Differences From Artifact [34ec46a285]:
- File gc_lang/fr/dictionnaire/thes_build.py — part of check-in [e67b500b7b] at 2019-06-26 18:54:17 on branch trunk — [fr] thesaurus builder (merging synsets) (user: olr, size: 4469) [annotate] [blame] [check-ins using]
To Artifact [7fe275114e]:
- File gc_lang/fr/dictionnaire/thes_build.py — part of check-in [adf7f0e3af] at 2019-07-16 17:46:43 on branch trunk — [fr] update thesaurus builder (user: olr, size: 4576) [annotate] [blame] [check-ins using] [more...]
| ︙ | |||
28 29 30 31 32 33 34 35 36 37 38 39 40 41 | 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | + + |
return
for i, sLine in enumerate(readFile(spf), 1):
sPOS, *lSynset = sLine.split("|")
lSynset = self._removeDuplicatesFrom(lSynset)
self.dSynset[i] = lSynset
for sWord in lSynset:
if not sWord.endswith("*"):
if "(" in sWord:
sWord = re.sub("\\(.*\\)", "", sWord).strip()
if sWord not in self.dSynEntry:
self.dSynEntry[sWord] = [ (sPOS, i) ]
else:
self.dSynEntry[sWord].append( (sPOS, i) )
def showSynsetEntries (self):
for sWord, lSynset in self.dSynEntry.items():
|
| ︙ |