Overview
Comment: | [graphspell][py] count words occurrence: count elided words |
---|---|
Downloads: | Tarball | ZIP archive | SQL archive |
Timelines: | family | ancestors | descendants | both | trunk | graphspell |
Files: | files | file ages | folders |
SHA3-256: |
f1c63ee223c5255895c238d1a91d5510 |
User & Date: | olr on 2020-03-02 12:59:38 |
Other Links: | manifest | tags |
Context
2020-03-02
| ||
15:15 | [lo][fr] Recenseur de mots: compte double pour les mots composés formes verbales interrogatives check-in: f02cd26db1 user: olr tags: trunk, lo | |
12:59 | [graphspell][py] count words occurrence: count elided words check-in: f1c63ee223 user: olr tags: trunk, graphspell | |
12:07 | [fr] ajustements check-in: e72a9109a9 user: olr tags: trunk, fr | |
Changes
Modified graphspell/spellchecker.py from [9c6b027a4b] to [114a0237c3].
︙ | ︙ | |||
146 147 148 149 150 151 152 | def countWordsOccurrences (self, sText, bByLemma=False, bOnlyUnknownWords=False, dWord={}): """count word occurrences. <dWord> can be used to cumulate count from several texts.""" if not self.oTokenizer: self._loadTokenizer() for dToken in self.oTokenizer.genTokens(sText): | | | 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | def countWordsOccurrences (self, sText, bByLemma=False, bOnlyUnknownWords=False, dWord={}): """count word occurrences. <dWord> can be used to cumulate count from several texts.""" if not self.oTokenizer: self._loadTokenizer() for dToken in self.oTokenizer.genTokens(sText): if dToken['sType'].startswith("WORD"): if bOnlyUnknownWords: if not self.isValidToken(dToken['sValue']): dWord[dToken['sValue']] = dWord.get(dToken['sValue'], 0) + 1 else: if not bByLemma: dWord[dToken['sValue']] = dWord.get(dToken['sValue'], 0) + 1 else: |
︙ | ︙ |