Overview
Comment: | [graphspell] tokenizer: handles all kinds of apostrophes |
---|---|
Downloads: | Tarball | ZIP archive | SQL archive |
Timelines: | family | ancestors | descendants | both | trunk | graphspell |
Files: | files | file ages | folders |
SHA3-256: |
1bdedd313329b16436a21845c9791d07 |
User & Date: | olr on 2019-09-01 08:22:01 |
Original Comment: | [graphspell] tokenizer: handles all kinds of aportrophes |
Other Links: | manifest | tags |
Context
2019-09-02
| ||
08:48 | [fr] ajustements check-in: b0f9309314 user: olr tags: trunk, fr | |
2019-09-01
| ||
08:33 | merge trunk check-in: 247bdef473 user: olr tags: tbme | |
08:22 | [graphspell] tokenizer: handles all kinds of apostrophes check-in: 1bdedd3133 user: olr tags: trunk, graphspell | |
08:18 | [fr] ajustements check-in: a1c110c578 user: olr tags: trunk, fr | |
Changes
Modified graphspell-js/tokenizer.js from [0f58ee1cf3] to [5a3085a08e].
︙ | |||
35 36 37 38 39 40 41 | 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | - + | [/^[,.;:!?…«»“”‘’"(){}\[\]·–—¿¡]/, 'PUNC'], [/^[A-Z][.][A-Z][.](?:[A-Z][.])*/, 'WORD_ACRONYM'], [/^(?:https?:\/\/|www[.]|[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯff-st_-]+[@.][a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯff-st_-]{2,}[@.])[a-zA-Z0-9][a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯff-st_.\/?&!%=+*"'@$#-]+/, 'LINK'], [/^[#@][a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯff-st_-]+/, 'TAG'], [/^<[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯff-st]+.*?>|<\/[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯff-st]+ *>/, 'HTML'], [/^\[\/?[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯff-st]+\]/, 'PSEUDOHTML'], [/^&\w+;(?:\w+;|)/, 'HTMLENTITY'], |
︙ |
Modified graphspell/tokenizer.py from [18d4ef9f97] to [a2c42f5f3e].
︙ | |||
27 28 29 30 31 32 33 | 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | - + | r'(?P<FOLDERWIN>[a-zA-Z]:\\(?:Program Files(?: [(]x86[)]|)|[\w.()]+)(?:\\[\w.()-]+)*)', r'(?P<PUNC>[][,.;:!?…«»“”‘’"(){}·–—¿¡])', r'(?P<WORD_ACRONYM>[A-Z][.][A-Z][.](?:[A-Z][.])*)', r'(?P<LINK>(?:https?://|www[.]|\w+[@.]\w\w+[@.])\w[\w./?&!%=+*"\'@$#-]+)', r'(?P<HASHTAG>[#@][\w-]+)', r'(?P<HTML><\w+.*?>|</\w+ *>)', r'(?P<PSEUDOHTML>\[/?\w+\])', |
︙ |