Overview
| Comment: | [graphspell] tokenizer: remove hyphen in number detection (always considered as a separate sign) |
|---|---|
| Downloads: | Tarball | ZIP archive | SQL archive |
| Timelines: | family | ancestors | descendants | both | graphspell | rg |
| Files: | files | file ages | folders |
| SHA3-256: |
6950f5898f8aaf6b6046b8f39c770cc2 |
| User & Date: | olr on 2018-07-17 06:42:09 |
| Other Links: | branch diff | manifest | tags |
Context
|
2018-07-17
| ||
| 11:38 | [core] better capitalization function check-in: 0b3f9d14dc user: olr tags: core, rg | |
| 06:42 | [graphspell] tokenizer: remove hyphen in number detection (always considered as a separate sign) check-in: 6950f5898f user: olr tags: graphspell, rg | |
| 06:24 | [fr] faux positif: quelle va être… check-in: 5d1225ab9b user: olr tags: fr, rg | |
Changes
Modified graphspell-js/tokenizer.js from [2fadfb42f5] to [4a5b091820].
| ︙ | |||
22 23 24 25 26 27 28 | 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | - + - + |
[/^[A-Z][.][A-Z][.](?:[A-Z][.])*/, 'WORD_ACRONYM'],
[/^(?:https?:\/\/|www[.]|[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st_-]+[@.][a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st_-]{2,}[@.])[a-zA-Z0-9][a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st_.\/?&!%=+*"'@$#-]+/, 'LINK'],
[/^[#@][a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st_-]+/, 'TAG'],
[/^<[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st]+.*?>|<\/[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st]+ *>/, 'HTML'],
[/^\[\/?[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st]+\]/, 'PSEUDOHTML'],
[/^&\w+;(?:\w+;|)/, 'HTMLENTITY'],
[/^\d\d?h\d\d\b/, 'HOUR'],
|
| ︙ |
Modified graphspell/tokenizer.py from [b1bcfc3595] to [daca54adb9].
| ︙ | |||
13 14 15 16 17 18 19 | 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | - + - + |
r'(?P<PUNC>[][,.;:!?…«»“”‘’"(){}·–—])',
r'(?P<WORD_ACRONYM>[A-Z][.][A-Z][.](?:[A-Z][.])*)',
r'(?P<LINK>(?:https?://|www[.]|\w+[@.]\w\w+[@.])\w[\w./?&!%=+*"\'@$#-]+)',
r'(?P<HASHTAG>[#@][\w-]+)',
r'(?P<HTML><\w+.*?>|</\w+ *>)',
r'(?P<PSEUDOHTML>\[/?\w+\])',
r'(?P<HOUR>\d\d?h\d\d\b)',
|
| ︙ |