Overview
| Comment: | [graphspell][fr] tokenisation: +signes €$# (faux positif) |
|---|---|
| Downloads: | Tarball | ZIP archive | SQL archive |
| Timelines: | family | ancestors | descendants | both | trunk | fr | graphspell |
| Files: | files | file ages | folders |
| SHA3-256: |
365d3554c78bff6e9e27c69cff0ff290 |
| User & Date: | olr on 2019-02-22 11:53:36 |
| Other Links: | manifest | tags |
Context
|
2019-02-22
| ||
| 17:10 | [fr] faux positif: confusion peut/peu check-in: d6657f0c8b user: olr tags: trunk, fr | |
| 11:53 | [graphspell][fr] tokenisation: +signes €$# (faux positif) check-in: 365d3554c7 user: olr tags: trunk, fr, graphspell | |
| 11:27 | [fr] merge last commit (cherrypick) check-in: 90dc1dc3c1 user: olr tags: trunk, fr | |
Changes
Modified gc_lang/fr/rules.grx from [6645daec9e] to [641b30219e].
| ︙ | |||
12547 12548 12549 12550 12551 12552 12553 | 12547 12548 12549 12550 12551 12552 12553 12554 12555 12556 12557 12558 12559 12560 12561 | - + |
-2>> =suggPlur(\2) # Accord de nombre erroné : « \2 » devrait être au pluriel.
TEST: 00 heure, 01 heure
TEST: il a adopté 1 {{chiens}}.
TEST: 22 {{heure}}
TEST: 3 {{heure}}
TEST: les élèves sont inquiets après une année 2018 compliquée et riche en réformes.
|
| ︙ |
Modified graphspell-js/tokenizer.js from [1d91386df2] to [c05f88b98c].
| ︙ | |||
20 21 22 23 24 25 26 | 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | - + - + |
[/^(?:https?:\/\/|www[.]|[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st_-]+[@.][a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st_-]{2,}[@.])[a-zA-Z0-9][a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st_.\/?&!%=+*"'@$#-]+/, 'LINK'],
[/^[#@][a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st_-]+/, 'TAG'],
[/^<[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st]+.*?>|<\/[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st]+ *>/, 'HTML'],
[/^\[\/?[a-zA-Zà-öÀ-Ö0-9ø-ÿØ-ßĀ-ʯfi-st]+\]/, 'PSEUDOHTML'],
[/^&\w+;(?:\w+;|)/, 'HTMLENTITY'],
[/^\d\d?h\d\d\b/, 'HOUR'],
[/^\d+(?:[.,]\d+|)/, 'NUM'],
|
| ︙ |
Modified graphspell/tokenizer.py from [daca54adb9] to [13303390f7].
| ︙ | |||
14 15 16 17 18 19 20 | 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | - + - + |
r'(?P<WORD_ACRONYM>[A-Z][.][A-Z][.](?:[A-Z][.])*)',
r'(?P<LINK>(?:https?://|www[.]|\w+[@.]\w\w+[@.])\w[\w./?&!%=+*"\'@$#-]+)',
r'(?P<HASHTAG>[#@][\w-]+)',
r'(?P<HTML><\w+.*?>|</\w+ *>)',
r'(?P<PSEUDOHTML>\[/?\w+\])',
r'(?P<HOUR>\d\d?h\d\d\b)',
r'(?P<NUM>\d+(?:[.,]\d+))',
|
| ︙ |