Grammalecte  Diff

Differences From Artifact [cbb922d3e1]:

To Artifact [c2a94b6c6f]:


55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
        @@@@END_GRAPH

There is no limit to the number of actions and the type of actions a rule can
launch. Each action has its own condition to be triggered.

There are several kinds of actions:

* Error warning, with a message, and optionally suggestions, and optionally an URL
* Text transformation, modifying internally the checked text
* [second pass only] Disambiguation action
* [second pass only] Tagging token
* [second pass only] Immunity rules


On the first pass, you can only write regex rules.







|







55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
        @@@@END_GRAPH

There is no limit to the number of actions and the type of actions a rule can
launch. Each action has its own condition to be triggered.

There are several kinds of actions:

* Error warning, with a message, and optionally suggestions, and optionally a URL
* Text transformation, modifying internally the checked text
* [second pass only] Disambiguation action
* [second pass only] Tagging token
* [second pass only] Immunity rules


On the first pass, you can only write regex rules.
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154

    __[i]__
    __<s]__
    __[u>__
    __<s>__


User option activating/disactivating is possible with an option name placed
just after the LCR flags, i.e.:

    __[i]/option1__
    __[u]/option2__
    __[s>/option1__
    __<u>/option3__
    __<i>/option3__







|







140
141
142
143
144
145
146
147
148
149
150
151
152
153
154

    __[i]__
    __<s]__
    __[u>__
    __<s>__


User option activating/deactivating is possible with an option name placed
just after the LCR flags, i.e.:

    __[i]/option1__
    __[u]/option2__
    __[s>/option1__
    __<u>/option3__
    __<i>/option3__
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377

`~>> *`

>   Replace by whitespaces

`~>> @`

>   Replace with arrobas, useful mostly at first pass, where it is advised to
>   check usage of punctuations and whitespaces.
>   Successions of @ are automatically removed at the beginning of the second pass.

`~>> _`

>   Replace with underscores. Just a filler.
>   These characters won’t be removed at the beginning of the second pass.

You can use positioning with text rewriting actions.

    Mr(. [A-Z]\w+) <<- ~1>> *

You can also call Python expressions.

    __[s]__ Mr. ([a-z]\w+) <<- ~1>> =\1.upper()


### Text processing

The text processor is useful to simplify texts and write simplier checking
rules.

For example, sentences with the same grammar mistake:

    These “cats” are blacks.
    These cats are “blacks”.
    These cats are absolutely blacks.
    These stupid “cats” are all blacks.
    These unknown cats are as per usual blacks.

Instead of writting complex rules or several rules to find mistakes for all possible
cases, you can use the text preprocessor to simplify the text.

To remove the chars “”, write:

    [“”] ~>> *

The * means: replace text by whitespaces.







|



















|










|







332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377

`~>> *`

>   Replace by whitespaces

`~>> @`

>   Replace with the at sign, useful mostly at first pass, where it is advised to
>   check usage of punctuations and whitespaces.
>   Successions of @ are automatically removed at the beginning of the second pass.

`~>> _`

>   Replace with underscores. Just a filler.
>   These characters won’t be removed at the beginning of the second pass.

You can use positioning with text rewriting actions.

    Mr(. [A-Z]\w+) <<- ~1>> *

You can also call Python expressions.

    __[s]__ Mr. ([a-z]\w+) <<- ~1>> =\1.upper()


### Text processing

The text processor is useful to simplify texts and write simpler checking
rules.

For example, sentences with the same grammar mistake:

    These “cats” are blacks.
    These cats are “blacks”.
    These cats are absolutely blacks.
    These stupid “cats” are all blacks.
    These unknown cats are as per usual blacks.

Instead of writing complex rules or several rules to find mistakes for all possible 
cases, you can use the text preprocessor to simplify the text.

To remove the chars “”, write:

    [“”] ~>> *

The * means: replace text by whitespaces.
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
When Grammalecte analyses a word with morph, before requesting the
POS tags to the dictionary, it checks if there is a stored marker for the
position where the word is. If there is a marker, Grammalecte uses the stored
data and don’t make request to the dictionary.

The disambiguation commands store POS tags at the position of a word.

There is 3 commands for disambiguation.

`select(n, pattern)`

>   stores at position n only the POS tags of the word matching the pattern.

`exclude(n, pattern)`








|







418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
When Grammalecte analyses a word with morph, before requesting the
POS tags to the dictionary, it checks if there is a stored marker for the
position where the word is. If there is a marker, Grammalecte uses the stored
data and don’t make request to the dictionary.

The disambiguation commands store POS tags at the position of a word.

There are 3 commands for disambiguation.

`select(n, pattern)`

>   stores at position n only the POS tags of the word matching the pattern.

`exclude(n, pattern)`

473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492


### Standard functions

`word(n)`

>   catches the nth next word after the pattern (separated only by white spaces).
>   returns None if no word catched

`word(-n)`

>   catches the nth next word before the pattern (separated only by white spaces).
>   returns None if no word catched

`after(regex[, neg_regex])`

>   checks if the text after the pattern matches the regex.

`before(regex[, neg_regex])`








|




|







473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492


### Standard functions

`word(n)`

>   catches the nth next word after the pattern (separated only by white spaces).
>   returns None if no word caught

`word(-n)`

>   catches the nth next word before the pattern (separated only by white spaces).
>   returns None if no word caught

`after(regex[, neg_regex])`

>   checks if the text after the pattern matches the regex.

`before(regex[, neg_regex])`