Grammalecte  Check-in [1dac73beb8]

Overview
Comment:[build][fr] build_data for locution: __END__ to skip what come after
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | fr | build | Lexicographe
Files: files | file ages | folders
SHA3-256: 1dac73beb8e566705fcb8db096fd676f0eca8ac099f81780f71ada7e8a0396cd
User & Date: olr on 2017-11-03 19:43:18
Other Links: branch diff | manifest | tags
Context
2017-11-03
21:26
[fr] locutions adverbiales: tri check-in: 9e283206be user: olr tags: fr, Lexicographe
19:43
[build][fr] build_data for locution: __END__ to skip what come after check-in: 1dac73beb8 user: olr tags: fr, build, Lexicographe
19:40
[fr] locutions adverbiales: tri check-in: 239439f2cd user: olr tags: fr, Lexicographe
Changes

Modified gc_lang/fr/build_data.py from [01eee1eb89] to [1e628c0406].

315
316
317
318
319
320
321


322
323
324
325
326
327
328
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330







+
+







def makeLocutions (sp, bJS=False):
    "compile list of locutions in JSON"
    print("> Locutions ", end="")
    print("(Python et JavaScript)"  if bJS  else "(Python seulement)")
    dLocGraph = {}
    oTokenizer = tkz.Tokenizer("fr")
    for sLine in itertools.chain(readFile(sp+"/data/locutions.txt"), readFile(sp+"/data/locutions_vrac.txt")):
        if sLine == "__END__":
            break
        dCur = dLocGraph
        sLoc, sTag = sLine.split("\t")
        for oToken in oTokenizer.genTokens(sLoc.strip()):
            sWord = oToken["sValue"]
            if sWord not in dCur:
                dCur[sWord] = {}
            dCur = dCur[sWord]