Commit Graph

  • 7e36239d74 node: add setPadding in Tokenizer Pierric Cistac 2020-01-28 18:42:13 -05:00
  • 4c11ae1e1e node: add setTruncation in Tokenizer Pierric Cistac 2020-01-30 11:25:40 -05:00
  • 19878e7584 node: expose Encoding Pierric Cistac 2020-02-03 11:30:50 -05:00
  • 27880b3aaf Merge pull request #118 from huggingface/node-original-string MOI Anthony 2020-02-03 11:19:38 -05:00
  • e365c1992b Improve flexibility in some Python binding (#107) Funtowicz Morgan 2020-02-03 10:41:33 +00:00
  • 6524f09e99 Roberta PostProcessor (#111) Funtowicz Morgan 2020-02-03 10:39:48 +00:00
  • b027c63c37 Expose get_vocab_size in tokenizer python API. Karan Desai 2020-01-23 00:52:07 -05:00
  • 8138ece1a6 Merge pull request #117 from huggingface/fix-original-str MOI Anthony 2020-02-01 22:25:36 -05:00
  • 9a25b0e7f9 Node - Tweak getOriginalString to avoid clone Anthony MOI 2020-02-01 21:58:24 -05:00
  • a14c63343b node: add binding for original string Pierric Cistac 2020-01-31 18:09:05 -05:00
  • e9ecd5aeec Fix some more warnings Anthony MOI 2020-02-01 11:06:14 -05:00
  • 661cc185a4 Fix new clippy warnings Anthony MOI 2020-02-01 10:56:30 -05:00
  • 05275a9391 python: fix inverted normalized/original string range Pierric Cistac 2020-01-31 11:09:55 -05:00
  • c98eb04998 node: fix test config Pierric Cistac 2020-01-31 11:07:36 -05:00
  • 6d532fedb1 rust: fix OnlyFirst and OnlySecond truncation strategies (#112) Pierric Cistac 2020-01-30 09:20:18 -05:00
  • 3f6de4d33f node: add lint check in workflow Pierric Cistac 2020-01-29 16:58:59 -05:00
  • 8792790a91 node: fix build.js, actually exiting with status 1 when error Pierric Cistac 2020-01-29 16:58:11 -05:00
  • c396cff952 tweak error name / description Pierric Cistac 2020-01-29 15:33:23 -05:00
  • 429a619168 rust: fix OnlyFirst and OnlySecond truncation strategies Pierric Cistac 2020-01-29 11:23:25 -05:00
  • d977d82a72 node: update ts build Pierric Cistac 2020-01-28 15:42:31 -05:00
  • 880cd7199b python: align Cargo.lock package version Pierric Cistac 2020-01-28 16:44:48 -05:00
  • 88391dd185 node: bump version to 0.3.1 Pierric Cistac 2020-01-24 17:19:18 -05:00
  • 0e724dfeb4 node: remove redundant types in jsdoc Pierric Cistac 2020-01-24 17:18:31 -05:00
  • a331db24fc node: change declaration type of rust internal structures to interface Pierric Cistac 2020-01-24 17:17:46 -05:00
  • efd3f8b2ff node: cache workflows (#103) Pierric Cistac 2020-01-23 17:37:05 -05:00
  • c7f850415f generalize npm cache and forget about rust target Pierric Cistac 2020-01-23 16:39:06 -05:00
  • ef59535b54 try w/ source rust lib Pierric Cistac 2020-01-23 16:26:20 -05:00
  • 551b89e2a1 trigger node action Pierric Cistac 2020-01-23 16:19:12 -05:00
  • 5dacde7fa4 first try cache rust + npm Pierric Cistac 2020-01-23 16:11:06 -05:00
  • 8b8f2867e5 Merge pull request #102 from huggingface/python-release MOI Anthony 2020-01-23 15:34:55 -05:00
  • 7121cab87b trigger python release action on tag push Pierric Cistac 2020-01-23 15:15:51 -05:00
  • 698d267a5a node: fix windows build (#101) Pierric Cistac 2020-01-23 15:05:21 -05:00
  • 0fbc6f0aa1 fix windows build Pierric Cistac 2020-01-23 15:00:00 -05:00
  • 68cce806cd Node bindings v0.3.0 Pierric Cistac 2020-01-22 18:24:48 -05:00
  • c4938ff09d node: add tests + linting (#76) Pierric Cistac 2020-01-22 18:22:45 -05:00
  • 36edeebba3 use last rust version on node actions Pierric Cistac 2020-01-22 18:11:16 -05:00
  • d056fe4104 add more failing tests w/ last stable rust version Pierric Cistac 2020-01-21 14:08:54 -05:00
  • 40f022ccb2 npm run lint Pierric Cistac 2020-01-22 18:09:38 -05:00
  • 2aef3e9d9c add eslint/prettier Pierric Cistac 2020-01-15 16:15:08 -05:00
  • eedb57c3f3 quickfix imports Pierric Cistac 2020-01-15 12:25:04 -05:00
  • 26a52dd660 fix wordpiece tokenizer when no vocabfile provided Pierric Cistac 2020-01-15 12:23:52 -05:00
  • 0db8467fba add simple tests on Tokenizer bindings Pierric Cistac 2020-01-15 12:21:58 -05:00
  • 6b8deb90e2 prepare for tests Pierric Cistac 2020-01-15 12:18:25 -05:00
  • 0105021280 Bump version for Python Anthony MOI 2020-01-22 16:07:03 -05:00
  • 327de00d71 Merge pull request #95 from huggingface/vocab-serialization MOI Anthony 2020-01-22 15:49:48 -05:00
  • 36cd67fdf5 Merge pull request #93 from Tomarchelone/patch-1 MOI Anthony 2020-01-22 15:47:33 -05:00
  • b059190259 Merge pull request #94 from huggingface/fix-python35 MOI Anthony 2020-01-22 14:13:28 -05:00
  • 3a9badd2e0 save vocab in order of ID epwalsh 2020-01-21 13:32:13 -08:00
  • 0b782e4507 Removed invalid class-level variable declaration. Morgan Funtowicz 2020-01-21 15:10:47 -05:00
  • 048ab46089 Fix indexing bug in add_tokens() Denis Zolotukhin 2020-01-21 12:58:22 +03:00
  • da7e629e4a Bump Python version for release Anthony MOI 2020-01-20 09:14:46 -05:00
  • c3cfc16b63 Merge pull request #92 from huggingface/fix-bpe-tokenizer MOI Anthony 2020-01-20 09:01:02 -05:00
  • 395f605fd2 Use WhitespaceSplit for BPETokenizer Anthony MOI 2020-01-17 18:33:29 -05:00
  • fc601289eb Node - Bindings for WhitespaceSplit Anthony MOI 2020-01-17 18:18:40 -05:00
  • 9c408011ae Python - Bindings for WhitespaceSplit Anthony MOI 2020-01-17 18:15:14 -05:00
  • 8369b02312 Add WhitespaceSplit PreTokenizer Anthony MOI 2020-01-17 18:03:10 -05:00
  • 3895d14bf9 Improve the way BpeTrainer creates a BPE Anthony MOI 2020-01-17 17:41:35 -05:00
  • d1ae0bd576 Cache cargo registry and build target directory in CI (#78) Evan Pete Walsh 2020-01-17 13:30:08 -08:00
  • 5d76bf4749 Merge pull request #80 from iechevarria/patch-1 MOI Anthony 2020-01-17 06:45:32 -05:00
  • e82722a9c2 Fix typo in Python binding README Ivan Echevarria 2020-01-16 17:10:48 -08:00
  • 457e6c9932 Merge pull request #71 from huggingface/python_example_fix MOI Anthony 2020-01-15 10:07:34 -05:00
  • 7d10dd0fd6 Merge pull request #75 from huggingface/doc_parallelism MOI Anthony 2020-01-15 09:42:22 -05:00
  • 65b35385f8 Merge pull request #70 from huggingface/python-decode-kwargs-fix MOI Anthony 2020-01-15 09:40:42 -05:00
  • 11fb79cee8 Added doc for setting tokenizers level of parallelism. Morgan Funtowicz 2020-01-15 15:08:43 +01:00
  • 374f944e32 Use the same vocabs/merges for Python and Rust comparison. Morgan Funtowicz 2020-01-15 11:57:34 +01:00
  • 4839154145 Remove kwargs mapping on Tokenizer decode/decode_batch as their is only one possible arg. Morgan Funtowicz 2020-01-15 11:16:01 +01:00
  • a779714a9e don't forget to copy README!!! Pierric Cistac 2020-01-14 17:31:29 -05:00
  • 25546d00c6 bump node bindings version: 0.2.3 Pierric Cistac 2020-01-14 17:08:57 -05:00
  • cd167816d6 Github action for node release (#68) Pierric Cistac 2020-01-14 16:51:12 -05:00
  • 63c873ac50 don't build when only updates on python bindings Pierric Cistac 2020-01-14 15:20:18 -05:00
  • 74e18c13f5 rename workflow to Node Pierric Cistac 2020-01-14 14:12:39 -05:00
  • ed201271fc clean env vars in node.yml Pierric Cistac 2020-01-14 14:03:41 -05:00
  • 6a10dadcd1 move bindings build to each push Pierric Cistac 2020-01-14 14:01:35 -05:00
  • 3ea7917df5 exports => module.exports Pierric Cistac 2020-01-14 13:24:42 -05:00
  • bbc6023424 ready for primetime Pierric Cistac 2020-01-14 12:05:25 -05:00
  • 13449b0d95 update build script / actions Pierric Cistac 2020-01-14 11:57:15 -05:00
  • a0bc4c97ce test multiple env builds Pierric Cistac 2020-01-14 10:30:35 -05:00
  • d11c2dcb92 add public read permission Pierric Cistac 2020-01-14 09:54:36 -05:00
  • 4d70d0b1c2 complete windows build Pierric Cistac 2020-01-13 21:56:05 -05:00
  • b4607c828a first try node release workflow Pierric Cistac 2020-01-13 15:44:06 -05:00
  • f836f2109b build ts Pierric Cistac 2020-01-13 14:48:49 -05:00
  • 6a9f7f4304 Merge pull request #64 from huggingface/bytelevel_bpe_python_example MOI Anthony 2020-01-14 09:53:27 -05:00
  • 8e2328925b Merge pull request #60 from adelevie/patch-1 MOI Anthony 2020-01-14 09:18:58 -05:00
  • 894f887444 Updated train_bert_wordpiece.py as well. Morgan Funtowicz 2020-01-14 13:32:02 +01:00
  • 7caf9fd823 Updated train_bytelevel_bpe.py to use the high level Python API. Morgan Funtowicz 2020-01-14 12:00:50 +01:00
  • 554d6f7fd9 Typo in README.md Alan deLevie 2020-01-13 19:03:18 -05:00
  • b41ce0e9d6 Update README.md Anthony MOI 2020-01-13 10:11:43 -05:00
  • c3bd2dfa53 Merge pull request #55 from benjamin-ny/patch-1 MOI Anthony 2020-01-13 10:10:17 -05:00
  • 743d66340d Fix a few errors in the README.md benjamin-ny 2020-01-13 09:38:38 +01:00
  • fc9e81d4ab Fix split on special tokens & bump version Anthony MOI 2020-01-12 02:35:45 -05:00
  • 32d3955cd4 fix link counter Clement 2020-01-10 17:40:12 -05:00
  • 78b05a092f Adding download counter Clement 2020-01-10 17:37:24 -05:00
  • d2833fc71d Update README.md Julien Chaumond 2020-01-10 16:24:07 -05:00
  • a95d0e6ba1 Node - Fix import Anthony MOI 2020-01-10 16:11:44 -05:00
  • 76cabe4741 Merge pull request #50 from huggingface/node-typings MOI Anthony 2020-01-10 16:07:01 -05:00
  • 2b2fadf45b first readme Pierric Cistac 2020-01-10 16:03:56 -05:00
  • 24c08b2530 fix sentencepiece tokenizer name Pierric Cistac 2020-01-10 16:03:47 -05:00
  • df67eadeca fix path bin Pierric Cistac 2020-01-10 15:53:56 -05:00
  • c9da0ffa18 bump Pierric Cistac 2020-01-10 15:35:27 -05:00
  • e68b4ae501 publish script Pierric Cistac 2020-01-10 15:19:59 -05:00