Commit Graph

118 Commits

Author SHA1 Message Date
Anthony MOI
b06681cb1e Bump version for release 2020-01-06 21:05:01 -05:00
Anthony MOI
185b6f0b8b Add Sequence Normalizer 2020-01-06 21:03:05 -05:00
Anthony MOI
5c02bbbc4c Add basic unicode normalizers 2020-01-06 20:38:42 -05:00
Anthony MOI
4b9ae66419 WordPiece decoder with customizable prefix 2020-01-06 20:20:42 -05:00
Anthony MOI
772d0680b6 Python - Update all typings 2020-01-06 20:03:00 -05:00
Anthony MOI
0079a7a6b7 Python - Add NormalizedString + doc/typings 2020-01-06 17:55:22 -05:00
Anthony MOI
6de04bbaea Python - Add typings/doc for Encoding 2020-01-06 17:23:04 -05:00
Anthony MOI
7e9e0aa81c Python - Add Tokenizer doc with stub file 2020-01-06 16:40:27 -05:00
Anthony MOI
9a99e2bcb1 Python - Add missing Bpe constructor kwargs 2020-01-06 16:39:59 -05:00
Anthony MOI
b7d0acc562 Python - Improve decode/decode_batch API 2020-01-06 16:39:36 -05:00
Anthony MOI
1a083a6e6f Python - Improved stub file for models 2020-01-06 15:55:00 -05:00
Anthony MOI
0e41e0b327 Python - Include correct packages and stubs 2020-01-06 15:24:17 -05:00
Anthony MOI
8723f78e6f Python - build-sdist.sh +x mode 2020-01-06 14:24:08 -05:00
Anthony MOI
d7b6385566 Python - Adding some stub files 2020-01-06 13:04:30 -05:00
Anthony MOI
7eebd06409 Python - Improve imports 2020-01-06 12:03:01 -05:00
Anthony MOI
e1caacfce0 Rename package for crates.io 2020-01-04 23:42:32 -05:00
Anthony MOI
fab4e96b51 Python - Add bert wordpiece training example 2020-01-03 19:37:29 -05:00
Anthony MOI
c51e340492 Python - Add WordPieceTrainer 2020-01-03 19:37:29 -05:00
Anthony MOI
e64b54b29e Python - Update BpeTrainer interface 2020-01-03 19:37:29 -05:00
Anthony MOI
408490e6b4 Add missing kwargs support 2020-01-02 19:32:56 -05:00
Anthony MOI
22e499133b Python - Expose missing BPE options at creation
cc @epwalsh
2020-01-02 19:30:50 -05:00
Anthony MOI
04cfeea2d5 Python - ByteLevel BPE training example file
cc @julien-c
2020-01-02 18:39:31 -05:00
Anthony MOI
0589deb6e2 Python - Expose BpeTrainer options 2020-01-02 18:09:04 -05:00
Anthony MOI
d3c3f5a700 Python - Expose ByteLevel alphabet 2020-01-02 18:06:06 -05:00
Anthony MOI
722b61230d BPE handles UNK token 2020-01-01 14:49:03 -05:00
Anthony MOI
47e4b00e05 BpeTrainer shows some progress 2020-01-01 01:28:17 -05:00
Anthony MOI
90dfdc715d Expose Tokenizer parts 2019-12-31 22:57:47 -05:00
Anthony MOI
f28ca58fd9 [Fix #17] BPE & WordPiece models saving 2019-12-31 13:56:28 -05:00
Anthony MOI
225a886382 Python - Expose Whitespace PreTokenizer 2019-12-30 13:10:33 -05:00
Anthony MOI
4677a09626 Python - Expose pad and truncate on Encoding 2019-12-30 12:56:07 -05:00
Anthony MOI
8ddb2de64e Update unicode-normalization to published crate 2019-12-30 12:18:00 -05:00
Anthony MOI
06d515d41b Python - Add ability to retrieve a range of string 2019-12-29 01:37:03 -05:00
Anthony MOI
049029dc42 Python - Restore methods on Encoding 2019-12-29 01:26:42 -05:00
Anthony MOI
9c574ad1b7 Python - Fix some import warnings 2019-12-29 00:43:32 -05:00
Anthony MOI
3779bf3e19 Python - Update example 2019-12-29 00:38:37 -05:00
Anthony MOI
3dcf9f763c Python - Update pre tokenizers with offsets 2019-12-29 00:37:58 -05:00
Anthony MOI
3f79d9d5e0 Python - Add normalizers bindings & BertNormalizer 2019-12-29 00:36:09 -05:00
Anthony MOI
839239d3b4 Bump version 2019-12-27 10:43:34 -05:00
Anthony MOI
bddf7ba737 Python - Fix building from wheels 2019-12-27 10:39:19 -05:00
Anthony MOI
ffd28ba558 Bump for release 2019-12-26 14:56:13 -05:00
Anthony MOI
74cc6f6bde Python - Simplify padding interface 2019-12-26 14:34:13 -05:00
Anthony MOI
d93d4fc3cd Python - Simplify truncation interface 2019-12-26 10:35:20 -05:00
Anthony MOI
a7734ffc9f Python - Update doc and readme for add_prefix_space 2019-12-26 10:34:53 -05:00
Anthony MOI
1879cb0bcb Python - change with_added_tokens as kwarg 2019-12-25 22:22:35 -05:00
Anthony MOI
905c1eb77e Python - update some packages 2019-12-25 22:16:43 -05:00
Anthony MOI
597031b973 Python - remove unused variable 2019-12-25 22:16:11 -05:00
Anthony MOI
9d289d357d Python - change add_prefix_space as kwarg 2019-12-25 22:10:17 -05:00
Anthony MOI
4bc5a7bbe7 Python - fix example 2019-12-24 11:20:40 -05:00
epwalsh
c0ed873c4d simplify initialization of BpeTrainer 2019-12-23 20:13:48 -05:00
Anthony MOI
fab1d4cabc Bump version for release 2019-12-23 17:28:38 -05:00