Commit Graph

276 Commits

Author SHA1 Message Date
Anthony MOI
5038a7f74e Update CHANGELOGs 2020-03-27 17:49:02 -04:00
Anthony MOI
a2a6d80017 Python - expost get_vocab on Tokenizer 2020-03-27 11:53:18 -04:00
Anthony MOI
e8aec7a624 Bump version for Python release 2020-03-27 09:17:35 -04:00
Morgan Funtowicz
bc46064a61 Added missing imports for AddedToken 2020-03-27 12:32:23 +01:00
Anthony MOI
b132be34af Bump version for Python release 2020-03-26 17:26:14 -04:00
Anthony MOI
6c232886b8 Improve & update CHANGELOGs 2020-03-26 17:08:42 -04:00
Anthony MOI
4341c79d85 Python - last fixes on Encoding bindings/typings 2020-03-26 15:42:45 -04:00
Anthony MOI
14e3ab3787 Python - fix style 2020-03-26 15:42:45 -04:00
Morgan Funtowicz
39958a2f0f TokenizedSequence / TokenizedSequenceWithOffsets needs to be declared in .py files not only .pyi 2020-03-26 15:42:45 -04:00
Morgan Funtowicz
68405a6fae Forward type_id in encode_tokenized/encode_tokenized_batch python binding. 2020-03-26 15:42:45 -04:00
Anthony MOI
9bd9e0b3c1 Expose post_process on the Tokenizer 2020-03-26 15:42:45 -04:00
Anthony MOI
9ce895550b Add some new merging capability on Encoding 2020-03-26 15:42:44 -04:00
Anthony MOI
eec74ca3e6 Python - Add Model.encode_batch and improve typings 2020-03-26 15:42:44 -04:00
Anthony MOI
1150751ab6 Python - Update mappings API 2020-03-26 15:42:44 -04:00
Anthony MOI
a397a1da63 Python - Expose encode method on Model 2020-03-26 15:42:44 -04:00
Anthony MOI
8de6ef5a37 Python - Bind new Encoding's mappings 2020-03-26 15:42:44 -04:00
Anthony MOI
e8925a33da Python - remove add_special_tokens from BertWordPieceTokenizer init 2020-03-26 14:19:37 -04:00
Anthony MOI
f8d54edcdd Python - Fix cases where str expected instead of AddedToken 2020-03-25 19:22:53 -04:00
Anthony MOI
c65d53892d Python - Add bindings for new AddedToken options 2020-03-24 20:58:45 -04:00
Anthony MOI
d953d58cee Rust - Fix offsets when there are added tokens 2020-03-19 12:53:03 -04:00
Anthony MOI
d53de0e2da Python - Expose normalize on BaseTokenizer 2020-03-18 16:44:31 -04:00
Anthony MOI
ae0d330907 Update CHANGELOGs 2020-03-18 16:42:27 -04:00
Anthony MOI
60a4fb35f4 Python - Update bindings 2020-03-16 10:36:42 -04:00
Morgan Funtowicz
505bfbba82 Fix invalid error messages. 2020-03-12 15:38:29 +01:00
Morgan Funtowicz
5ed1f26c71 Throw a more meaningful error when provided python input is None. 2020-03-12 10:59:05 +01:00
Anthony MOI
257360acec Python - encode & encode batch with add_special_tokens 2020-03-10 16:21:10 -04:00
Anthony MOI
a9be177185 Update CHANGELOGs 2020-03-10 13:12:34 -04:00
Anthony MOI
28f022058c Keep default values as true 2020-03-10 12:58:53 -04:00
Anthony MOI
45f3eaaf72 Update bindings and typings 2020-03-10 12:28:24 -04:00
Anthony MOI
efbbfea558 Update ByteLevel PostProcessor 2020-03-10 12:05:04 -04:00
Anthony MOI
7e9003ccb7 Python - Update bindings 2020-03-09 18:37:03 -04:00
Anthony MOI
86d2e90ad2 Update CHANGELOGs 2020-03-06 17:44:44 -05:00
Anthony MOI
d778ed5e0a Python - Update README and implementation 2020-03-06 17:44:44 -05:00
Anthony MOI
52180a9179 Python - Add ByteLevel PostProcessor 2020-03-06 17:44:44 -05:00
Anthony MOI
b60eef5245 Python - Make style 2020-03-06 17:44:44 -05:00
Anthony MOI
d8e7a830b2 Update CHANGELOGs 2020-03-06 17:44:34 -05:00
Anthony MOI
b2e5f54b6f Python - Fix ByteLevelBPETokenizer implementation 2020-03-06 17:44:03 -05:00
Anthony MOI
f1460fadb9 Python - Update docs and implementations 2020-03-06 17:44:03 -05:00
Anthony MOI
2393506dc7 Python - Add ByteLevel Normalizer 2020-03-06 17:44:03 -05:00
Anthony MOI
47cef0e13a Python - Fix BPE and WordPiece builders usage 2020-03-06 12:20:39 -05:00
Anthony MOI
4b596e19dd Rust - Improve training progress for multiple files 2020-03-03 11:04:24 -05:00
Anthony MOI
8e791791d1 Python - prepare for release 2020-03-02 14:56:42 -05:00
Anthony MOI
4deeb9511f Update CHANGELOGs 2020-03-02 14:37:17 -05:00
Anthony MOI
f8f0702d98 Fix LongestFirst truncation strategy 2020-02-29 16:26:13 -05:00
Anthony MOI
657f8b6c15 Rust & Python - Update CHANGELOGs 2020-02-26 11:30:44 -05:00
Anthony MOI
3b10d640d5 Rust & Python - Update CHANGELOGs 2020-02-26 10:51:40 -05:00
Anthony MOI
2425fe877d Python - Update CHANGELOG 2020-02-26 09:31:17 -05:00
Anthony MOI
61b4c9c30a Python - Add missing tokens to BertWordPieceTokenizer 2020-02-26 09:21:54 -05:00
Anthony MOI
440e8e9bd9 Python - Bump version for release 2020-02-24 16:08:49 -05:00
Anthony MOI
be08d9574c Python - Add Changelog 2020-02-24 10:12:50 -05:00