Commit Graph

352 Commits

Author SHA1 Message Date
Anthony MOI
c205afe7a5 Python - Also allow creating Tokenizer from_buffer 2020-05-27 13:46:37 -04:00
Anthony MOI
0e890d0d05 Update CHANGELOGs 2020-05-27 13:46:37 -04:00
Anthony MOI
de9feae0b5 Python - Make Encoding pickable 2020-05-27 13:46:37 -04:00
Anthony MOI
c5bba91bf4 Python - Test and fix classes pickling 2020-05-27 13:46:37 -04:00
Anthony MOI
6a70162d78 Python - Make all relevant classes pickable 2020-05-27 13:46:37 -04:00
Anthony MOI
93bb82c657 Update READMEs and CHANGELOGs 2020-05-27 13:32:20 -04:00
Anthony MOI
b24904513c Update READMEs and CHANGELOGs 2020-05-27 13:12:46 -04:00
Anthony MOI
85c7c94809 Python - Add to/from str and files for Tokenizer 2020-05-27 13:07:53 -04:00
Anthony MOI
cffcbb95fc Rust - serialization fixes + loading/saving methods 2020-05-27 13:07:53 -04:00
Anthony MOI
c800813bbe Python - Add Tokenizer saving capability 2020-05-27 13:07:52 -04:00
Anthony MOI
2b17d4221c Python - Restore custom PyDecoder and PyPreTokenizer 2020-05-27 13:07:52 -04:00
Anthony MOI
07fb3283f4 Python - Disable custom Decoder/PreTokenizer for now 2020-05-27 13:07:52 -04:00
Anthony MOI
400d9545fd Update rust toolchain for now 2020-05-21 19:15:40 -04:00
Anthony MOI
5a01792413 Python - Update CHANGELOGs and bump to 0.8.0-dev for release 2020-05-21 18:57:02 -04:00
Anthony MOI
7ad3bda369 Merge pull request #249 from huggingface/pre-tokenized
Allow pre-tokenized inputs to encode/encode_batch
2020-05-21 18:39:46 -04:00
Anthony MOI
8cb4ca72b6 Python - Update dependencies 2020-05-20 19:55:14 -04:00
Anthony MOI
30216190e5 Python - Improve typings for new encode/encode_batch 2020-05-01 17:11:55 -04:00
Anthony MOI
3fb8033770 Python - Improve tests for new encode/encode_batch 2020-05-01 17:11:55 -04:00
Anthony MOI
efaa6f589a Python - Improve encode/encode_batch 2020-05-01 17:11:54 -04:00
Anthony MOI
dbc8e68c68 Python - Update tests for new encode 2020-05-01 17:11:54 -04:00
Anthony MOI
2e105c4258 Python - Update typings for new encode 2020-05-01 17:11:54 -04:00
Anthony MOI
835f08ab02 Python - Update bindings for new encode 2020-05-01 17:11:54 -04:00
Anthony MOI
02cc97756f Rust - Improve TruncationError 2020-04-24 12:13:17 -04:00
Anthony MOI
7d2b59b0aa Rust - Add len() and is_empty() on Encoding 2020-04-24 11:44:10 -04:00
jaymody
a28fd29204 Python - Fix bug in bert wordpiece example script 2020-04-18 17:50:52 -04:00
Anthony MOI
670f619ab5 Python - bump to 0.7.0 for final release 2020-04-17 12:48:10 -04:00
Anthony MOI
3312ad75d9 Python - Bump to 0.7.0rc6 for release 2020-04-16 19:39:04 -04:00
Anthony MOI
ad0e488998 Python - Update changelog 2020-04-16 19:32:54 -04:00
Anthony MOI
249a282f1d Python - Fix style 2020-04-16 19:31:00 -04:00
Thomas Wolf
77590b9291 style! 2020-04-17 01:29:52 +02:00
Thomas Wolf
7216486686 Update CharLevelBPE 2020-04-17 01:15:02 +02:00
Anthony MOI
873ac2d9a8 Python - Add missing char_to_word 2020-04-16 18:20:30 -04:00
Anthony MOI
bdfb02f473 Python - Bump to 0.7.0rc6 for release 2020-04-16 14:42:22 -04:00
Anthony MOI
8834508547 Update CHANGELOGs 2020-04-16 14:25:19 -04:00
Anthony MOI
71b7830d1b Rust | Python | Node - Also add char_to_word 2020-04-16 14:23:37 -04:00
Anthony MOI
c5e22c14cb Python - Improve mappings on Encoding 2020-04-16 14:23:37 -04:00
Anthony MOI
c96c4d95bd Update CHANGELOGs 2020-04-16 10:34:34 -04:00
Anthony MOI
81e2cc2fc4 Python - Add offsets trimming to RobertaProcessing 2020-04-15 18:49:38 -04:00
Sławomir Dadas
0865a9ad55 Python - improve compatibility with sentencepiece in the conversion script 2020-04-11 17:35:50 +02:00
Anthony MOI
09104afd07 Python - Bump to 0.7.0-rc5 for release 2020-04-09 11:41:10 -04:00
Anthony MOI
a6c33f5de8 Python - update some dependencies 2020-04-09 10:56:26 -04:00
Anthony MOI
d6326a61c1 Python - Use PyO3 0.9.2 2020-04-09 10:21:05 -04:00
Anthony MOI
3ad1360210 Word indices are None for special tokens 2020-04-09 09:52:02 -04:00
Anthony MOI
1b9ead7ca2 Python - Try PyO3 master to fix build 2020-04-08 16:06:24 -04:00
Anthony MOI
b8daeae24a Python - Force PyO3 to 0.9.0 for now
cf https://github.com/PyO3/pyo3/issues/857
2020-04-08 15:45:15 -04:00
Anthony MOI
9f3de61f07 Python - revert to PyO3 0.9.0 for now
cf https://github.com/PyO3/pyo3/issues/857
2020-04-08 15:27:49 -04:00
Anthony MOI
25afbb5fde Python - Bump to 0.7.0-rc4 for release 2020-04-08 14:27:29 -04:00
Anthony MOI
ce637aec63 Python - Update README with new API 2020-04-08 14:27:29 -04:00
Anthony MOI
39999fba14 Update CHANGELOGs before releases 2020-04-08 14:04:26 -04:00
Anthony MOI
4cb77ca64c Python - Tweak BPE constructor + add some tests 2020-04-08 14:04:26 -04:00