Commit Graph

344 Commits

Author SHA1 Message Date
Anthony MOI
cffcbb95fc Rust - serialization fixes + loading/saving methods 2020-05-27 13:07:53 -04:00
Anthony MOI
c800813bbe Python - Add Tokenizer saving capability 2020-05-27 13:07:52 -04:00
Anthony MOI
2b17d4221c Python - Restore custom PyDecoder and PyPreTokenizer 2020-05-27 13:07:52 -04:00
Anthony MOI
07fb3283f4 Python - Disable custom Decoder/PreTokenizer for now 2020-05-27 13:07:52 -04:00
Anthony MOI
400d9545fd Update rust toolchain for now 2020-05-21 19:15:40 -04:00
Anthony MOI
5a01792413 Python - Update CHANGELOGs and bump to 0.8.0-dev for release 2020-05-21 18:57:02 -04:00
Anthony MOI
7ad3bda369 Merge pull request #249 from huggingface/pre-tokenized
Allow pre-tokenized inputs to encode/encode_batch
2020-05-21 18:39:46 -04:00
Anthony MOI
8cb4ca72b6 Python - Update dependencies 2020-05-20 19:55:14 -04:00
Anthony MOI
30216190e5 Python - Improve typings for new encode/encode_batch 2020-05-01 17:11:55 -04:00
Anthony MOI
3fb8033770 Python - Improve tests for new encode/encode_batch 2020-05-01 17:11:55 -04:00
Anthony MOI
efaa6f589a Python - Improve encode/encode_batch 2020-05-01 17:11:54 -04:00
Anthony MOI
dbc8e68c68 Python - Update tests for new encode 2020-05-01 17:11:54 -04:00
Anthony MOI
2e105c4258 Python - Update typings for new encode 2020-05-01 17:11:54 -04:00
Anthony MOI
835f08ab02 Python - Update bindings for new encode 2020-05-01 17:11:54 -04:00
Anthony MOI
02cc97756f Rust - Improve TruncationError 2020-04-24 12:13:17 -04:00
Anthony MOI
7d2b59b0aa Rust - Add len() and is_empty() on Encoding 2020-04-24 11:44:10 -04:00
jaymody
a28fd29204 Python - Fix bug in bert wordpiece example script 2020-04-18 17:50:52 -04:00
Anthony MOI
670f619ab5 Python - bump to 0.7.0 for final release 2020-04-17 12:48:10 -04:00
Anthony MOI
3312ad75d9 Python - Bump to 0.7.0rc6 for release 2020-04-16 19:39:04 -04:00
Anthony MOI
ad0e488998 Python - Update changelog 2020-04-16 19:32:54 -04:00
Anthony MOI
249a282f1d Python - Fix style 2020-04-16 19:31:00 -04:00
Thomas Wolf
77590b9291 style! 2020-04-17 01:29:52 +02:00
Thomas Wolf
7216486686 Update CharLevelBPE 2020-04-17 01:15:02 +02:00
Anthony MOI
873ac2d9a8 Python - Add missing char_to_word 2020-04-16 18:20:30 -04:00
Anthony MOI
bdfb02f473 Python - Bump to 0.7.0rc6 for release 2020-04-16 14:42:22 -04:00
Anthony MOI
8834508547 Update CHANGELOGs 2020-04-16 14:25:19 -04:00
Anthony MOI
71b7830d1b Rust | Python | Node - Also add char_to_word 2020-04-16 14:23:37 -04:00
Anthony MOI
c5e22c14cb Python - Improve mappings on Encoding 2020-04-16 14:23:37 -04:00
Anthony MOI
c96c4d95bd Update CHANGELOGs 2020-04-16 10:34:34 -04:00
Anthony MOI
81e2cc2fc4 Python - Add offsets trimming to RobertaProcessing 2020-04-15 18:49:38 -04:00
Sławomir Dadas
0865a9ad55 Python - improve compatibility with sentencepiece in the conversion script 2020-04-11 17:35:50 +02:00
Anthony MOI
09104afd07 Python - Bump to 0.7.0-rc5 for release 2020-04-09 11:41:10 -04:00
Anthony MOI
a6c33f5de8 Python - update some dependencies 2020-04-09 10:56:26 -04:00
Anthony MOI
d6326a61c1 Python - Use PyO3 0.9.2 2020-04-09 10:21:05 -04:00
Anthony MOI
3ad1360210 Word indices are None for special tokens 2020-04-09 09:52:02 -04:00
Anthony MOI
1b9ead7ca2 Python - Try PyO3 master to fix build 2020-04-08 16:06:24 -04:00
Anthony MOI
b8daeae24a Python - Force PyO3 to 0.9.0 for now
cf https://github.com/PyO3/pyo3/issues/857
2020-04-08 15:45:15 -04:00
Anthony MOI
9f3de61f07 Python - revert to PyO3 0.9.0 for now
cf https://github.com/PyO3/pyo3/issues/857
2020-04-08 15:27:49 -04:00
Anthony MOI
25afbb5fde Python - Bump to 0.7.0-rc4 for release 2020-04-08 14:27:29 -04:00
Anthony MOI
ce637aec63 Python - Update README with new API 2020-04-08 14:27:29 -04:00
Anthony MOI
39999fba14 Update CHANGELOGs before releases 2020-04-08 14:04:26 -04:00
Anthony MOI
4cb77ca64c Python - Tweak BPE constructor + add some tests 2020-04-08 14:04:26 -04:00
Anthony MOI
be7b345bcd Require Send for all parts of the tokenizer (#222) 2020-04-08 13:35:06 -04:00
Andre Bogus
550413f00a add Send + Sync on all traits, remove elsewhere 2020-04-08 18:43:23 +02:00
Bjarte Johansen
def8333d45 Python - Update changelog 2020-04-06 21:40:23 +02:00
Bjarte Johansen
fab97475e5 Python - Update examples to use new models API 2020-04-06 21:40:23 +02:00
Bjarte Johansen
823066fea9 Python - Update tests to use new models API
- Check that new models have right subclass
2020-04-06 21:40:08 +02:00
Bjarte Johansen
38bc788002 Python - Update implementations to use new API 2020-04-06 21:40:08 +02:00
Bjarte Johansen
69ed81e618 Python - Update types with new models API 2020-04-06 21:40:08 +02:00
Bjarte Johansen
2dc48e56ac Python - Update pyo3 version
* Use __new__ instead of static method as model constructors
2020-04-06 21:20:16 +02:00