Commit Graph

330 Commits

Author SHA1 Message Date
Anthony MOI
02cc97756f Rust - Improve TruncationError 2020-04-24 12:13:17 -04:00
Anthony MOI
7d2b59b0aa Rust - Add len() and is_empty() on Encoding 2020-04-24 11:44:10 -04:00
jaymody
a28fd29204 Python - Fix bug in bert wordpiece example script 2020-04-18 17:50:52 -04:00
Anthony MOI
670f619ab5 Python - bump to 0.7.0 for final release 2020-04-17 12:48:10 -04:00
Anthony MOI
3312ad75d9 Python - Bump to 0.7.0rc6 for release 2020-04-16 19:39:04 -04:00
Anthony MOI
ad0e488998 Python - Update changelog 2020-04-16 19:32:54 -04:00
Anthony MOI
249a282f1d Python - Fix style 2020-04-16 19:31:00 -04:00
Thomas Wolf
77590b9291 style! 2020-04-17 01:29:52 +02:00
Thomas Wolf
7216486686 Update CharLevelBPE 2020-04-17 01:15:02 +02:00
Anthony MOI
873ac2d9a8 Python - Add missing char_to_word 2020-04-16 18:20:30 -04:00
Anthony MOI
bdfb02f473 Python - Bump to 0.7.0rc6 for release 2020-04-16 14:42:22 -04:00
Anthony MOI
8834508547 Update CHANGELOGs 2020-04-16 14:25:19 -04:00
Anthony MOI
71b7830d1b Rust | Python | Node - Also add char_to_word 2020-04-16 14:23:37 -04:00
Anthony MOI
c5e22c14cb Python - Improve mappings on Encoding 2020-04-16 14:23:37 -04:00
Anthony MOI
c96c4d95bd Update CHANGELOGs 2020-04-16 10:34:34 -04:00
Anthony MOI
81e2cc2fc4 Python - Add offsets trimming to RobertaProcessing 2020-04-15 18:49:38 -04:00
Sławomir Dadas
0865a9ad55 Python - improve compatibility with sentencepiece in the conversion script 2020-04-11 17:35:50 +02:00
Anthony MOI
09104afd07 Python - Bump to 0.7.0-rc5 for release 2020-04-09 11:41:10 -04:00
Anthony MOI
a6c33f5de8 Python - update some dependencies 2020-04-09 10:56:26 -04:00
Anthony MOI
d6326a61c1 Python - Use PyO3 0.9.2 2020-04-09 10:21:05 -04:00
Anthony MOI
3ad1360210 Word indices are None for special tokens 2020-04-09 09:52:02 -04:00
Anthony MOI
1b9ead7ca2 Python - Try PyO3 master to fix build 2020-04-08 16:06:24 -04:00
Anthony MOI
b8daeae24a Python - Force PyO3 to 0.9.0 for now
cf https://github.com/PyO3/pyo3/issues/857
2020-04-08 15:45:15 -04:00
Anthony MOI
9f3de61f07 Python - revert to PyO3 0.9.0 for now
cf https://github.com/PyO3/pyo3/issues/857
2020-04-08 15:27:49 -04:00
Anthony MOI
25afbb5fde Python - Bump to 0.7.0-rc4 for release 2020-04-08 14:27:29 -04:00
Anthony MOI
ce637aec63 Python - Update README with new API 2020-04-08 14:27:29 -04:00
Anthony MOI
39999fba14 Update CHANGELOGs before releases 2020-04-08 14:04:26 -04:00
Anthony MOI
4cb77ca64c Python - Tweak BPE constructor + add some tests 2020-04-08 14:04:26 -04:00
Anthony MOI
be7b345bcd Require Send for all parts of the tokenizer (#222) 2020-04-08 13:35:06 -04:00
Andre Bogus
550413f00a add Send + Sync on all traits, remove elsewhere 2020-04-08 18:43:23 +02:00
Bjarte Johansen
def8333d45 Python - Update changelog 2020-04-06 21:40:23 +02:00
Bjarte Johansen
fab97475e5 Python - Update examples to use new models API 2020-04-06 21:40:23 +02:00
Bjarte Johansen
823066fea9 Python - Update tests to use new models API
- Check that new models have right subclass
2020-04-06 21:40:08 +02:00
Bjarte Johansen
38bc788002 Python - Update implementations to use new API 2020-04-06 21:40:08 +02:00
Bjarte Johansen
69ed81e618 Python - Update types with new models API 2020-04-06 21:40:08 +02:00
Bjarte Johansen
2dc48e56ac Python - Update pyo3 version
* Use __new__ instead of static method as model constructors
2020-04-06 21:20:16 +02:00
Anthony MOI
b03fea1d66 Python - Update workflow and Makefile with tests 2020-04-01 17:36:33 -04:00
Anthony MOI
837791ee1f Python - Test BertWordPieceTokenizer 2020-04-01 17:25:56 -04:00
Anthony MOI
7fd7dfd113 Python - Test CharBPETokenizer 2020-04-01 17:25:56 -04:00
Anthony MOI
dbc23e20a9 Python - Test Models 2020-04-01 17:25:55 -04:00
Anthony MOI
53a7dbdaee Python - Test PostProcessors 2020-04-01 17:25:55 -04:00
Anthony MOI
a9f4c5950a Python - Test Decoders 2020-04-01 17:25:55 -04:00
Anthony MOI
0de9885da8 Python - Test PreTokenizers 2020-04-01 17:25:55 -04:00
Anthony MOI
d6692d4072 Python - Test Normalizers 2020-04-01 17:25:55 -04:00
Anthony MOI
3264ffe235 Python - Improve tests on Tokenizer 2020-04-01 17:25:55 -04:00
Anthony MOI
5ebe687753 Python - Add first implementations tests 2020-04-01 17:25:55 -04:00
Anthony MOI
023566fbbb Python - Add some tests utils 2020-04-01 17:25:55 -04:00
Anthony MOI
477037fd6b Python - Improve AddedToken repr 2020-04-01 17:25:55 -04:00
Anthony MOI
b055b77b54 Python - Add first tests: Tokenizer 2020-04-01 17:25:55 -04:00
Anthony MOI
f15c088cf3 Python - Hotfix typing import 2020-04-01 11:35:07 -04:00