Commit Graph

308 Commits

Author SHA1 Message Date
Anthony MOI
b8daeae24a Python - Force PyO3 to 0.9.0 for now
cf https://github.com/PyO3/pyo3/issues/857
2020-04-08 15:45:15 -04:00
Anthony MOI
9f3de61f07 Python - revert to PyO3 0.9.0 for now
cf https://github.com/PyO3/pyo3/issues/857
2020-04-08 15:27:49 -04:00
Anthony MOI
25afbb5fde Python - Bump to 0.7.0-rc4 for release 2020-04-08 14:27:29 -04:00
Anthony MOI
ce637aec63 Python - Update README with new API 2020-04-08 14:27:29 -04:00
Anthony MOI
39999fba14 Update CHANGELOGs before releases 2020-04-08 14:04:26 -04:00
Anthony MOI
4cb77ca64c Python - Tweak BPE constructor + add some tests 2020-04-08 14:04:26 -04:00
Anthony MOI
be7b345bcd Require Send for all parts of the tokenizer (#222) 2020-04-08 13:35:06 -04:00
Andre Bogus
550413f00a add Send + Sync on all traits, remove elsewhere 2020-04-08 18:43:23 +02:00
Bjarte Johansen
def8333d45 Python - Update changelog 2020-04-06 21:40:23 +02:00
Bjarte Johansen
fab97475e5 Python - Update examples to use new models API 2020-04-06 21:40:23 +02:00
Bjarte Johansen
823066fea9 Python - Update tests to use new models API
- Check that new models have right subclass
2020-04-06 21:40:08 +02:00
Bjarte Johansen
38bc788002 Python - Update implementations to use new API 2020-04-06 21:40:08 +02:00
Bjarte Johansen
69ed81e618 Python - Update types with new models API 2020-04-06 21:40:08 +02:00
Bjarte Johansen
2dc48e56ac Python - Update pyo3 version
* Use __new__ instead of static method as model constructors
2020-04-06 21:20:16 +02:00
Anthony MOI
b03fea1d66 Python - Update workflow and Makefile with tests 2020-04-01 17:36:33 -04:00
Anthony MOI
837791ee1f Python - Test BertWordPieceTokenizer 2020-04-01 17:25:56 -04:00
Anthony MOI
7fd7dfd113 Python - Test CharBPETokenizer 2020-04-01 17:25:56 -04:00
Anthony MOI
dbc23e20a9 Python - Test Models 2020-04-01 17:25:55 -04:00
Anthony MOI
53a7dbdaee Python - Test PostProcessors 2020-04-01 17:25:55 -04:00
Anthony MOI
a9f4c5950a Python - Test Decoders 2020-04-01 17:25:55 -04:00
Anthony MOI
0de9885da8 Python - Test PreTokenizers 2020-04-01 17:25:55 -04:00
Anthony MOI
d6692d4072 Python - Test Normalizers 2020-04-01 17:25:55 -04:00
Anthony MOI
3264ffe235 Python - Improve tests on Tokenizer 2020-04-01 17:25:55 -04:00
Anthony MOI
5ebe687753 Python - Add first implementations tests 2020-04-01 17:25:55 -04:00
Anthony MOI
023566fbbb Python - Add some tests utils 2020-04-01 17:25:55 -04:00
Anthony MOI
477037fd6b Python - Improve AddedToken repr 2020-04-01 17:25:55 -04:00
Anthony MOI
b055b77b54 Python - Add first tests: Tokenizer 2020-04-01 17:25:55 -04:00
Anthony MOI
f15c088cf3 Python - Hotfix typing import 2020-04-01 11:35:07 -04:00
Anthony MOI
2a84ef12cf Python - Add missing get_vocab from BaseTokenizer 2020-04-01 11:32:54 -04:00
Anthony MOI
93a83127ae Bump version for Python release 2020-03-31 14:25:47 -04:00
Morgan Funtowicz
afe9cfe96e Strip should inherits from Normalizer on Python binding.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-31 20:20:09 +02:00
Anthony MOI
279db8537e Update CHANGELOGs 2020-03-27 18:46:20 -04:00
Anthony MOI
5038a7f74e Update CHANGELOGs 2020-03-27 17:49:02 -04:00
Anthony MOI
a2a6d80017 Python - expost get_vocab on Tokenizer 2020-03-27 11:53:18 -04:00
Anthony MOI
e8aec7a624 Bump version for Python release 2020-03-27 09:17:35 -04:00
Morgan Funtowicz
bc46064a61 Added missing imports for AddedToken 2020-03-27 12:32:23 +01:00
Anthony MOI
b132be34af Bump version for Python release 2020-03-26 17:26:14 -04:00
Anthony MOI
6c232886b8 Improve & update CHANGELOGs 2020-03-26 17:08:42 -04:00
Anthony MOI
4341c79d85 Python - last fixes on Encoding bindings/typings 2020-03-26 15:42:45 -04:00
Anthony MOI
14e3ab3787 Python - fix style 2020-03-26 15:42:45 -04:00
Morgan Funtowicz
39958a2f0f TokenizedSequence / TokenizedSequenceWithOffsets needs to be declared in .py files not only .pyi 2020-03-26 15:42:45 -04:00
Morgan Funtowicz
68405a6fae Forward type_id in encode_tokenized/encode_tokenized_batch python binding. 2020-03-26 15:42:45 -04:00
Anthony MOI
9bd9e0b3c1 Expose post_process on the Tokenizer 2020-03-26 15:42:45 -04:00
Anthony MOI
9ce895550b Add some new merging capability on Encoding 2020-03-26 15:42:44 -04:00
Anthony MOI
eec74ca3e6 Python - Add Model.encode_batch and improve typings 2020-03-26 15:42:44 -04:00
Anthony MOI
1150751ab6 Python - Update mappings API 2020-03-26 15:42:44 -04:00
Anthony MOI
a397a1da63 Python - Expose encode method on Model 2020-03-26 15:42:44 -04:00
Anthony MOI
8de6ef5a37 Python - Bind new Encoding's mappings 2020-03-26 15:42:44 -04:00
Anthony MOI
e8925a33da Python - remove add_special_tokens from BertWordPieceTokenizer init 2020-03-26 14:19:37 -04:00
Anthony MOI
f8d54edcdd Python - Fix cases where str expected instead of AddedToken 2020-03-25 19:22:53 -04:00