Commit Graph

300 Commits

Author SHA1 Message Date
Bjarte Johansen
def8333d45 Python - Update changelog 2020-04-06 21:40:23 +02:00
Bjarte Johansen
fab97475e5 Python - Update examples to use new models API 2020-04-06 21:40:23 +02:00
Bjarte Johansen
823066fea9 Python - Update tests to use new models API
- Check that new models have right subclass
2020-04-06 21:40:08 +02:00
Bjarte Johansen
38bc788002 Python - Update implementations to use new API 2020-04-06 21:40:08 +02:00
Bjarte Johansen
69ed81e618 Python - Update types with new models API 2020-04-06 21:40:08 +02:00
Bjarte Johansen
2dc48e56ac Python - Update pyo3 version
* Use __new__ instead of static method as model constructors
2020-04-06 21:20:16 +02:00
Anthony MOI
b03fea1d66 Python - Update workflow and Makefile with tests 2020-04-01 17:36:33 -04:00
Anthony MOI
837791ee1f Python - Test BertWordPieceTokenizer 2020-04-01 17:25:56 -04:00
Anthony MOI
7fd7dfd113 Python - Test CharBPETokenizer 2020-04-01 17:25:56 -04:00
Anthony MOI
dbc23e20a9 Python - Test Models 2020-04-01 17:25:55 -04:00
Anthony MOI
53a7dbdaee Python - Test PostProcessors 2020-04-01 17:25:55 -04:00
Anthony MOI
a9f4c5950a Python - Test Decoders 2020-04-01 17:25:55 -04:00
Anthony MOI
0de9885da8 Python - Test PreTokenizers 2020-04-01 17:25:55 -04:00
Anthony MOI
d6692d4072 Python - Test Normalizers 2020-04-01 17:25:55 -04:00
Anthony MOI
3264ffe235 Python - Improve tests on Tokenizer 2020-04-01 17:25:55 -04:00
Anthony MOI
5ebe687753 Python - Add first implementations tests 2020-04-01 17:25:55 -04:00
Anthony MOI
023566fbbb Python - Add some tests utils 2020-04-01 17:25:55 -04:00
Anthony MOI
477037fd6b Python - Improve AddedToken repr 2020-04-01 17:25:55 -04:00
Anthony MOI
b055b77b54 Python - Add first tests: Tokenizer 2020-04-01 17:25:55 -04:00
Anthony MOI
f15c088cf3 Python - Hotfix typing import 2020-04-01 11:35:07 -04:00
Anthony MOI
2a84ef12cf Python - Add missing get_vocab from BaseTokenizer 2020-04-01 11:32:54 -04:00
Anthony MOI
93a83127ae Bump version for Python release 2020-03-31 14:25:47 -04:00
Morgan Funtowicz
afe9cfe96e Strip should inherits from Normalizer on Python binding.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-31 20:20:09 +02:00
Anthony MOI
279db8537e Update CHANGELOGs 2020-03-27 18:46:20 -04:00
Anthony MOI
5038a7f74e Update CHANGELOGs 2020-03-27 17:49:02 -04:00
Anthony MOI
a2a6d80017 Python - expost get_vocab on Tokenizer 2020-03-27 11:53:18 -04:00
Anthony MOI
e8aec7a624 Bump version for Python release 2020-03-27 09:17:35 -04:00
Morgan Funtowicz
bc46064a61 Added missing imports for AddedToken 2020-03-27 12:32:23 +01:00
Anthony MOI
b132be34af Bump version for Python release 2020-03-26 17:26:14 -04:00
Anthony MOI
6c232886b8 Improve & update CHANGELOGs 2020-03-26 17:08:42 -04:00
Anthony MOI
4341c79d85 Python - last fixes on Encoding bindings/typings 2020-03-26 15:42:45 -04:00
Anthony MOI
14e3ab3787 Python - fix style 2020-03-26 15:42:45 -04:00
Morgan Funtowicz
39958a2f0f TokenizedSequence / TokenizedSequenceWithOffsets needs to be declared in .py files not only .pyi 2020-03-26 15:42:45 -04:00
Morgan Funtowicz
68405a6fae Forward type_id in encode_tokenized/encode_tokenized_batch python binding. 2020-03-26 15:42:45 -04:00
Anthony MOI
9bd9e0b3c1 Expose post_process on the Tokenizer 2020-03-26 15:42:45 -04:00
Anthony MOI
9ce895550b Add some new merging capability on Encoding 2020-03-26 15:42:44 -04:00
Anthony MOI
eec74ca3e6 Python - Add Model.encode_batch and improve typings 2020-03-26 15:42:44 -04:00
Anthony MOI
1150751ab6 Python - Update mappings API 2020-03-26 15:42:44 -04:00
Anthony MOI
a397a1da63 Python - Expose encode method on Model 2020-03-26 15:42:44 -04:00
Anthony MOI
8de6ef5a37 Python - Bind new Encoding's mappings 2020-03-26 15:42:44 -04:00
Anthony MOI
e8925a33da Python - remove add_special_tokens from BertWordPieceTokenizer init 2020-03-26 14:19:37 -04:00
Anthony MOI
f8d54edcdd Python - Fix cases where str expected instead of AddedToken 2020-03-25 19:22:53 -04:00
Anthony MOI
c65d53892d Python - Add bindings for new AddedToken options 2020-03-24 20:58:45 -04:00
Anthony MOI
d953d58cee Rust - Fix offsets when there are added tokens 2020-03-19 12:53:03 -04:00
Anthony MOI
d53de0e2da Python - Expose normalize on BaseTokenizer 2020-03-18 16:44:31 -04:00
Anthony MOI
ae0d330907 Update CHANGELOGs 2020-03-18 16:42:27 -04:00
Anthony MOI
60a4fb35f4 Python - Update bindings 2020-03-16 10:36:42 -04:00
Morgan Funtowicz
505bfbba82 Fix invalid error messages. 2020-03-12 15:38:29 +01:00
Morgan Funtowicz
5ed1f26c71 Throw a more meaningful error when provided python input is None. 2020-03-12 10:59:05 +01:00
Anthony MOI
257360acec Python - encode & encode batch with add_special_tokens 2020-03-10 16:21:10 -04:00