Anthony MOI
|
c205afe7a5
|
Python - Also allow creating Tokenizer from_buffer
|
2020-05-27 13:46:37 -04:00 |
|
Anthony MOI
|
0e890d0d05
|
Update CHANGELOGs
|
2020-05-27 13:46:37 -04:00 |
|
Anthony MOI
|
de9feae0b5
|
Python - Make Encoding pickable
|
2020-05-27 13:46:37 -04:00 |
|
Anthony MOI
|
c5bba91bf4
|
Python - Test and fix classes pickling
|
2020-05-27 13:46:37 -04:00 |
|
Anthony MOI
|
6a70162d78
|
Python - Make all relevant classes pickable
|
2020-05-27 13:46:37 -04:00 |
|
Anthony MOI
|
93bb82c657
|
Update READMEs and CHANGELOGs
|
2020-05-27 13:32:20 -04:00 |
|
Anthony MOI
|
b24904513c
|
Update READMEs and CHANGELOGs
|
2020-05-27 13:12:46 -04:00 |
|
Anthony MOI
|
85c7c94809
|
Python - Add to/from str and files for Tokenizer
|
2020-05-27 13:07:53 -04:00 |
|
Anthony MOI
|
cffcbb95fc
|
Rust - serialization fixes + loading/saving methods
|
2020-05-27 13:07:53 -04:00 |
|
Anthony MOI
|
c800813bbe
|
Python - Add Tokenizer saving capability
|
2020-05-27 13:07:52 -04:00 |
|
Anthony MOI
|
2b17d4221c
|
Python - Restore custom PyDecoder and PyPreTokenizer
|
2020-05-27 13:07:52 -04:00 |
|
Anthony MOI
|
07fb3283f4
|
Python - Disable custom Decoder/PreTokenizer for now
|
2020-05-27 13:07:52 -04:00 |
|
Anthony MOI
|
400d9545fd
|
Update rust toolchain for now
|
2020-05-21 19:15:40 -04:00 |
|
Anthony MOI
|
5a01792413
|
Python - Update CHANGELOGs and bump to 0.8.0-dev for release
|
2020-05-21 18:57:02 -04:00 |
|
Anthony MOI
|
7ad3bda369
|
Merge pull request #249 from huggingface/pre-tokenized
Allow pre-tokenized inputs to encode/encode_batch
|
2020-05-21 18:39:46 -04:00 |
|
Anthony MOI
|
8cb4ca72b6
|
Python - Update dependencies
|
2020-05-20 19:55:14 -04:00 |
|
Anthony MOI
|
30216190e5
|
Python - Improve typings for new encode/encode_batch
|
2020-05-01 17:11:55 -04:00 |
|
Anthony MOI
|
3fb8033770
|
Python - Improve tests for new encode/encode_batch
|
2020-05-01 17:11:55 -04:00 |
|
Anthony MOI
|
efaa6f589a
|
Python - Improve encode/encode_batch
|
2020-05-01 17:11:54 -04:00 |
|
Anthony MOI
|
dbc8e68c68
|
Python - Update tests for new encode
|
2020-05-01 17:11:54 -04:00 |
|
Anthony MOI
|
2e105c4258
|
Python - Update typings for new encode
|
2020-05-01 17:11:54 -04:00 |
|
Anthony MOI
|
835f08ab02
|
Python - Update bindings for new encode
|
2020-05-01 17:11:54 -04:00 |
|
Anthony MOI
|
02cc97756f
|
Rust - Improve TruncationError
|
2020-04-24 12:13:17 -04:00 |
|
Anthony MOI
|
7d2b59b0aa
|
Rust - Add len() and is_empty() on Encoding
|
2020-04-24 11:44:10 -04:00 |
|
jaymody
|
a28fd29204
|
Python - Fix bug in bert wordpiece example script
|
2020-04-18 17:50:52 -04:00 |
|
Anthony MOI
|
670f619ab5
|
Python - bump to 0.7.0 for final release
|
2020-04-17 12:48:10 -04:00 |
|
Anthony MOI
|
3312ad75d9
|
Python - Bump to 0.7.0rc6 for release
|
2020-04-16 19:39:04 -04:00 |
|
Anthony MOI
|
ad0e488998
|
Python - Update changelog
|
2020-04-16 19:32:54 -04:00 |
|
Anthony MOI
|
249a282f1d
|
Python - Fix style
|
2020-04-16 19:31:00 -04:00 |
|
Thomas Wolf
|
77590b9291
|
style!
|
2020-04-17 01:29:52 +02:00 |
|
Thomas Wolf
|
7216486686
|
Update CharLevelBPE
|
2020-04-17 01:15:02 +02:00 |
|
Anthony MOI
|
873ac2d9a8
|
Python - Add missing char_to_word
|
2020-04-16 18:20:30 -04:00 |
|
Anthony MOI
|
bdfb02f473
|
Python - Bump to 0.7.0rc6 for release
|
2020-04-16 14:42:22 -04:00 |
|
Anthony MOI
|
8834508547
|
Update CHANGELOGs
|
2020-04-16 14:25:19 -04:00 |
|
Anthony MOI
|
71b7830d1b
|
Rust | Python | Node - Also add char_to_word
|
2020-04-16 14:23:37 -04:00 |
|
Anthony MOI
|
c5e22c14cb
|
Python - Improve mappings on Encoding
|
2020-04-16 14:23:37 -04:00 |
|
Anthony MOI
|
c96c4d95bd
|
Update CHANGELOGs
|
2020-04-16 10:34:34 -04:00 |
|
Anthony MOI
|
81e2cc2fc4
|
Python - Add offsets trimming to RobertaProcessing
|
2020-04-15 18:49:38 -04:00 |
|
Sławomir Dadas
|
0865a9ad55
|
Python - improve compatibility with sentencepiece in the conversion script
|
2020-04-11 17:35:50 +02:00 |
|
Anthony MOI
|
09104afd07
|
Python - Bump to 0.7.0-rc5 for release
|
2020-04-09 11:41:10 -04:00 |
|
Anthony MOI
|
a6c33f5de8
|
Python - update some dependencies
|
2020-04-09 10:56:26 -04:00 |
|
Anthony MOI
|
d6326a61c1
|
Python - Use PyO3 0.9.2
|
2020-04-09 10:21:05 -04:00 |
|
Anthony MOI
|
3ad1360210
|
Word indices are None for special tokens
|
2020-04-09 09:52:02 -04:00 |
|
Anthony MOI
|
1b9ead7ca2
|
Python - Try PyO3 master to fix build
|
2020-04-08 16:06:24 -04:00 |
|
Anthony MOI
|
b8daeae24a
|
Python - Force PyO3 to 0.9.0 for now
cf https://github.com/PyO3/pyo3/issues/857
|
2020-04-08 15:45:15 -04:00 |
|
Anthony MOI
|
9f3de61f07
|
Python - revert to PyO3 0.9.0 for now
cf https://github.com/PyO3/pyo3/issues/857
|
2020-04-08 15:27:49 -04:00 |
|
Anthony MOI
|
25afbb5fde
|
Python - Bump to 0.7.0-rc4 for release
|
2020-04-08 14:27:29 -04:00 |
|
Anthony MOI
|
ce637aec63
|
Python - Update README with new API
|
2020-04-08 14:27:29 -04:00 |
|
Anthony MOI
|
39999fba14
|
Update CHANGELOGs before releases
|
2020-04-08 14:04:26 -04:00 |
|
Anthony MOI
|
4cb77ca64c
|
Python - Tweak BPE constructor + add some tests
|
2020-04-08 14:04:26 -04:00 |
|