Anthony MOI
|
c02d4e2202
|
Python - Improve AddedToken interface
|
2020-06-19 17:53:46 -04:00 |
|
Anthony MOI
|
a14cd7b219
|
Python - Bump version to 0.8.0.rc2 for release
|
2020-06-19 10:48:53 -04:00 |
|
Anthony MOI
|
898a4a812e
|
Python - Make AddedToken pickable
|
2020-06-19 10:34:11 -04:00 |
|
Anthony MOI
|
63edb95130
|
Python - Update AddedToken repr
|
2020-06-19 10:18:55 -04:00 |
|
Anthony MOI
|
4c7a0ff4ec
|
Update CHANGELOGs
|
2020-06-18 14:50:16 -04:00 |
|
Anthony MOI
|
fc63d56eab
|
AddedVocabulary - Add tests, update bindings + various tweaks
|
2020-06-18 14:50:16 -04:00 |
|
Anthony MOI
|
c6f633eb1c
|
Rust - Fix/Tweak AddedVocabulary + Fix python tests
|
2020-06-16 14:42:53 -04:00 |
|
Anthony MOI
|
397cc539da
|
Rust - Add AddedVocabulary + normalized option on AddedToken
|
2020-06-16 14:42:53 -04:00 |
|
Anthony MOI
|
fb964adfdb
|
Python - Bump version to 0.8.0.rc1 for release
|
2020-06-11 14:24:34 -04:00 |
|
Anthony MOI
|
847651445e
|
Fix build-wheels.sh script for manylinux wheels
Before this change, we added the `.so` files from previous version in
the `.whl` files of later versions.
Fix #301
|
2020-06-11 12:43:40 -04:00 |
|
Anthony MOI
|
433a311887
|
Update CHANGELOGs
|
2020-06-09 17:33:41 -04:00 |
|
Anthony MOI
|
794759b56d
|
Python - Improve truncation/padding management
|
2020-06-09 17:33:41 -04:00 |
|
Anthony MOI
|
d00ac60162
|
Update changelogs and bump version for python release
|
2020-06-03 18:27:49 -04:00 |
|
Morgan Funtowicz
|
fcb4e76d9b
|
Ensure pad_to_multiple_of is correctly forwarded in base_tokenizer.py
|
2020-05-31 10:02:59 +02:00 |
|
Anthony MOI
|
0934fe5803
|
Python - Bindings for pad_to_multiple_of
|
2020-05-29 20:34:41 -04:00 |
|
Anthony MOI
|
2a0f2337db
|
Python - Update CHANGELOG and bump version to 0.8.0.dev1 for release
|
2020-05-27 14:22:00 -04:00 |
|
Anthony MOI
|
c205afe7a5
|
Python - Also allow creating Tokenizer from_buffer
|
2020-05-27 13:46:37 -04:00 |
|
Anthony MOI
|
0e890d0d05
|
Update CHANGELOGs
|
2020-05-27 13:46:37 -04:00 |
|
Anthony MOI
|
de9feae0b5
|
Python - Make Encoding pickable
|
2020-05-27 13:46:37 -04:00 |
|
Anthony MOI
|
c5bba91bf4
|
Python - Test and fix classes pickling
|
2020-05-27 13:46:37 -04:00 |
|
Anthony MOI
|
6a70162d78
|
Python - Make all relevant classes pickable
|
2020-05-27 13:46:37 -04:00 |
|
Anthony MOI
|
93bb82c657
|
Update READMEs and CHANGELOGs
|
2020-05-27 13:32:20 -04:00 |
|
Anthony MOI
|
b24904513c
|
Update READMEs and CHANGELOGs
|
2020-05-27 13:12:46 -04:00 |
|
Anthony MOI
|
85c7c94809
|
Python - Add to/from str and files for Tokenizer
|
2020-05-27 13:07:53 -04:00 |
|
Anthony MOI
|
cffcbb95fc
|
Rust - serialization fixes + loading/saving methods
|
2020-05-27 13:07:53 -04:00 |
|
Anthony MOI
|
c800813bbe
|
Python - Add Tokenizer saving capability
|
2020-05-27 13:07:52 -04:00 |
|
Anthony MOI
|
2b17d4221c
|
Python - Restore custom PyDecoder and PyPreTokenizer
|
2020-05-27 13:07:52 -04:00 |
|
Anthony MOI
|
07fb3283f4
|
Python - Disable custom Decoder/PreTokenizer for now
|
2020-05-27 13:07:52 -04:00 |
|
Anthony MOI
|
400d9545fd
|
Update rust toolchain for now
|
2020-05-21 19:15:40 -04:00 |
|
Anthony MOI
|
5a01792413
|
Python - Update CHANGELOGs and bump to 0.8.0-dev for release
|
2020-05-21 18:57:02 -04:00 |
|
Anthony MOI
|
7ad3bda369
|
Merge pull request #249 from huggingface/pre-tokenized
Allow pre-tokenized inputs to encode/encode_batch
|
2020-05-21 18:39:46 -04:00 |
|
Anthony MOI
|
8cb4ca72b6
|
Python - Update dependencies
|
2020-05-20 19:55:14 -04:00 |
|
Anthony MOI
|
30216190e5
|
Python - Improve typings for new encode/encode_batch
|
2020-05-01 17:11:55 -04:00 |
|
Anthony MOI
|
3fb8033770
|
Python - Improve tests for new encode/encode_batch
|
2020-05-01 17:11:55 -04:00 |
|
Anthony MOI
|
efaa6f589a
|
Python - Improve encode/encode_batch
|
2020-05-01 17:11:54 -04:00 |
|
Anthony MOI
|
dbc8e68c68
|
Python - Update tests for new encode
|
2020-05-01 17:11:54 -04:00 |
|
Anthony MOI
|
2e105c4258
|
Python - Update typings for new encode
|
2020-05-01 17:11:54 -04:00 |
|
Anthony MOI
|
835f08ab02
|
Python - Update bindings for new encode
|
2020-05-01 17:11:54 -04:00 |
|
Anthony MOI
|
02cc97756f
|
Rust - Improve TruncationError
|
2020-04-24 12:13:17 -04:00 |
|
Anthony MOI
|
7d2b59b0aa
|
Rust - Add len() and is_empty() on Encoding
|
2020-04-24 11:44:10 -04:00 |
|
jaymody
|
a28fd29204
|
Python - Fix bug in bert wordpiece example script
|
2020-04-18 17:50:52 -04:00 |
|
Anthony MOI
|
670f619ab5
|
Python - bump to 0.7.0 for final release
|
2020-04-17 12:48:10 -04:00 |
|
Anthony MOI
|
3312ad75d9
|
Python - Bump to 0.7.0rc6 for release
|
2020-04-16 19:39:04 -04:00 |
|
Anthony MOI
|
ad0e488998
|
Python - Update changelog
|
2020-04-16 19:32:54 -04:00 |
|
Anthony MOI
|
249a282f1d
|
Python - Fix style
|
2020-04-16 19:31:00 -04:00 |
|
Thomas Wolf
|
77590b9291
|
style!
|
2020-04-17 01:29:52 +02:00 |
|
Thomas Wolf
|
7216486686
|
Update CharLevelBPE
|
2020-04-17 01:15:02 +02:00 |
|
Anthony MOI
|
873ac2d9a8
|
Python - Add missing char_to_word
|
2020-04-16 18:20:30 -04:00 |
|
Anthony MOI
|
bdfb02f473
|
Python - Bump to 0.7.0rc6 for release
|
2020-04-16 14:42:22 -04:00 |
|
Anthony MOI
|
8834508547
|
Update CHANGELOGs
|
2020-04-16 14:25:19 -04:00 |
|