|
c59b216baa
|
Fixing convert/check scripts.
|
2020-09-22 08:21:38 +02:00 |
|
|
b16406c900
|
Moving StripAccents within normalizer for Albert +XLNet, but now crash
in Precompiled. offsets are wrong ?
|
2020-09-22 08:21:38 +02:00 |
|
|
275ee6d4c4
|
Making convert script machine agnostic.
|
2020-09-22 08:21:38 +02:00 |
|
|
2fd1d9cf06
|
Adding a new convert script, that will convert all python Tokenizer code
into a proper Rust Tokenizer format and check it on a file.
- Also fuse_unks by default in `tokenizers`'s BPE.
|
2020-09-22 08:21:38 +02:00 |
|