Commit Graph

2 Commits

Author SHA1 Message Date
275ee6d4c4 Making convert script machine agnostic. 2020-09-22 08:21:38 +02:00
2fd1d9cf06 Adding a new convert script, that will convert all python Tokenizer code
into a proper Rust Tokenizer format and check it on a file.

- Also fuse_unks by default in `tokenizers`'s BPE.
2020-09-22 08:21:38 +02:00