Commit Graph

2 Commits

Author SHA1 Message Date
da4c7b10e4 Add a way to specify the unknown token in SentencePieceUnigramTokenizer python implem (#762)
* add a way to specify the unknown token in `SentencePieceUnigramTokenizer`

* add test that verify that an exception is raised for the missing unknown token

* style

* add test tokens
2021-08-12 09:42:44 -04:00
d94fa220b6 Python - Add train_from_iterator to implementations 2021-01-07 09:02:20 -05:00