Commit Graph

2 Commits

Author SHA1 Message Date
1a84958cc8 Fixing bad deserialization following inclusion of a default for Punctuation. (#884)
* Fixing bad deserialization following inclusion of a default for
`Punctuation`.

* don't remove the type now...

* Adding slow test to run on all the tokenizers of the hub.

* `PartialEq` everywhere.

* Forcing `type` to exist on the `pre_tokenizers`.
2022-01-17 22:28:25 +01:00
88556790e7 Fixing a bug where long tokenizer files would be incorrectly deserialized (#459)
* Fixing a bug where long tokenizer files would be incorrectly
deserialized

- Add a bunch of tests to check deserialization behaviour
- One tests also confirms current Single deserialization of Sequence.

* Better test locations for Windows + no file dependency in Python binding
Rust side.

* Adressing @n1t0 comments.
2020-10-13 18:44:24 +02:00