Update CHANGELOGs

2025-12-07 13:18:31 +00:00 · 2020-03-27 17:48:44 -04:00
parent 2a71a63242
commit 5038a7f74e
2 changed files with 5 additions and 0 deletions
--- a/bindings/python/CHANGELOG.md
+++ b/bindings/python/CHANGELOG.md
@@ -28,6 +28,8 @@ whitespaces are part of the actual token.
 It has been added to `ByteLevelBPETokenizer` but it is off by default (`trim_offsets=False`).
 - More alignment mappings on the `Encoding`.
 - `post_process` can be called on the `Tokenizer`
 - [#208]: Ability to retrieve the vocabulary from the `Tokenizer` with
 `get_vocab(with_added_tokens: bool)`
 ### Fixed
 - [#193]: Fix some issues with the offsets being wrong with the `ByteLevel` BPE:
@@ -148,6 +150,7 @@ delimiter (Works like `.split(delimiter)`)
 - Fix a bug with the IDs associated with added tokens.
 - Fix a bug that was causing crashes in Python 3.5
 [#208]: https://github.com/huggingface/tokenizers/pull/208
 [#197]: https://github.com/huggingface/tokenizers/pull/197
 [#193]: https://github.com/huggingface/tokenizers/pull/193
 [#190]: https://github.com/huggingface/tokenizers/pull/190
--- a/tokenizers/CHANGELOG.md
+++ b/tokenizers/CHANGELOG.md
@@ -30,6 +30,7 @@ the unintuitive inclusion of the whitespaces in the produced offsets, even if th
 part of the actual token
 - More alignment mappings on the `Encoding`.
 - `post_process` can be called on the `Tokenizer`
 - [#208]: Ability to retrieve the vocabulary from the `Tokenizer` & `Model`
 ### Fixed
 - [#193]: Fix some issues with the offsets being wrong with the `ByteLevel` BPE:
@@ -53,6 +54,7 @@ advised, but that's not the question)
 split up in multiple bytes
 - [#174]: The `LongestFirst` truncation strategy had a bug
 [#208]: https://github.com/huggingface/tokenizers/pull/208
 [#197]: https://github.com/huggingface/tokenizers/pull/197
 [#193]: https://github.com/huggingface/tokenizers/pull/193
 [#190]: https://github.com/huggingface/tokenizers/pull/190