Python - Update CHANGELOG

2025-08-22 16:25:30 +00:00 · 2020-11-28 12:42:37 -05:00
parent 49bd055519
commit 5549fc4837
1 changed files with 19 additions and 1 deletions
--- a/bindings/python/CHANGELOG.md
+++ b/bindings/python/CHANGELOG.md
@ -7,10 +7,22 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]

 ### Added
+- [#519]: Add a `WordLevelTrainer` used to train a `WordLevel` model
+- [#533]: Add support for conda builds
 - [#542]: Add Split pre-tokenizer to easily split using a pattern
+- [#544]: Ability to train from memory. This also improves the integration with `datasets`

 ### Changed
- [#530]: The various attributes on each component can be get/set
+- [#509]: Automatically stubbing the `.pyi` files
+- [#519]: Each `Model` can return its associated `Trainer` with `get_trainer()`
+- [#530]: The various attributes on each component can be get/set (ie.
+`tokenizer.model.dropout = 0.1`)
+- [#538]: The API Reference has been improved and is now up-to-date.
+
+## Fixed
+- [#519]: During training, the `Model` is now trained in-place. This fixes several bugs that were
+forcing to reload the `Model` after a training.
+- [#539]: Fix `BaseTokenizer` enable_truncation docstring

 ## [0.9.4]

@ -278,8 +290,14 @@ delimiter (Works like `.split(delimiter)`)
 - Fix a bug that was causing crashes in Python 3.5


+[#544]: https://github.com/huggingface/tokenizers/pull/544
 [#542]: https://github.com/huggingface/tokenizers/pull/542
+[#539]: https://github.com/huggingface/tokenizers/pull/539
+[#538]: https://github.com/huggingface/tokenizers/pull/538
+[#533]: https://github.com/huggingface/tokenizers/pull/533
 [#530]: https://github.com/huggingface/tokenizers/pull/530
+[#519]: https://github.com/huggingface/tokenizers/pull/519
+[#509]: https://github.com/huggingface/tokenizers/pull/509
 [#506]: https://github.com/huggingface/tokenizers/pull/506
 [#500]: https://github.com/huggingface/tokenizers/pull/500
 [#498]: https://github.com/huggingface/tokenizers/pull/498