mirror of
https://github.com/mii443/tokenizers.git
synced 2025-08-23 00:35:35 +00:00
Python - Update CHANGELOG
This commit is contained in:
@ -7,10 +7,22 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
## [Unreleased]
|
## [Unreleased]
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
|
- [#519]: Add a `WordLevelTrainer` used to train a `WordLevel` model
|
||||||
|
- [#533]: Add support for conda builds
|
||||||
- [#542]: Add Split pre-tokenizer to easily split using a pattern
|
- [#542]: Add Split pre-tokenizer to easily split using a pattern
|
||||||
|
- [#544]: Ability to train from memory. This also improves the integration with `datasets`
|
||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
- [#530]: The various attributes on each component can be get/set
|
- [#509]: Automatically stubbing the `.pyi` files
|
||||||
|
- [#519]: Each `Model` can return its associated `Trainer` with `get_trainer()`
|
||||||
|
- [#530]: The various attributes on each component can be get/set (ie.
|
||||||
|
`tokenizer.model.dropout = 0.1`)
|
||||||
|
- [#538]: The API Reference has been improved and is now up-to-date.
|
||||||
|
|
||||||
|
## Fixed
|
||||||
|
- [#519]: During training, the `Model` is now trained in-place. This fixes several bugs that were
|
||||||
|
forcing to reload the `Model` after a training.
|
||||||
|
- [#539]: Fix `BaseTokenizer` enable_truncation docstring
|
||||||
|
|
||||||
## [0.9.4]
|
## [0.9.4]
|
||||||
|
|
||||||
@ -278,8 +290,14 @@ delimiter (Works like `.split(delimiter)`)
|
|||||||
- Fix a bug that was causing crashes in Python 3.5
|
- Fix a bug that was causing crashes in Python 3.5
|
||||||
|
|
||||||
|
|
||||||
|
[#544]: https://github.com/huggingface/tokenizers/pull/544
|
||||||
[#542]: https://github.com/huggingface/tokenizers/pull/542
|
[#542]: https://github.com/huggingface/tokenizers/pull/542
|
||||||
|
[#539]: https://github.com/huggingface/tokenizers/pull/539
|
||||||
|
[#538]: https://github.com/huggingface/tokenizers/pull/538
|
||||||
|
[#533]: https://github.com/huggingface/tokenizers/pull/533
|
||||||
[#530]: https://github.com/huggingface/tokenizers/pull/530
|
[#530]: https://github.com/huggingface/tokenizers/pull/530
|
||||||
|
[#519]: https://github.com/huggingface/tokenizers/pull/519
|
||||||
|
[#509]: https://github.com/huggingface/tokenizers/pull/509
|
||||||
[#506]: https://github.com/huggingface/tokenizers/pull/506
|
[#506]: https://github.com/huggingface/tokenizers/pull/506
|
||||||
[#500]: https://github.com/huggingface/tokenizers/pull/500
|
[#500]: https://github.com/huggingface/tokenizers/pull/500
|
||||||
[#498]: https://github.com/huggingface/tokenizers/pull/498
|
[#498]: https://github.com/huggingface/tokenizers/pull/498
|
||||||
|
Reference in New Issue
Block a user