mirror of
https://github.com/mii443/tokenizers.git
synced 2025-12-06 20:58:22 +00:00
Update CHANGELOGs
This commit is contained in:
@@ -3,6 +3,12 @@
|
||||
## Changes:
|
||||
- Keep only one progress bar while reading files during training. This is better for use-cases with
|
||||
a high number of files as it avoids having too many progress bar on screen.
|
||||
- `add_prefix_space` option of the `ByteLevel` `PreTokenizer` has been moved to a `Normalizer`
|
||||
|
||||
## How to migrate:
|
||||
- Use the `ByteLevel` `Normalizer` with `add_prefix_space=True` in addition to the `PreTokenizer`.
|
||||
The `PreTokenizer` does not handle this option anymore. This fixes some issues with the offsets
|
||||
being wrong if this option was on.
|
||||
|
||||
# v0.6.0
|
||||
|
||||
|
||||
Reference in New Issue
Block a user