mirror of
https://github.com/mii443/tokenizers.git
synced 2025-12-17 01:28:46 +00:00
Update CHANGELOGs
This commit is contained in:
@@ -1,6 +1,9 @@
|
||||
# v0.6.0 (not published yet)
|
||||
|
||||
Fixes:
|
||||
## Changes:
|
||||
- Big improvements in speed for BPE (Both training and tokenization) ([#165](https://github.com/huggingface/tokenizers/pull/165))
|
||||
|
||||
## Fixes:
|
||||
- Some default tokens were missing from `BertWordPieceTokenizer` (cf [#160](https://github.com/huggingface/tokenizers/issues/160))
|
||||
- There was a bug in ByteLevel PreTokenizer that caused offsets to be wrong if a char got split up
|
||||
in multiple bytes. (cf [#156](https://github.com/huggingface/tokenizers/pull/156))
|
||||
|
||||
Reference in New Issue
Block a user