mirror of
https://github.com/mii443/tokenizers.git
synced 2025-08-22 16:25:30 +00:00
Fixing broken link. (#1268)
This commit is contained in:
@ -73,7 +73,7 @@ Training the tokenizer
|
|||||||
|
|
||||||
In this tour, we will build and train a Byte-Pair Encoding (BPE) tokenizer. For more information
|
In this tour, we will build and train a Byte-Pair Encoding (BPE) tokenizer. For more information
|
||||||
about the different type of tokenizers, check out this `guide
|
about the different type of tokenizers, check out this `guide
|
||||||
<https://huggingface.co/transformers/tokenizer_summary.html>`__ in the 🤗 Transformers
|
<https://huggingface.co/docs/transformers/main/en/tokenizer_summary#summary-of-the-tokenizers>`__ in the 🤗 Transformers
|
||||||
documentation. Here, training the tokenizer means it will learn merge rules by:
|
documentation. Here, training the tokenizer means it will learn merge rules by:
|
||||||
|
|
||||||
- Start with all the characters present in the training corpus as tokens.
|
- Start with all the characters present in the training corpus as tokens.
|
||||||
|
Reference in New Issue
Block a user