mirror of
https://github.com/mii443/tokenizers.git
synced 2025-08-22 16:25:30 +00:00
Fix one char super tiny typo (#1137)
* Update pipeline.mdx * Update pipeline.rst
This commit is contained in:
@ -558,7 +558,7 @@ If you used a model that added special characters to represent subtokens
|
||||
of a given "word" (like the `"##"` in
|
||||
WordPiece) you will need to customize the `decoder` to treat
|
||||
them properly. If we take our previous `bert_tokenizer` for instance the
|
||||
default decoing will give:
|
||||
default decoding will give:
|
||||
|
||||
<tokenizerslangcontent>
|
||||
<python>
|
||||
|
@ -497,7 +497,7 @@ remove all special tokens, then join those tokens with spaces:
|
||||
|
||||
If you used a model that added special characters to represent subtokens of a given "word" (like
|
||||
the :obj:`"##"` in WordPiece) you will need to customize the `decoder` to treat them properly. If we
|
||||
take our previous :entity:`bert_tokenizer` for instance the default decoing will give:
|
||||
take our previous :entity:`bert_tokenizer` for instance the default decoding will give:
|
||||
|
||||
.. only:: python
|
||||
|
||||
|
Reference in New Issue
Block a user