Merge pull request #989 from huggingface/mishig25-patch-2

Update pipeline.mdx
2025-12-03 11:18:29 +00:00 · 2022-04-25 21:03:52 +02:00
parent 0bd4976dba 00132ba836
commit 6533bf0fad
1 changed files with 2 additions and 2 deletions
--- a/docs/source-doc-builder/pipeline.mdx
+++ b/docs/source-doc-builder/pipeline.mdx
@@ -520,7 +520,7 @@ On top of encoding the input texts, a `Tokenizer` also has an API for decoding,
 generated by your model back to a text. This is done by the methods
 `Tokenizer.decode` (for one predicted text) and `Tokenizer.decode_batch` (for a batch of predictions).

-The [decoder]{.title-ref} will first convert the IDs back to tokens
+The `decoder` will first convert the IDs back to tokens
 (using the tokenizer's vocabulary) and remove all special tokens, then
 join those tokens with spaces:

@@ -556,7 +556,7 @@ join those tokens with spaces:

 If you used a model that added special characters to represent subtokens
 of a given "word" (like the `"##"` in
-WordPiece) you will need to customize the [decoder]{.title-ref} to treat
+WordPiece) you will need to customize the `decoder` to treat
 them properly. If we take our previous `bert_tokenizer` for instance the
 default decoing will give: