Update docs for from_pretrained

2025-08-22 16:25:30 +00:00 · 2021-08-19 16:50:17 +02:00
parent 528c9a532e
commit a4d0f3dd18
2 changed files with 16 additions and 8 deletions
--- a/bindings/python/setup.py
+++ b/bindings/python/setup.py
@ -3,6 +3,7 @@ from setuptools_rust import Binding, RustExtension
 extras = {}
 extras["testing"] = ["pytest", "requests", "numpy", "datasets"]
 extras["docs"] = ["sphinx", "sphinx_rtd_theme", "setuptools_rust"]
 setup(
    name="tokenizers",
--- a/docs/source/quicktour.rst
+++ b/docs/source/quicktour.rst
@ -706,10 +706,22 @@ In this case, the `attention mask` generated by the tokenizer takes the padding
 .. only:: python
    Using a pretrained tokenizer
-    ----------------------------------------------------------------------------------------------------
+    ------------------------------------------------------------------------------------------------
-    You can also use a pretrained tokenizer directly in, as long as you have its vocabulary file. For
+    You can load any tokenizer from the Hugging Face Hub as long as a `tokenizer.json` file is
-    instance, here is how to get the classic pretrained BERT tokenizer:
+    available in the repository.
    .. code-block:: python
        from tokenizers import Tokenizer
        tokenizer = Tokenizer.from_pretrained("bert-base-uncased")
    Importing a pretrained tokenizer from legacy vocabulary files
    ------------------------------------------------------------------------------------------------
    You can also import a pretrained tokenizer directly in, as long as you have its vocabulary file.
    For instance, here is how to import the classic pretrained BERT tokenizer:
    .. code-block:: python
@ -722,8 +734,3 @@ In this case, the `attention mask` generated by the tokenizer takes the padding
    .. code-block:: bash
        wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt
    .. note::
        Better support for pretrained tokenizers is coming in a next release, so expect this API to
        change soon.