mirror of
https://github.com/mii443/tokenizers.git
synced 2025-08-22 16:25:30 +00:00
Doc - Basic layout - WIP
This commit is contained in:
@ -27,7 +27,11 @@ Components:
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Getting Started
|
||||
|
||||
quicktour
|
||||
installation
|
||||
pipeline
|
||||
components
|
||||
|
||||
Load an existing tokenizer:
|
||||
|
5
docs/source/installation.rst
Normal file
5
docs/source/installation.rst
Normal file
@ -0,0 +1,5 @@
|
||||
Installation
|
||||
====================================================================================================
|
||||
|
||||
- How to install using pip
|
||||
- How to build from source
|
10
docs/source/pipeline.rst
Normal file
10
docs/source/pipeline.rst
Normal file
@ -0,0 +1,10 @@
|
||||
The tokenization pipeline
|
||||
====================================================================================================
|
||||
|
||||
TODO: Describe the tokenization pipeline:
|
||||
|
||||
- Normalization
|
||||
- Pre-tokenization
|
||||
- Tokenization
|
||||
- Post-processing
|
||||
- Decoding
|
4
docs/source/quicktour.rst
Normal file
4
docs/source/quicktour.rst
Normal file
@ -0,0 +1,4 @@
|
||||
Quicktour
|
||||
====================================================================================================
|
||||
|
||||
- How to use a tokenizer: encode, encode_batch, ``Encoding``, offsets, mappings, ...
|
Reference in New Issue
Block a user