mirror of
https://github.com/mii443/tokenizers.git
synced 2025-08-22 16:25:30 +00:00
* Fix typos Signed-off-by: tinyboxvk <13696594+tinyboxvk@users.noreply.github.com> * Update docs/source/quicktour.rst * Update docs/source-doc-builder/quicktour.mdx --------- Signed-off-by: tinyboxvk <13696594+tinyboxvk@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
83 lines
1.7 KiB
Plaintext
83 lines
1.7 KiB
Plaintext
# Installation
|
|
|
|
<tokenizerslangcontent>
|
|
<python>
|
|
🤗 Tokenizers is tested on Python 3.5+.
|
|
|
|
You should install 🤗 Tokenizers in a [virtual environment](https://docs.python.org/3/library/venv.html). If you're
|
|
unfamiliar with Python virtual environments, check out the [user
|
|
guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
|
Create a virtual environment with the version of Python you're going to
|
|
use and activate it.
|
|
|
|
## Installation with pip
|
|
|
|
🤗 Tokenizers can be installed using pip as follows:
|
|
|
|
```bash
|
|
pip install tokenizers
|
|
```
|
|
|
|
## Installation from sources
|
|
|
|
To use this method, you need to have the Rust language installed. You
|
|
can follow [the official
|
|
guide](https://www.rust-lang.org/learn/get-started) for more
|
|
information.
|
|
|
|
If you are using a unix based OS, the installation should be as simple
|
|
as running:
|
|
|
|
```bash
|
|
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
|
|
```
|
|
|
|
Or you can easily update it with the following command:
|
|
|
|
```bash
|
|
rustup update
|
|
```
|
|
|
|
Once rust is installed, we can start retrieving the sources for 🤗
|
|
Tokenizers:
|
|
|
|
```bash
|
|
git clone https://github.com/huggingface/tokenizers
|
|
```
|
|
|
|
Then we go into the python bindings folder:
|
|
|
|
```bash
|
|
cd tokenizers/bindings/python
|
|
```
|
|
|
|
At this point you should have your [virtual environment]() already
|
|
activated. In order to compile 🤗 Tokenizers, you need to:
|
|
|
|
```bash
|
|
pip install -e .
|
|
```
|
|
|
|
</python>
|
|
<rust>
|
|
## Crates.io
|
|
|
|
🤗 Tokenizers is available on [crates.io](https://crates.io/crates/tokenizers).
|
|
|
|
You just need to add it to your `Cargo.toml`:
|
|
|
|
```bash
|
|
cargo add tokenizers
|
|
```
|
|
</rust>
|
|
<node>
|
|
## Installation with npm
|
|
|
|
You can simply install 🤗 Tokenizers with npm using:
|
|
|
|
```bash
|
|
npm install tokenizers
|
|
```
|
|
</node>
|
|
</tokenizerslangcontent>
|