tokenizers

mirror of https://github.com/mii443/tokenizers.git synced 2025-08-22 16:25:30 +00:00

Files

Chris Ha cb8d4de599 fix documentation regarding regex (#1264 )

* fix documentation regarding regex

Split() in pre_tokenizers.rs and normalizations take a regex that is required to be built with a tokenizer specific regex module.
Clarify this in the documentation.

* Update __init__.pyi

fixed __init__.pyi

* Update bindings/python/py_src/tokenizers/__init__.pyi

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update bindings/python/py_src/tokenizers/__init__.pyi

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Revert "Update bindings/python/py_src/tokenizers/__init__.pyi"

This reverts commit 6e8bdfcddf67bcdd8e3b1a78685fd5ef8f6a153c.

* Revert "Update bindings/python/py_src/tokenizers/__init__.pyi"

This reverts commit 897b0c0de471ad7cb6269b8456347c4e5cff2aaf.

* Revert "Update __init__.pyi"

This reverts commit fbe82310b7728ee7cdb6f8b38fbc2388f9d95771.

* add codeblocks the right way

* add codeblocks with stub.py

ran setup.py install to build, and then ran stub.py

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

2023-06-07 09:41:28 +02:00

node

Makes decode and decode_batch work on borrowed content. (#1251 )

2023-05-17 11:18:15 +02:00

python

fix documentation regarding regex (#1264 )

2023-06-07 09:41:28 +02:00