* Draft functionality of visualization
* Added comments to make code more intelligble
* polish the styles
* Ensure colors are stable and comment the css
* Code clean up
* Made visualizer importable and added some docs
* Fix styling
* implement comments from PR
* Fixed the regex for UNK tokens and examples in notebook
* Converted docs to google format
* Added a notebook showing multiple languages and tokenizers
* Added visual indication of chars that are tokenized with >1 token
* Reorganize things a bit and fix import
* Update docs
Co-authored-by: Anthony MOI <m.anthony.moi@gmail.com>
* start playing around
* make a first version
* refactor
* apply make format
* add python bindings
* add some python binding tests
* correct pre-tokenizers
* update auto-generated bindings
* lint python bindings
* add code node
* add split to docs
* refactor python binding a bit
* cargo fmt
* clippy and fmt in node
* quick updates and fixes
* Oops
* Update node typings
* Update changelog
Co-authored-by: Anthony MOI <m.anthony.moi@gmail.com>
* First pass on automatic stubbing our python files.
* And now modifying all rust docs to be visible in Pyi files.
* Better assert fail message.
* Fixing github workflow.
* Removing types not exported anymore.
* Fixing `Tokenizer` signature.
* Disabling auto __init__.py.
* Re-enabling some types.
* Don't overwrite non automated __init__.py
* Automated most __init__.py
* Restubbing after rebase.
* Fixing env for tests.
* Install blakc in the env.
* Use PY35 target in stub.py
Co-authored-by: Anthony MOI <m.anthony.moi@gmail.com>
which is non trivial for PyClassInitializer. Instead I added a lower
staticmethod that returns raw objects, and the `from_file(s)` methods
are implemented directly in Python.