mirror of
https://github.com/mii443/tokenizers.git
synced 2025-12-03 03:08:21 +00:00
* Draft functionality of visualization * Added comments to make code more intelligble * polish the styles * Ensure colors are stable and comment the css * Code clean up * Made visualizer importable and added some docs * Fix styling * implement comments from PR * Fixed the regex for UNK tokens and examples in notebook * Converted docs to google format * Added a notebook showing multiple languages and tokenizers * Added visual indication of chars that are tokenized with >1 token * Reorganize things a bit and fix import * Update docs Co-authored-by: Anthony MOI <m.anthony.moi@gmail.com>
26 lines
288 B
Plaintext
26 lines
288 B
Plaintext
.DS_Store
|
|
*~
|
|
|
|
.vim
|
|
.env
|
|
target
|
|
.idea
|
|
Cargo.lock
|
|
|
|
/data
|
|
tokenizers/data
|
|
bindings/python/tests/data
|
|
docs/build/
|
|
docs/make.bat
|
|
|
|
__pycache__
|
|
pip-wheel-metadata
|
|
*.egg-info
|
|
*.so
|
|
/bindings/python/examples/.ipynb_checkpoints
|
|
/bindings/python/build
|
|
/bindings/python/dist
|
|
|
|
.vscode
|
|
*.code-workspace
|