tokenizers

mirror of https://github.com/mii443/tokenizers.git synced 2025-08-23 16:49:27 +00:00

Author	SHA1	Message	Date
Anthony MOI	d3d9f2c76b	words -> word_ids & sequences -> sequence_ids	2020-11-09 16:02:07 -05:00
Anthony MOI	57d162b269	Add an Encoding.sequences to allow masking	2020-11-06 10:41:56 -05:00
Anthony MOI	385d25720a	Simplify the API for Encoding.token_to_XXX	2020-11-06 10:41:56 -05:00
Anthony MOI	a79cc55e08	Node - Encoding mappings handle sequence_id	2020-11-06 10:41:56 -05:00
Pierric Cistac	e9a2e63a67	Node - Fix new linting errors	2020-07-24 15:44:39 -04:00
Anthony MOI	4aecd82d07	Node - Improve mappings on Encoding	2020-04-16 14:23:37 -04:00
Anthony MOI	3ad1360210	Word indices are None for special tokens	2020-04-09 09:52:02 -04:00
Pierric Cistac	e9667a7b83	Node - `tokenizer.postProcess` bindings	2020-03-26 15:42:45 -04:00
Pierric Cistac	0408567f23	Node - Merge encodings	2020-03-26 15:42:45 -04:00
Pierric Cistac	ce3cf78ea5	Node - Bindings for Encoding mappings	2020-03-26 15:42:45 -04:00
Pierric Cistac	25ef729a5a	Node - Update bindings	2020-03-18 15:13:29 -04:00
Pierric Cistac	fe49512d37	node: make `WordPiece.fromFiles` async	2020-03-06 16:06:06 -05:00
Pierric Cistac	917996841d	node: "proxy" raw Encoding with getters	2020-02-26 18:15:16 -05:00