Input sequences ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ These types represent all the different kinds of sequence that can be used as input of a Tokenizer. Globally, any sequence can be either a string or a list of strings, according to the operating mode of the tokenizer: ``raw text`` vs ``pre-tokenized``. .. autodata:: tokenizers.TextInputSequence .. autodata:: tokenizers.PreTokenizedInputSequence .. autodata:: tokenizers.InputSequence Encode inputs ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ These types represent all the different kinds of input that a :class:`~tokenizers.Tokenizer` accepts when using :meth:`~tokenizers.Tokenizer.encode_batch`. .. autodata:: tokenizers.TextEncodeInput .. autodata:: tokenizers.PreTokenizedEncodeInput .. autodata:: tokenizers.EncodeInput Tokenizer ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autoclass:: tokenizers.Tokenizer :members: