mirror of
https://github.com/mii443/tokenizers.git
synced 2025-12-08 21:58:18 +00:00
Truncate Right (#841)
* feat(tokenizers): add truncate test case * !feat(tokenizer): truncate right * refacto(tokenizers): clippy * feat(bindings): update bindings for truncate() * fix(tokenizers): remove unsafe code * refacto(tokenizers): truncate direction * truncate direction enum * compute parts ranges beforehand * 2n space because encoding is dropped at the end of procedure * update bindings * add pip install in python bindings' make test * fix(node): clippy asks to use unwrap_or_else * fix(node): lint * refacto(tokenizers): replace Vec<Range<usize>> by Vec<(usize, usize)> * refacto(bindings): add match syntax * refacto(tokenizers): use mem::replace instead of mem::swap * refacto(tokenizers): assign value the normal way
This commit is contained in:
@@ -286,7 +286,7 @@ class Encoding:
|
||||
:obj:`List[str]`: The list of tokens
|
||||
"""
|
||||
pass
|
||||
def truncate(self, max_length, stride=0):
|
||||
def truncate(self, max_length, stride=0, direction="right"):
|
||||
"""
|
||||
Truncate the :class:`~tokenizers.Encoding` at the given length
|
||||
|
||||
@@ -299,6 +299,9 @@ class Encoding:
|
||||
|
||||
stride (:obj:`int`, defaults to :obj:`0`):
|
||||
The length of previous content to be included in each overflowing piece
|
||||
|
||||
direction (:obj:`str`, defaults to :obj:`right`)
|
||||
Truncate direction
|
||||
"""
|
||||
pass
|
||||
@property
|
||||
|
||||
Reference in New Issue
Block a user