mirror of
https://github.com/mii443/tokenizers.git
synced 2025-08-22 16:25:30 +00:00
Fix typos (#1715)
* Fix typos Signed-off-by: tinyboxvk <13696594+tinyboxvk@users.noreply.github.com> * Update docs/source/quicktour.rst * Update docs/source-doc-builder/quicktour.mdx --------- Signed-off-by: tinyboxvk <13696594+tinyboxvk@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
This commit is contained in:
@ -49,7 +49,7 @@ class CustomNormalizer:
|
||||
def normalize(self, normalized: NormalizedString):
|
||||
# Most of these can be replaced by a `Sequence` combining some provided Normalizer,
|
||||
# (ie Sequence([ NFKC(), Replace(Regex("\s+"), " "), Lowercase() ])
|
||||
# and it should be the prefered way. That being said, here is an example of the kind
|
||||
# and it should be the preferred way. That being said, here is an example of the kind
|
||||
# of things that can be done here:
|
||||
normalized.nfkc()
|
||||
normalized.filter(lambda char: not char.isnumeric())
|
||||
|
Reference in New Issue
Block a user