Add Strip normalizer (#140)

* WIP strip.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Rust StripNormalizer

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Allow to specify strip direction

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Renamed StripNormalizer to Strip

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added Python binding.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Makes Strip python compatible with pythonic constructor.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Run RustFmt

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Clippy next ofc.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Move lstrip and rstrip on NormalizedString

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* implment strip() for normalizer + unittests.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Add some more unittests on edge cases.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* clippy and fmt.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Simplify strip and fix offsets

* Python - Update strip bindings with default values

Co-authored-by: MOI Anthony <xn1t0x@gmail.com>
This commit is contained in:
Funtowicz Morgan
2020-02-17 11:26:40 +01:00
committed by GitHub
parent 3cac26cdb2
commit bb8321ac0d
7 changed files with 157 additions and 0 deletions

View File

@@ -76,6 +76,7 @@ fn normalizers(_py: Python, m: &PyModule) -> PyResult<()> {
m.add_class::<normalizers::NFKC>()?;
m.add_class::<normalizers::Sequence>()?;
m.add_class::<normalizers::Lowercase>()?;
m.add_class::<normalizers::Strip>()?;
Ok(())
}