Commit Graph

33 Commits

Author SHA1 Message Date
5c18ec5ff5 pyo3 v0.18 migration (#1173)
* pyo v0.18 migration

* Fix formatting issues of black
2023-03-08 11:27:47 +01:00
fbad581128 Bump derive_builder from 0.9 to 0.12 (#1129) 2022-12-23 23:37:16 +01:00
11bb2e00f2 Add python 3.11 to manylinux buildwheels (#1096)
* Add python 3.11 to manylinux buildwheels

* Fixing clippy.

* Node clippy.

* Python clippy.

* Changelog + version number update.

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2022-11-07 08:45:04 +01:00
8129dd3309 pyo3: update to 0.17 (#1066)
* python: update bindings to edition 2021

* python: update to pyo3 0.17

* Updating testing.

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2022-10-05 16:59:01 +02:00
06025e4ca1 Adding Sequence for PostProcessor. (#1052)
* Adding `Sequence` for `PostProcessor`.

* Fixing node? Writing in the dark here, don't have Python2.7

* `undefined` is not accepted.

* Other test.
2022-08-25 14:50:06 +02:00
460bdded80 Modify Processor trait to support chaining. (#1054)
0 modifications yet, everything will consume the vector.
Every test should be green without any modifications.
2022-08-24 19:49:23 +02:00
519cc13be0 Upgrade pyo3 to 0.16 (#956)
* Upgrade pyo3 to 0.15

Rebase-conflicts-fixed-by: H. Vetinari <h.vetinari@gmx.com>

* Upgrade pyo3 to 0.16

Rebase-conflicts-fixed-by: H. Vetinari <h.vetinari@gmx.com>

* Install Python before running cargo clippy

* Fix clippy warnings

* Use `PyArray_Check` instead of downcasting to `PyArray1<u8>`

* Enable `auto-initialize` of pyo3 to fix `cargo test
--no-default-features`

* Fix some test cases

Why do they change?

* Refactor and add SAFETY comments to `PyArrayUnicode`

Replace deprecated `PyUnicode_FromUnicode` with `PyUnicode_FromKindAndData`

Co-authored-by: messense <messense@icloud.com>
2022-05-05 15:48:40 +02:00
c1100ec542 Clippy fixes. (#846)
* Clippy fixes.

* Drop support for Python 3.6

* Remove other 3.6

* Re-enabling caches for build (5h + seems too long and issue seems
solved)

https://github.com/actions/virtual-environments/issues/572

* `npm audit fix`.

* Fix yaml ?

* Pyarrow issue fixed: https://github.com/huggingface/datasets/pull/2268

* Installing dev libraries.

* Install python dev elsewhere ?

* Typo.

* No sudo.

* ...

* Testing the GH again.

* Maybe v2 will fix ?

* Fixing tests on MacOS Python 3.8+
2021-12-15 15:55:48 +01:00
56a9196030 Fix clippy warnings 2021-03-16 12:32:06 -04:00
57200144ca Python - Fix ByteLevel instantiation from state (#621) 2021-02-04 10:16:05 -05:00
64441b54b1 Python - Improve documentation for post-processors 2020-11-23 11:52:51 -05:00
352c92ad33 Automatically stubbing the pyi files while keeping inspecting ability (#509)
* First pass on automatic stubbing our python files.

* And now modifying all rust docs to be visible in Pyi files.

* Better assert fail message.

* Fixing github workflow.

* Removing types not exported anymore.

* Fixing `Tokenizer` signature.

* Disabling auto __init__.py.

* Re-enabling some types.

* Don't overwrite non automated __init__.py

* Automated most __init__.py

* Restubbing after rebase.

* Fixing env for tests.

* Install blakc in the env.

* Use PY35 target in stub.py

Co-authored-by: Anthony MOI <m.anthony.moi@gmail.com>
2020-11-17 15:13:00 -05:00
1070eb471e Python - Update bindings for TemplateProcessing 2020-09-29 10:09:10 -04:00
5276238b1b Python - Add bindings for PostProcessor.process 2020-09-23 15:50:01 -04:00
940f8bd8fa Update PyO3 (#426) 2020-09-22 12:00:20 -04:00
337fe72b13 Python - Bindings for TemplateProcessing 2020-09-10 15:04:19 -04:00
df827d538f Adding clippy as a linter within the Python binding. (#388)
* Adding clippy as a linter within the Python binding.

* Missing clippy (dropped commit ??)
2020-09-04 09:09:02 -04:00
16f75d9efc Ensure serialization works in all expected ways. 2020-08-04 15:59:33 -04:00
11e86a16c5 Remove Container from PostProcessors, replace with Arc.
* prefix the Python types in Rust with Py.
* remove unsound Container wrappers, replace with Arc.
2020-08-04 15:59:33 -04:00
c5bba91bf4 Python - Test and fix classes pickling 2020-05-27 13:46:37 -04:00
6a70162d78 Python - Make all relevant classes pickable 2020-05-27 13:46:37 -04:00
81e2cc2fc4 Python - Add offsets trimming to RobertaProcessing 2020-04-15 18:49:38 -04:00
be7b345bcd Require Send for all parts of the tokenizer (#222) 2020-04-08 13:35:06 -04:00
550413f00a add Send + Sync on all traits, remove elsewhere 2020-04-08 18:43:23 +02:00
2dc48e56ac Python - Update pyo3 version
* Use __new__ instead of static method as model constructors
2020-04-06 21:20:16 +02:00
efbbfea558 Update ByteLevel PostProcessor 2020-03-10 12:05:04 -04:00
52180a9179 Python - Add ByteLevel PostProcessor 2020-03-06 17:44:44 -05:00
f263d7651f Python - RustFmt 2020-02-18 15:07:34 -05:00
c4bac6aeeb Expose num_added_tokens on Python side (#146)
* Expose num_added_tokens on Python side without the need to pass an Encoding to added_tokens.

This allows to compute the max sentence length for single/pair inputs without actually the need to have an Encoding structure.
As the number of added tokens is fixed and static during compilation it allows more flexible usage of the method.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Renamed num_added_tokens to num_special_tokens_to_add.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-02-14 10:55:20 +00:00
f32e0c09fc Implement __new__ for PostProcessors
Allows PostProcessors to be instansiated through python class constructor.
2020-02-10 10:43:53 +01:00
6524f09e99 Roberta PostProcessor (#111)
* Added RobertaProcessor on Rust side.

Required to match the double separator token in the middle of pairs.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Fix typo in RobertaProcessing method declaration

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Correctly include RobertProcessor in the Python binding

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Roberta doesnt use token_type_ids so let's set everything to 0

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Attempt to make it works on Node side too.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* fix js bindings / `npm run lint`

* Make RustFmt happy.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>
2020-02-03 10:39:48 +00:00
3f95248d6d Python - Truncation & padding bindings 2019-12-17 17:24:53 -05:00
93a74aa53a Python - Expose PostProcessors 2019-12-16 18:46:14 -05:00