ac552ff8b9
Update model.rs ( #1166 )
2023-02-28 17:35:57 +01:00
fa66caf0ab
Improved version. ( #1154 )
...
* Improved version.
* Clippy.
2023-01-23 16:35:19 +01:00
d09241fba1
Prevent using from_pretrained
on invalid ids (better error message). ( #1153 )
2023-01-23 15:38:14 +01:00
b861d48b06
Making Tokenizer
clone. ( #1152 )
2023-01-23 10:12:35 +01:00
1fcd90b0b7
Update info on environment variable for threading ( #1150 )
...
* Update env var name for threading
* Update env var name for threading
2023-01-22 21:24:41 +01:00
33a57e6418
Made dirs optional ( #1148 )
2023-01-18 09:29:15 +01:00
daf8aebd76
Adding python 3.8 for M1 ( #1147 )
2023-01-16 16:40:46 +01:00
5a94a2b6e7
Add missing build targets ( #1145 )
...
* M1 3.11 was not out neither windows amd64.
* python@v4.
* Actually upload.
* Update needs.
* Preparing the actual PR.
2023-01-15 10:18:08 +01:00
fe4ae7dc38
Bump json5 from 2.2.0 to 2.2.3 in /bindings/node ( #1140 )
...
Bumps [json5](https://github.com/json5/json5 ) from 2.2.0 to 2.2.3.
- [Release notes](https://github.com/json5/json5/releases )
- [Changelog](https://github.com/json5/json5/blob/main/CHANGELOG.md )
- [Commits](https://github.com/json5/json5/compare/v2.2.0...v2.2.3 )
---
updated-dependencies:
- dependency-name: json5
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-03 11:50:51 +01:00
c3fedd96b3
Bump json5, copy-webpack-plugin, webpack and webpack-cli ( #1139 )
...
Removes [json5](https://github.com/json5/json5 ). It's no longer used after updating ancestor dependencies [json5](https://github.com/json5/json5 ), [copy-webpack-plugin](https://github.com/webpack-contrib/copy-webpack-plugin ), [webpack](https://github.com/webpack/webpack ) and [webpack-cli](https://github.com/webpack/webpack-cli ). These dependencies need to be updated together.
Removes `json5`
Updates `copy-webpack-plugin` from 5.1.2 to 11.0.0
- [Release notes](https://github.com/webpack-contrib/copy-webpack-plugin/releases )
- [Changelog](https://github.com/webpack-contrib/copy-webpack-plugin/blob/master/CHANGELOG.md )
- [Commits](https://github.com/webpack-contrib/copy-webpack-plugin/compare/v5.1.2...v11.0.0 )
Updates `webpack` from 4.46.0 to 5.75.0
- [Release notes](https://github.com/webpack/webpack/releases )
- [Commits](https://github.com/webpack/webpack/compare/v4.46.0...v5.75.0 )
Updates `webpack-cli` from 3.3.12 to 5.0.1
- [Release notes](https://github.com/webpack/webpack-cli/releases )
- [Changelog](https://github.com/webpack/webpack-cli/blob/master/CHANGELOG.md )
- [Commits](https://github.com/webpack/webpack-cli/compare/v3.3.12...webpack-cli@5.0.1 )
---
updated-dependencies:
- dependency-name: json5
dependency-type: indirect
- dependency-name: copy-webpack-plugin
dependency-type: direct:development
- dependency-name: webpack
dependency-type: direct:development
- dependency-name: webpack-cli
dependency-type: direct:development
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-03 10:22:49 +01:00
9b155b5723
[FIX] In CharBPETokenizer, when Vocab or merges is None, unk_token cannot be used. ( #1136 )
...
* [fix] Use unk_token
In SentencePieceBPETokenizer, when Vocab or merges is None, unk_token cannot be used.
* [fix] If unk_token is None, this case is also considered.
* Update bindings/python/py_src/tokenizers/implementations/sentencepiece_bpe.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
* [FIX] In CharBPETokenizer, Use unk_token.
In CharBPETokenizer, when Vocab or merges is None, unk_token cannot be used.
* Update bindings/python/py_src/tokenizers/implementations/char_level_bpe.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
* Update bindings/python/py_src/tokenizers/implementations/char_level_bpe.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
2022-12-27 11:13:52 +01:00
60a00dda44
Fix one char super tiny typo ( #1137 )
...
* Update pipeline.mdx
* Update pipeline.rst
2022-12-26 11:13:38 +01:00
4d520c9664
Ignore Cargo.lock for subfolders ( #1131 )
2022-12-25 11:35:47 +01:00
fbad581128
Bump derive_builder from 0.9 to 0.12 ( #1129 )
2022-12-23 23:37:16 +01:00
2bed678958
Fix broken links in docs ( #1133 )
2022-12-23 23:35:18 +01:00
3e7476de86
Wrap rustdoc html entity in code block ( #1130 )
2022-12-23 23:30:45 +01:00
03ce27d2fa
Bump cached-path from 0.5 to 0.6 ( #1127 )
2022-12-21 18:10:48 +01:00
5886179eee
Bump decode-uri-component in /tokenizers/examples/unstable_wasm/www ( #1125 )
...
Bumps [decode-uri-component](https://github.com/SamVerschueren/decode-uri-component ) from 0.2.0 to 0.2.2.
- [Release notes](https://github.com/SamVerschueren/decode-uri-component/releases )
- [Commits](https://github.com/SamVerschueren/decode-uri-component/compare/v0.2.0...v0.2.2 )
---
updated-dependencies:
- dependency-name: decode-uri-component
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-19 14:24:24 +01:00
a408b44429
Bump minimatch from 3.0.4 to 3.1.2 in /bindings/node ( #1126 )
...
Bumps [minimatch](https://github.com/isaacs/minimatch ) from 3.0.4 to 3.1.2.
- [Release notes](https://github.com/isaacs/minimatch/releases )
- [Changelog](https://github.com/isaacs/minimatch/blob/main/changelog.md )
- [Commits](https://github.com/isaacs/minimatch/compare/v3.0.4...v3.1.2 )
---
updated-dependencies:
- dependency-name: minimatch
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-19 14:09:24 +01:00
bfa842e063
Adding stale bot ? ( #1123 )
...
* Adding stale bot ?
* Clippy.
2022-12-19 13:50:48 +01:00
1649d74536
Fixing conda ssl location ( #1124 )
...
* Fixing conda build ?
* Reduce the scope to speedup testing.
* Reduce more.
* Trying to link to conda lib.
* Trying to enable `pkg-config` on the codna env.
* Really publish.
* Update conda builds.
* Remove 3.11
* Putting releases back onto release track.
2022-12-19 13:50:36 +01:00
9a25b2cb8e
[FIX] In SentencePieceBPETokenizer, when Vocab or merges is None, unk_token cannot be used. ( #1120 )
...
* [fix] Use unk_token
In SentencePieceBPETokenizer, when Vocab or merges is None, unk_token cannot be used.
* [fix] If unk_token is None, this case is also considered.
* Update bindings/python/py_src/tokenizers/implementations/sentencepiece_bpe.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
2022-12-19 13:40:04 +01:00
102dfe87a3
Bump decode-uri-component from 0.2.0 to 0.2.2 in /bindings/node ( #1116 )
...
Bumps [decode-uri-component](https://github.com/SamVerschueren/decode-uri-component ) from 0.2.0 to 0.2.2.
- [Release notes](https://github.com/SamVerschueren/decode-uri-component/releases )
- [Commits](https://github.com/SamVerschueren/decode-uri-component/compare/v0.2.0...v0.2.2 )
---
updated-dependencies:
- dependency-name: decode-uri-component
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-05 18:09:38 +01:00
67080e163a
Include license file in Rust crate ( #1115 )
...
* Include license file in Rust crate
* Ignore security warning.
* Also for python.
* Upgrading ubuntu version.
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
2022-11-30 23:17:56 +01:00
c74e9e62f6
Bump loader-utils in /tokenizers/examples/unstable_wasm/www ( #1108 )
...
Bumps [loader-utils](https://github.com/webpack/loader-utils ) from 1.4.0 to 1.4.2.
- [Release notes](https://github.com/webpack/loader-utils/releases )
- [Changelog](https://github.com/webpack/loader-utils/blob/v1.4.2/CHANGELOG.md )
- [Commits](https://github.com/webpack/loader-utils/compare/v1.4.0...v1.4.2 )
---
updated-dependencies:
- dependency-name: loader-utils
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-16 12:01:25 +01:00
e9529cb02f
Merge pull request #1107 from huggingface/revert-1101-update_doc_pr_actions
...
Revert "Update pr docs actions"
2022-11-16 11:41:51 +01:00
ffcf5a4136
Revert "Update pr docs actions ( #1101 )"
...
This reverts commit 99c06c82e0
.
2022-11-16 11:41:38 +01:00
bbae829a72
Adding rust audit. ( #1099 )
...
* Adding rust audit.
* Update clap version + derive_builder (they clashed).
* Ignoring specific CVE which can be ignored
https://github.com/Azure/iot-identity-service/issues/481
* Updating python lock.
* Revert `derive-builder` update.
* Adding back help msg.
2022-11-09 12:59:36 +01:00
99c06c82e0
Update pr docs actions ( #1101 )
2022-11-09 11:09:52 +01:00
b8a4aa6000
Fixing extra wheels memory usage. ( #1098 )
2022-11-07 09:11:18 +01:00
11bb2e00f2
Add python 3.11 to manylinux buildwheels ( #1096 )
...
* Add python 3.11 to manylinux buildwheels
* Fixing clippy.
* Node clippy.
* Python clippy.
* Changelog + version number update.
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
2022-11-07 08:45:04 +01:00
96a9e5715c
New version. ( #1082 )
...
* New version.
The actual release will happen *before* PyO3 0.17.2 because
the tests were ran before than.
* Manylinux2014 necessary now with Rust 1.64.
2022-10-06 15:45:56 +02:00
4ef0afbeb6
Update old gh actions, remove deprecated doc building. ( #1069 )
2022-10-05 17:59:46 +02:00
8129dd3309
pyo3: update to 0.17 ( #1066 )
...
* python: update bindings to edition 2021
* python: update to pyo3 0.17
* Updating testing.
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
2022-10-05 16:59:01 +02:00
6113666624
Updating python formatting. ( #1079 )
...
* Updating python formatting.
* Forgot gh action.
* Skipping isort to prevent circular imports.
* Updating stub.
* Removing `isort` (it contradicts `stub.py`).
* Fixing weird stub black/isort disagreeement.
2022-10-05 15:29:33 +02:00
5f6e978452
Fixing roberta type id (everything is zero). ( #1072 )
...
* Fixing roberta type ids (everything is zero).
* We need to fix type_ids for all sequence even when not changing
anything else.
* Fixing tests hopefully better.
2022-09-26 18:00:41 +02:00
6e5569a540
Moving versions numbers to dev
mode. ( #1067 )
2022-09-22 18:24:07 +02:00
63082c4d11
Enabling static interpreter embedding for manylinux. ( #1064 )
...
* Removing dead file.
* Checking that we can distribute with static python embedding for
manylinux
* Many linux embed interpreter.
* Building wheels manylinux with static embedding
* Better script.
* typo.
* Using a dummy feature?
* default features ?
* Back into order.
* Fixing manylinux ??.
* Local dir.
* Missing star.
* Makedir ?
* Monkey coding this.
* extension module ?
* Building with default features `RustExtension`.
* bdist_wheel + rustextension any better ?
* update rust-py version.
* Forcing extension module.
* No default features.
* Remove py37 out of spite
* Revert "Remove py37 out of spite"
This reverts commit 6ab7facd792b59c2e30be82fe42816d24c32cf0d.
* Really extraneous feature.
* Fix build wheels.
* Putting things back in place.
2022-09-21 12:18:46 +02:00
655f4057b7
Removing python3.6 from manylinux it's not supported anymore. ( #1063 )
2022-09-19 12:22:02 +02:00
7c146d9ce5
Turns out we introduced a regression because bad code. ( #1060 )
2022-09-16 11:20:59 +02:00
7bfab48979
Preparing rc1 release. ( #1056 )
...
* Preparing rc1 release.
* Fixing test_alignment_methods
* Fixing the overflowing sequence_id issue (LayoutLMv2 tests caught this).
* Adding overly complex overflowing test.
2022-09-12 16:07:06 +02:00
06025e4ca1
Adding Sequence
for PostProcessor
. ( #1052 )
...
* Adding `Sequence` for `PostProcessor`.
* Fixing node? Writing in the dark here, don't have Python2.7
* `undefined` is not accepted.
* Other test.
2022-08-25 14:50:06 +02:00
37f7bae0f7
Making process_encodings
not eat up the encodings any more. ( #1051 )
...
* Making `process_encodings` not eat up the encodings any more.
* Fixing clippy.
2022-08-25 11:49:18 +02:00
c174b5bd34
Adding m1 build to the release process for Python. ( #1055 )
...
* Adding m1 build to the release process for Python.
* typo.
2022-08-25 11:06:03 +02:00
6878ab028d
Bump node-forge and webpack-dev-server ( #1053 )
...
Bumps [node-forge](https://github.com/digitalbazaar/forge ) and [webpack-dev-server](https://github.com/webpack/webpack-dev-server ). These dependencies needed to be updated together.
Updates `node-forge` from 0.10.0 to 1.3.1
- [Release notes](https://github.com/digitalbazaar/forge/releases )
- [Changelog](https://github.com/digitalbazaar/forge/blob/main/CHANGELOG.md )
- [Commits](https://github.com/digitalbazaar/forge/compare/0.10.0...v1.3.1 )
Updates `webpack-dev-server` from 3.11.3 to 4.10.0
- [Release notes](https://github.com/webpack/webpack-dev-server/releases )
- [Changelog](https://github.com/webpack/webpack-dev-server/blob/master/CHANGELOG.md )
- [Commits](https://github.com/webpack/webpack-dev-server/compare/v3.11.3...v4.10.0 )
---
updated-dependencies:
- dependency-name: node-forge
dependency-type: indirect
- dependency-name: webpack-dev-server
dependency-type: direct:development
...
Signed-off-by: dependabot[bot] <support@github.com >
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-24 20:08:46 +02:00
460bdded80
Modify Processor
trait to support chaining. ( #1054 )
...
0 modifications yet, everything will consume the vector.
Every test should be green without any modifications.
2022-08-24 19:49:23 +02:00
b1c9bc68b5
Updating code according to clippy. ( #1048 )
...
- Adding `Eq` where possible
- Denied the ref deref warnings as it was spamming and solution not
really better.
2022-08-24 19:45:15 +02:00
67c56adf68
Upgrade macro_rules_attribute to 0.1.2 ( #1038 )
2022-08-08 14:03:19 +02:00
67fb60a33c
Bump terser in /tokenizers/examples/unstable_wasm/www ( #1032 )
...
Bumps [terser](https://github.com/terser/terser ) from 4.8.0 to 4.8.1.
- [Release notes](https://github.com/terser/terser/releases )
- [Changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md )
- [Commits](https://github.com/terser/terser/commits )
---
updated-dependencies:
- dependency-name: terser
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-22 09:00:14 +02:00
eb2213842b
Update README.md ( #1019 )
...
* Update README.md
Add reference to normalizer blog post
* Update lib.rs
* Fixing PR + clippy on node.
* Update readme to match docstring.
* Other clippy warning.
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
2022-07-19 09:54:29 +02:00