Commit Graph

1729 Commits

Author SHA1 Message Date
a7ace4480d python stub.py 2023-09-05 17:33:14 +00:00
f435af8b71 linting 2023-09-05 16:43:06 +00:00
26fdfc2bc3 style 2023-09-05 16:42:45 +00:00
b57e1c3f5d #[allow(dead_code)] // Suppress the "method is never used" warning 2023-09-05 16:42:22 +00:00
c3fa75fa0e nits 2023-09-05 15:40:13 +00:00
08af8ea9c3 make tests happy 2023-09-05 15:37:09 +00:00
531b06f6db update the get_vocab_size to compute actual length of the get_vocab function 2023-09-05 15:19:50 +00:00
f1da83f358 add support for get_added_tokens_decoder 2023-09-05 14:49:29 +00:00
e5fc051ad2 update 2023-09-05 13:34:43 +00:00
93b37f36dc styling 2023-09-04 20:54:55 +00:00
058e34b421 make special editable as well 2023-09-04 20:54:29 +00:00
2291c89896 python stub.py 2023-09-04 19:49:36 +00:00
b235f85527 clippy 2023-09-04 19:31:48 +00:00
9aab096da8 fmt 2023-09-04 19:31:05 +00:00
a59bb76aa1 update and todo 2023-09-04 19:21:38 +00:00
c599db1421 nits 2023-09-04 19:11:19 +00:00
d4008b0d7a cliipy 2023-09-04 19:11:05 +00:00
b117ac7f16 updates 2023-09-04 19:10:22 +00:00
a53dff9bc5 make content writable in python 2023-09-04 18:18:21 +00:00
d9829cdc6e fix more tests 2023-09-04 17:22:27 +00:00
39bd27e673 fix build 2023-09-01 21:22:07 +00:00
9f0c703f03 update init and src for bingings python 2023-09-01 21:07:01 +00:00
587748ab09 clean derive partial eq 2023-09-01 20:50:34 +00:00
fdef4a118b fmt 2023-09-01 20:48:47 +00:00
d1566a9ecc update, // AddedTokens can be updated if value changed 2023-09-01 20:48:36 +00:00
399c6fe852 fix and update tes 2023-09-01 20:40:06 +00:00
2b72017e17 correctly compute the new id: we take the max of the AddedToken + get_vocab_size 2023-09-01 19:03:33 +00:00
db319492f7 clippy 2023-09-01 18:57:39 +00:00
2dca476810 fix some tests 2023-09-01 18:48:50 +00:00
6cca5716af fix one test? 2023-09-01 18:42:30 +00:00
345b4eba96 updates 2023-09-01 18:41:36 +00:00
8e522a38d9 Updating the docs with the new command. (#1333) 2023-08-29 13:15:26 +02:00
d2010d5165 Move to maturing mimicking move for safetensors. + Rewritten node bindings. (#1331)
* Move to maturing mimicking move for `safetensors`.

* Tmp.

* Fix sdist.

* Wat?

* Clippy 1.72

* Remove if.

* Conda sed.

* Fix doc check workflow.

* Moving to maturin AND removing http + openssl mess (smoothing transition
moving to `huggingface_hub`)

* Fix dep

* Black.

* New node bindings.

* Fix docs + node cache ?

* Yarn.

* Working dir.

* Extension module.

* Put back interpreter.

* Remove cache.

* New attempt

* Multi python.

* Remove FromPretrained.

* Remove traces of `fromPretrained`.

* Drop 3.12 for windows?

* Typo.

* Put back the default feature for ignoring links during simple test.

* Fix ?

* x86_64 -> x64.

* Remove warning for windows bindings.

* Excluse aarch.

* Include/exclude.

* Put back workflows in correct states.
2023-08-28 16:24:14 +02:00
f2952020d5 Python 38 arm (#1330) 2023-08-23 16:29:16 +02:00
f08058ab2b Reduce number of different revisions by 1 (#1329) 2023-08-23 15:57:36 +02:00
6c350d88fe Re-using scritpts from safetensors. (#1328) 2023-08-23 15:37:38 +02:00
d0bb35d5a6 Merge pull request #1316 from boyleconnor/add-expect-for-no-truncation
Add `expect()` for disabling truncation
2023-08-18 19:30:53 +02:00
540bf2eb01 pyo3: update to 0.19 (#1322)
* Bump pyo3 dependency versions

* Fix deprecation warnings from pyo3

---------

Co-authored-by: Mike Lui <mikelui@meta.com>
2023-08-16 18:40:32 +02:00
9a93c50c25 Fix stride condition. (#1321)
* Release all at once for simplicity.

* rc2
2023-08-14 15:27:55 +02:00
b35d33f981 Release all at once for simplicity. (#1320) 2023-08-14 13:49:45 +02:00
fb292d1eae 0.13.4.rc1 (#1319) 2023-08-14 12:06:43 +02:00
862046ac94 CD backports (#1318)
* CD backports

follow
huggingface/safetensors#317

* fix node bindings?

`cargo check` doesnt work on my local configuration from `tokenizers/bindings/node/native`
i don't think it will be a problem but i have difficulty telling

* backport #315

* safetensors#317 back ports
2023-08-10 18:52:22 +02:00
748556a9ed Fix code style 2023-08-07 15:17:43 -07:00
d47d3e377c Derive clone for TrainerWrapper (#1317) 2023-08-07 15:15:10 +02:00
a0a8ebe03f Add expect() for disabling truncation 2023-08-06 13:25:50 -07:00
efea6c7246 Handle when precompiled charsmap is empty (#1308)
* Handle when precompiled charsmap is empty

* Black

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2023-07-31 14:35:24 +02:00
c2664ae13f Give error when initializing tokenizer with too high stride (#1306)
* Split `get_n_added_tokens` into separate method

* Modify `TokenizerImpl.with_truncation()` to raise an error if given bad parameters

* Return Python error if `tokenizer.with_truncation()` fails

* Add dummy variable assignment for `no_truncation()` case

* Unrelated fmt fix.

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2023-07-28 09:16:44 +02:00
bb38f390a6 Single warning for holes. (#1303)
* Single warning for holes.

* Dummy.
2023-07-25 12:57:23 +02:00
d6326b2b88 feat: Added CITATION.cff. (#1302) 2023-07-25 12:16:09 +02:00
ea4d3f634c Bump word-wrap from 1.2.3 to 1.2.4 in /bindings/node (#1299)
Bumps [word-wrap](https://github.com/jonschlinkert/word-wrap) from 1.2.3 to 1.2.4.
- [Release notes](https://github.com/jonschlinkert/word-wrap/releases)
- [Commits](https://github.com/jonschlinkert/word-wrap/compare/1.2.3...1.2.4)

---
updated-dependencies:
- dependency-name: word-wrap
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-21 08:08:10 +02:00