mirror of
https://github.com/mii443/tokenizers.git
synced 2025-08-22 16:25:30 +00:00
Adding a new document that is the checklist to make (#975)
* Adding a new document that is the checklist to make a new `tokenizers` release. This will help making sure nothing is forgotten. * Update RELEASE.md Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> * Update RELEASE.md Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Update RELEASE.md Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Update RELEASE.md Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Adding runnning full test suite instructions. Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>
This commit is contained in:
90
RELEASE.md
Normal file
90
RELEASE.md
Normal file
@ -0,0 +1,90 @@
|
||||
## How to release
|
||||
|
||||
# Before the release
|
||||
|
||||
Simple checklist on how to make releases for `tokenizers`.
|
||||
|
||||
- Freeze `master` branch.
|
||||
- Run all tests (Check CI has properly run)
|
||||
- If any significant work, check benchmarks:
|
||||
- `cd tokenizers && cargo bench` (needs to be run on latest release tag to measure difference if it's your first time)
|
||||
- Run all `transformers` tests. (`transformers` is a big user of `tokenizers` we need
|
||||
to make sure we don't break it, testing is one way to make sure nothing unforeseen
|
||||
has been done.)
|
||||
- Run all fast tests at the VERY least (not just the tokenization tests). (`RUN_PIPELINE_TESTS=1 CUDA_VISIBLE_DEVICES=-1 pytest -sv tests/`)
|
||||
- When all *fast* tests work, then we can also (it's recommended) run the whole `transformers`
|
||||
test suite.
|
||||
- Rebase this [PR](https://github.com/huggingface/transformers/pull/16708).
|
||||
This will create new docker images ready to run the tests suites with `tokenizers` from the main branch.
|
||||
- Wait for actions to finish
|
||||
- Rebase this [PR](https://github.com/huggingface/transformers/pull/16712)
|
||||
This will run the actual full test suite.
|
||||
- Check the results.
|
||||
- **If any breaking change has been done**, make sure the version can safely be increased for transformers users (`tokenizers` version need to make sure users don't upgrade before `transformers` has). [link](https://github.com/huggingface/transformers/blob/main/setup.py#L154)
|
||||
For instance `tokenizers>=0.10,<0.11` so we can safely upgrade to `0.11` without impacting
|
||||
current users
|
||||
- Then start a new PR containing all desired code changes from the following steps.
|
||||
- You will `Create release` after the code modifications are on `master`.
|
||||
|
||||
# Rust
|
||||
|
||||
- `tokenizers` (rust, python & node) versions don't have to be in sync but it's
|
||||
very common to release for all versions at once for new features.
|
||||
- Edit `Cargo.toml` to reflect new version
|
||||
- Edit `CHANGELOG.md`:
|
||||
- Add relevant PRs that were added (python PRs do not belong for instance).
|
||||
- Add links at the end of the files.
|
||||
- Go to [Releases](https://github.com/huggingface/tokenizers/releases)
|
||||
- Create new Release:
|
||||
- Mark it as pre-release
|
||||
- Use new version name with a new tag (create on publish) `vX.X.X`.
|
||||
- Copy paste the new part of the `CHANGELOG.md`
|
||||
- ⚠️ Click on `Publish release`. This will start the whole process of building a uploading
|
||||
the new version on `crates.io`, there's no going back after this
|
||||
- Go to the [Actions](https://github.com/huggingface/tokenizers/actions) tab and check everything works smoothly.
|
||||
- If anything fails, you need to fix the CI/CD to make it work again. Since your package was not uploaded to the repository properly, you can try again.
|
||||
|
||||
|
||||
# Python
|
||||
|
||||
- Edit `bindings/python/setup.py` to reflect new version.
|
||||
- Edit `bindings/python/py_src/tokenizers/__init__.py` to reflect new version.
|
||||
- Edit `CHANGELOG.md`:
|
||||
- Add relevant PRs that were added (node PRs do not belong for instance).
|
||||
- Add links at the end of the files.
|
||||
- Go to [Releases](https://github.com/huggingface/tokenizers/releases)
|
||||
- Create new Release:
|
||||
- Mark it as pre-release
|
||||
- Use new version name with a new tag (create on publish) `python-vX.X.X`.
|
||||
- Copy paste the new part of the `CHANGELOG.md`
|
||||
- ⚠️ Click on `Publish release`. This will start the whole process of building a uploading
|
||||
the new version on `pypi`, there's no going back after this
|
||||
- Go to the [Actions](https://github.com/huggingface/tokenizers/actions) tab and check everything works smoothly.
|
||||
- If anything fails, you need to fix the CI/CD to make it work again. Since your package was not uploaded to the repository properly, you can try again.
|
||||
- This CI/CD has 3 distinct builds, `Pypi`(normal), `conda` and `extra`. `Extra` is REALLY slow (~4h), this is normal since it has to rebuild many things, but enables the wheel to be available for old Linuxes
|
||||
|
||||
# Node
|
||||
|
||||
- Edit `bindings/node/package.json` to reflect new version.
|
||||
- Edit `CHANGELOG.md`:
|
||||
- Add relevant PRs that were added (python PRs do not belong for instance).
|
||||
- Add links at the end of the files.
|
||||
- Go to [Releases](https://github.com/huggingface/tokenizers/releases)
|
||||
- Create new Release:
|
||||
- Mark it as pre-release
|
||||
- Use new version name with a new tag (create on publish) `node-vX.X.X`.
|
||||
- Copy paste the new part of the `CHANGELOG.md`
|
||||
- ⚠️ Click on `Publish release`. This will start the whole process of building a uploading
|
||||
the new version on `npm`, there's no going back after this
|
||||
- Go to the [Actions](https://github.com/huggingface/tokenizers/actions) tab and check everything works smoothly.
|
||||
- If anything fails, you need to fix the CI/CD to make it work again. Since your package was not uploaded to the repository properly, you can try again.
|
||||
|
||||
|
||||
# Testing the CI/CD for release
|
||||
|
||||
|
||||
If you want to make modifications to the CI/CD of the release GH actions, you need
|
||||
to :
|
||||
- **Comment the part that uploads the artifacts** to `crates.io`, `PyPi` or `npm`.
|
||||
- Change the trigger mecanism so it can trigger every time you push to your branch.
|
||||
- Keep pushing your changes until the artifacts are properly created.
|
Reference in New Issue
Block a user