4 Commits

Author SHA1 Message Date
bfd9cdeefb Perf improvement 16% by removing offsets. (#1587)
* [Breaking Change] Perf improvement 16% by removing offsets.

Offsets calculation are always calculated in Python land.
By changing it to not being calculated, we win 16% of the runtime.

This is not the total extent of it because offsets are
still calculated in bytes.

* Required features.

* Remove clippy error.

* Make it non breaking and still show perf improvement.

* Even faster without offsets.

* Update doc.

* Fmt.

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fmt.

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-08 14:56:13 +02:00
1df498a186 Fixing benchmark2. 2024-08-01 15:52:39 +02:00
c6f2c0b057 Fixing the benchmark. (#1583) 2024-08-01 10:36:53 +02:00
35f338a7b8 Add benchmark vs tiktoken (#1582)
* Adding a simple tiktoken benchmark.

* Adding 1 large fused document case.
2024-07-31 17:09:23 +02:00