Overview of quicktok

After extensive development, quicktok has arrived as a powerful solution for those looking to expedite their tokenization processes. This fast and precise BPE tokenizer is written in C++ and is fully compatible with tiktoken, boasting byte-identical token IDs. Notably, quicktok operates 2 to 3.6 times quicker than bpe-openai—the fastest known alternative—and achieves speeds 4 to 11 times faster than tiktoken itself.

Key Features

quicktok supports various encoding schemes including cl100k, o200k, GPT-OSS, Llama-3, and Qwen2.5/3. Its design employs the same algorithm as bpe-openai, utilizing exact backtracking BPE, but incorporates advanced data structure optimizations to minimize memory access:

  • 2-byte Trie: Used for efficient longest-match walks.
  • Dense Caches: Implemented for validating merges effectively.
  • Hand-Compiled Pretokenizer: Utilizes a specialized solution rather than a general regex engine for improved performance.

Performance Benchmarks

In rigorous testing conducted on an Apple M1 chip using a single thread, quicktok demonstrated impressive throughput rates measured in MB/s. The benchmarks reveal the following results across various datasets:

EncoderThe PileCodeCommon Crawl
quicktok (native)121.7139.271.3
quicktok (Python)77.983.649.7
bpe-openai36.638.728.9
rs-bpe30.934.723.5
tiktoken-rs15.413.813.3
tiktoken (Python)13.612.812.3
TokenDagger11.111.910.7

Each encoder is accessed via its own raw API, and the benchmarks can be replicated by using the make bench-compare command in the repository.

Conclusion

For anyone in need of a faster and more efficient tokenization solution, quicktok presents an outstanding option. You can install it via pip with pip install quicktok-v1 and explore the project further on GitHub: quicktok Repo.