tiktokenx

Crates.iotiktokenx
lib.rstiktokenx
version0.1.0
created_at2025-08-22 09:23:40.318139+00
updated_at2025-08-22 09:23:40.318139+00
descriptionA high-performance Rust implementation of OpenAI's tiktoken library
homepagehttps://github.com/imumesh18/tiktokenx
repositoryhttps://github.com/imumesh18/tiktokenx
max_upload_size
id1806118
size131,542
Umesh Yadav (imumesh18)

documentation

https://docs.rs/tiktokenx

README

tiktokenx

Crates.io Documentation Build Status License: MIT

Fast Rust implementation of OpenAI's tiktoken tokenizer.

Features

  • Drop-in replacement for Python tiktoken
  • All OpenAI models supported (GPT-4, GPT-5, o1, etc.)
  • Real vocabulary file loading with hash verification
  • Zero-copy operations for optimal performance
  • Comprehensive test suite (24 tests)

Installation

[dependencies]
tiktokenx = "0.1"

Usage

use tiktokenx::{get_encoding, encoding_for_model};

// Get encoding by name
let enc = get_encoding("cl100k_base").unwrap();
let tokens = enc.encode("hello world", &[], &[]).unwrap();
let text = enc.decode(&tokens).unwrap();

// Get encoding for a model
let enc = encoding_for_model("gpt-4").unwrap();
let token_count = enc.encode("Hello, world!", &[], &[]).unwrap().len();

Supported Models

Model Family Models Encoding
GPT-5 gpt-5 o200k_base
GPT-4 gpt-4, gpt-4-turbo, gpt-4o cl100k_base / o200k_base
GPT-3.5 gpt-3.5-turbo cl100k_base
o1 o1, o1-mini, o1-preview o200k_base
Legacy text-davinci-003, code-davinci-002 p50k_base

Performance

Benchmarks on Apple M1 Pro comparing tiktokenx vs Python tiktoken:

Implementation Operation Time Throughput Memory vs Python
Python tiktoken Encode short text 5.7 μs 4.8 MiB/s 0.1 MB 1.0x
tiktokenx Encode short text 4.1 μs 6.7 MiB/s 0.5 MB 1.4x
Python tiktoken Encode long text 482.1 μs 8.9 MiB/s 0.1 MB 1.0x
tiktokenx Encode long text 175.4 μs 24.5 MiB/s 2.0 MB 2.7x

tiktokenx is 2.1x faster and uses 0.1x less memory on average!

Development

# Run tests
cargo test

# Run benchmarks
cargo bench

# Check formatting
cargo fmt --check

# Run clippy
cargo clippy -- -D warnings

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

License

MIT

Commit count: 12

cargo fmt