| Crates.io | chonkier |
| lib.rs | chonkier |
| version | 0.0.2 |
| created_at | 2025-04-14 16:47:22.050875+00 |
| updated_at | 2025-04-24 12:03:34.70752+00 |
| description | 🦛 Chonkie, now in Rust 🦀: No-nonsense, ultra-fast, ultra-light chunking library |
| homepage | |
| repository | |
| max_upload_size | |
| id | 1633065 |
| size | 398,918 |

The no-nonsense, lightweight and fast chunking library that's ready to CHONK your text, in Rust 🦀!
Installation • Usage • Chunkers • Acknowledgements • Citation
Chonkie just got low-leveled! 🦀 Your favorite python chunking library is now in Rust~ even faster, smaller and reliable than ever!
🦀 Rusty & Reliable: Built with Rust for memory safety and performance. 🚀 Feature-rich: All the CHONKs you'd ever need ✨ Easy to use: Add Crate, Use Crate, CHONK ⚡ Blazingly Fast: CHONK at the speed of Rust! zooooom 🪶 Light-weight: No bloat, just CHONK 🦛 Cute CHONK mascot: psst it's a pygmy hippo btw ❤️ Moto Moto's favorite Rust library
ChonkieR is a chunking library that "just works" ✨
To add ChonkieR to your project, run:
cargo add chonkier # Or add it to your Cargo.toml
ChonkieR follows the rule of minimum dependencies. Features can be enabled via Cargo features.
Don't want to think about it? Simply enable all features (Not recommended for production binaries unless needed)
# Cargo.toml
[dependencies]
chonkier = { version = "0.1.0", features = ["all"] } # Replace with desired version
Here's a basic example to get you started:
use chonkier::CharacterTokenizer;
use chonkier::RecursiveChunker;
use chonkier::types::RecursiveRules;
fn main() {
// Initialize the chunker
let chunker = RecursiveChunker::new(CharacterTokenizer::new(), 512, RecursiveRules::default());
// Chunk some text
let text = "ChonkieR is the goodest boi! My favorite chunking hippo hehe.";
let chunks: Vec<RecursiveChunk> = chunker.chunk(text);
// Access chunks
for chunk in chunks {
println!("Chunk: {}", chunk.text);
println!("Tokens: {}", chunk.token_count);
}
}
Check out more usage examples in the examples folder!
ChonkieR currently supports the following chunkers:
ChonkieR would like to CHONK its way through a special thanks to all the users and contributors who have helped make this library what it is today! Your feedback, issue reports, and improvements have helped make ChonkieR the CHONKIEST it can be.
And of course, special thanks to Moto Moto for endorsing ChonkieR with his famous quote:
"I like them big, I like them chonkieR." ~ Moto Moto (He really said this)
If you use ChonkieR in your research, please cite it as follows:
@software{chonkie2025,
author = {Minhas, Bhavnick AND Nigam, Shreyash},
title = {Chonkie: A no-nonsense fast, lightweight, and efficient text chunking library},
year = {2025},
publisher = {GitHub},
howpublished = {\url{https://github.com/chonkie-inc/chonkie}},
}