| Crates.io | kll-rs |
| lib.rs | kll-rs |
| version | 0.1.4 |
| created_at | 2025-08-29 11:39:49.973101+00 |
| updated_at | 2025-08-29 12:49:50.766676+00 |
| description | KLL quantiles sketch from Apache DataSketches for Rust |
| homepage | https://github.com/homeffjy/kll-rs |
| repository | https://github.com/homeffjy/kll-rs |
| max_upload_size | |
| id | 1815746 |
| size | 93,629 |
A high-performance Rust wrapper for Apache DataSketches KLL (Karp, Luby, Lamport) quantile sketches. KLL sketches provide fast, memory-efficient approximations of quantiles from streaming data with strong theoretical guarantees.
Add this to your Cargo.toml:
[dependencies]
kll-rs = { git = "https://github.com/homeffjy/kll-rs" }
use kll_rs::KllDoubleSketch;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut sketch = KllDoubleSketch::new()?;
// Add streaming data
for i in 1..=100_000 {
sketch.update(i as f64);
}
// Query quantiles
let median = sketch.get_quantile(0.5); // 50th percentile
let p95 = sketch.get_quantile(0.95); // 95th percentile
let p99 = sketch.get_quantile(0.99); // 99th percentile
println!("Median: {:.1}, P95: {:.1}, P99: {:.1}", median, p95, p99);
// Get multiple quantiles efficiently
let quantiles = sketch.get_quantiles(&[0.25, 0.5, 0.75, 0.95, 0.99]);
println!("Quantiles: {:?}", quantiles);
// Get rank (cumulative distribution)
let rank = sketch.get_rank(75000.0);
println!("Rank of 75000: {:.3}", rank);
Ok(())
}
use kll_rs::KllDoubleSketch;
// Create sketch with custom k parameter (higher k = better accuracy, more memory)
let mut sketch = KllDoubleSketch::new_with_k(256)?;
// Process data
for value in data_stream {
sketch.update(value);
}
// Check sketch properties
println!("Processed {} values", sketch.get_n());
println!("Memory usage: {} retained values", sketch.get_num_retained());
println!("Estimation mode: {}", sketch.is_estimation_mode());
let mut sketch1 = KllDoubleSketch::new()?;
let mut sketch2 = KllDoubleSketch::new()?;
// Process data in parallel
sketch1.update(1.0);
sketch2.update(2.0);
// Merge sketches
sketch1.merge(&sketch2)?;
use kll_rs::KllDoubleSketch;
// Create and populate sketch
let mut sketch = KllDoubleSketch::new()?;
for i in 0..10000 {
sketch.update(i as f64);
}
// Serialize to bytes
let bytes = sketch.serialize()?;
// Deserialize from bytes
let restored_sketch = KllDoubleSketch::deserialize(&bytes)?;
// Serde support (JSON, MessagePack, etc.)
let json = serde_json::to_string(&sketch)?;
let from_json: KllDoubleSketch = serde_json::from_str(&json)?;
| Method | Description |
|---|---|
new() |
Create sketch with default parameters (k=200) |
new_with_k(k) |
Create sketch with custom k parameter (k ≥ 8) |
update(value) |
Add a value to the sketch |
merge(other) |
Merge another sketch into this one |
get_quantile(fraction) |
Get quantile for fraction ∈ [0,1] |
get_quantiles(fractions) |
Get multiple quantiles efficiently |
get_rank(value) |
Get rank (CDF) of a value |
get_n() |
Total number of values processed |
get_num_retained() |
Number of values retained in memory |
is_estimation_mode() |
Whether sketch is in estimation mode |
serialize() |
Serialize to bytes |
deserialize(bytes) |
Deserialize from bytes |
This library includes comprehensive benchmarks to evaluate performance characteristics:
# Run all benchmarks
cargo bench --bench kll_double_benchmark
# View detailed HTML reports
open target/criterion/report/index.html
The benchmarks test maximum-scale scenarios to evaluate performance limits:
| Operation | Data Size | Performance |
|---|---|---|
| Sketch Creation | k=256 | ~50ns per sketch |
| Updates | 100K values | ~15ns per update |
| Quantile Query | 100K values | ~100ns per query |
| Multiple Quantiles | 7 quantiles from 100K values | ~500ns |
| Serialization | 100K values | ~50μs |
| Deserialization | 100K values | ~45μs |
| Merge Operation | 2×50K values | ~20μs |
| Clone Operation | 50K values | ~50μs |
KLL sketches use memory efficiently:
KLL sketches provide theoretical guarantees on approximation error:
For most applications, k=200 (default) provides excellent accuracy. Use higher k values for applications requiring maximum precision.
git clone https://github.com/homeffjy/kll-rs.git
cd kll-rs
cargo build --release