| Crates.io | ultraloglog |
| lib.rs | ultraloglog |
| version | 0.1.6 |
| created_at | 2025-01-31 14:32:10.926166+00 |
| updated_at | 2025-10-26 07:26:15.352592+00 |
| description | Rust implementation of the UltraLogLog algorithm |
| homepage | |
| repository | https://github.com/waynexia/ultraloglog |
| max_upload_size | |
| id | 1537567 |
| size | 109,596 |
Rust implementation of the UltraLogLog algorithm. Ultraloglog is more space efficient than the widely used HyperLogLog, but can be slower. FGRA estimator or MLE estimator can be used.
use ultraloglog::{Estimator, MaximumLikelihoodEstimator, OptimalFGRAEstimator, UltraLogLog};
let mut ull = UltraLogLog::new(6).unwrap();
ull.add_value("apple")
.add_value("banana")
.add_value("cherry")
.add_value("033");
let est = ull.get_distinct_count_estimate();
The serde feature can be activated so that the sketch can be saved to disk and then loaded.
use ultraloglog::{Estimator, MaximumLikelihoodEstimator, OptimalFGRAEstimator, UltraLogLog};
use std::fs::{remove_file, File};
use std::io::{BufReader, BufWriter};
let file_path = "test_ultraloglog.bin";
// Create UltraLogLog and add data
let mut ull = UltraLogLog::new(5).expect("Failed to create ULL");
ull.add(123456789);
ull.add(987654321);
let original_estimate = ull.get_distinct_count_estimate();
// Save to file using writer
let file = File::create(file_path).expect("Failed to create file");
let writer = BufWriter::new(file);
ull.save(writer).expect("Failed to save UltraLogLog");
// Load from file using reader
let file = File::open(file_path).expect("Failed to open file");
let reader = BufReader::new(file);
let loaded_ull = UltraLogLog::load(reader).expect("Failed to load UltraLogLog");
let loaded_estimate = loaded_ull.get_distinct_count_estimate();
This crate also provides Python bindings for the UltraLogLog algorithm using PyO3. See example.py for usage.
import ultraloglog
# Create a new UltraLogLog sketch
ull = ultraloglog.PyUltraLogLog(12) # precision parameter
# Add values
ull.add_str("hello")
ull.add_int(42)
ull.add_float(3.14)
# Get estimated count
print(f"Estimated distinct count: {ull.count()}")
This package is available as ultraloglog in PyPI. You can install it using:
pip install ultraloglog
uv is recommended to manage virtual environments.
Install Rust, and maturin pip install maturin
Build and install: maturin develop --release
As mentioned in the paper, high quality 64-bit hash function is key to ultraloglog algorithm. We tested several modern 64-bit hash libraries and found that xxhash-rust (default) and wyhash-rs worked well. However, users can easily replace the default xxhash-rust with polymurhash, komihash, ahash and t1ha et.al. See testing section for details.
Ertl, O., 2024. UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting. Proceedings of the VLDB Endowment, 17(7), pp.1655-1668.