Crates.io | hyperbitbit |
lib.rs | hyperbitbit |
version | 0.0.1-alpha.2 |
source | src |
created_at | 2020-12-19 00:38:05.930404 |
updated_at | 2020-12-24 04:20:21.219683 |
description | Implementation of HyperBitBit data structure |
homepage | https://github.com/jettify/hyperbitbit |
repository | https://github.com/jettify/hyperbitbit.git |
max_upload_size | |
id | 324486 |
size | 27,281 |
A native rust implementation of a HyperBitBit algorithm introduced by by Robert Sedgewick in AC11-Cardinality.pdf
Consider HyperLogLog variants for productions usage, sine this data structure extensively studied, merge able and more accurate. HyperBitBit is extremely cheap and fast alternative in expense of accuracy.
This crate is on crates.io and
can be used by adding hyperbitbit
to the dependencies in your
project's Cargo.toml
.
[dependencies]
hyperbitbit = "0.0.1-alpha.1"
If you want serde support, include the feature like this:
[dependencies]
hyperbitbit = { version = "0.0.1-alpha.1", features = ["serde_support"] }
add this to your crate root:
extern crate hyperbitbit;
Create a HyperBitBit, add more data and estimate cardinality
use hyperbitbit::HyperBitBit;
use rand::distributions::Alphanumeric;
use rand::prelude::*;
use std::collections::HashSet;
fn main() {
// create hbb for cardinality estimation
let mut hbb = HyperBitBit::new();
// hash set to measure exact cardinality
let mut items = HashSet::new();
// fixed seed for RNG for reproducibly results
let mut rng = StdRng::seed_from_u64(42);
// put bunch of random strings on hbb and hash set for comparison
let maxn = 10000;
for _ in 1..maxn {
let s = (&mut rng).sample_iter(&Alphanumeric).take(4).collect::<String>();
hbb.insert(&s);
items.insert(s);
}
let expected: i64 = items.len() as i64;
let rel: f64 = (100.0 * (expected - hbb.cardinality() as i64) as f64) / (expected as f64);
println!("Actuals cardinality: {:?}", expected);
println!("Estimated cardinality: {:?}", hbb.cardinality());
println!("Error % cardinality: {:.2}", rel);
}
Licensed under the Apache License, Version 2.0