| Crates.io | hyperloglockless |
| lib.rs | hyperloglockless |
| version | 0.3.1 |
| created_at | 2025-06-10 05:50:39.317565+00 |
| updated_at | 2025-07-26 02:11:57.884478+00 |
| description | High-performance HyperLogLog with bias correction and full concurrency support. |
| homepage | https://github.com/tomtomwombat/hyperloglockless/ |
| repository | https://github.com/tomtomwombat/hyperloglockless/ |
| max_upload_size | |
| id | 1706686 |
| size | 46,497 |
High-performance HyperLogLog with bias correction and full concurrency support. Used for accurate and space-efficient cardinality estimation.
HyperLogLogs are space efficient data structures for the "count-distinct problem", approximating the number of distinct elements in a multiset. Paper.
hyperloglockless offers a lockless concurrent HyperLogLog and a single threaded counterpart. They're simpler, faster, and more accurate than other HyperLogLog implementations:
AtomicHyperLogLog is a drop-in replacement for RwLock<OtherHyperLogLog>: all methods take &self, so you can wrap it in Arc and update it concurrently without &mut.[dependencies]
hyperloglockless = "0.3.0"
A HyperLogLog with precision p uses 2^p bytes of memory and has an error % of roughly 104 / sqrt(2^p).
use hyperloglockless::HyperLogLog;
let precision = hyperloglockless::precision_for_error(0.01); // 1% error
assert_eq!(precision, 14);
let mut hll = HyperLogLog::new(precision);
hll.insert(&'๐ฆ');
hll.insert_all('a'..='z');
let count = hll.count(); // ~27
assert_eq!(hll.len(), 1 << precision); // 16384 bytes
Full concurrency support: AtomicHyperLogLog is a drop-in replacement for RwLock<OtherHyperLogLog>: all methods take &self.
use hyperloglockless::AtomicHyperLogLog;
let hll = AtomicHyperLogLog::new(14);
hll.insert(&'๐ฆ');
hll.insert_all('a'..='z');
Both hyperloglockless::HyperLogLog and hyperloglockless::AtomicHyperLogLog are extremely fast for both insert and count calls.
hyperloglockless::AtomicHyperLogLog does not require any locking and therefore avoids thread contention.
hyperloglockless stays consistently accurate while other implementations break down after millions of items.
rand - Enabled by default, this has the DefaultHasher source its random state using thread_rng() instead of hardware sources. Getting entropy from a user-space source is considerably faster, but requires additional dependencies to achieve this. Disabling this feature by using default-features = false makes DefaultHasher source its entropy using getrandom, which will have a much simpler code footprint at the expense of speed.serde - HyperLogLogs implement Serialize and Deserialize when possible.loom - AtomicHyperLogLogs use loom atomics, making it compatible with loom testing.Licensed under either of
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.