probably

Crates.ioprobably
lib.rsprobably
version0.3.1
sourcesrc
created_at2020-10-31 23:10:58.320421
updated_at2020-12-03 19:17:38.535327
descriptionA library with various approximate computing algorithms
homepagehttps://github.com/aeshirey/probably/
repositoryhttps://github.com/aeshirey/probably/
max_upload_size
id307355
size202,071
Adam Shirey (aeshirey)

documentation

README

Approximate computing

Approximate computing "is a computation technique which returns a possibly inaccurate result rather than a guaranteed accurate result," which "can provide disproportionate gains in performance and energy, while still achieving acceptable result accuracy."

This library is a collection of several approximate computing algorithms written in Rust. The goals are:

  • Ease of use. For maximum adoption, it should be straightforward to make use of AC algorithms with very little effort.
  • Parallelization. When possible, algorithms should be capable of being parallelized (ie, map-reduce of the core algorithm). This includes serialization of datasets.
  • Common dependencies. Minimize the dependency chain to reduce build times and binary sizes.

NOTE that I did not write any of these algorithms - they are all implemented by other talented Rustaceans, the repositories for which are linked below. I have, however, added some functionality (serialization/deserialization, exposing new initialization functions, etc.), and I did implement the Rust version from C#. All source code is MIT-licensed.

Using

To use probably in your Cargo.toml:

probably = "0.2.0"

To include serialization of data structures, include the with_serde feature:

probably = { version = "0.2.0", features = ["with_serde"] }

The algorithms implemented in this library and on the roadmap fall into one of several categories:

Cardinality/frequency

These algorithms approxmate counting the number of items in a set within some approximate acceptable error.

Membership

These algorithms determine if an item n exists in the set N, guaranteeing no false negatives at the cost of some false positives.

  • Bloom Filter -- implemented from bloom_filter
  • Quotient filter -- like Bloom filter, but can be merged and resized more efficiently. Uses 10-25% more space than BF.
  • Cuckoo filter -- implemented from rust-cuckoofilter. like Bloom filter, but can delete items. Can use lower space overhead than BF.

Quantile Estimators

Other

  • MinHash for estimating similarity of two sets

Additional reading

Commit count: 19

cargo fmt