heavykeeper

Crates.ioheavykeeper
lib.rsheavykeeper
version
sourcesrc
created_at2024-06-08 09:53:53.793189
updated_at2024-11-30 14:37:37.189701
descriptionA library for finding the top-k elements of a stream of data.
homepage
repositoryhttps://github.com/pmcgleenon/heavykeeper-rs
max_upload_size
id1265636
Cargo.toml error:TOML parse error at line 23, column 1 | 23 | autolib = false | ^^^^^^^ unknown field `autolib`, expected one of `name`, `version`, `edition`, `authors`, `description`, `readme`, `license`, `repository`, `homepage`, `documentation`, `build`, `resolver`, `links`, `default-run`, `default_dash_run`, `rust-version`, `rust_dash_version`, `rust_version`, `license-file`, `license_dash_file`, `license_file`, `licenseFile`, `license_capital_file`, `forced-target`, `forced_dash_target`, `autobins`, `autotests`, `autoexamples`, `autobenches`, `publish`, `metadata`, `keywords`, `categories`, `exclude`, `include`
size0
Patrick McGleenon (pmcgleenon)

documentation

README

heavykeeper-rs

Top-K Heavykeeper algorithm for Top-K elephant flows

This is based on the paper HeavyKeeper: An Accurate Algorithm for Finding Top-k Elephant Flows by Junzhi Gong, Tong Yang, Haowei Zhang, and Hao Li, Peking University; Steve Uhlig, Queen Mary, University of London; Shigang Chen, University of Florida; Lorna Uden, Staffordshire University; Xiaoming Li, Peking University

Example

See ip_files.rs for an example of how to use the library to count the top-k IP flows in a file.

A sample usage is as follows:

    // create a new TopK

    let mut topk = TopK::new(k, width, depth, decay);

    // add some items
    topk.add(item);

    // check the counts
    for node in topk.list() {
        println!("{} {}", String::from_utf8_lossy(&node.item), node.count);
    }

Other Implementations

Name Language Github Repo
SegmentIO Go https://github.com/segmentio/topk
Aegis Go https://github.com/go-kratos/aegis/blob/main/topk/heavykeeper.go
Tomasz Kolaj Go https://github.com/migotom/heavykeeper
HeavyKeeper Paper C++ https://github.com/papergitkeeper/heavy-keeper-project
Jigsaw-Sketch C++ https://github.com/duyang92/jigsaw-sketch-paper/tree/main/CPU/HeavyKeeper
Redis Bloom Heavykeeper C https://github.com/RedisBloom/RedisBloom/blob/master/src/topk.c
Count-Min-Sketch Rust https://github.com/alecmocatta/streaming_algorithms
Sliding Window HeavyKeeper Go https://github.com/keilerkonzept/topk

Running

An example driver program which can be used as a word count program can be found at main.rs.

Usage:

cargo build --release
target/release/heavykeeper -k 10 -w 8192 -d 2 -y 0.95 -f data/war_and_peace.txt

Building the example

cargo build --examples --release
target/release/examples/ip_files

License

This project is dual licensed under the Apache/MIT license.

Commit count: 49

cargo fmt