Crates.io | heavykeeper |
lib.rs | heavykeeper |
version | 0.2.2 |
source | src |
created_at | 2024-06-08 09:53:53.793189 |
updated_at | 2024-06-19 20:47:53.375902 |
description | A library for find the top-k elements of a stream of data. |
homepage | |
repository | https://github.com/pmcgleenon/heavykeeper-rs |
max_upload_size | |
id | 1265636 |
size | 64,330 |
Top-K Heavykeeper algorithm for Top-K elephant flows
This is based on the paper HeavyKeeper: An Accurate Algorithm for Finding Top-k Elephant Flows by Junzhi Gong, Tong Yang, Haowei Zhang, and Hao Li, Peking University; Steve Uhlig, Queen Mary, University of London; Shigang Chen, University of Florida; Lorna Uden, Staffordshire University; Xiaoming Li, Peking University
See ip_files.rs for an example of how to use the library to count the top-k IP flows in a file.
A sample usage is as follows:
// create a new TopK
let mut topk = TopK::new(k, width, depth, decay);
// add some items
topk.add(item);
// check the counts
for node in topk.list() {
println!("{} {}", String::from_utf8_lossy(&node.item), node.count);
}
Jigsaw-Sketch: a fast and accurate algorithm for finding top-k elephant flows in high-speed networks Boyu ZHANG, He HUANG, Yu-E SUN, Yang DU & Dan WANG
An example driver program which can be used as a word count program can be found at main.rs
.
Usage:
cargo build --release
target/release/heavykeeper -k 10 -w 8192 -d 2 -y 0.95 -f data/war_and_peace.txt
cargo build --examples --release
target/release/examples/ip_files
This project is dual licensed under the Apache/MIT license.