samplr

Crates.iosamplr
lib.rssamplr
version0.1.0
sourcesrc
created_at2021-11-28 19:46:35.246003
updated_at2021-11-28 19:46:35.246003
descriptionCLI tool to randomly sample data
homepagehttps://github.com/SteadBytes/samplr
repositoryhttps://github.com/SteadBytes/samplr
max_upload_size
id488942
size19,227
Ben Steadman (SteadBytes)

documentation

README

samplr

samplr is a CLI tool to randomly sample data; generating a fixed size sample of input lines with uniform probabilities.

Installation

Source

Requires Rust to be installed.

git clone https://github.com/SteadBytes/sample.git
cd sample
cargo install --path .

Examples

Sample 15 lines from a file:

sample -n 15 things.txt

Sample 15 lines from standard input:

<things.txt | sample -n 15

Sample 15 lines from multiple files:

sample -n 15 things.txt other_things.txt

Sampling Algorithm

samplr uses a Reservoir Sampling algorithm to generate fixed size samples from an input stream of unknown length. For more details, see the implementation and the linked blog article.

Development

Tests

Run unit tests:

cargo test

Run all tests (including potentially CPU intensive statistical tests):

cargo test --all-features --release
Commit count: 12

cargo fmt