Crates.io | tango-bench |
lib.rs | tango-bench |
version | 0.6.0 |
source | src |
created_at | 2023-10-29 09:43:26.202851 |
updated_at | 2024-09-17 08:10:42.635438 |
description | Tango benchmarking harness |
homepage | https://github.com/bazhenov/tango |
repository | https://github.com/bazhenov/tango |
max_upload_size | |
id | 1017426 |
size | 98,403 |
It used to be that benchmarking required a significant amount of time and numerous iterations to arrive at meaningful results, which was particularly arduous when trying to detect subtle changes, such as those within the range of a few percentage points.
Introducing Tango.rs, a novel benchmarking framework that employs paired benchmarking to assess code performance. This approach capitalizes on the fact that it's far more efficient to measure the performance difference between two simultaneously executing functions compared to two functions executed consecutively.
Features:
Compared to traditional pointwise benchmarking, paired benchmarking is significantly more sensitive to changes. This heightened sensitivity enables the early detection of statistically significant performance variations.
Tango is designed to have the capability to detect a 1% change in performance within just 1 second in at least 9 out of 10 test runs.
cargo-export
installedAdd cargo dependency and create new benchmark:
[dev-dependencies]
tango-bench = "0.5"
[[bench]]
name = "factorial"
harness = false
allows rustc to export symbols for dynamic linking from benchmarks
(Linux/macOS) Add build script (build.rs
) with following content
fn main() {
println!("cargo:rustc-link-arg-benches=-rdynamic");
println!("cargo:rerun-if-changed=build.rs");
}
(Windows, nightly required) Add following code to cargo config (.cargo/config
)
[build]
rustflags = ["-Zexport-executable-symbols"]
Add benches/factorial.rs
with the following content:
use std::hint::black_box;
use tango_bench::{benchmark_fn, tango_benchmarks, tango_main, IntoBenchmarks};
pub fn factorial(mut n: usize) -> usize {
let mut result = 1usize;
while n > 0 {
result = result.wrapping_mul(black_box(n));
n -= 1;
}
result
}
fn factorial_benchmarks() -> impl IntoBenchmarks {
[
benchmark_fn("factorial", |b| b.iter(|| factorial(500))),
]
}
tango_benchmarks!(factorial_benchmarks());
tango_main!();
Build and export benchmark to target/benchmarks
directory:
$ cargo export target/benchmarks -- bench --bench=factorial
Now lets try to modify factorial.rs
and make factorial faster :)
fn factorial_benchmarks() -> impl IntoBenchmarks {
[
benchmark_fn("factorial", |b| b.iter(|| factorial(495))),
]
}
Now we can compare new version with already built one:
$ cargo bench -q --bench=factorial -- compare target/benchmarks/factorial
factorial [ 375.5 ns ... 369.0 ns ] -1.58%*
The result shows that indeed there is indeed ~1% difference between factorial(500)
and factorial(495)
.
Additional examples are available in examples
directory.
To use Tango.rs in an asynchronous setup, follow these steps:
Add tokio
and tango-bench
dependencies to your Cargo.toml
:
[dev-dependencies]
tango-bench = { version = "0.5", features = ["async-tokio"] }
[[bench]]
name = "async_factorial"
harness = false
Create benches/async_factorial.rs
with the following content:
use std::hint::black_box;
use tango_bench::{
async_benchmark_fn, asynchronous::tokio::TokioRuntime, tango_benchmarks, tango_main,
IntoBenchmarks,
};
pub async fn factorial(mut n: usize) -> usize {
let mut result = 1usize;
while n > 0 {
result = result.wrapping_mul(black_box(n));
n -= 1;
}
result
}
fn benchmarks() -> impl IntoBenchmarks {
[async_benchmark_fn("async_factorial", TokioRuntime, |b| {
b.iter(|| async { factorial(500).await })
})]
}
tango_benchmarks!(benchmarks());
tango_main!();
Build and use benchmarks as you do in synchronous case
$ cargo bench -q --bench=async_factorial -- compare
There are several arguments you can pass to the compare
command to change it behavior
-t
, --time
– how long to run each benchmark (in seconds)
-s
, --samples
– how much samples to gather from each benchmark
-f
– filter benchmarks by name. Glob patterns are supported (eg. */bench_name/{2,4,8}/**
)
-d [path]
– dump CSV with raw samples in a given directory
--gnuplot
– generate plot for each benchmark (requires gnuplot to be installed)
-o
, --filter-outliers
– additionally filter outliers
-p
, --parallel
- run base/candidate functions in 2 different threads instead of interleaving in a single thread
--fail-threshold
– do fail if new version is slower than baseline on a given percentage
--fail-fast
- do fail after first benchmark exceeding fail threshold, not after the whole suite
The project is in its early stages so any help will be appreciated. Here are some ideas you might find interesting