benchmark

Crates.iobenchmark
lib.rsbenchmark
version0.6.0
created_at2025-03-18 16:19:38.789649+00
updated_at2025-08-30 01:55:31.967468+00
descriptionNanosecond-precision benchmarking for dev, testing, and production. Zero-overhead core timing when disabled; optional std-powered collectors and zero-dependency metrics (Watch/Timer) for real service observability.
homepagehttps://github.com/jamesgober/rust-benchmark
repositoryhttps://github.com/jamesgober/rust-benchmark
max_upload_size
id1596775
size316,757
James Gober (jamesgober)

documentation

https://docs.rs/benchmark

README

Triple Hexagon
Benchmark
RUST LIBRARY

Crates.io Crates.io Downloads docs.rs GitHub CI Benchmarks MSRV

Nanosecond-precision benchmarking for development, testing, and production. The core timing path is zero-overhead when disabled, while optional, std-powered collectors and zero-dependency metrics (Watch/Timer and the stopwatch! macro) provide real service observability with percentiles in production. Designed to be embedded in performance-critical code without bloat or footguns.


What is Benchmark?

Benchmark is a comprehensive benchmarking suite with an advanced performance metrics module that functions in two distinct contexts:

1: As a Development Tool:

The Benchmarking Suite is a statistical benchmarking framework for performance testing and optimization during development. This suite provides the tools for statistical microbenchmarking and comparative analysis, allowing you to detect performance regressions in CI and make data-driven implementation choices during development.

Benchmarking Suite Features

A lightweight, ergonomic alternative to Criterion for statistical performance testing:

— See Benchmark Documentation for more information.


2: As a Production Tool:

The performance metrics module serves as an advanced, lightweight application performance monitoring (APM) tool that provides seemless production observability. It allows you to instrument your application to capture real-time performance metrics for critical operations like database queries and API response times. Each timing measurement can be thought of as a span, helping you identify bottlenecks in a live system.

Performance Metrics Features

Low-overhead instrumentation for live application monitoring:

— See Metrics Documentation for more information.



Features

  • True Zero-Overhead: When disabled via default-features = false, all benchmarking code compiles away completely, adding zero bytes to your binary and zero nanoseconds to execution time.
  • No Dependencies: Built using only the Rust standard library, eliminating dependency conflicts and keeping compilation times fast.
  • Thread-Safe by Design: Core measurement functions are pure and inherently thread-safe, with an optional thread-safe Collector for aggregating measurements across threads.
  • Async Compatible: Works seamlessly with any async runtime (Tokio, async-std, etc.) without special support or additional features - just time any expression, sync or async.
  • Nanosecond Precision: Uses platform-specific high-resolution timers through std::time::Instant, providing nanosecond-precision measurements with monotonic guarantees.
  • Simple Statistics: Provides essential statistics (count, total, min, max, mean) without complex algorithms or memory allocation, keeping the library focused and efficient.
  • Production Metrics (optional): Enable the metrics feature for a thread-safe Watch, Timer (auto record on drop), and stopwatch! macro using a built-in zero-dependency histogram for percentiles.
  • Minimal API Surface: Just four functions and two macros - easy to learn, hard to misuse, and unlikely to ever need breaking changes.
  • Cross-Platform: Consistent behavior across Linux, macOS, Windows, and other platforms supported by Rust's standard library.


Feature Flags

  • none: no features.
  • std (default): uses Rust standard library; disables no_std
  • benchmark (default): enables default benchmark measurement.
  • metrics (optional): production/live metrics (Watch, Timer, stopwatch!).
  • default: convenience feature equal to std + benchmark
  • standard: convenience feature equal to std + benchmark + metrics
  • minimal: minimal build with core timing only (no default features)
  • all: Activates all features (includes: std + benchmark + metrics)

— See FEATURES DOCUMENTATION for more information.




Usage:


Installation

Add this to your Cargo.toml:

[dependencies]
benchmark = "0.6.0"

Standard Features

Enables all standard benchmark features.

[dependencies]

# Enables Production & Development.
benchmark = { version = "0.6.0", features = ["standard"] }

Production metrics (std + metrics)

Enable production observability using Watch/Timer or the stopwatch! macro.

Cargo features:

[dependencies]
benchmark = { version = "0.6.0", features = ["std", "metrics"] }

Record with Timer (auto-record on drop):

use benchmark::{Watch, Timer};

let watch = Watch::new();
{
    let _t = Timer::new(watch.clone(), "db.query");
    // ... do the work to be measured ...
} // recorded once on drop

let s = &watch.snapshot()["db.query"];
assert!(s.count >= 1);

Or use the stopwatch! macro:

use benchmark::{Watch, stopwatch};

let watch = Watch::new();
stopwatch!(watch, "render", {
    // ... work to measure ...
});
assert!(watch.snapshot()["render"].count >= 1);

Disable Default Features

True zero-overhead core timing only.

[dependencies]
# Disable default features for true zero-overhead
benchmark = { version = "0.6.0", default-features = false }

— See FEATURES DOCUMENTATION for more information.



Quick Start

See also: Async Usage · Disabled Mode Behavior · Production Metrics

Measure a closure

Use measure() to time any closure and get back (result, Duration).

use benchmark::measure;

let (value, duration) = measure(|| 2 + 2);
assert_eq!(value, 4);
println!("took {} ns", duration.as_nanos());

Time an expression with the macro

time! works with any expression and supports async contexts.

use benchmark::time;

let (value, duration) = time!({
    let mut sum = 0;
    for i in 0..10_000 { sum += i; }
    sum
});
assert!(duration.as_nanos() > 0);

Micro-benchmark a code block

Use benchmark_block! to run a block many times and get raw per-iteration durations.

use benchmark::benchmark_block;

// Default 10_000 iterations
let samples = benchmark_block!({
    // hot path
    std::hint::black_box(1 + 1);
});
assert_eq!(samples.len(), 10_000);

// Explicit iterations
let samples = benchmark_block!(1_000usize, {
    std::hint::black_box(2 * 3);
});

Macro-benchmark a named expression

Use benchmark! to run a named expression repeatedly and get (last, Vec<Measurement>).

use benchmark::benchmark;

// Default 10_000 iterations
let (last, ms) = benchmark!("add", { 2 + 3 });
assert_eq!(last, Some(5));
assert_eq!(ms[0].name, "add");

// Explicit iterations
let (_last, ms) = benchmark!("mul", 77usize, { 6 * 7 });
assert_eq!(ms.len(), 77);
Disabled mode (`default-features = false`): `benchmark_block!` runs once and returns `vec![]`; `benchmark!` runs once and returns `(Some(result), vec![])`.

Named timing + Collector aggregation (std + benchmark)

Record a named measurement and aggregate stats with Collector.

use benchmark::{time_named, Collector};

fn work() {
    std::thread::sleep(std::time::Duration::from_millis(1));
}

let collector = Collector::new();
let (_, m) = time_named!("work", work());
collector.record(&m);

let stats = collector.stats("work").unwrap();
println!(
    "count={} total={}ns min={}ns max={}ns mean={}ns",
    stats.count,
    stats.total.as_nanos(),
    stats.min.as_nanos(),
    stats.max.as_nanos(),
    stats.mean.as_nanos()
);

Async timing with await

The macros inline Instant timing when features = ["std", "benchmark"], so awaiting inside works seamlessly.

use benchmark::time;

#[tokio::main(flavor = "current_thread")]
async fn main() {
    let sleep_dur = std::time::Duration::from_millis(10);
    let ((), d) = time!(tokio::time::sleep(sleep_dur).await);
    println!("slept ~{} ms", d.as_millis());
}

Benchmarks

This repository includes Criterion benchmarks that measure the overhead of the public API compared to a direct Instant::now() baseline.


Safety & Edge Cases

  • Zero durations: Some operations can complete so fast that elapsed() may be 0ns on some platforms. The API preserves this and returns Duration::ZERO. If you need a floor for visualization, clamp in your presentation layer.
  • Saturating conversions: Watch::record_instant() converts as_nanos() to u64 using saturating semantics, avoiding panics on extremely large values.
  • Range clamping: Watch::record() clamps input to histogram bounds, ensuring valid recording and protecting against out-of-range values.
  • Percentile input clamping: Percentile queries (e.g., Histogram::percentile(q)) clamp q to [0.0, 1.0]. Out-of-range inputs map to min/max (e.g., -0.1 → 0.0, 1.2 → 1.0).
  • Drop safety: Timer records exactly once. It guards against double-record by storing Option<Instant> and recording on Drop even during unwinding.
  • Empty datasets: Watch::snapshot() and Collector handle empty sets defensively. Snapshots for empty histograms return zeros; Collector::stats() returns None for missing keys.
  • Overflow protection: Collector uses saturating_add for total accumulation and 128-bit nanosecond storage in Duration to provide ample headroom.
  • Thread safety: All shared structures use RwLock with short hold-times: clone under read lock, compute outside the lock. Methods will panic only if a lock is poisoned by a prior panic.
  • Feature gating: Production metrics are gated behind features=["std","metrics"]. Disable default features to make all timing a no-op for zero-overhead builds.

How to run

cargo bench

Zero-overhead Proof

This project includes automated checks that verify zero bytes and zero time when disabled, and provide assembly and runtime comparison artifacts.

  • Size comparison (CI): Workflow .github/workflows/ci.yml, job Zero Overhead builds the crate twice and compares libbenchmark.rlib sizes with and without benchmark. Any unexpected growth fails the job.
  • Assembly artifacts (CI): Job Assembly Inspection emits objdump/llvm-objdump symbol tables and disassembly for enabled vs disabled. Download the assembly-artifacts artifact from the run to inspect.
  • Runtime comparison (CI): Bench workflow .github/workflows/bench.yml runs the example examples/overhead_compare.rs in both modes and uploads overhead-compare artifacts with raw output.
  • Compile tests (trybuild): tests/trybuild_disabled.rs ensures disabled mode compiles with the macro and function APIs used in a minimal program.
  • Local reproduction:
    • Disabled: cargo run --release --example overhead_compare --no-default-features
    • Benchmark: cargo run --release --example overhead_compare --no-default-features --features "std benchmark"

When features are disabled (default-features = false), timing returns Duration::ZERO and produces no runtime cost (e.g., time_macro_ns=0 on CI-hosted runners), validating the zero-overhead guarantee.

Sample results (illustrative)

Results below are from a recent run on GitHub-hosted Linux runners; your numbers will vary by hardware and load.

Overhead
--------
instant_now_elapsed           ~ 81–89 ns
measure_closure_add           ~ 79–81 ns
time_macro_add                ~ 81–82 ns

Collector Statistics

stats::single/1000 ~ 2.44–2.61 µs stats::single/10000 ~ 26.7–29.1 µs stats::all/k10_n1000 ~ 29.4–33.6 µs stats::all/k50_n1000 ~ 148.6–157.1 µs

Array Baseline (no locks)

stats::array/k1_n10000 ~ 17.15–17.90 µs stats::array/k10_n1000 ~ 15.03–16.25 µs

Interpretation

  • Macro/function overhead is on par with direct Instant usage for trivial work, as expected.
  • Collector stats scale roughly linearly with the number of samples; costs are dominated by iteration and min/max/accumulate.
  • No-lock array baseline provides a lower bound for aggregation cost; the difference vs Collector indicates lock and map overhead.
  • Use --features std,benchmark to ensure the benchmark timing path is used.

Benchmarks included

  • overhead::instant: Instant::now().elapsed() baseline.
  • overhead::measure: measure(|| expr).
  • overhead::time_macro: time!(expr).
  • Bench source: benches/overhead.rs.
  • Criterion version: 0.5.
  • Note: when features are disabled (default-features = false), measurement returns Duration::ZERO.
  • Tip: use --features std,benchmark to ensure the benchmark path for benches.
  • Environment affects results; run on a quiet system with performance governor if possible.

Sample output (placeholder)

overhead::instant/instant_now_elapsed
                        time:   [X ns  ..  Y ns  ..  Z ns]

overhead::measure/measure_closure_add time: [X ns .. Y ns .. Z ns]

overhead::time_macro/time_macro_add time: [X ns .. Y ns .. Z ns]


Documentation:


DEVELOPMENT & CONTRIBUTION

We welcome contributions! Please see CONTRIBUTING.md for guidelines.



⚖️ License

Licensed under the Apache License, version 2.0 (the "License"); you may not use this software, including, but not limited to the source code, media files, ideas, techniques, or any other associated property or concept belonging to, associated with, or otherwise packaged with this software except in compliance with the License.

You may obtain a copy of the License at: http://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the LICENSE file included with this project for the specific language governing permissions and limitations under the License.


COPYRIGHT © 2025 JAMES GOBER.
Commit count: 61

cargo fmt