| Crates.io | hft-benchmarks |
| lib.rs | hft-benchmarks |
| version | 0.1.2 |
| created_at | 2025-08-19 04:56:42.333198+00 |
| updated_at | 2025-09-15 14:36:09.478641+00 |
| description | High-precision benchmarking tools for high-frequency trading systems with nanosecond-level timing accuracy |
| homepage | https://github.com/hft-framework/hft-benchmarks |
| repository | https://github.com/hft-framework/hft-benchmarks |
| max_upload_size | |
| id | 1801379 |
| size | 145,430 |
High-precision performance measurement tools for Rust applications requiring nanosecond-level timing accuracy.
Add to your Cargo.toml:
[dependencies]
hft-benchmarks = { path = "../path/to/hft-benchmarks" }
Simple benchmark:
use hft_benchmarks::*;
fn main() {
quick_calibrate_tsc_frequency();
SimpleBench::new("my_function")
.bench(1000, || my_expensive_function())
.report();
}
Output:
my_function: 1000 samples, mean=245ns, p50=230ns, p95=310ns, p99=450ns, p99.9=890ns, std_dev=45.2ns
use hft_benchmarks::*;
// One-time setup (do this once at program start)
calibrate_tsc_frequency();
// Time a single operation
let (result, elapsed_ns) = time_function(|| {
expensive_computation()
});
println!("Operation took {}ns", elapsed_ns);
// Collect multiple measurements for statistical analysis
let mut results = BenchmarkResults::new("algorithm_comparison".to_string());
for _ in 0..1000 {
let timer = PrecisionTimer::start();
your_algorithm();
results.record(timer.stop());
}
let analysis = results.analyze();
println!("{}", analysis.summary());
// Check if performance meets requirements
if analysis.meets_target(100) { // P99 < 100ns
println!("✓ Performance target met");
} else {
println!("✗ Too slow: P99 = {}ns", analysis.p99);
}
use hft_benchmarks::*;
fn main() {
quick_calibrate_tsc_frequency();
// Benchmark old implementation
let old_perf = SimpleBench::new("old_algorithm")
.bench(5000, || old_implementation())
.analyze();
// Benchmark new implementation
let new_perf = SimpleBench::new("new_algorithm")
.bench(5000, || new_implementation())
.analyze();
// Calculate improvement
let speedup = old_perf.mean as f64 / new_perf.mean as f64;
println!("New implementation is {:.1}x faster", speedup);
println!("Old: {}ns P99, New: {}ns P99", old_perf.p99, new_perf.p99);
}
use hft_benchmarks::*;
fn main() {
quick_calibrate_tsc_frequency();
// Run built-in allocation benchmarks
benchmark_allocations(); // Test different allocation sizes
benchmark_object_pools(); // Compare pool vs direct allocation
benchmark_aligned_allocations(); // Test memory alignment impact
}
Example output:
Benchmarking memory allocations (10000 iterations per size)...
allocation_64B: 10000 samples, mean=89ns, p50=70ns, p95=120ns, p99=180ns
allocation_1024B: 10000 samples, mean=145ns, p50=130ns, p95=200ns, p99=280ns
Pool allocation: pool_allocation: 10000 samples, mean=65ns, p50=60ns, p95=85ns, p99=110ns
Direct allocation: direct_allocation: 10000 samples, mean=140ns, p50=130ns, p95=180ns, p99=220ns
// Required once at program startup for accurate timing
calibrate_tsc_frequency(); // 1000ms calibration (most accurate)
quick_calibrate_tsc_frequency(); // 100ms calibration (faster, less accurate)
Fluent API for quick benchmarking:
use hft_benchmarks::SimpleBench;
SimpleBench::new("operation_name")
.bench(iterations, || your_function())
.report(); // Print results
// Or get analysis object
let analysis = SimpleBench::new("operation_name")
.bench(iterations, || your_function())
.analyze();
For custom measurement logic:
use hft_benchmarks::{PrecisionTimer, time_function};
// Time a single operation
let timer = PrecisionTimer::start();
expensive_operation();
let elapsed_ns = timer.stop();
// Time function with return value
let (result, elapsed_ns) = time_function(|| {
compute_something()
});
use hft_benchmarks::BenchmarkResults;
let mut results = BenchmarkResults::new("test_name".to_string());
// Collect measurements
for _ in 0..1000 {
let elapsed = time_operation();
results.record(elapsed);
}
// Analyze results
let analysis = results.analyze();
println!("Mean: {}ns, P99: {}ns", analysis.mean, analysis.p99);
// Check performance target
if analysis.meets_target(500) { // P99 < 500ns
println!("Performance target met!");
}
The benchmark results show statistical distribution of timing measurements:
function_name: 1000 samples, mean=245ns, p50=230ns, p95=310ns, p99=450ns, p99.9=890ns, std_dev=45.2ns
In performance-critical systems:
Example: A function averaging 100ns but with P99 of 10ms will cause problems despite good average performance.
Run the benchmark test suite:
# From project root
cd /path/to/hft-framework/Code
cargo test --package hft-benchmarks -- --nocapture
# Or from benchmark crate directory
cd crates/hft-benchmarks
cargo test --lib -- --nocapture
Run example benchmarks:
cargo run --example simple_benchmark_example
Always calibrate before benchmarking:
// At program start
quick_calibrate_tsc_frequency(); // For development/testing
// OR
calibrate_tsc_frequency(); // For production measurements
Use appropriate sample sizes:
// Quick development check
SimpleBench::new("dev_test").bench(100, || function()).report();
// Production validation
SimpleBench::new("prod_test").bench(10000, || function()).report();
Account for JIT compilation and cache warming:
// Warm up
for _ in 0..1000 { function(); }
// Then benchmark
SimpleBench::new("warmed_up").bench(5000, || function()).report();
cargo run --release)use hft_benchmarks::*;
fn main() {
quick_calibrate_tsc_frequency();
SimpleBench::new("new_feature")
.bench(1000, || my_new_function())
.report();
}
use hft_benchmarks::*;
fn compare_algorithms() {
quick_calibrate_tsc_frequency();
println!("=== Algorithm Comparison ===");
let results_a = SimpleBench::new("algorithm_a")
.bench(5000, || algorithm_a())
.analyze();
let results_b = SimpleBench::new("algorithm_b")
.bench(5000, || algorithm_b())
.analyze();
println!("Algorithm A: {}ns P99", results_a.p99);
println!("Algorithm B: {}ns P99", results_b.p99);
if results_b.p99 < results_a.p99 {
let improvement = (results_a.p99 as f64 / results_b.p99 as f64 - 1.0) * 100.0;
println!("Algorithm B is {:.1}% faster (P99)", improvement);
}
}
use hft_benchmarks::*;
fn validate_performance() {
calibrate_tsc_frequency(); // Full calibration for accuracy
let analysis = SimpleBench::new("critical_path")
.bench(10000, || critical_trading_function())
.analyze();
// Ensure P99 latency meets requirements
const MAX_P99_NS: u64 = 500;
assert!(
analysis.meets_target(MAX_P99_NS),
"Performance regression: P99 = {}ns (max allowed: {}ns)",
analysis.p99,
MAX_P99_NS
);
println!("✓ Performance validation passed");
println!(" Mean: {}ns, P99: {}ns, P99.9: {}ns",
analysis.mean, analysis.p99, analysis.p999);
}
use hft_benchmarks::*;
fn optimize_memory_usage() {
quick_calibrate_tsc_frequency();
println!("=== Memory Allocation Comparison ===");
// Test stack allocation
SimpleBench::new("stack_alloc")
.bench(10000, || {
let data = [0u64; 64]; // Stack allocated
std::hint::black_box(data);
})
.report();
// Test heap allocation
SimpleBench::new("heap_alloc")
.bench(10000, || {
let data = vec![0u64; 64]; // Heap allocated
std::hint::black_box(data);
})
.report();
// Use built-in memory benchmarks
benchmark_object_pools();
}
## Running Complete Benchmark Suite
### Memory Allocation Analysis
```bash
cargo run --example simple_benchmark_example
Output:
=== Vector Allocation Benchmark ===
vec_allocation: 1000 samples, mean=185ns, p50=170ns, p95=220ns, p99=992ns
=== Implementation Comparison ===
Old: 90ns P99, New: 50ns P99
Improvement: 166.7% faster
use hft_benchmarks::*;
fn main() {
calibrate_tsc_frequency();
// Benchmark your trading algorithm
SimpleBench::new("order_processing")
.bench(10000, || process_market_order())
.report();
// Memory-intensive operations
benchmark_allocations();
benchmark_object_pools();
}
This library uses CPU timestamp counters (TSC) for nanosecond-precision timing:
_rdtsc() instructionThe benchmark tools themselves have minimal impact:
PrecisionTimer overhead: ~35ns
Function call overhead: ~37ns
Statistical calculation: <1μs for 10k samples
Memory allocation test: ~100-500ns per iteration
std::hint::black_box)Use alongside other profiling tools for comprehensive analysis:
This library excels at microbenchmarks and latency-critical code paths where nanosecond precision matters.