| Crates.io | torsh-benches |
| lib.rs | torsh-benches |
| version | 0.1.0-alpha.2 |
| created_at | 2025-09-30 04:13:03.556509+00 |
| updated_at | 2025-12-22 05:35:29.928438+00 |
| description | Benchmarking suite for ToRSh |
| homepage | https://github.com/cool-japan/torsh/ |
| repository | https://github.com/cool-japan/torsh/ |
| max_upload_size | |
| id | 1860584 |
| size | 1,967,449 |
This crate provides comprehensive benchmarking utilities for ToRSh, designed to measure performance across different operations, compare with other tensor libraries, and track performance regressions.
Run all benchmarks:
cargo bench
Run specific benchmark suites:
cargo bench tensor_operations
cargo bench neural_networks
cargo bench memory_operations
use torsh_benches::prelude::*;
use torsh_tensor::creation::*;
// Create a simple benchmark
let mut runner = BenchRunner::new();
let config = BenchConfig::new("my_benchmark")
.with_sizes(vec![64, 128, 256])
.with_dtypes(vec![DType::F32]);
let bench = benchmark!(
"tensor_addition",
|size| {
let a = rand::<f32>(&[size, size]);
let b = rand::<f32>(&[size, size]);
(a, b)
},
|(a, b)| a.add(b).unwrap()
);
runner.run_benchmark(bench, &config);
Enable external library comparisons:
cargo bench --features compare-external
use torsh_benches::metrics::*;
let mut collector = MetricsCollector::new();
collector.start();
// Run your code here
let metrics = collector.stop();
println!("Peak memory usage: {:.2} MB", metrics.memory_stats.peak_usage_mb);
println!("CPU utilization: {:.1}%", metrics.cpu_stats.average_usage_percent);
use torsh_benches::metrics::*;
let mut profiler = PerformanceProfiler::new();
profiler.begin_event("matrix_multiply");
// Your code here
profiler.end_event("matrix_multiply");
let report = profiler.generate_report();
profiler.export_chrome_trace("profile.json").unwrap();
The enhanced analysis framework provides deep insights into benchmark performance:
use torsh_benches::prelude::*;
// 1. Collect comprehensive system information
let mut system_collector = SystemInfoCollector::new();
let system_info = system_collector.collect();
// Check if system is optimized for benchmarking
let (is_optimized, recommendations) = is_system_optimized_for_benchmarking();
if !is_optimized {
println!("System optimization recommendations:");
for rec in recommendations {
println!(" {}", rec);
}
}
// 2. Run benchmarks and collect results
let mut runner = BenchRunner::new();
// ... run your benchmarks ...
let results = runner.results();
// 3. Perform advanced analysis
let mut analyzer = BenchmarkAnalyzer::new();
let analyses = analyzer.analyze_results(&results);
// 4. Generate comprehensive reports
analyzer.generate_analysis_report(&analyses, "analysis_report.md")?;
analyzer.export_detailed_csv(&analyses, "detailed_results.csv")?;
system_collector.generate_system_report(&system_info, "system_info.md")?;
// 5. Get optimization recommendations
for analysis in &analyses {
println!("Benchmark: {}", analysis.benchmark_name);
println!("Performance: {:?}", analysis.performance_rating);
println!("Bottleneck: {}",
if analysis.bottleneck_analysis.memory_bound { "Memory-bound" }
else if analysis.bottleneck_analysis.compute_bound { "Compute-bound" }
else { "Mixed" });
for rec in &analysis.recommendations {
println!(" 💡 {}", rec);
}
}
cargo run --example enhanced_analysis_demo --release
This will:
## Benchmark Categories
### Tensor Operations
- **Creation**: `zeros()`, `ones()`, `rand()`, `eye()`
- **Arithmetic**: element-wise operations, broadcasting
- **Linear Algebra**: matrix multiplication, decompositions
- **Reductions**: sum, mean, min, max, norm
- **Indexing**: slicing, gathering, scattering
### Neural Network Operations
- **Layers**: Linear, Conv2d, BatchNorm, Dropout
- **Activations**: ReLU, Sigmoid, Tanh, Softmax
- **Loss Functions**: MSE, Cross-entropy, BCE
- **Optimizers**: SGD, Adam, RMSprop
### Memory Operations
- **Allocation**: buffer creation and management
- **Transfer**: host-to-device, device-to-host
- **Synchronization**: device synchronization overhead
### Data Loading
- **Dataset**: reading and preprocessing
- **DataLoader**: batching and shuffling
- **Transforms**: image and tensor transformations
## Configuration
Benchmark behavior can be customized through `BenchConfig`:
```rust
let config = BenchConfig::new("custom_benchmark")
.with_sizes(vec![32, 64, 128, 256, 512, 1024])
.with_dtypes(vec![DType::F16, DType::F32, DType::F64])
.with_timing(
Duration::from_millis(500), // warmup time
Duration::from_secs(2), // measurement time
)
.with_memory_measurement()
.with_metadata("device", "cpu");
Benchmark results can be exported in multiple formats:
When the compare-external feature is enabled, benchmarks will run against:
Results show relative performance:
| Operation | Library | Size | Time (μs) | Speedup vs ToRSh |
|-----------|---------|------|-----------|------------------|
| MatMul | torsh | 512 | 234.5 | 1.00x |
| MatMul | ndarray | 512 | 289.1 | 0.81x |
Track performance over time:
let mut detector = RegressionDetector::new(0.1); // 10% threshold
detector.load_baseline("baseline_results.json").unwrap();
let regressions = detector.check_regression(¤t_results);
for regression in regressions {
println!("Regression detected in {}: {:.2}x slower",
regression.benchmark, regression.slowdown_factor);
}
For consistent benchmarking results:
use torsh_benches::utils::Environment;
Environment::setup_for_benchmarking();
// Run benchmarks
Environment::restore_environment();
This will:
Example GitHub Actions workflow:
- name: Run benchmarks
run: cargo bench --features compare-external
- name: Upload benchmark results
uses: benchmark-action/github-action-benchmark@v1
with:
tool: 'criterion'
output-file-path: target/criterion/reports/index.html
To add new benchmarks:
Benchmarkable trait for your operationSee existing benchmarks in benches/ for examples.
black_box() to prevent compiler optimizations