| Crates.io | axonml-profile |
| lib.rs | axonml-profile |
| version | 0.2.4 |
| created_at | 2026-01-19 21:48:27.98061+00 |
| updated_at | 2026-01-25 22:39:46.158318+00 |
| description | Profiling tools for the Axonml ML framework |
| homepage | https://github.com/automatanexus/axonml |
| repository | https://github.com/automatanexus/axonml |
| max_upload_size | |
| id | 2055408 |
| size | 97,091 |
axonml-profile provides comprehensive profiling capabilities for neural network training and inference. It includes memory tracking, compute profiling, timeline recording, and automatic bottleneck detection to help identify and resolve performance issues.
| Module | Description |
|---|---|
memory |
Memory profiler for tracking allocations, peak usage, and leak detection |
compute |
Compute profiler for measuring operation times, FLOPS, and bandwidth |
timeline |
Timeline profiler for recording events with timestamps and metadata |
bottleneck |
Bottleneck analyzer for detecting performance issues with severity ratings |
report |
Report generation in multiple formats (Text, JSON, Markdown, HTML) |
error |
Error types and Result alias for profiling operations |
Add this to your Cargo.toml:
[dependencies]
axonml-profile = "0.1.0"
use axonml_profile::Profiler;
// Create a profiler
let profiler = Profiler::new();
// Profile an operation
profiler.start("forward_pass");
let output = model.forward(&input);
profiler.stop("forward_pass");
// Check timing
let total = profiler.total_time("forward_pass");
let avg = profiler.avg_time("forward_pass");
println!("Total: {:?}, Average: {:?}", total, avg);
use axonml_profile::Profiler;
let profiler = Profiler::new();
// Track allocations
profiler.record_alloc("weights", 4 * 1024 * 1024); // 4 MB
profiler.record_alloc("activations", 2 * 1024 * 1024); // 2 MB
// Check memory usage
println!("Current: {} bytes", profiler.current_memory());
println!("Peak: {} bytes", profiler.peak_memory());
// Record deallocation
profiler.record_free("activations", 2 * 1024 * 1024);
use axonml_profile::{Profiler, ProfileGuard, profile_scope};
let profiler = Profiler::new();
// Using ProfileGuard directly
{
let _guard = ProfileGuard::new(&profiler, "expensive_operation");
// Operation automatically timed from guard creation to drop
perform_computation();
}
// Or using the macro
{
profile_scope!(&profiler, "another_operation");
perform_another_computation();
}
use axonml_profile::Profiler;
let profiler = Profiler::new();
// Profile various operations
profiler.start("matmul");
// ... matrix multiplication
profiler.stop("matmul");
profiler.start("relu");
// ... activation
profiler.stop("relu");
// Analyze for bottlenecks
let bottlenecks = profiler.analyze_bottlenecks();
for b in &bottlenecks {
println!("[{:?}] {}: {}", b.severity, b.name, b.description);
println!(" Suggestion: {}", b.suggestion);
}
use axonml_profile::{Profiler, ReportFormat};
use std::path::Path;
let profiler = Profiler::new();
// ... profiling operations ...
// Generate and print summary
let report = profiler.summary();
println!("{}", report);
// Export to file
report.export(Path::new("profile.html"), ReportFormat::Html)?;
report.export(Path::new("profile.json"), ReportFormat::Json)?;
report.export(Path::new("profile.md"), ReportFormat::Markdown)?;
use axonml_profile::{global_profiler, start, stop, record_alloc};
// Use global profiler functions
start("global_operation");
// ... operation ...
stop("global_operation");
record_alloc("global_tensor", 1024);
// Access the global profiler instance
let profiler = global_profiler();
profiler.print_summary();
use axonml_profile::ComputeProfiler;
let mut profiler = ComputeProfiler::new();
// Profile with FLOPS tracking
profiler.start_with_flops("matmul", 2.0 * 1024.0 * 1024.0 * 1024.0); // 2 GFLOPS
// ... operation ...
profiler.stop("matmul");
// Get top operations
for op in profiler.top_by_time(5) {
println!("{}: {} calls, {:?} total", op.name, op.call_count, op.total_time());
}
| Type | Description | Detection Threshold |
|---|---|---|
SlowOperation |
Operation taking disproportionate time | >20% of total time |
HighCallCount |
Operation called too frequently | >10,000 calls |
MemoryHotspot |
Large memory allocation | >30% of peak memory |
MemoryLeak |
Memory not freed | >5% of allocations |
LowThroughput |
Low computational throughput | <1 GFLOPS |
| Format | Description | Use Case |
|---|---|---|
| Text | Plain text with ASCII tables | Console output |
| JSON | Structured JSON | Programmatic analysis |
| Markdown | GitHub-flavored Markdown | Documentation |
| HTML | Styled HTML page | Browser viewing |
Run the test suite:
cargo test -p axonml-profile
Licensed under either of:
at your option.