| Crates.io | temporal-compare |
| lib.rs | temporal-compare |
| version | 0.5.0 |
| created_at | 2025-09-27 17:46:01.296088+00 |
| updated_at | 2025-09-27 19:48:09.284082+00 |
| description | High-performance framework for benchmarking temporal prediction algorithms inspired by Time-R1 |
| homepage | https://github.com/ruvnet/sublinear-time-solver |
| repository | https://github.com/ruvnet/sublinear-time-solver |
| max_upload_size | |
| id | 1857511 |
| size | 184,082 |
Ultra-fast Rust framework for temporal prediction with 6x speedup via SIMD and 3.69x compression via INT8 quantization.
Imagine trying to predict the next word you'll type, the next stock price movement, or the next frame in a video. These are temporal prediction tasks - predicting future states from historical sequences. Temporal-Compare provides a testing ground to compare different approaches to this fundamental problem.
This crate implements a clean, extensible framework for comparing:
┌─────────────────────────────────────────────────────────┐
│ Input Time Series │
│ [t-31, t-30, ..., t-1, t] │
└────────────────┬────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Feature Engineering │
│ • Window: 32 timesteps │
│ • Regime indicators │
│ • Temporal features (time-of-day) │
└────────────────┬────────────────────────────────────────┘
│
┌────────┴────────┬──────────┬──────────┬──────────┐
▼ ▼ ▼ ▼ ▼
┌──────────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Baseline │ │ MLP │ │ MLP-Opt │ │MLP-Ultra │ │ RUV-FANN │
│ Predictor │ │ Simple │ │ Adam │ │ SIMD │ │ Network │
│ │ │ │ │ │ │ │ │ │
│ Last value │ │ Basic │ │ Backprop │ │ AVX2 │ │ Rprop │
└──────┬───────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │ │ │
└───────────────┴──────────────┴──────────────┴──────────────┘
│
▼
┌─────────────────────┐
│ Outputs │
│ • Regression (MSE) │
│ • Classification │
│ (3-class: ↓/→/↑) │
└─────────────────────┘
The synthetic time series follows an autoregressive process with complexity:
x(t) = 0.8 * x(t-1) + drift(regime) + N(0, 0.3) + impulse(t)
where:
- regime ∈ {0, 1} switches with P=0.02
- drift = 0.02 if regime=0, else -0.015
- impulse = +0.9 every 37 timesteps
| Backend | Accuracy | Speed | Size | Key Innovation |
|---|---|---|---|---|
| MLP-Classifier | 65.2% | 1.9s | 120KB | BatchNorm + Dropout |
| Baseline | 64.3% | 0.0s | N/A | Analytical solution |
| MLP-Ultra | 64.0% | 0.5s | 100KB | AVX2 SIMD (6x speedup) |
| MLP-Quantized | 63.6% | 0.5s | 2.6KB | INT8 quantization (3.69x) |
| MLP-AVX512 | 62.0% | 0.4s | 100KB | AVX-512 (16 floats/cycle) |
| Ensemble | 59.5% | 8.2s | 400KB | 4-model weighted voting |
| Boosted | 58.0% | 10s | 200KB | AdaBoost-style iteration |
| Reservoir | 55.8% | 0.8s | 50KB | Echo state, no backprop |
| Quantum | 53.2% | 1.0s | 60KB | Quantum interference patterns |
| Fourier | 48.7% | 0.3s | 200KB | Random RBF kernel features |
| Sparse | 40.1% | 5.0s | 10KB | 91% weights pruned |
| Lottery | 38.5% | 15s | 5KB | Iterative magnitude pruning |
# Clone the repository
git clone https://github.com/ruvnet/sublinear-time-solver.git
cd sublinear-time-solver/temporal-compare
# Build with standard features
cargo build --release
# Build with RUV-FANN backend support
cargo build --release --features ruv-fann
# Build with SIMD optimizations (recommended)
RUSTFLAGS="-C target-cpu=native" cargo build --release
# Baseline predictor
cargo run --release -- --backend baseline --n 5000
# Simple MLP
cargo run --release -- --backend mlp --n 5000 --epochs 20 --lr 0.001
# Optimized MLP with Adam optimizer
cargo run --release -- --backend mlp-opt --n 5000 --epochs 20 --lr 0.001
# Ultra-fast SIMD MLP (recommended for performance)
RUSTFLAGS="-C target-cpu=native" cargo run --release -- --backend mlp-ultra --n 5000 --epochs 20
# RUV-FANN backend (requires feature flag)
cargo run --release --features ruv-fann -- --backend ruv-fann --n 5000
# 3-class trend prediction (down/neutral/up)
cargo run --release -- --backend mlp --classify --n 5000 --epochs 15
# Compare against baseline
cargo run --release -- --backend baseline --classify --n 5000
# Custom window size and seed
cargo run --release -- --backend mlp --window 64 --seed 12345 --n 10000
# Full parameter control
cargo run --release -- \
--backend mlp \
--window 48 \
--hidden 256 \
--epochs 50 \
--lr 0.0005 \
--n 20000 \
--seed 42
# Run complete comparison with timing
for backend in baseline mlp mlp-opt mlp-ultra; do
echo "Testing $backend..."
time cargo run --release -- --backend $backend --n 10000 --epochs 25
done
# With RUV-FANN included
cargo build --release --features ruv-fann
for backend in baseline mlp mlp-opt mlp-ultra ruv-fann; do
echo "Testing $backend..."
time cargo run --release --features ruv-fann -- --backend $backend --n 10000 --epochs 25
done
Backend MSE Training Time Speedup
─────────────────────────────────────────────────
Baseline 0.112 N/A -
MLP 0.128 3.057s 1.0x
MLP-Opt 0.238 2.100s 1.5x
MLP-Ultra 0.108 0.500s 6.1x ← Best!
RUV-FANN 0.115 1.200s 2.5x
Backend Accuracy Notes
────────────────────────────────────
Baseline 64.7% Simple threshold-based
MLP 37.0% Limited by numerical gradients
MLP-Opt 42.3% Improved with backprop
MLP-Ultra 45.0% SIMD-accelerated
RUV-FANN 62.0% Close to baseline
// Reuse allocations across predictions
let tensor_pool = TensorPool::new();
let tensor = pool.acquire(size);
// ... use tensor ...
pool.release(tensor);
// Parallelize batch processing
#[parallel]
for batch in batches.par_iter() {
process_batch(batch);
}
// Compute in FP16, accumulate in FP32
let fp16_weights = weights.to_f16();
let result = fp16_matmul(fp16_weights, input);
burn = "0.13"
burn-wgpu = "0.13" # WebGPU backend
candle-core = "0.3"
candle-transformers = "0.3"
// Compile computation graph
let graph = ComputeGraph::from_model(&model);
graph.optimize() // Fusion, CSE, layout optimization
.compile() // Generate optimized code
.execute(input);
#[wasm_bindgen]
pub fn predict_wasm(input: &[f32]) -> Vec<f32> {
// Run in browser at near-native speed
}
let best_architecture = NAS::evolve()
.population(100)
.generations(50)
.optimize_for(Metric::Accuracy, Constraint::Latency(1.0))
.run();
// Multi-node training with MPI
let trainer = DistributedTrainer::new();
trainer.all_reduce_gradients(&mut gradients);
__global__ void quantized_matmul_int8(
const int8_t* __restrict__ A,
const int8_t* __restrict__ B,
float* __restrict__ C,
float scale_a, float scale_b
) {
// Tensor Core INT8 operations
}
| Optimization | Effort | Speedup | Size Reduction | Status |
|---|---|---|---|---|
| INT8 Quantization | Low | 1x | 3.69x | ✅ Done |
| AVX2 SIMD | Low | 6x | 1x | ✅ Done |
| Memory Pooling | Low | 1.15x | 1x | ⬜ TODO |
| OpenMP | Low | 2-4x | 1x | ⬜ TODO |
| FP16 | Medium | 2x | 2x | ⬜ TODO |
| GPU (Burn) | Medium | 10-50x | 1x | ⬜ TODO |
| WASM | Medium | 0.9x | 1x | ⬜ TODO |
| NAS | High | 1.1x | Variable | ⬜ TODO |
| Distributed | High | 10-100x | 1x | ⬜ TODO |
Contributions welcome! Areas of interest:
@ruvnet - Architecture, implementation, and optimization Pioneering work in temporal consciousness mathematics and sublinear algorithms
MIT License - See LICENSE file for details