| Crates.io | npu-rs |
| lib.rs | npu-rs |
| version | 0.1.2 |
| created_at | 2025-10-20 10:29:35.214678+00 |
| updated_at | 2025-10-22 06:50:27.232433+00 |
| description | A NPU driver for RISCV boards |
| homepage | https://github.com/KushalMeghani1644/NPU-rs |
| repository | https://github.com/KushalMeghani1644/NPU-rs |
| max_upload_size | |
| id | 1891724 |
| size | 132,143 |
A Simulation Rust driver for neural processing units on RISC-based boards with 20 TOPS peak performance.
NOTE: *This crate is a simulator Real hardware integration requires HAL implementation and Linux kernel module support. NOTE: I don't own a real RISC board thus this code wasn't tested on real RISCV hardware, please make sure to use at your own risk.
Core Compute
Memory Management
Power Management
Performance Analysis
Model Optimization
Device Management
tensor: Tensor operations (add, sub, mul, div, relu, sigmoid)
device: Device driver and state management
memory: Memory allocation and tracking
compute: Matrix multiplication and convolution units
execution: Operation execution and scheduling
power: DVFS and thermal management
model: Neural network model definitions
quantization: INT8 quantization and calibration
optimizer: Graph optimization
profiler: Performance profiling
perf_monitor: Real-time metrics
error: Error handling
cargo install npu-rs
cargo build --release
NOTE: THIS CODE RUNS ON CPU ONLY; NO REAL HARDWARE EXECUTION
cargo run # Full demo
cargo run --example full_inference_pipeline # Example pipeline
Peak Throughput - 20 TOPS Memory - 512 MB Compute Units - 4 Frequency - 400-1000 MHz (via DVFS) Power TDP - 1.2-5.0 W Thermal Limit - 90 C
use npu_rs::{NpuDevice, Tensor, ExecutionContext};
use std::sync::Arc;
let device = Arc::new(NpuDevice::new());
device.initialize()?;
let ctx = ExecutionContext::new(device);
let a = Tensor::random(&[4, 8]);
let b = Tensor::random(&[8, 6]);
let result = ctx.execute_matmul(&a.data, &b.data)?;
println!("Result: {:?}", result.shape());