| Crates.io | mielin-tensor |
| lib.rs | mielin-tensor |
| version | 0.1.0-rc.1 |
| created_at | 2026-01-18 02:27:47.370638+00 |
| updated_at | 2026-01-18 02:27:47.370638+00 |
| description | Kernel-level tensor operations with hardware accelerator support (Arm SVE2/SME, CUDA, Metal, NPU) |
| homepage | |
| repository | https://github.com/cool-japan/mielin |
| max_upload_size | |
| id | 2051624 |
| size | 861,974 |
TensorLogic - Kernel-Level Tensor Operations
Kernel integration for hardware-accelerated tensor operations leveraging Arm SVE2/SME and other accelerators.
MielinTensor provides kernel-level awareness of tensor/matrix operations, enabling efficient scheduling and resource allocation for AI workloads on MielinOS.
Add to your Cargo.toml:
[dependencies]
mielin-tensor = { path = "../mielin-tensor" }
use mielin_tensor::TensorRuntime;
use mielin_hal::capabilities::HardwareCapabilities;
// Create runtime with detected capabilities
let caps = HardwareCapabilities::SVE2 | HardwareCapabilities::NEON;
let runtime = TensorRuntime::new(caps);
// Check for specific features
if runtime.supports_sve2() {
println!("SVE2 available for vector operations");
}
if runtime.supports_sme() {
println!("SME available for matrix operations");
}
TensorRuntimepub struct TensorRuntime {
capabilities: HardwareCapabilities,
}
impl TensorRuntime {
pub fn new(capabilities: HardwareCapabilities) -> Self;
pub fn supports_sve2(&self) -> bool;
pub fn supports_sme(&self) -> bool;
}
Acceleratorpub struct Accelerator {
pub name: &'static str,
}
impl Accelerator {
pub const fn new(name: &'static str) -> Self;
}
TensorSchedulerpub struct TensorScheduler {}
impl TensorScheduler {
pub fn new() -> Self;
pub fn schedule_matmul(&self, size: usize);
}
use mielin_tensor::TensorRuntime;
use mielin_hal::capabilities::HardwareProfile;
let hw = HardwareProfile::detect();
let runtime = TensorRuntime::new(hw.capabilities);
if runtime.supports_sve2() {
// Use SVE2 for operations
perform_sve2_matmul();
} else {
// Fallback to scalar
perform_scalar_matmul();
}
use mielin_tensor::scheduler::TensorScheduler;
let scheduler = TensorScheduler::new();
// Schedule a 1024x1024 matrix multiplication
scheduler.schedule_matmul(1024);
Performance estimates for 1024x1024 matrix multiplication:
| Platform | Time | Notes |
|---|---|---|
| Scalar (no SIMD) | ~1000 ms | Baseline |
| NEON (128-bit) | ~100 ms | 10x speedup |
| SVE2 (256-bit) | ~50 ms | 20x speedup |
| SVE2 (512-bit) | ~25 ms | 40x speedup |
| SME | ~10 ms | 100x speedup |
| NPU | ~5 ms | 200x speedup |
TensorLogic integrates with the MielinOS kernel:
use mielin_kernel::tensor::TensorContext;
// Create context with memory budget
let ctx = TensorContext::new(10 * 1024 * 1024); // 10 MB
// Context provides:
// - Memory allocation tracking
// - Resource limits
// - Priority scheduling
use mielin_kernel::tensor::TensorContext;
let ctx = TensorContext::new(memory_budget);
// Kernel ensures tensor ops don't exceed budget
Future: Kernel-level scheduler for tensor operations:
cargo test -p mielin-tensor
Current implementation is foundational:
This crate provides the structure for future TensorLogic integration.
Active research areas:
MIT OR Apache-2.0