| Crates.io | tenrso-kernels |
| lib.rs | tenrso-kernels |
| version | 0.1.0-alpha.2 |
| created_at | 2025-11-08 08:47:35.797006+00 |
| updated_at | 2025-12-17 06:33:23.126591+00 |
| description | Tensor kernel operations: Khatri-Rao, Kronecker, Hadamard, n-mode products, MTTKRP |
| homepage | https://github.com/cool-japan/tenrso |
| repository | https://github.com/cool-japan/tenrso |
| max_upload_size | |
| id | 1922662 |
| size | 640,258 |
Tensor kernel operations: Khatri-Rao, Kronecker, Hadamard, n-mode products, and MTTKRP.
tenrso-kernels provides high-performance tensor operation kernels that are fundamental building blocks for tensor decompositions and contractions:
All kernels are optimized for performance with SIMD acceleration and parallel execution.
Add to your Cargo.toml:
[dependencies]
tenrso-kernels = "0.1"
# With parallel execution
tenrso-kernels = { version = "0.1", features = ["parallel"] }
use tenrso_kernels::khatri_rao;
use scirs2_core::ndarray_ext::Array2;
let a = Array2::from_shape_fn((100, 10), |(i, j)| (i + j) as f64);
let b = Array2::from_shape_fn((50, 10), |(i, j)| (i * j) as f64);
// Column-wise Kronecker product: (100*50) × 10
let kr = khatri_rao(&a.view(), &b.view());
assert_eq!(kr.shape(), &[5000, 10]);
use tenrso_kernels::kronecker;
use scirs2_core::ndarray_ext::Array2;
let a = Array2::from_elem((2, 3), 2.0);
let b = Array2::from_elem((4, 5), 3.0);
// Tensor product: (2*4) × (3*5)
let kron = kronecker(&a.view(), &b.view());
assert_eq!(kron.shape(), &[8, 15]);
use tenrso_kernels::hadamard;
use scirs2_core::ndarray_ext::Array;
let a = Array::from_elem(vec![10, 20, 30], 2.0);
let b = Array::from_elem(vec![10, 20, 30], 3.0);
// Element-wise multiplication
let result = hadamard(&a, &b)?;
assert_eq!(result[[5, 10, 15]], 6.0);
use tenrso_kernels::nmode_product;
use scirs2_core::ndarray_ext::{Array, Array2};
// Tensor: 10 × 20 × 30
let tensor = Array::zeros(vec![10, 20, 30]);
// Matrix: 15 × 20 (contracts along mode 1)
let matrix = Array2::zeros((15, 20));
// Result: 10 × 15 × 30
let result = nmode_product(&tensor, &matrix, 1)?;
assert_eq!(result.shape(), &[10, 15, 30]);
use tenrso_kernels::mttkrp;
use scirs2_core::ndarray_ext::{Array, Array2};
// Tensor: 100 × 200 × 300
let tensor = Array::zeros(vec![100, 200, 300]);
// Factor matrices for CP decomposition
let u = Array2::zeros((200, 64)); // Mode 1 factors
let v = Array2::zeros((300, 64)); // Mode 2 factors
// MTTKRP along mode 0: (100, 64)
let result = mttkrp(&tensor, &[&u, &v], 0)?;
assert_eq!(result.shape(), &[100, 64]);
pub fn khatri_rao<T>(
a: &ArrayView2<T>,
b: &ArrayView2<T>,
) -> Array2<T>
where
T: LinalgScalar
Computes the column-wise Kronecker product. Output shape: (a.nrows() * b.nrows(), a.ncols()).
Complexity: O(I × J × K) where a is I×K and b is J×K.
pub fn kronecker<T>(
a: &ArrayView2<T>,
b: &ArrayView2<T>,
) -> Array2<T>
where
T: LinalgScalar
Computes the tensor product. Output shape: (a.nrows() * b.nrows(), a.ncols() * b.ncols()).
Complexity: O(I × J × K × L) where a is I×K and b is J×L.
pub fn hadamard<T, D>(
a: &ArrayBase<S, D>,
b: &ArrayBase<S, D>,
) -> Result<Array<T, D>>
where
T: LinalgScalar,
D: Dimension
Element-wise multiplication. Shapes must match exactly.
Complexity: O(N) where N is total elements.
pub fn nmode_product<T>(
tensor: &Array<T, IxDyn>,
matrix: &Array2<T>,
mode: usize,
) -> Result<Array<T, IxDyn>>
where
T: LinalgScalar
Contracts tensor with matrix along specified mode.
Complexity: O(I₀ × ... × Iₙ × J) where tensor is I₀×...×Iₙ, matrix is J×Iₘₒdₑ.
pub fn mttkrp<T>(
tensor: &Array<T, IxDyn>,
factors: &[&Array2<T>],
mode: usize,
) -> Result<Array2<T>>
where
T: LinalgScalar
Matricized Tensor Times Khatri-Rao Product. Core operation for CP-ALS.
Complexity: O(I₀ × ... × Iₙ × R) where tensor is I₀×...×Iₙ, rank is R.
Kernels use scirs2_core SIMD operations when available:
Enable parallel feature for multi-threaded execution:
[dependencies]
tenrso-kernels = { version = "0.1", features = ["parallel"] }
Large matrices use blocked algorithms to improve cache locality:
Run performance benchmarks:
cargo bench --features parallel
# Compare with baseline
cargo bench --bench khatri_rao -- --baseline main
Expected performance (16-core CPU):
# Unit tests
cargo test
# Property tests
cargo test --test properties
# With all features
cargo test --all-features
See examples/ directory:
khatri_rao.rs - Basic usage and benchmarking (TODO)mttkrp_cp.rs - MTTKRP in CP decomposition context (TODO)nmode_tucker.rs - N-mode product for Tucker (TODO)default = ["parallel"] - Parallel execution enabledparallel - Use Rayon for multi-threadingSee ../../CONTRIBUTING.md for development guidelines.
Apache-2.0