rstsr-openblas

Crates.iorstsr-openblas
lib.rsrstsr-openblas
version0.6.2
created_at2025-02-08 05:02:34.225133+00
updated_at2025-11-14 04:55:00.087734+00
descriptionAn n-Dimension Rust Tensor Toolkit
homepage
repositoryhttps://github.com/RESTGroup/rstsr
max_upload_size
id1547769
size359,938
ajz34 (ajz34)

documentation

README

RSTSR OpenBLAS device

This crate enables OpenBLAS device.

For more information of OpenBLAS and its usage, we refer to document of rstsr-openblas-ffi.

Usage

use rstsr_core::prelude::*;
use rstsr_openblas::DeviceOpenBLAS;

// specify the number of threads of 16
let device = DeviceOpenBLAS::new(16);
// if you want to use the default number of threads, use the following line
// let device = DeviceOpenBLAS::default();

let a = rt::linspace((0.0, 1.0, 1048576, &device)).into_shape([16, 256, 256]);
let b = rt::linspace((1.0, 2.0, 1048576, &device)).into_shape([16, 256, 256]);

// by optimized BLAS, the following operation is very fast
let c = &a % &b;

// mean of all elements is also performed in parallel
let c_mean = c.mean_all();

println!("{:?}", c_mean);
assert!((c_mean - 213.2503660477036) < 1e-6);

Important Notes

  • We do not provide automatic linkage:

    • Please add -l openblas in RUSTFLAGS, or cargo:rustc-link-lib=openblas in build.rs, or something similar, to your project. We do not use external FFI crates blas or blas-sys, and do not automatically search OpenBLAS library for linking.
    • If feature openmp activated, please add -l gomp or -l omp in RUSTFLAGS, or cargo:rustc-link-lib=gomp or cargo:rustc-link-lib=omp in build.rs, or something similar, to your project. We do not use external FFI crate openmp-sys, and do not automatically search for OpenMP library for linking.
  • If your OpenBLAS is compiled with OpenMP, please add openmp feature to either this crate or rstsr-openblas-ffi.

    • In our testing, OpenBLAS with OpenMP is probably more efficient than pthreads. However, we currently decided not make openmp as default feature.
Commit count: 457

cargo fmt