rstsr-openblas

Crates.iorstsr-openblas
lib.rsrstsr-openblas
version0.5.1
created_at2025-02-08 05:02:34.225133+00
updated_at2025-09-01 08:13:13.402859+00
descriptionAn n-Dimension Rust Tensor Toolkit
homepage
repositoryhttps://github.com/RESTGroup/rstsr
max_upload_size
id1547769
size351,918
Zhenyu Zhu ajz34 (ajz34)

documentation

README

RSTSR OpenBLAS device

This crate enables OpenBLAS device.

For more information of OpenBLAS and its usage, we refer to document of rstsr-openblas-ffi.

Usage

use rstsr_core::prelude::*;
use rstsr_openblas::DeviceOpenBLAS;

// specify the number of threads of 16
let device = DeviceOpenBLAS::new(16);
// if you want to use the default number of threads, use the following line
// let device = DeviceOpenBLAS::default();

let a = rt::linspace((0.0, 1.0, 1048576, &device)).into_shape([16, 256, 256]);
let b = rt::linspace((1.0, 2.0, 1048576, &device)).into_shape([16, 256, 256]);

// by optimized BLAS, the following operation is very fast
let c = &a % &b;

// mean of all elements is also performed in parallel
let c_mean = c.mean_all();

println!("{:?}", c_mean);
assert!((c_mean - 213.2503660477036) < 1e-6);

Important Notes

  • We do not provide automatic linkage:

    • Please add -l openblas in RUSTFLAGS, or cargo:rustc-link-lib=openblas in build.rs, or something similar, to your project. We do not use external FFI crates blas or blas-sys, and do not automatically search OpenBLAS library for linking.
    • If feature openmp activated, please add -l gomp or -l omp in RUSTFLAGS, or cargo:rustc-link-lib=gomp or cargo:rustc-link-lib=omp in build.rs, or something similar, to your project. We do not use external FFI crate openmp-sys, and do not automatically search for OpenMP library for linking.
  • If your OpenBLAS is compiled with OpenMP, please add openmp feature to either this crate or rstsr-openblas-ffi.

    • In our testing, OpenBLAS with OpenMP is probably more efficient than pthreads. However, we currently decided not make openmp as default feature.
Commit count: 457

cargo fmt