Crates.io | gemm-benchmark |
lib.rs | gemm-benchmark |
version | 0.1.3 |
source | src |
created_at | 2021-07-06 18:03:12.077771 |
updated_at | 2022-08-04 06:13:08.4457 |
description | Simple BLAS [sd]gemm benchmark |
homepage | https://github.com/danieldk/gemm-benchmark |
repository | |
max_upload_size | |
id | 419605 |
size | 36,795 |
[sd]gemm
benchmarkThis is a small [sd]gemm
benchmark based, similar to
ACES DGEMM,
implemented in Rust. It supports the following BLAS libraries:
$ cargo install gemm-benchmark --features accelerate
$ cargo install gemm-benchmark --features blis
To build the benchmark with Intel MKL statically linked, use:
$ cargo install gemm-benchmark --features intel-mkl
Intel MKL uses Zen-specific [sd]gemm
kernels on AMD Zen CPUs.
However, these kernels are slower on many Zen CPUs than the AVX2
kernels. You can build the benchmark to override Intel CPU
detection, so that MKL uses AVX2 kernels on Zen CPUs as well. This
does require dynamic linking, since it is not permitted to modify
MKL binaries. To enable this override, use the intel-mkl-amd
feature:
$ cargo install gemm-benchmark --features intel-mkl-amd
$ cargo install gemm-benchmark --features openblas
Set OPENBLAS_NUM_THREADS=1
before running.
By default, sgemm
is benchmarked using 256 x 256 matrices, for
1,000 iterations and 1 thread. The dimensionality (-d
), number
of iterations (-i
), and the number of threads (-t
) can be set
with command-line flags. For example:
$ gemm-benchmark -d 1024 -i 2000 -t 4
Runs the benchmark using 1024 x 1024 matrices, for 1,000 iterations,
and 4 threads. It is also possible to benchmark dgem,
using the
--dgemm
option:
$ gemm-benchmark -d 1024 -i 2000 -t 4 --dgemm
The following table shows GFLOPS for various CPUs using 1 to 16 threads on matrix size 768.
Threads | M1 Accelerate | M1 Pro Accelerate | M1 Ultra Accelerate | Ryzen 3700X MKL | Ryzen 5900X MKL |
---|---|---|---|---|---|
1 | 1340 | 2061 | 2177 | 134 | 148 |
2 | 1226 | 2583 | 3427 | 262 | 284 |
4 | 1102 | 2685 | 3788 | 513 | 558 |
8 | 1253 | 2381 | 4344 | 924 | 1106 |
12 | 1225 | 2248 | 4261 | 989 | 1555 |
16 | 1217 | 2254 | 4376 | 850 | 1390 |