tract-linalg

Crates.iotract-linalg
lib.rstract-linalg
created_at2019-01-16 15:41:28.476283
updated_at2024-04-03 17:36:16.560717
descriptionTiny, no-nonsense, self contained, TensorFlow and ONNX inference
homepage
repositoryhttps://github.com/snipsco/tract
max_upload_size
id108938
Mathieu Poumeyrol

documentation

README

# tract-linalg linalg stands for "linear algebra". This is a misnamer. This crates contains low-level, architecture dependant optimisations used by tract-core. # Functions * MatMatMul: Extended matrix*matrix product: * inspired by Gotoblass and BLIS micro kernel approach * extended for convolution friendly addressing (fused img2col) * fused output pipeline (min, max, and a few more simple, fast ops) * f32*f32 -> f32 (à la sgemm) * i8*i8 -> i32 accumulator -> i32 storage * i8*i8 -> i32 accumulator -> i8 (with channel zeropoint and scale, and re-quantization pipeline) * f32 sigmoid and f32 tanh: at f32 precision, by a rationale function (no exponentiation) * byte-to-byte lookup table # Implementations | | generic fallback | armv6, vfp | armv7 neon | armv8 simd | x64 FMA |-------------------|--------------------|---------------|-------------------|-------------------|----------------- | MatMatMul f32 | | 4x4 | 8x4 | 8x8 | 16x6 | MatMatMul i8->i8 | | | 8x4 | | 8x8 | MatMatMul i8->i32 | | | | | 8x8 | sigmoid f32 | | | 4n | 4n | | tanh f32 | | | 4n | 4n | | byte lookup | | | | |
Commit count: 5734

cargo fmt