Crates.io | strength_reduce |
lib.rs | strength_reduce |
version | 0.2.4 |
source | src |
created_at | 2019-01-03 09:25:27.861719 |
updated_at | 2022-11-08 03:46:49.572569 |
description | Faster integer division and modulus operations |
homepage | |
repository | http://github.com/ejmahler/strength_reduce |
max_upload_size | |
id | 105196 |
size | 80,457 |
strength_reduce
implements integer division and modulo via "arithmetic strength reduction".
Modern processors can do multiplication and shifts much faster than division, and "arithmetic strength reduction" is an algorithm to transform divisions into multiplications and shifts. Compilers already perform this optimization for divisors that are known at compile time; this library enables this optimization for divisors that are only known at runtime.
Benchmarking shows a 5-10x speedup on integer division and modulo operations.
This library is intended for hot loops like the example below, where a division is repeated many times in a loop with the divisor remaining unchanged. There is a setup cost associated with creating stength-reduced division instances, so using strength-reduced division for 1-2 divisions is not worth the setup cost. The break-even point differs by use-case, but is typically low: Benchmarking has shown that takes 3 to 4 repeated divisions with the same StengthReduced## instance to be worth it.
strength_reduce
is #![no_std]
See the API Documentation for more details.
use strength_reduce::StrengthReducedU64;
let mut my_array: Vec<u64> = (0..500).collect();
let divisor = 3;
let modulo = 14;
// slow naive division and modulo
for element in &mut my_array {
*element = (*element / divisor) % modulo;
}
// fast strength-reduced division and modulo
let reduced_divisor = StrengthReducedU64::new(divisor);
let reduced_modulo = StrengthReducedU64::new(modulo);
for element in &mut my_array {
*element = (*element / reduced_divisor) % reduced_modulo;
}
strength_reduce
uses proptest
to generate test cases. In addition, the u8
and u16
problem spaces are small enough that we can exhaustively test every possible combination of numerator and divisor.
However, the u16
exhaustive test takes several minutes to run, so it is marked #[ignore]
. Before submitting pull requests, please test with cargo test -- --ignored
at least once.
The strength_reduce
crate requires rustc 1.26 or greater.
Licensed under either of
at your option.