# `unroll` Unroll for loops with integer literal bounds using an attribute-like macro. [![On crates.io](https://img.shields.io/crates/v/unroll.svg)](https://crates.io/crates/unroll) [![On docs.rs](https://docs.rs/unroll/badge.svg)](https://docs.rs/unroll/) [![Build status](https://gitlab.com/elrnv/unroll/badges/master/pipeline.svg)](https://gitlab.com/elrnv/unroll/pipelines) This crate provides a attribute-like macro `unroll_for_loops` that can be applied to functions. This macro looks for loops to unroll and unrolls them at compile time. ## Why a procedural macro? There is already a crate called [Crunchy](https://github.com/Vurich/crunchy) that unrolls loops with a `macro_rules!` macro. This crate provides an alternative solution for unrolling loops using an attribute-like macro instead. Some benefits of an item based attribute-like macro for unrolling loops are: - Macro invocation doesn't pollute the algorithm with annotations. Only one annotation is needed for each function. - Can use variables defined outside of the macro invocation. - Can be extended to do partial loop unrolling ## Usage Just add `#[unroll_for_loops]` above the function whose for loops you would like to unroll. Currently all for loops with integer literal bounds will be unrolled, although this macro currently can't see inside complex code (e.g. for loops within closures). ## Examples The following example demonstrates the kinds of code that can be unrolled automatically. ```rust #[unroll_for_loops] fn main() { println!("matrix = "); for i in 0..10 { for j in 0..10 { print!("({:?}, {:?})", i, j); } println!(""); } } ``` ### Good Uses Of course not all code benefits from unrolling. For instance the code above is bound by I/O, and hence likely wont benefit from unrolling. On the other hand, computational tasks, such as matrix multiplication, can benefit significantly as demonstrated by this simple example of a matrix-vector product: ```rust #[unroll_for_loops] fn mtx_vec_mul(mtx: &[[f64; 5]; 5], vec: &[f64; 5]) -> [f64; 5] { let mut out = [0.0; 5]; for col in 0..5 { for row in 0..5 { out[row] += mtx[col][row] * vec[col]; } } out } ``` A quick [benchmark](./benches/matrix_vector_product.rs) of this code generated by [criterion](https://crates.io/crates/criterion) shows a 3x performance increase with loop unrolling: ``` Matrix-Vector Product/Ordinary time: [2.7254 ns 2.7657 ns 2.8153 ns] Matrix-Vector Product/Unrolled time: [878.71 ps 882.96 ps 888.01 ps] ``` The relative performance difference can be visualized in the violin plot:

Violin plot for Matrix-Vector product benchmark

### Bad Uses There are also places where this macro has no benefit where one might be expected. For instance, when computing a sum of elements in a large array, one might break the computation into independent blocks to allow the CPU to perform out of order execution optimizations. For example the following code: ```rust fn sum(data: &[f64]) -> f64 { let mut sum = 0.0; for i in 0..data.len() { sum += data[i]; } sum } ``` can get a substantial boost in performance when computed in chunks as: ```rust fn sum(data: &[f64]) -> f64 { let mut sum = [0.0; 32]; for i in (0..data.len()).step_by(32) { for j in 0..32 { sum[j] += data[i + j]; } } sum.iter().sum() } ``` However this example is unlikely to benefit from the `unroll_for_loops` macro. A [benchmark](./benches/unroll_sum.rs) for unrolling sums is included for verification. Having said this, a macro that can partially unroll loops as shown above can be very useful and is left for future work. Contributions are welcome! # Acknowledgements I would like to thank the author of [Crunchy](https://github.com/Vurich/crunchy) for the initial great solution to this problem. # License This repository is licensed under either of * Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or https://www.apache.org/licenses/LICENSE-2.0) * MIT License ([LICENSE-MIT](LICENSE-MIT) or https://opensource.org/licenses/MIT) at your option.