--- id: optimisations title: Optimisations --- bfc includes an extensive range of optimisations. This page discusses the techniques used, and gives examples of BF programs that are transformed in each case. See also [BF compliance](compliance.md) for implementation decisions that affect BF program portability. By default, bfc will run all optimisations. This can be controlled by CLI options: ``` --opt=0 # no optimisations --opt=1 # peephole only --opt=2 # peephole and speculative execution ``` ## Peephole Optimisations Peephole optimisations operate on small sequences of BF instructions. When applied together they provide significant speedups. ### Combining Instructions We combine successive increments/decrements: ``` Compile Combine +++ => Increment 1 => Increment 3 Increment 1 Increment 1 ``` If increments/decrements cancel out, we remove them entirely. ``` Compile Combine +- => Increment 1 => # nothing! Increment -1 ``` We combine pointer increments: ``` Compile Combine >> => PointerIncrement 1 => PointerIncrement 2 PointerIncrement 1 ``` We do the same thing for successive sets: ``` Combine Set 1 => Set 2 Set 2 ``` We combine sets and increments too: ``` Compile Known zero Combine + => Increment 1 => Set 0 => Set 1 Increment 1 ``` We remove increments when there's a set immediately after: ``` Combine Increment 1 => Set 2 Set 2 ``` We remove both increments and sets if there's a read immediately after: ``` Combine Increment 1 => Read Read ``` We track the current cell position in straight-line code. If we can determine the last instruction to modify the current cell, it doesn't need to be immediately previous. For example, `+>-<,`: ``` Combine Increment 1 => PointerIncrement 1 PointerIncrement 1 Increment -1 Increment -1 PointerIncrement -1 PointerIncrement -1 Read Read ``` ### Loop Simplification `[-]` is a common BF idiom for zeroing cells. We replace that with `Set`, enabling further instruction combination. ``` Compile Simplify [-] => Loop => Set 0 Increment -1 ``` ### Dead Code Elimination We remove loops that we know are dead. For example, loops at the beginning of a program: ``` Compile Known zero DCE [>] => Loop => Set 0 => Set 0 DataIncrement 1 Loop DataIncrement ``` Loops following another loop (one BF technique for comments is `[-][this, is+a comment.]`). ``` Compile Annotate DCE [>][>] => Loop => Loop => Loop DataIncrement 1 DataIncrement 1 DataIncrement 1 Loop Set 0 Set 0 DataIncrement 1 Loop DataIncrement 1 ``` Loops where the cell has previously been set to zero: ``` Compile Simplify DCE [-]>+<[] => Loop => Set 0 => Set 0 Increment -1 DataIncrement 1 DataIncrement 1 DataIncrement 1 Increment 1 Increment 1 Increment 1 DataIncrement -1 DataIncrement -1 DataIncrement -1 Loop Loop ``` We remove redundant set commands after loops (often generated by loop annotation as above). ``` Remove redundant set Loop => Loop Increment -1 Increment -1 Set 0 ``` We also remove dead code at the end of a program. ``` Remove pure code Write => Write Increment 1 ``` Finally, we remove cell modifications that are immediately overwritten by reads, e.g. `+,` is equivalent to `,`. ### Reorder with offsets Given a sequence of instructions without loops or I/O, we can safely reorder them to have the same effect (we assume no out-of-bound cell access). This enables us to combine pointer operations: ``` Compile Reorder >+> => PointerIncrement 1 => Increment 1 (offset 1) Increment 1 PointerIncrement 2 PointerIncrement 1 ``` We also ensure we modify cells in a consistent order, to aid cache locality. For example, `>+<+>>+` writes to cell #1, then cell #0, then cell #2. We reorder these instructions to obtain: ``` Increment 1 (offset 0) Increment 1 (offset 1) Increment 1 (offset 2) PointerIncrement 2 ``` ### Multiply-move loops bfc can detect loops that perform multiplication and converts them to multiply instructions. This works for simple cases like `[->++<]` (multiply by two into the next cell) as well as more complex cases like `[>-<->>+++<<]`. ## Cell Bounds Analysis bfc provides programs with [up to 100,000 cells](/docs/compliance), all of which must be zero-initialised. However, most programs don't use the whole range. bfc uses static analysis to work out how many cells a BF program may use, so it doesn't need to allocate or zero-initialise more memory than necessary. ``` >><< only uses three cells ``` ``` [>><<] uses three cells at most ``` ``` [>><<]>>> uses four cells at most ``` ``` [>] may use any number of cells, so we must assume 100,000 ``` ## Speculative Execution bfc executes as much as it can at compile time. For some programs (such as hello_world.bf) this optimises away the entire program to just writing to stdout. bfc doesn't even need to allocate memory for cells in this situation. ``` $ cargo run -- sample_programs/hello_world.bf --dump-llvm @known_outputs = constant [13 x i8] c"Hello World!\0A" declare i32 @write(i32, i8*, i32) define i32 @main() { entry: %0 = call i32 @write(i32 0, i8* getelementptr inbounds ([13 x i8]* @known_outputs, i32 0, i32 0), i32 13) ret i32 0 } ``` ### Ensuring Termination bfc sets a maximum number of execution steps, avoiding infinite loops hanging the compiler. As a result `+[]` will have `+` executed (so our initial cell value is `1` and `[]` will be in the compiled output. ### Handling Unknown Values If a program reads from data from stdin, speculation execution stops. As a result, `>,` will have `>` executed (setting the initial cell pointer to 1) and `,` will be in the compiled output. ### Partial Loop Evaluation If loops can be entirely executed at compile time, they will be removed from the resulting binary. Partially executed loops will be included in the output, but runtime execution can begin at an arbitrary position in the loop. For example, consider `+[-]+[+,]`. We can execute `+[-]+` entirely, but `[+,]` depends on runtime values. The compiled output contains `[+,]`, but we start execution at the `,` (continuing execution from where compile time execution had to stop). ## Further Reading Interested readers may also enjoy my blog posts: * [An Optimising BF Compiler](http://www.wilfred.me.uk/blog/2015/08/29/an-optimising-bf-compiler/) * [Even More BF Optimisations](http://www.wilfred.me.uk/blog/2015/10/18/even-more-bf-optimisations/) * [An Industrial Grade BF Compiler](http://www.wilfred.me.uk/blog/2016/02/07/an-industrial-grade-bf-compiler)