Crates.io | thumb2-stack-size |
lib.rs | thumb2-stack-size |
version | 0.1.1 |
source | src |
created_at | 2018-04-08 20:27:53.469215 |
updated_at | 2018-04-08 20:31:06.42002 |
description | estimates stack space requirements of thumb2 programs |
homepage | |
repository | https://git.eno.space/thumb2-stack-size.git/ |
max_upload_size | |
id | 59624 |
size | 86,682 |
A tool for estimating Thumb/Thumb2 binary stack usage.
Microcontroller development often requires explicit management of stacks. Microcontrollers by and large do not have virtual addressing capabilities, thus their stacks must be pre-allocated entirely in advance. They also mostly feature only small quantities of RAM, thus stacks on microcontrollers are usually small. This present a bit of a dilemma: enough stack space must be allocated to ensure that no overruns occur, but as little memory as possible should be wasted on stacks.
This tool can estimate how much stack space is used by a given Thumb or Thumb2 binary.
thumb2-stack-size supports only 32 bit statically linked ARM executable ELF files at this time. The ELF files must have symbols information, debug information is not necessary. The symbol table are used to reconstruct functions from the input binary.
The input file is read and split into functions using the symbol table to
locate functions and their sizes. Each function, at this point a stream of
instructions, is then parsed instruction for instruction. If any instructions
are encountered that are not defined by the ARM ISA an error is raised and
estimation is aborted. ("Defined" includes the permanently undefined
instructions udf
and udf.w
).
The resulting functions are run through a virtual machine that tracks stacks manipulations done by each instruction. Stack manipulations can be either static (eg push, pop, reserving large stack chunks at once) or dynamic (eg alloca). Traces can also be static or dynamic, but may also be divergent: if two traces lead to the same instruction, but reach that instruction with different stack pointer values, the function cannot be analyzed fully. The tool will attempt a best-effort analysis and calculate a lower bound on the functions stack requirements.
All possible traces through the function are simulated to give three possible results for a given function:
alloca
in a loop.This tool notable does not support analysis of table branches (ie the tbb
and
tbh
instructions). Table branches are often emitted by the Rust formatting
code. Whenever such a branch is discovered a warning is issued. The binary can
be recompile with -C llvm-args=-min-jump-table-entries=99999
and reanalyzed;
the -min-jump-table-entries
option is set to such a high value here that LLVM
will not allowed to emit table branches. The resulting binary is functionally
identical with the same stack properties, but likely larger.
When run without options the tool will output all warnings, followed by a table that contains one line for each function. The table lists
Each warning is annotated with the (demangled) name of the symbol that caused the warning and its address. The address is useful to disambiguate functions that have the same demangled name, as is common for generic functions that are called with different type signatures.
A small program that contains indirect branches (in this case calls through a function pointer) could produce the following output:
warning: function __reset @ 8000040 cannot be statically analyzed
* indirect branches found
warning: function hcl::timer::TimerSet::tick @ 80001cc cannot be statically analyzed
* indirect branches found
warning: function timer::systick @ 8000080 cannot be statically analyzed
* calls to dynamic functions found: 80001cc,
8000040 0 dynamic __reset
8000080 48 dynamic timer::systick
8000084 128 static timer::main
8000134 136 static __hcl_entry
800013e 0 static hcl::qlist::QlistData::unlink
800015c 0 static hcl::qlist::QlistData::pop_front
8000168 0 static <hcl::qlist::QlistLock<'q, A>>::unsafe_insert_before
80001cc 48 dynamic hcl::timer::TimerSet::tick
800023c 32 static hcl::timer::TimerSet::add_stub
8000272 0 static hcl::timer::invoke_indirect
8000296 16 static hcl::timer::invoke_indirect
80002b6 32 static hcl::timer::invoke_indirect
Additionally to the table the tool can also output the call graph of all functions. The call graph may be used to track down indirect function calls if the developer has knowledge about certain functions that is not available to the estimation algorithm.
The call graph is split into newline-separated sections, one section for each function in the input. Each function is traversed in depth-first order, all callees of the function are listed recursively in the process. Mutually recursive functions are also printed, but the call graph is truncated with an ellipsis marker at the first function that appears twice in the graph.
Each line of a call graph contains the address of the called function, beginning with the function for which the call graph is printed. For each function the analysis result is printed as well, followed by the function name.
The same binary as above produces the following call graph:
8000040 0 dynamic __reset
8000080 48 dynamic timer::systick
80001cc 48 dynamic └ hcl::timer::TimerSet::tick
800015c 0 static ├ hcl::qlist::QlistData::pop_front
800013e 0 static │ └ hcl::qlist::QlistData::unlink
8000168 0 static └ <hcl::qlist::QlistLock<'q, A>>::unsafe_insert_before
8000084 128 static timer::main
800023c 32 static └ hcl::timer::TimerSet::add_stub
8000168 0 static └ <hcl::qlist::QlistLock<'q, A>>::unsafe_insert_before
8000134 136 static __hcl_entry
8000084 128 static └ timer::main
800023c 32 static └ hcl::timer::TimerSet::add_stub
8000168 0 static └ <hcl::qlist::QlistLock<'q, A>>::unsafe_insert_before
800013e 0 static hcl::qlist::QlistData::unlink
800015c 0 static hcl::qlist::QlistData::pop_front
800013e 0 static └ hcl::qlist::QlistData::unlink
8000168 0 static <hcl::qlist::QlistLock<'q, A>>::unsafe_insert_before
80001cc 48 dynamic hcl::timer::TimerSet::tick
800015c 0 static ├ hcl::qlist::QlistData::pop_front
800013e 0 static │ └ hcl::qlist::QlistData::unlink
8000168 0 static └ <hcl::qlist::QlistLock<'q, A>>::unsafe_insert_before
800023c 32 static hcl::timer::TimerSet::add_stub
8000168 0 static └ <hcl::qlist::QlistLock<'q, A>>::unsafe_insert_before
8000272 0 static hcl::timer::invoke_indirect
8000296 16 static hcl::timer::invoke_indirect
800013e 0 static └ hcl::qlist::QlistData::unlink
80002b6 32 static hcl::timer::invoke_indirect
800013e 0 static ├ hcl::qlist::QlistData::unlink
8000168 0 static └ <hcl::qlist::QlistLock<'q, A>>::unsafe_insert_before
8000040 0 dynamic __reset
8000080 48 dynamic timer::systick
8000084 128 static timer::main
8000134 136 static __hcl_entry
800013e 0 static hcl::qlist::QlistData::unlink
800015c 0 static hcl::qlist::QlistData::pop_front
8000168 0 static <hcl::qlist::QlistLock<'q, A>>::unsafe_insert_before
80001cc 48 dynamic hcl::timer::TimerSet::tick
800023c 32 static hcl::timer::TimerSet::add_stub
8000272 0 static hcl::timer::invoke_indirect
8000296 16 static hcl::timer::invoke_indirect
80002b6 32 static hcl::timer::invoke_indirect
Note that hcl::timer::invoke_indirect
appears multiple times, bit with different addresses.
This view may be used to deduce more information about stack usage. In the
example case it is known that systick
calls hcl::timer::TimerSet::tick
,
which in turn may call any of the hcl::timer::invoke_indirect
functions. Here
tick
uses at least 48 bytes, plus at most 32 bytes for any of the
invoke_indirect
functions. With this we can deduce that tick
will not
require more than 80 bytes of stack space.