`perf focus` is an in-progress tool that can you be used with linux
`perf` to answer queries about runtimes. Note that this tool is under
heavy development and you should not expect stable usage at this
point. Moreover, this documentation is incomplete, run `perf-focus
--help` for a complete list of options (or, of course, you can read
the source).

### Install and gather data

The first step is to install the tool:

```bash
> cargo install perf-focus
```

Next, you will want to gather some data by running `perf`. Use the
following settings to get a good backtrace:

```bash
> perf record -F 99 -g <some-command-here>
```

**Note:** if you are profiling a Rust program, enable debuginfo to
ensure that we can get a backtrace.

### Critical idea: a matcher

When you use the tool, you must specify a matcher. The matcher lets
you filter the samples to find those that take place in a paricular
function of interest; this lets you easily analyze cross-cutting parts
of the code that are invokved from many places. Here is a simple
example:

```
> perf focus '{^middle::traits}..{^je_}'
```

This call will analyze the `perf.data` file generated by `perf record`
and report what percentage of samples were taken when some function
whose name begins with `middle::traits` was on the stack and it
invoked (transitively) a function whose name began with `je_`. In the
query syntax, `{<regex>}` matches a single function whose name is
given by the embedded regular expression. The `,` operator first
matches The `..M` prefix skips over any number of frames before
matching `M`.  It can also be used as a binary operator, so that
`M..N` is equivalent to `M,..N`.

```
> perf focus '{^a$},{^b$}'
```

Reports how often a function named `a` directly called a function
named `b`.

```
> perf focus '{^a$},!..{^b$}'
```

Reports how often a function named `a` was found on the stack
*without* having (transitively) called a function named `b`.

### Profiling rustc queries

If you pass `--rustc-query`, the tool will apply filter that strips
out everything but [rustc queries][q]. This is useful for figuring
out rustc compilation time. I suggest using this in conjunction
with `--tree-callees`.

[q]: https://rust-lang-nursery.github.io/rustc-guide/query.html

Example:

```
> perf focus '{main}' --rustc-query --tree-callees | head
Matcher    : {main}
Matches    : 2690
Not Matches: 0
Percentage : 100%

Tree
| matched `{main}` (100% total, 25% self)
: | mir_borrowck<'tcx>> (35% total, 35% self)
: : | normalize_projection_ty<'tcx>> (0% total, 0% self)
: | typeck_item_bodies<'tcx>> (11% total, 0% self)
```

### Trees

You can generate callee and caller trees by passing one of the following
options.

```
--tree-callee
--tree-caller
```

This will give output showing each function that was called, along
with the percentage of time spent in that subtree ("total") as well as
in that actual function ("self"). (As always, these are precentages of
total program execution.) You can customize these with `--tree-max-depth` and `--tree-min-percent`,
which are useful for culling uninteresting things.

Example output:

```bash
> perf focus '{add_drop_live_constraint}' --tree-callees --tree-max-depth 3 --tree-min-percent 3
Matcher    : {add_drop_live_constraint}
Matches    : 708
Not Matches: 1982
Percentage : 26%

Tree
| matched `{add_drop_live_constraint}` (26% total, 0% self)
: | rustc_mir::borrow_check::nll::type_check::TypeChecker::fully_perform_op (25% total, 0% self)
: : | rustc::infer::InferCtxt::commit_if_ok (23% total, 0% self)
: : : | <std::collections::hash::set::HashSet<T, S>>::insert (8% total, 0% self) [...]
...
```

### Graphs

You can also generate call graphs by passing one of the following
options:

```
--graph
--graph-callers
--graph-callees
```

The first option will generate call graphs including all the frames in
every matching sample. The `--graph-callers` option geneates a call
graph including only those frames that invoked the matched
code. `--graph-callees` only includes those frame sthat were called by
the matched code.

In the graph, each node and edge is labeled with a percentage,
indicating the percentage of samples in which it appeared. This
percentage is **always** an absolute percentage across all samples in
the run (it is not a percentage of the matching samples, in
particular).

By default, the graph includes the top 22 most significant functions
(and edges between them). You can include more or less by passing
`--threshold N` (to include the top N functions).

You can use `--rename <regex> <match>` to munge the names of functions
that appear in the graph. This can be useful for stripping parts
of the fn name, or coallescing functions:

```
// convert things like `middle::traits::select::foo::bar` to
// `middle::traits::select`:
--rename '(^middle::traits::[a-zA-Z0-9_]+)::.*' '$1'

// strip the last `...::XXX` suffix:
--rename '::[a-zA-Z0-9_]+$' ''
```

### Histograms

Instead of a graph, you can use the histogram options to just dump out the most common
functions and the percentage of samples in which they appeared (as above, all percentages
are absolute):

```
--hist
--hist-callers
--hist-callees
```