Crates.io | cargo-pgo |
lib.rs | cargo-pgo |
version | 0.2.8 |
source | src |
created_at | 2022-08-03 18:59:41.613581 |
updated_at | 2024-04-02 07:28:28.808845 |
description | Cargo subcommand for optimizing Rust binaries with PGO and BOLT. |
homepage | |
repository | https://github.com/kobzol/cargo-pgo |
max_upload_size | |
id | 638277 |
size | 84,054 |
cargo-pgo
Cargo subcommand that makes it easier to use PGO and BOLT to optimize Rust binaries.
For an example on how to use cargo-pgo
to optimize a binary on GitHub Actions CI, see this workflow.
$ cargo install cargo-pgo
You will also need the llvm-profdata
binary for PGO and llvm-bolt
and merge-fdata
binaries for BOLT.
You can install the PGO helper binary by adding the llvm-tools-preview
component to your toolchain
with rustup
:
$ rustup component add llvm-tools-preview
For BOLT, it's unfortunately more complicated. See below for BOLT installation guide.
BOLT support is currently experimental.
It is important to understand the workflow of using feedback-directed optimizations. Put simply, it consists of three general steps:
Before you start to optimize your binaries, you should first check if your environment is set up
correctly, at least for PGO (BOLT is more complicated). You can do that using the info
command:
$ cargo pgo info
cargo-pgo
provides subcommands that wrap common Cargo commands. It will automatically add
--release
to wrapped commands where it is applicable, since it doesn't really make sense to perform
PGO on debug builds.
First, you need to generate the PGO profiles by performing an instrumented build.
You can currently do that in several ways. The most generic command for creating an instrumented
artifact is cargo pgo instrument
:
$ cargo pgo instrument [<command>] -- [cargo-args]
The command
specifies what command will be executed by cargo
. It is optional and by default it
is set to build
. You can pass additional arguments for cargo
after --
.
There are several ways of producing the profiles:
Building a binary
$ cargo pgo build
# or
$ cargo pgo instrument build
This is the simplest and recommended approach. You build an instrumented binary and then run it
on some workloads. Note that the binary will be located at <target-dir>/<target-triple>/release/<binary-name>
.
Running an instrumented program
$ cargo pgo run
# or
$ cargo pgo instrument run
You can also directly execute an instrumented binary with the cargo pgo run
command,
which is a shortcut for cargo pgo instrument run
. This command will instrument the binary and
then execute it right away.
Run instrumented tests
$ cargo pgo test
# or
$ cargo pgo instrument test
This command will generate profiles by executing tests. Note that unless your test suite is really comprehensive, it might be better to create a binary and run it on some specific workloads instead.
Run instrumented benchmarks
$ cargo pgo bench
# or
$ cargo pgo instrument bench
This command will generate profiles by executing benchmarks.
Once you have generated some profiles, you can execute cargo pgo optimize
to build an optimized
version of your binary.
If you want, you can also pass a command to cargo pgo optimize
to e.g. run PGO-optimized benchmarks
or tests:
$ cargo pgo optimize bench
$ cargo pgo optimize test
You can analyze gathered PGO profiles using the llvm-profdata
binary:
$ llvm-profdata show <profile>.profdata
Using BOLT with cargo-pgo
is similar to using PGO, however you either have to build
BOLT manually or download it from the GitHub releases archive (for LLVM 16+). Support for BOLT is currently in an
experimental stage.
BOLT is not supported directly by rustc
, so the instrumentation and optimization commands are not
directly applied to binaries built by rustc
. Instead, cargo-pgo
creates additional binaries that
you have to use for gathering profiles and executing the optimized code.
First, you need to generate the BOLT profiles. To do that, execute the following command:
$ cargo pgo bolt build
The instrumented binary will be located at <target-dir>/<target-triple>/release/<binary-name>-bolt-instrumented
.
Execute it on several workloads to gather as much data as possible.
Note that for BOLT, the profile gathering step is optional. You can also simply run the optimization step (see below) without any profiles, although it will probably not have a large effect.
Once you have generated some profiles, you can execute cargo pgo bolt optimize
to build an
optimized version of your binary. The optimized binary will be named <binary-name>-bolt-optimized
.
Yes, BOLT and PGO can even be combined :) To do that, you should first generate PGO profiles and
then use BOLT on already PGO optimized binaries. You can do that using the --with-pgo
flag:
# Build PGO instrumented binary
$ cargo pgo build
# Run binary to gather PGO profiles
$ ./target/.../<binary>
# Build BOLT instrumented binary using PGO profiles
$ cargo pgo bolt build --with-pgo
# Run binary to gather BOLT profiles
$ ./target/.../<binary>-bolt-instrumented
# Optimize a PGO-optimized binary with BOLT
$ cargo pgo bolt optimize --with-pgo
Do not strip symbols from your release binary when using BOLT! If you do it, you might encounter linker errors.
Here's a short guide how to compile LLVM with BOLT manually. You will need a recent compiler, CMake
and
ninja
.
Note: LLVM BOLT is slowly getting into package repositories, although it's not fully working out of the box yet. You can find more details here if you're interested.
$ git clone https://github.com/llvm/llvm-project
$ cd llvm-project
$ git checkout llvmorg-14.0.5
Note that BOLT is being actively fixed, so a trunk
version of LLVM might actually work better.$ cmake -S llvm -B build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=${PWD}/llvm-install \
-DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt;bolt"
$ cd build
$ ninja
$ ninja install
The built files should be located at <llvm-dir>/llvm-install/bin
. You should add this directory
to $PATH
to make BOLT usable with cargo-pgo
.