gpu-trace-perf

Crates.iogpu-trace-perf
lib.rsgpu-trace-perf
version1.4.0
created_at2020-03-26 00:09:59.460187+00
updated_at2025-09-23 05:21:00.658304+00
descriptionPlays a collection of GPU traces under different environments to evaluate driver changes on performance
homepage
repositoryhttps://gitlab.freedesktop.org/anholt/gpu-trace-perf
max_upload_size
id222842
size4,566,763
Emma Anholt (anholt)

documentation

README

gpu-trace-perf

This is a rust rewrite of some tooling I built for comparing performance between different graphics driver settings on graphics traces. The goal is for a driver developer to be able to quickly experiment and find how their changes affect the performance of actual rendering.

Right now apitrace, renderdoc, gfxretrace, and angle_trace_tests are supported. You pass the tool a collection of GPU traces (or the path to the angle_perf_traces binary, which will then enumerate tests), and it will run through all of them, rerunning to get better stats over time, and give you an estimate of the change in FPS from your driver change.

Installing

apt-get install cargo
cargo install gpu-trace-perf

For apitrace traces (*.trace), you also need apitrace installed. I recommend having apitrace's waffle backend enabled, and WAFFLE_PLATFORM=gbm set in the environment to not flicker windows on the screen constantly.

For renderdoc traces (*.rdc), you need:

  • python3
  • renderdoc installed (sudo apt-get install renderdoc)
  • renderdoc's python module findable from python3.

Example usage

gpu-trace-perf run --traces $HOME/src/traces-db beforedriver afterdriver

This command will find all the traces in traces-db and run them in a loop printing stats until you feel ready to hit ^C.

The beforedriver and afterdriver arguments are scripts in your path that set the environment to make you use your new driver, like this:

#!/bin/sh

export LD_LIBRARY_PATH=$HOME/src/prefix/lib
"$@"

Since a traces db may be large and a change being tested may only affect a subset of the traces, you can filter down which traces the replayed repeatedly to only those whose stderr output is changed by some debug environment variables:

    # Only re-run traces that had their shader compiler or command stream output changed on NVK.
    gpu-trace-perf run --traces $HOME/src/traces-db beforedriver afterdriver \
        --debug-filter "NVK_DEBUG=push_dump" --debug-filter "NAK_DEBUG=print"

Running ANGLE traces

To include ANGLE traces in your test list, include the angle_trace_tests binary in the -t list (or the traces directory). You can also select a specific ANGLE trace with (for example):

gpu-trace-perf -t <path>/angle_trace_tests/TraceTest.minetest

utrace timings

Normally, the trace tool-provided timings are used for the FPS result. However, sometimes you only care about a specific subset of your command buffers, and watching just those can reduce the noise of the timings. If the driver supports utrace, you can pass "--utrace <comma_separated_utrace_events>" to use those instead of the trace tool's timings. This can be a particular help with renderdoc traces, which have high CPU overhead for per-renderpass setup on Vulkan. Note that if your driver has a nonstandard prefix for start/end events, you may need to add it to u_trace::Frame::event_times(). You can also specify "drawcalls" as the event to expand to a list of common draw and compute events across several drivers.

Shader heuristic analysis

Sometimes as a developer, you want to select between two modes of compiling or running a shader based on some heuristic. This tool lets you generate the A/B times per shader once, then iterate on your heuristic in the quick-to-run rust code instead of doing lengthy trace replay runs for each idea you come up with.

The requirements are:

  • The driver has utrace events for the start/end of draw call times
  • The draw events include the hashes of the shaders for each stage involved in the draw.
  • The change to the shader is represented in the shader hash.
  • check_debug_filter() includes the shader dumping env vars for your driver
  • your before/after scripts only append to any shader debug env vars also involved in shader dumping.
  • capture_draw_times() has the utrace event names and the shader stage names for your driver.
  • shader_parser.rs supports parsing your driver's shader outputs.

Currently, turnip is supported, and some code exists to support other drivers, but is incomplete. If you meet all the requirements, running looks like:

    gpu-trace-perf run append beforedriver afterdriver --output results/ \
        --capture-shaders  --traces $HOME/src/traces-db

    gpu-trace-perf shader-analyze output/

For any traces with shaders that changed modes, it will dump the per-trace per-shader times between the before and after environments, and a table showing the overall effect on trace times between the available heuristics (always choose driver B, always choose optimally, and a dummy heuristic as an example). Then, edit shader_analyze.rs to replace the dummy heuristic with your own, potentially adding multiple to the table.

Cross building for your embedded device

Add the following to ~/.cargo/config:

[target.armv7-unknown-linux-gnueabihf]
linker = "arm-linux-gnueabihf-gcc"

[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"

And set up the new toolchain and build:

rustup target add aarch64-unknown-linux-gnu
cargo build --release --target aarch64-unknown-linux-gnu gpu-trace-perf
scp target/aarch64-unknown-linux-gnu/release/gpu-trace-perf device:bin/

License

Licensed under the MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

Commit count: 0

cargo fmt