autoperf vastly simplifies the instrumentation of programs with performance counters on Intel machines. Rather than trying to learn how to measure every event and manually programming event values in counter registers or perf, you can use autoperf which will repeatedly run your program until it has measured every single performance event on your machine. autoperf tries to compute a near-optimal schedule that maximizes the amount of events measured per run, minimizes the total number of runs and avoids multiplexing of events on counters.
autoperf is known to work with Ubuntu 18.04 on Skylake and
IvyBridge/SandyBridge architectures. All Intel architectures should work,
please file a bug request if they don't. autoperf builds on perf
from the
Linux project and a few other libraries that can be installed using:
$ sudo apt-get install likwid cpuid hwloc numactl util-linux
To run the example analysis scripts, you'll need these python3 libraries:
$ pip3 install ascii_graph matplotlib pandas argparse numpy
You'll also need Rust which is best installed using rustup:
$ curl https://sh.rustup.rs -sSf | sh -s -- -y
$ source $HOME/.cargo/env
autoperf is published on crates.io, so once you have rust and cargo, you can
install it like this:
$ cargo install autoperf
Or clone and build the repository yourself:
$ git clone https://github.com/gz/autoperf.git
$ cd autoperf
$ cargo build --release
$ ./target/release/autoperf --help
autoperf uses perf internally to interface with Linux and the performance
counter hardware. perf recommends that the following settings are disabled.
Therefore, autoperf will check the values of those configurations and refuse to
start if they are not set like below:
sudo sh -c 'echo 0 >> /proc/sys/kernel/kptr_restrict'
sudo sh -c 'echo 0 > /proc/sys/kernel/nmi_watchdog'
sudo sh -c 'echo -1 > /proc/sys/kernel/perf_event_paranoid'
autoperf has a few commands, use --help
to get a better overview of all the
options.
The profile command instruments a single program by running it multiple times
until every performance event is measured. For example,
$ autoperf profile sleep 2
will repeatedly run sleep 2
while measuring events with performance
counters. After it is done you will find an out
folder with many csv files
that contain measurements from individual runs.
To combine all those runs into a single CSV result file you can use the
aggregate command:
$ autoperf aggregate ./out
This will do some sanity checking and produce a results.csv
file which contains
all the measured data.
Now you have all the data, so you can start asking some questions. As an example the following script tells you which events were correlated when your program was running:
$ python3 analyze/profile/correlation.py ./out
$ open out/correlation_heatmap.png
For example, visualizing event correlation for the profiled sleep 2
command
above looks like this (every dot represents the correlation between two
measured performance events, on a Skylake machine which had
around 1.7k non-zero event measurement):
You can look at individual events too:
python3 analyze/profile/event_detail.py --resultdir ./out --features AVG.OFFCORE_RESPONSE.ALL_RFO.L3_MISS.REMOTE_HIT_FORWARD
There are more scripts in the analyze
folder to better work with the captured data-sets. Have a look.
autoperf allows you to quickly gather lots of performance (or training) data and reason about it quantitatively. For example, we initially developed autoperf to build ML classifiers that the Barrelfish scheduler could use for detecting application slowdown and make better scheduling decisions. autoperf meant that we needed no domain knowledge about events, aside from how to measure them they could be treated as a black-box.
You can read more about our experiments here:
But we found that autoperf can greatly simplify your life in many other scenarios too: * Find out what performance events are relevant for your workload * Analyzing and finding performance issues in your code * Find classifiers to detect hardware exploits (side channels/spectre/meltdown etc.) * ...