Resource Control Benchmarks

Resource control aims to control compute resource distribution to improve reliability and utilization of a system. resctl-bench is a collection of whole-system benchmarks to evaluate resource control and hardware behaviors using realistic simulated workloads.

Comprehensive resource control involves the whole system - kernel subsystems such as cgroup2, memory management, file system and block layer, userspace system components and even the SSD. Furthermore, testing resource control end-to-end requires scenarios involving realistic workloads and monitoring their interactions. The combination makes benchmarking resource control challenging and error-prone. It's easy to slip up on a configuration and testing with real workloads can be tedious and unreliable.

resctl-bench encapsulates the whole process so that resource control benchmarks can be performed easily and reliably. It verifies and updates system configurations, reproduces resource contention scenarios with a realistic latency-sensitive workload simulator and other secondary workloads, analyzes the resulting system and workload behaviors, and generates easily understandable reports.

resctl-bench is a part of resctl-demo suite, which gives a guided tour of various resource control strategies using live scenarios built on the same components. The benchmarks implemented in resctl-bench involve concepts and components which are documented in resctl-demo in depth. For more information on resctl-demo, visit:

https://github.com/facebookexperimental/resctl-demo

Premade System Images

Comprehensive resource control has many requirements, some of which can be difficult to configure on an existing system. resctl-demo provides premade images to help getting started. Visit the following page for details:

https://facebookmicrosites.github.io/resctl-demo-website

For other installation options, visit:

https://github.com/facebookexperimental/resctl-demo

An Example Session

Let's say we want to see how well iocost can protect rd-hashd and designed a bench sequence like the following:

  1. Run hashd-params to determine hashd parameters.
  2. Run protection with iocost disabled to establish the baseline.
  3. Run iocost-params to determine the iocost parameters.
  4. Run protection with iocost enabled and compare the results.

Assuming the root device is nvme0n1 with the device number 259:0, this maps to:

$ echo '259:0 enable=0' > /sys/fs/cgroup/io.cost.qos $ echo 0 > /sys/block/nvme0n1/queue/wbt_lat_usec $ resctl-bench -r result.json run \ hashd-params:passive=io \ protection:id=iocost-off,passive=io \ iocost-params \ protection:id=iocost-on

Here are the example outputs:

Let's look at the result of the first benchmark - hashd-params.

``` [hashd-params result] 2021-06-22 17:23:03 - 17:43:47

System info: kernel="5.6.13-0fbk165756gdcbe47195163" nrcpus=36 memory=63.9G swap=32.0G swappiness=60 zswap mem_profile=16 (avail=57.4G share=12.0G target=11.0G) passive=io

IO info: dev=nvme0n1(259:0) model="WDC CL SN720 SDAQNTW-512G-1020" size=477G iosched=mq-deadline wbt=off iocost=off other=off

Params: log_bps=1.0M

Result: hashsize=1.2M rpsmax=1029 memactual=16.1G chunkpages=25 ```

After the header, the following three blocks are showing the system and bench configurations followed by the result.

Let's now take a look at the first next result. Partial header:

[protection result] "iocost-off" 2021-06-22 19:13:37 - 19:30:25 ... IO info: dev=nvme0n1(259:0) model="WDC CL SN720 SDAQNTW-512G-1020" size=477G iosched=mq-deadline wbt=off iocost=off other=off

shows that this is protection result with ID iocost-off. Skipping over to the result:

``` Memory Hog Summary ==================

IO Latency: R p50=885u:3.7m/49.5m p90=4.7m:12.7m/150m p99=13.1m:25.1m/350m max=30.4m:65.4m/750m W p50=5.0m:16.3m/99.5m p90=17.6m:28.3m/250m p99=29.0m:38.8m/450m max=48.9m:87.0m/850m

Isolation and Request Latency Impact Distributions:

             min   p01   p05   p10   p25   p50   p75   p90   p95   p99   max  mean stdev

isol% 0 0.49 1.65 2.24 13.12 50.90 72.52 82.12 88.56 100.0 100.0 45.50 30.72 lat-imp% 0 0 0 0 4.69 17.00 40.54 75.06 121.9 380.3 882.5 39.42 81.53

Result: isol=45.50:30.72% latimp=39.42%:81.53 workcsv=100.0% missing=0.26% ```

For brevity, let's just focus on the isol=45.50:30.72% on the last line, which is indicating that the isolation factor - how well the RPS of the rd-hashd could be protected against interferences from memory hogs - averaged 45.5% with the standard deviation of 30.72%. Roughly speaking, our main workload's RPS halved while the system was experiencing memory shortage. For more information on the output format:

So, we now know that without iocost, the protection isn't great. The next iocost-params benchmark determines the parameters so that we can enable it. The result:

iocost model: rbps=1348822120 rseqiops=235687 rrandiops=218614 wbps=601694170 wseqiops=133453 wrandiops=69308 iocost QoS: rpct=95.00 rlat=19562 wpct=95.00 wlat=65667 min=60.00 max=100.00

iocost-params automatically applies the determined parameters for the subsequent benchmarks. The QoS parameters determined here are very naive but should do for our purpose. For determining more accurate QoS parameters and evaluating storage devices comprehensively, see the iocost-tune benchmark.

Let's see whether the protection result is any better with iocost on:

[protection result] "iocost-on" 2021-06-22 19:38:53 - 20:02:27 ... IO info: dev=nvme0n1(259:0) model="WDC CL SN720 SDAQNTW-512G-1020" size=477G iosched=mq-deadline wbt=off iocost=on other=off iocost model: rbps=1348822120 rseqiops=235687 rrandiops=218614 wbps=601694170 wseqiops=133453 wrandiops=69308 iocost QoS: rpct=95.00 rlat=19562 wpct=95.00 wlat=65667 min=60.00 max=100.00

The header confirms that we are testing the correct configuration. The result:

``` Memory Hog Summary ==================

IO Latency: R p50=164u:42.2u/415u p90=915u:827u/17.5m p99=3.4m:4.5m/97.5m max=8.8m:10.3m/250m W p50=158u:1.7m/41.5m p90=2.3m:9.1m/95.5m p99=5.1m:14.3m/97.5m max=8.8m:21.7m/350m

Isolation and Request Latency Impact Distributions:

             min   p01   p05   p10   p25   p50   p75   p90   p95   p99   max  mean stdev

isol% 0 0 88.34 90.57 93.78 97.30 100.0 100.0 100.0 100.0 100.0 95.18 11.06 lat-imp% 0 0 0.96 2.20 3.79 6.49 10.22 15.63 18.32 29.55 263.0 8.14 9.99

Result: isol=95.18:11.06% latimp=8.14%:9.99 workcsv=42.89% missing=0.21% ```

The isolation factor average is now 95.18% with the standard deviation of 11.06%, a significant improvement over 45.5% without iocost.

This example shows that testing resource control behaviors using scenarios that exercise every layer of the tall stack realistically is easy and reliable with resctl-bench. For more information, explore the doc pages: