Run fio with Haura
Important❗
To actually run the engine you need to first compile betree_storage_stack
in
release mode
, for this execute cargo build --release
in betree/
.
Furthermore, your system must be able to find the compiled library. For that,
source the given environment file in fio-haura/
.
$ source ./env.sh
Additionally, since Haura's configuration is more complex then what is provided
in fio clients a configuration has to be loaded. The configurations path has to
be stored in the environment under BETREE_CONFIG
. See the bectl's Basic Usage
chapter for more information.
Running fio
fio
can be configured with CLI options and jobfiles, they both have the same
capabilities, therefore for brevity we will use CLI options here. You can find
multiple jobfiles which can be used with fio to specify these options in a more
manageable way in fio-haura/jobfiles
.
As an example to perform a simple IOPS test, you can use:
$ fio \
--direct=1 \
--rw=randwrite \
--random_distribution=zipf \
--bs=4k \
--ioengine=external:src/fio-engine-haura.o \
--numjobs=1 \
--runtime=30 \
--time_based \
--group_reporting \
--name=iops-test-job \
--eta-newline=1 \
--size=4G \
--io_size=2G
This starts an IO benchmark using --direct
access in a --rw=randwrite
pattern using a blocksize of --bs=4k
for each access. Furthermore, haura is
specified as --ioengine=external:src/fio-engine-haura.o
and runs for
--runtime=30
seconds with --numjobs=1
. The total size of IO operations for
each thread is --io_size=2GB
which is the upper limit if runtime is not
reached.
❗ Random Workloads Caution
When using random workloads which surpass the size of the internal cache or explicitly sync'ing to disk, extensive fragmentation might appear. This leads to situations where (even though enough space is theoretically available) no continuous space can be allocated, resulting in out of space errors.
To counteract this it is advised to:
- Increase the size of the cache
- Increase the underlying block size while retaining the same
io_size
- Choose a random distribution with a higher skew to specific regions (e.g. zipf) to avoid frequent evictions of nodes from the internal cache
- Reduce the number of jobs; More jobs put more pressure on the cache leading to more frequent evictions which lead to more writeback operations worsening fragmentation
As a general rule this leads to two things: reduce the amount of write operations, enlarge the allocation space.
fio
prints a summary of the results at then end which should look similar to this output:
fio --direct=1 --rw=randwrite --bs=4k --ioengine=external:src/fio-engine-haura.o --runtime=10 --numjobs=4 --time_based --group_reporting --name=iops-test-job --eta-newline=1 --size=4G --thread
iops-test-job: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=haura, iodepth=1
...
fio-3.30
Starting 4 threads
Jobs: 4 (f=4): [w(4)][30.0%][w=12.3MiB/s][w=3147 IOPS][eta 00m:07s]
Jobs: 4 (f=4): [w(4)][54.5%][w=12.7MiB/s][w=3244 IOPS][eta 00m:05s]
Jobs: 4 (f=4): [w(4)][72.7%][w=12.8MiB/s][w=3283 IOPS][eta 00m:03s]
Jobs: 4 (f=4): [w(4)][90.9%][eta 00m:01s]
Jobs: 4 (f=4): [w(4)][100.0%][w=6422KiB/s][w=1605 IOPS][eta 00m:00s]
iops-test-job: (groupid=0, jobs=4): err= 0: pid=46232: Wed Mar 1 13:42:32 2023
write: IOPS=2182, BW=8729KiB/s (8938kB/s)(85.4MiB/10024msec); 0 zone resets
clat (nsec): min=12, max=30256, avg=184.83, stdev=460.93
lat (nsec): min=1122, max=1629.4M, avg=1828274.84, stdev=26162931.82
clat percentiles (nsec):
| 1.00th=[ 14], 5.00th=[ 17], 10.00th=[ 57], 20.00th=[ 60],
| 30.00th=[ 62], 40.00th=[ 64], 50.00th=[ 66], 60.00th=[ 68],
| 70.00th=[ 99], 80.00th=[ 179], 90.00th=[ 692], 95.00th=[ 812],
| 99.00th=[ 1272], 99.50th=[ 1672], 99.90th=[ 3088], 99.95th=[ 3568],
| 99.99th=[23424]
bw ( KiB/s): min= 1664, max=24480, per=100.00%, avg=10387.06, stdev=1608.07, samples=67
iops : min= 416, max= 6120, avg=2596.72, stdev=402.01, samples=67
lat (nsec) : 20=5.94%, 50=2.40%, 100=61.71%, 250=14.51%, 500=4.29%
lat (nsec) : 750=1.49%, 1000=8.25%
lat (usec) : 2=1.07%, 4=0.30%, 10=0.01%, 20=0.01%, 50=0.01%
cpu : usr=16.99%, sys=7.73%, ctx=15964, majf=0, minf=383252
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,21874,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=8729KiB/s (8938kB/s), 8729KiB/s-8729KiB/s (8938kB/s-8938kB/s), io=85.4MiB (89.6MB), run=10024-10024msec
Haura-specific flags
The engine implemented comes with some additional flags to modify the configuration of the started Haura instance. These flags only deactivates the translation of certain conditions usually created in fio benchmarks to Haura itself. Which can be useful for example when using tiered storage setups which cannot be described with fio.
--disrespect-fio-files
Avoid transferring fio file configuration to haura. Can be
used to use specific disks regardless of fio specification.
--disrespect-fio-direct
Use direct mode only as specified in haura configuration.
--disrespect-fio-options
Disregard all fio options in Haura. This only uses the I/O
workflow as executed by fio. Take care to ensure
comparability with results of other engines.
More examples
Have a look at the examples directory of fio
for more usage examples and jobfiles.
When performing read-only benchmarks the benchmarks include some prepopulation which might take depending on the storage medium some time to complete.