{ "cells": [ { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [], "source": [ "# benchmark.ipynb\n", "# Part of the aflplusplus project, requires an ipynb (Jupyter) editor or viewer.\n", "# Author: Chris Ball \n", "import json\n", "import pandas as pd\n", "with open(\"benchmark-results.jsonl\") as f:\n", " lines = f.read().splitlines()\n", "json_lines = [json.loads(line) for line in lines]\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Translate the JSON Lines entries into a single pandas DataFrame\n", "\n", "We have JSON Lines in [benchmark-results.jsonl](benchmark-results.jsonl) that look like this:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"config\": {\n", " \"afl_persistent_config\": true,\n", " \"afl_system_config\": true,\n", " \"afl_version\": \"++4.09a\",\n", " \"comment\": \"i9-9900k, 16GB DDR4-3000, Arch Linux\",\n", " \"compiler\": \"clang version 15.0.7\",\n", " \"target_arch\": \"x86_64-pc-linux-gnu\"\n", " },\n", " \"hardware\": {\n", " \"cpu_fastest_core_mhz\": 4999.879,\n", " \"cpu_model\": \"Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz\",\n", " \"cpu_threads\": 16\n", " },\n", " \"targets\": {\n", " \"test-instr\": {\n", " \"multicore\": {\n", " \"afl_execs_per_sec\": 11025.88,\n", " \"afl_execs_total\": 519670,\n", " \"fuzzers_used\": 1,\n", " \"run_end\": \"2023-09-24 01:18:19.516294\",\n", " \"run_start\": \"2023-09-24 01:17:55.982600\",\n", " \"total_execs_per_sec\": 11019.3,\n", " \"total_run_time\": 47.16\n", " }\n", " },\n", " \"test-instr-persist-shmem\": {\n", " \"multicore\": {\n", " \"afl_execs_per_sec\": 134423.5,\n", " \"afl_execs_total\": 519670,\n", " \"fuzzers_used\": 1,\n", " \"run_end\": \"2023-09-24 01:17:32.262373\",\n", " \"run_start\": \"2023-09-24 01:17:30.328037\",\n", " \"total_execs_per_sec\": 133591.26,\n", " \"total_run_time\": 3.89\n", " }\n", " }\n", " }\n", "}\n" ] } ], "source": [ "print(json.dumps(json.loads(lines[0]), indent=2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The [pd.json_normalize()](https://pandas.pydata.org/docs/reference/api/pandas.json_normalize.html]) method translates this into a flat table that we can perform queries against:" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
config.afl_persistent_configconfig.afl_system_configconfig.afl_versionconfig.commentconfig.compilerconfig.target_archhardware.cpu_fastest_core_mhzhardware.cpu_modelhardware.cpu_threadstargets.test-instr.multicore.afl_execs_per_sec...targets.test-instr.singlecore.run_starttargets.test-instr.singlecore.total_execs_per_sectargets.test-instr.singlecore.total_run_timetargets.test-instr-persist-shmem.singlecore.afl_execs_per_sectargets.test-instr-persist-shmem.singlecore.afl_execs_totaltargets.test-instr-persist-shmem.singlecore.fuzzers_usedtargets.test-instr-persist-shmem.singlecore.run_endtargets.test-instr-persist-shmem.singlecore.run_starttargets.test-instr-persist-shmem.singlecore.total_execs_per_sectargets.test-instr-persist-shmem.singlecore.total_run_time
0TrueTrue++4.09ai9-9900k, 16GB DDR4-3000, Arch Linuxclang version 15.0.7x86_64-pc-linux-gnu4999.879Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz1611025.88...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1TrueTrue++4.09ai9-9900k, 16GB DDR4-3000, Arch Linuxclang version 15.0.7x86_64-pc-linux-gnu4998.794Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz1621139.64...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2TrueTrue++4.09ai9-9900k, 16GB DDR4-3000, Arch Linuxclang version 15.0.7x86_64-pc-linux-gnu4998.859Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz1630618.28...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3TrueTrue++4.09ai9-9900k, 16GB DDR4-3000, Arch Linuxclang version 15.0.7x86_64-pc-linux-gnu5000.078Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz1639125.92...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
4TrueTrue++4.09ai9-9900k, 16GB DDR4-3000, Arch Linuxclang version 15.0.7x86_64-pc-linux-gnu4996.885Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz1647861.04...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
\n", "

5 rows × 37 columns

\n", "
" ], "text/plain": [ " config.afl_persistent_config config.afl_system_config config.afl_version \\\n", "0 True True ++4.09a \n", "1 True True ++4.09a \n", "2 True True ++4.09a \n", "3 True True ++4.09a \n", "4 True True ++4.09a \n", "\n", " config.comment config.compiler \\\n", "0 i9-9900k, 16GB DDR4-3000, Arch Linux clang version 15.0.7 \n", "1 i9-9900k, 16GB DDR4-3000, Arch Linux clang version 15.0.7 \n", "2 i9-9900k, 16GB DDR4-3000, Arch Linux clang version 15.0.7 \n", "3 i9-9900k, 16GB DDR4-3000, Arch Linux clang version 15.0.7 \n", "4 i9-9900k, 16GB DDR4-3000, Arch Linux clang version 15.0.7 \n", "\n", " config.target_arch hardware.cpu_fastest_core_mhz \\\n", "0 x86_64-pc-linux-gnu 4999.879 \n", "1 x86_64-pc-linux-gnu 4998.794 \n", "2 x86_64-pc-linux-gnu 4998.859 \n", "3 x86_64-pc-linux-gnu 5000.078 \n", "4 x86_64-pc-linux-gnu 4996.885 \n", "\n", " hardware.cpu_model hardware.cpu_threads \\\n", "0 Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 16 \n", "1 Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 16 \n", "2 Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 16 \n", "3 Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 16 \n", "4 Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 16 \n", "\n", " targets.test-instr.multicore.afl_execs_per_sec ... \\\n", "0 11025.88 ... \n", "1 21139.64 ... \n", "2 30618.28 ... \n", "3 39125.92 ... \n", "4 47861.04 ... \n", "\n", " targets.test-instr.singlecore.run_start \\\n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " targets.test-instr.singlecore.total_execs_per_sec \\\n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " targets.test-instr.singlecore.total_run_time \\\n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " targets.test-instr-persist-shmem.singlecore.afl_execs_per_sec \\\n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " targets.test-instr-persist-shmem.singlecore.afl_execs_total \\\n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " targets.test-instr-persist-shmem.singlecore.fuzzers_used \\\n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " targets.test-instr-persist-shmem.singlecore.run_end \\\n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " targets.test-instr-persist-shmem.singlecore.run_start \\\n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " targets.test-instr-persist-shmem.singlecore.total_execs_per_sec \\\n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " targets.test-instr-persist-shmem.singlecore.total_run_time \n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", "[5 rows x 37 columns]" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "df = pd.json_normalize(json_lines)\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Graph prep\n", "\n", "We're looking for a line graph showing lines for each fuzz target, in both singlecore and multicore modes, in each config setting -- where the x-axis is number of cores, and the y-axis is either afl_execs_per_sec or total_execs_per_sec (I'm not yet sure which is a better metric to use).\n", "\n", "But first, a mini test harness by checking that the number of rows matched what we'd intuitively expect:" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [], "source": [ "i7 = df.query(\"`config.comment` == 'i9-9900k, 16GB DDR4-3000, Arch Linux'\")\n", "assert len(i7) == 148\n", "multicore = i7.query(\"`targets.test-instr-persist-shmem.multicore.total_execs_per_sec` > 0 or `targets.test-instr.multicore.total_execs_per_sec` > 0\")\n", "assert len(multicore) == 144 # 36 cores * 4 states * 1 run (containing two targets)\n", "singlecore = i7.query(\"`targets.test-instr-persist-shmem.singlecore.total_execs_per_sec` > 0 or `targets.test-instr.singlecore.total_execs_per_sec` > 0\")\n", "assert len(singlecore) == 4 # 1 core * 4 states * 1 run (containing two targets)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [], "source": [ "def build_graphdf_from_query(query: pd.DataFrame):\n", " \"\"\"Build a table suitable for graphing from a subset of the dataframe.\"\"\"\n", " graphdata = []\n", " max_fuzzers = int(query[[\"targets.test-instr-persist-shmem.multicore.fuzzers_used\", \"targets.test-instr.multicore.fuzzers_used\"]].max(axis=1).max(axis=0))\n", " for _, row in query.iterrows():\n", " for target in [\"test-instr-persist-shmem\", \"test-instr\"]:\n", " for mode in [\"multicore\", \"singlecore\"]:\n", " label = \"\"\n", " if not row[f\"targets.{target}.{mode}.total_execs_per_sec\"] > 0:\n", " continue\n", " execs_per_sec = row[f\"targets.{target}.{mode}.total_execs_per_sec\"]\n", " afl_execs_per_sec = row[f\"targets.{target}.{mode}.afl_execs_per_sec\"]\n", " parallel_fuzzers = row[f\"targets.{target}.{mode}.fuzzers_used\"]\n", " afl_persistent_config = row[\"config.afl_persistent_config\"]\n", " afl_system_config = row[\"config.afl_system_config\"]\n", " if target == \"test-instr-persist-shmem\":\n", " label += \"shmem\"\n", " else:\n", " label += \"base\"\n", " if mode == \"multicore\":\n", " label += \"-multicore\"\n", " else:\n", " label += \"-singlecore\"\n", " if afl_persistent_config:\n", " label += \"+persist-conf\"\n", " if afl_system_config:\n", " label += \"+system-conf\"\n", " \n", " if label == \"shmem-multicore+persist-conf+system-conf\":\n", " graphdata.append({\"execs_per_sec\": execs_per_sec, \"parallel_fuzzers\": parallel_fuzzers, \"afl_persistent_config\": afl_persistent_config, \"afl_system_config\": afl_system_config, \"label\": \"Multicore: Persistent mode/shared memory + kernel config\"})\n", " graphdata.append({\"execs_per_sec\": afl_execs_per_sec, \"parallel_fuzzers\": parallel_fuzzers, \"afl_persistent_config\": afl_persistent_config, \"afl_system_config\": afl_system_config, \"label\": \"Multicore: afl_execs: Persistent mode/shared memory + kernel config\"})\n", " if label == \"shmem-multicore\":\n", " graphdata.append({\"execs_per_sec\": execs_per_sec, \"parallel_fuzzers\": parallel_fuzzers, \"afl_persistent_config\": afl_persistent_config, \"afl_system_config\": afl_system_config, \"label\": \"Multicore: Persistent mode/shared memory without kernel config\"})\n", " if label == \"base-multicore+persist-conf+system-conf\":\n", " graphdata.append({\"execs_per_sec\": execs_per_sec, \"parallel_fuzzers\": parallel_fuzzers, \"afl_persistent_config\": afl_persistent_config, \"afl_system_config\": afl_system_config, \"label\": \"Multicore: Non-persistent mode + kernel config\"})\n", " if label == \"shmem-singlecore+persist-conf+system-conf\":\n", " for i in range(1, max_fuzzers + 1):\n", " graphdata.append({\"execs_per_sec\": execs_per_sec, \"parallel_fuzzers\": float(i), \"afl_persistent_config\": afl_persistent_config, \"afl_system_config\": afl_system_config, \"label\": \"Singlecore: Persistent mode/shared memory + kernel config\"})\n", " if label == \"base-singlecore+persist-conf+system-conf\":\n", " for i in range(1, max_fuzzers + 1):\n", " graphdata.append({\"execs_per_sec\": execs_per_sec, \"parallel_fuzzers\": float(i), \"afl_persistent_config\": afl_persistent_config, \"afl_system_config\": afl_system_config, \"label\": \"Singlecore: Non-persistent mode + kernel config\"})\n", " return pd.DataFrame.from_records(graphdata).sort_values(\"label\", ascending=False)\n", "\n", "graphdf = build_graphdf_from_query(i7)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "1234567891011121314151617181920212223242526272829303132333435361x26x51x75x100x125xConfigurationMulticore: Non-persistent mode + kernel configMulticore: Persistent mode/shared memory + kernel configMulticore: Persistent mode/shared memory without kernel configMulticore: afl_execs: Persistent mode/shared memory + kernel configSinglecore: Non-persistent mode + kernel configSinglecore: Persistent mode/shared memory + kernel configFuzzer performanceNumber of parallel fuzzersFuzz target executions per second" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import numpy as np\n", "pd.options.plotting.backend = \"plotly\"\n", "\n", "# Right now our table has absolute values of execs per sec, but it's more useful\n", "# to show relative perf (vs 1.0x baseline)\n", "pivotdf = graphdf.pivot(index=\"parallel_fuzzers\", columns=\"label\", values=\"execs_per_sec\")\n", "fig = pivotdf.plot(\n", " title=\"Fuzzer performance\",\n", " labels={\n", " \"label\": \"Configuration\",\n", " \"parallel_fuzzers\": \"Number of parallel fuzzers\",\n", " \"value\": \"Fuzz target executions per second\"\n", " }\n", ")\n", "\n", "# Compute tick values and their labels for the primary Y-axis\n", "tickvals = np.linspace(graphdf['execs_per_sec'].min(), graphdf['execs_per_sec'].max(), 6)\n", "ticktext = [f\"{val:.0f}x\" for val in tickvals / graphdf['execs_per_sec'].min()]\n", "# Update the primary Y-axis with custom tick labels\n", "fig.update_yaxes(tickvals=tickvals, ticktext=ticktext)\n", "fig.update_xaxes(tickvals=list(range(1,36+1)))\n", "fig.update_layout(width=1200, height=400)\n", "fig.show(\"svg\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's what the table that produced this graph looks like:" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
labelMulticore: Non-persistent mode + kernel configMulticore: Persistent mode/shared memory + kernel configMulticore: Persistent mode/shared memory without kernel configMulticore: afl_execs: Persistent mode/shared memory + kernel configSinglecore: Non-persistent mode + kernel configSinglecore: Persistent mode/shared memory + kernel config
parallel_fuzzers
1.011019.30133591.2690851.40134423.5011038.96135613.26
2.021111.92255995.07176159.32258490.0411038.96135613.26
3.030568.82380246.34260268.78383777.4511038.96135613.26
4.038963.07490254.72336355.99496249.4811038.96135613.26
5.047693.65598698.16413750.00613089.3111038.96135613.26
\n", "
" ], "text/plain": [ "label Multicore: Non-persistent mode + kernel config \\\n", "parallel_fuzzers \n", "1.0 11019.30 \n", "2.0 21111.92 \n", "3.0 30568.82 \n", "4.0 38963.07 \n", "5.0 47693.65 \n", "\n", "label Multicore: Persistent mode/shared memory + kernel config \\\n", "parallel_fuzzers \n", "1.0 133591.26 \n", "2.0 255995.07 \n", "3.0 380246.34 \n", "4.0 490254.72 \n", "5.0 598698.16 \n", "\n", "label Multicore: Persistent mode/shared memory without kernel config \\\n", "parallel_fuzzers \n", "1.0 90851.40 \n", "2.0 176159.32 \n", "3.0 260268.78 \n", "4.0 336355.99 \n", "5.0 413750.00 \n", "\n", "label Multicore: afl_execs: Persistent mode/shared memory + kernel config \\\n", "parallel_fuzzers \n", "1.0 134423.50 \n", "2.0 258490.04 \n", "3.0 383777.45 \n", "4.0 496249.48 \n", "5.0 613089.31 \n", "\n", "label Singlecore: Non-persistent mode + kernel config \\\n", "parallel_fuzzers \n", "1.0 11038.96 \n", "2.0 11038.96 \n", "3.0 11038.96 \n", "4.0 11038.96 \n", "5.0 11038.96 \n", "\n", "label Singlecore: Persistent mode/shared memory + kernel config \n", "parallel_fuzzers \n", "1.0 135613.26 \n", "2.0 135613.26 \n", "3.0 135613.26 \n", "4.0 135613.26 \n", "5.0 135613.26 " ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pivotdf.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can totally ignore the code cell directly below (unless you're curious). It's just preparing Markdown for the block below it to render. Jupyter Notebooks aren't able to use code variables inside Markdown blocks, so I have to do this instead." ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "\n", "### Line graph analysis\n", "Here are a few things that jump out from the graph above. Let's start at the bottom of the graph.\n", "\n", "#### test-instr vs. test-instr-persist-shmem\n", "\n", "This graph is scaled so that the single-core, non-persistent-mode performance (11038 execs per second) is\n", "represented as **1.0x**. If you build and run a fuzzer without creating a persistent mode harness for it, and without running fuzzers in parallel, this is the performance\n", "you get on this machine.\n", "\n", "#### Multicore test-instr\n", "\n", "By running as many parallel fuzzers are there are CPU threads, we can reach 103765 execs per second, which is **9.4x** that base speed.\n", "\n", "#### Persistent mode + shared memory\n", "\n", "##### Singlecore\n", "\n", "By modifying the harness to use persistent mode with shared memory as described [here](https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/README.persistent_mode.md#4-persistent-mode),\n", "we end up with **12.3x** base speed. So -- perhaps counter-intuively -- if you have a choice between switching to using multiple cores or rewriting\n", "the harness to use persistent mode on a single core, it is better (at least on this machine) to use persistent mode on a single core, than to use non-persistent mode on all cores.\n", "\n", "##### Multicore\n", "\n", "By scaling up that persistent mode with shared memory harness across cores, and with kernel mitigations still turned on (see next section), we get to\n", "**75.6x** base speed.\n", "\n", "#### Kernel config\n", "\n", "By \"kernel config\", I'm referring to booting the Linux kernel with `mitigations=off`, which is a meta-parameter for disabling *all* hardware vulnerability meltdowns (such as Spectre,\n", "Meltdown, Retbleed, etc) introduced in Linux v5.2. Disabling these results in a `total_execs_per_sec` increase of 368476 execs -- the difference between\n", "109.0x (mitigations off) and 75.6x (mitigations on) base speed. Turning on mitigations\n", "reduced the overall performance by 31%!\n", "\n", "One way to think about this is that the mitigations turn this 16-thread CPU into a 7-thread CPU, since the number of execs reached with 16 threads and mitigations on is around the same\n", "number of execs reached with 7 threads and mitigations off.\n", "\n", "Or if we want to think in terms of cores, then the average number of execs gained per core in the initial eight is 115588 execs per sec, but the loss due to\n", "mitigations is 368476 execs per sec, which is the averaged performance of 3.2 cores.\n", "\n", "With kernel mitigations turned off, we reach our highest available total_execs_per_sec speed on this machine, which is **109.0x** higher\n", "than where we started from.\n", "\n", "#### afl_execs_per_sec vs. total_execs_per_sec\n", "\n", "* The purple line at the top is measuring `afl_execs_per_sec`. This is afl's own measurement of the speed of each fuzzer process, from the `out/fuzzer/fuzzer_stats` file.\n", " * It peaks at 23 fuzzers running in parallel, on this 8-core (16-thread) CPU.\n", " * In contrast, `total_execs_per_sec` shows large drops in performance as we pass 8 (cores) and 16 (threads) fuzzers.\n", " * I'm inclined to trust `total_execs_per_sec` `(total_execs / (end time - start time))` more, so we'll use that from now on.\n", "\n", "#### How many parallel fuzzers should we use on this machine?\n", "\n", "* The drops in performance after 8/16 fuzzers are profound.\n", " * Using 9-12 fuzzers is *worse* than using 8 fuzzers on this 8C/16T system, but using 13-16 is better than 8.\n", " * And using >16 is worse than using 16. Makes sense.\n", " * We should use the number of CPUs in /proc/cpuinfo (threads) to get the best performance. But if we did halve the number of\n", " fuzzers, we would surprisingly only lose 21%\n", " of performance. This could be a good tradeoff in terms of cost.\n" ], "text/plain": [ "" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# (Ignore this code cell.)\n", "from IPython.display import Markdown as md\n", "singlecore_base_execs = pivotdf.iloc[0][\"Singlecore: Non-persistent mode + kernel config\"]\n", "singlecore_persist_execs = pivotdf.iloc[0][\"Singlecore: Persistent mode/shared memory + kernel config\"]\n", "multicore_fuzzers_with_afl_max_execs = int(pivotdf[\"Multicore: afl_execs: Persistent mode/shared memory + kernel config\"].idxmax())\n", "multicore_fuzzers_with_total_max_execs = int(pivotdf[\"Multicore: Persistent mode/shared memory + kernel config\"].idxmax())\n", "multicore_base_max_execs = pivotdf[\"Multicore: Non-persistent mode + kernel config\"].max()\n", "factor_for_execs = lambda execs: round(execs / singlecore_base_execs, 1)\n", "\n", "multicore_persistent_without_mitigations_label = \"Multicore: Persistent mode/shared memory + kernel config\"\n", "multicore_max_execs_mitigations_off = pivotdf[multicore_persistent_without_mitigations_label].max()\n", "multicore_max_execs_mitigations_off_only_cores = pivotdf.loc[multicore_fuzzers_with_total_max_execs / 2][multicore_persistent_without_mitigations_label]\n", "multicore_max_execs_mitigations_on = pivotdf[\"Multicore: Persistent mode/shared memory without kernel config\"].max()\n", "multicore_avg_gain_per_core = pivotdf.loc[pivotdf.index <= 8][\"Multicore: Persistent mode/shared memory + kernel config\"].diff().dropna().mean()\n", "mitigations_off_increase = int(multicore_max_execs_mitigations_off - multicore_max_execs_mitigations_on)\n", "\n", "md(f\"\"\"\n", "### Line graph analysis\n", "Here are a few things that jump out from the graph above. Let's start at the bottom of the graph.\n", "\n", "#### test-instr vs. test-instr-persist-shmem\n", "\n", "This graph is scaled so that the single-core, non-persistent-mode performance ({int(singlecore_base_execs)} execs per second) is\n", "represented as **1.0x**. If you build and run a fuzzer without creating a persistent mode harness for it, and without running fuzzers in parallel, this is the performance\n", "you get on this machine.\n", "\n", "#### Multicore test-instr\n", "\n", "By running as many parallel fuzzers are there are CPU threads, we can reach {int(multicore_base_max_execs)} execs per second, which is **{factor_for_execs(multicore_base_max_execs)}x** that base speed.\n", "\n", "#### Persistent mode + shared memory\n", "\n", "##### Singlecore\n", "\n", "By modifying the harness to use persistent mode with shared memory as described [here](https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/README.persistent_mode.md#4-persistent-mode),\n", "we end up with **{factor_for_execs(singlecore_persist_execs)}x** base speed. So -- perhaps counter-intuively -- if you have a choice between switching to using multiple cores or rewriting\n", "the harness to use persistent mode on a single core, it is better (at least on this machine) to use persistent mode on a single core, than to use non-persistent mode on all cores.\n", "\n", "##### Multicore\n", "\n", "By scaling up that persistent mode with shared memory harness across cores, and with kernel mitigations still turned on (see next section), we get to\n", "**{factor_for_execs(multicore_max_execs_mitigations_on)}x** base speed.\n", "\n", "#### Kernel config\n", "\n", "By \"kernel config\", I'm referring to booting the Linux kernel with `mitigations=off`, which is a meta-parameter for disabling *all* hardware vulnerability meltdowns (such as Spectre,\n", "Meltdown, Retbleed, etc) introduced in Linux v5.2. Disabling these results in a `total_execs_per_sec` increase of {mitigations_off_increase} execs -- the difference between\n", "{factor_for_execs(multicore_max_execs_mitigations_off)}x (mitigations off) and {factor_for_execs(multicore_max_execs_mitigations_on)}x (mitigations on) base speed. Turning on mitigations\n", "reduced the overall performance by {abs(round(((multicore_max_execs_mitigations_on - multicore_max_execs_mitigations_off) / multicore_max_execs_mitigations_off) * 100))}%!\n", "\n", "One way to think about this is that the mitigations turn this 16-thread CPU into a 7-thread CPU, since the number of execs reached with 16 threads and mitigations on is around the same\n", "number of execs reached with 7 threads and mitigations off.\n", "\n", "Or if we want to think in terms of cores, then the average number of execs gained per core in the initial eight is {int(multicore_avg_gain_per_core)} execs per sec, but the loss due to\n", "mitigations is {mitigations_off_increase} execs per sec, which is the averaged performance of {round(mitigations_off_increase / multicore_avg_gain_per_core, 1)} cores.\n", "\n", "With kernel mitigations turned off, we reach our highest available total_execs_per_sec speed on this machine, which is **{factor_for_execs(multicore_max_execs_mitigations_off)}x** higher\n", "than where we started from.\n", "\n", "#### afl_execs_per_sec vs. total_execs_per_sec\n", "\n", "* The purple line at the top is measuring `afl_execs_per_sec`. This is afl's own measurement of the speed of each fuzzer process, from the `out/fuzzer/fuzzer_stats` file.\n", " * It peaks at {multicore_fuzzers_with_afl_max_execs} fuzzers running in parallel, on this 8-core (16-thread) CPU.\n", " * In contrast, `total_execs_per_sec` shows large drops in performance as we pass 8 (cores) and 16 (threads) fuzzers.\n", " * I'm inclined to trust `total_execs_per_sec` `(total_execs / (end time - start time))` more, so we'll use that from now on.\n", "\n", "#### How many parallel fuzzers should we use on this machine?\n", "\n", "* The drops in performance after 8/16 fuzzers are profound.\n", " * Using 9-12 fuzzers is *worse* than using 8 fuzzers on this 8C/16T system, but using 13-16 is better than 8.\n", " * And using >16 is worse than using 16. Makes sense.\n", " * We should use the number of CPUs in /proc/cpuinfo (threads) to get the best performance. But if we did halve the number of\n", " fuzzers, we would surprisingly only lose {abs(int(((multicore_max_execs_mitigations_off_only_cores - multicore_max_execs_mitigations_off) / multicore_max_execs_mitigations_off) * 100))}%\n", " of performance. This could be a good tradeoff in terms of cost.\n", "\"\"\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Example with more cores\n", "\n", "While there was some nuance here, the answer was pretty straightforward -- use the number of CPU threads you have access to. What if there were more threads? Here the experiment is repeated on an AWS EC2 \"r6a.48xlarge\" spot instance with 192 vCPUs, and the answer calls the conclusion we just made above into question:" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
config.afl_persistent_configconfig.afl_system_configconfig.afl_versionconfig.commentconfig.compilerconfig.target_archhardware.cpu_fastest_core_mhzhardware.cpu_modelhardware.cpu_threadstargets.test-instr-persist-shmem.multicore.afl_execs_per_sectargets.test-instr-persist-shmem.multicore.afl_execs_totaltargets.test-instr-persist-shmem.multicore.fuzzers_usedtargets.test-instr-persist-shmem.multicore.run_endtargets.test-instr-persist-shmem.multicore.run_starttargets.test-instr-persist-shmem.multicore.total_execs_per_sectargets.test-instr-persist-shmem.multicore.total_run_time
148FalseTrue++4.09aAWS EC2 r6a.48xlarge spot instanceclang version 15.0.7 (Amazon Linux 15.0.7-3.am...x86_64-amazon-linux-gnu3599.314AMD EPYC 7R13 Processor19285586.47519670.01.02023-09-30 07:42:00.4794182023-09-30 07:41:57.39629384636.816.14
149FalseTrue++4.09aAWS EC2 r6a.48xlarge spot instanceclang version 15.0.7 (Amazon Linux 15.0.7-3.am...x86_64-amazon-linux-gnu3599.425AMD EPYC 7R13 Processor192171655.961039340.02.02023-09-30 07:42:06.8534362023-09-30 07:42:03.776562168998.376.15
\n", "
" ], "text/plain": [ " config.afl_persistent_config config.afl_system_config \\\n", "148 False True \n", "149 False True \n", "\n", " config.afl_version config.comment \\\n", "148 ++4.09a AWS EC2 r6a.48xlarge spot instance \n", "149 ++4.09a AWS EC2 r6a.48xlarge spot instance \n", "\n", " config.compiler \\\n", "148 clang version 15.0.7 (Amazon Linux 15.0.7-3.am... \n", "149 clang version 15.0.7 (Amazon Linux 15.0.7-3.am... \n", "\n", " config.target_arch hardware.cpu_fastest_core_mhz \\\n", "148 x86_64-amazon-linux-gnu 3599.314 \n", "149 x86_64-amazon-linux-gnu 3599.425 \n", "\n", " hardware.cpu_model hardware.cpu_threads \\\n", "148 AMD EPYC 7R13 Processor 192 \n", "149 AMD EPYC 7R13 Processor 192 \n", "\n", " targets.test-instr-persist-shmem.multicore.afl_execs_per_sec \\\n", "148 85586.47 \n", "149 171655.96 \n", "\n", " targets.test-instr-persist-shmem.multicore.afl_execs_total \\\n", "148 519670.0 \n", "149 1039340.0 \n", "\n", " targets.test-instr-persist-shmem.multicore.fuzzers_used \\\n", "148 1.0 \n", "149 2.0 \n", "\n", " targets.test-instr-persist-shmem.multicore.run_end \\\n", "148 2023-09-30 07:42:00.479418 \n", "149 2023-09-30 07:42:06.853436 \n", "\n", " targets.test-instr-persist-shmem.multicore.run_start \\\n", "148 2023-09-30 07:41:57.396293 \n", "149 2023-09-30 07:42:03.776562 \n", "\n", " targets.test-instr-persist-shmem.multicore.total_execs_per_sec \\\n", "148 84636.81 \n", "149 168998.37 \n", "\n", " targets.test-instr-persist-shmem.multicore.total_run_time \n", "148 6.14 \n", "149 6.15 " ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "r6a = df.query(\"`config.comment` == 'AWS EC2 r6a.48xlarge spot instance'\")\n", "r6a.head(2).dropna(axis=1)" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
execs_per_secparallel_fuzzersafl_persistent_configafl_system_configlabel
399331957.22200.0TrueTrueMulticore: afl_execs: Persistent mode/shared m...
1531026766.4477.0TrueTrueMulticore: afl_execs: Persistent mode/shared m...
\n", "
" ], "text/plain": [ " execs_per_sec parallel_fuzzers afl_persistent_config \\\n", "399 331957.22 200.0 True \n", "153 1026766.44 77.0 True \n", "\n", " afl_system_config label \n", "399 True Multicore: afl_execs: Persistent mode/shared m... \n", "153 True Multicore: afl_execs: Persistent mode/shared m... " ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "r6a_graphdf = build_graphdf_from_query(r6a)\n", "r6a_graphdf.head(2)" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "510152025303540455055606570758085909510010511011512012513013514014515015516016517017518018519019520010x36x62x89x115x141xConfigurationMulticore: Persistent mode/shared memory + kernel configMulticore: afl_execs: Persistent mode/shared memory + kernel configFuzzer performanceNumber of parallel fuzzersFuzz target executions per second" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "r6a_pivotdf = r6a_graphdf.pivot(index=\"parallel_fuzzers\", columns=\"label\", values=\"execs_per_sec\")\n", "r6a_fig = r6a_pivotdf.plot(\n", " title=\"Fuzzer performance\",\n", " labels={\n", " \"label\": \"Configuration\",\n", " \"parallel_fuzzers\": \"Number of parallel fuzzers\",\n", " \"value\": \"Fuzz target executions per second\"\n", " }\n", ")\n", "\n", "# Compute tick values and their labels for the primary Y-axis\n", "tickvals = np.linspace(r6a_graphdf['execs_per_sec'].min(), r6a_graphdf['execs_per_sec'].max(), 6)\n", "ticktext = [f\"{val:.0f}x\" for val in tickvals / graphdf['execs_per_sec'].min()]\n", "# Update the primary Y-axis with custom tick labels\n", "r6a_fig.update_yaxes(tickvals=tickvals, ticktext=ticktext)\n", "r6a_fig.update_xaxes(tickvals=list(range(0,200+1, 5)))\n", "r6a_fig.update_layout(width=1200, height=400)\n", "r6a_fig.show(\"svg\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Line graph analysis\n", "\n", "This is a shocking result for a 192 vCPU machine -- whether you count `afl_execs` or `total_execs`, our optimal number of parallel fuzzers was 16!\n", "\n", "Does this mean that AFL++ is a bad fuzzer, or that AWS tricked us and gave us a 16-thread machine instead of a 192-thread one?\n", "\n", "No, probably not -- the most likely causes here are a problem with our Python harness, or potentially that we're already saturating the Linux kernel's ability to service system calls, although we're definitely hitting such a limit way earlier than expected. A good way to test this theory would be to run more system-call-servicers (read: kernels!) at once on this machine; one way to do that is to use hardware virtualization with KVM. " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }