From e2eedefc65bec1a04605f117a11ca8bdf9d80323 Mon Sep 17 00:00:00 2001 From: Andrea Fioraldi Date: Mon, 3 Feb 2020 13:02:16 +0100 Subject: docs to md --- QuickStartGuide.md | 1 + QuickStartGuide.txt | 1 - README.md | 4 +- afl-whatsup | 2 +- docs/QuickStartGuide.md | 2 +- docs/life_pro_tips.txt | 2 +- docs/parallel_fuzzing.md | 228 ++++++++++++++++++++++++++++++++++++++++ docs/parallel_fuzzing.txt | 223 --------------------------------------- docs/perf_tips.txt | 2 +- docs/status_screen.txt | 4 +- docs/technical_details.txt | 2 +- experimental/README.experiments | 2 +- src/afl-fuzz-init.c | 2 +- src/afl-fuzz.c | 2 +- src/afl-gotcpu.c | 2 +- 15 files changed, 242 insertions(+), 237 deletions(-) create mode 120000 QuickStartGuide.md delete mode 120000 QuickStartGuide.txt create mode 100644 docs/parallel_fuzzing.md delete mode 100644 docs/parallel_fuzzing.txt diff --git a/QuickStartGuide.md b/QuickStartGuide.md new file mode 120000 index 00000000..8136d85e --- /dev/null +++ b/QuickStartGuide.md @@ -0,0 +1 @@ +docs/QuickStartGuide.md \ No newline at end of file diff --git a/QuickStartGuide.txt b/QuickStartGuide.txt deleted file mode 120000 index e1687eb5..00000000 --- a/QuickStartGuide.txt +++ /dev/null @@ -1 +0,0 @@ -docs/QuickStartGuide.txt \ No newline at end of file diff --git a/README.md b/README.md index e2e073ac..00ae599c 100644 --- a/README.md +++ b/README.md @@ -433,11 +433,11 @@ see [http://lcamtuf.coredump.cx/afl/plot/](http://lcamtuf.coredump.cx/afl/plot/) Every instance of afl-fuzz takes up roughly one core. This means that on multi-core systems, parallelization is necessary to fully utilize the hardware. For tips on how to fuzz a common target on multiple cores or multiple networked -machines, please refer to [docs/parallel_fuzzing.txt](docs/parallel_fuzzing.txt). +machines, please refer to [docs/parallel_fuzzing.md](docs/parallel_fuzzing.md). The parallel fuzzing mode also offers a simple way for interfacing AFL to other fuzzers, to symbolic or concolic execution engines, and so forth; again, see the -last section of [docs/parallel_fuzzing.txt](docs/parallel_fuzzing.txt) for tips. +last section of [docs/parallel_fuzzing.md](docs/parallel_fuzzing.md) for tips. ## 10) Fuzzer dictionaries diff --git a/afl-whatsup b/afl-whatsup index 6a8c5669..6156ba11 100755 --- a/afl-whatsup +++ b/afl-whatsup @@ -45,7 +45,7 @@ if [ "$DIR" = "" ]; then echo "Usage: $0 [ -s ] afl_sync_dir" 1>&2 echo 1>&2 echo "The -s option causes the tool to skip all the per-fuzzer trivia and show" 1>&2 - echo "just the summary results. See docs/parallel_fuzzing.txt for additional tips." 1>&2 + echo "just the summary results. See docs/parallel_fuzzing.md for additional tips." 1>&2 echo 1>&2 exit 1 diff --git a/docs/QuickStartGuide.md b/docs/QuickStartGuide.md index 1e89a6ad..d5ad303e 100644 --- a/docs/QuickStartGuide.md +++ b/docs/QuickStartGuide.md @@ -51,4 +51,4 @@ following files: - README.md - A general introduction to AFL, - docs/perf_tips.txt - Simple tips on how to fuzz more quickly, - docs/status_screen.txt - An explanation of the tidbits shown in the UI, - - docs/parallel_fuzzing.txt - Advice on running AFL on multiple cores. + - docs/parallel_fuzzing.md - Advice on running AFL on multiple cores. diff --git a/docs/life_pro_tips.txt b/docs/life_pro_tips.txt index c8c47636..27c70592 100644 --- a/docs/life_pro_tips.txt +++ b/docs/life_pro_tips.txt @@ -14,7 +14,7 @@ See dictionaries/README.dictionaries to learn how. % You can get the most out of your hardware by parallelizing AFL jobs. -See docs/parallel_fuzzing.txt for step-by-step tips. +See docs/parallel_fuzzing.md for step-by-step tips. % diff --git a/docs/parallel_fuzzing.md b/docs/parallel_fuzzing.md new file mode 100644 index 00000000..51fa3986 --- /dev/null +++ b/docs/parallel_fuzzing.md @@ -0,0 +1,228 @@ +# Tips for parallel fuzzing + + This document talks about synchronizing afl-fuzz jobs on a single machine + or across a fleet of systems. See README for the general instruction manual. + +## 1) Introduction + +Every copy of afl-fuzz will take up one CPU core. This means that on an +n-core system, you can almost always run around n concurrent fuzzing jobs with +virtually no performance hit (you can use the afl-gotcpu tool to make sure). + +In fact, if you rely on just a single job on a multi-core system, you will +be underutilizing the hardware. So, parallelization is usually the right +way to go. + +When targeting multiple unrelated binaries or using the tool in "dumb" (-n) +mode, it is perfectly fine to just start up several fully separate instances +of afl-fuzz. The picture gets more complicated when you want to have multiple +fuzzers hammering a common target: if a hard-to-hit but interesting test case +is synthesized by one fuzzer, the remaining instances will not be able to use +that input to guide their work. + +To help with this problem, afl-fuzz offers a simple way to synchronize test +cases on the fly. + +Note that afl++ has AFLfast's power schedules implemented. +It is therefore a good idea to use different power schedules if you run +several instances in parallel. See docs/power_schedules.txt + +Alternatively running other AFL spinoffs in parallel can be of value, +e.g. Angora (https://github.com/AngoraFuzzer/Angora/) + +## 2) Single-system parallelization + +If you wish to parallelize a single job across multiple cores on a local +system, simply create a new, empty output directory ("sync dir") that will be +shared by all the instances of afl-fuzz; and then come up with a naming scheme +for every instance - say, "fuzzer01", "fuzzer02", etc. + +Run the first one ("master", -M) like this: + +``` +$ ./afl-fuzz -i testcase_dir -o sync_dir -M fuzzer01 [...other stuff...] +``` + +...and then, start up secondary (-S) instances like this: + +``` +$ ./afl-fuzz -i testcase_dir -o sync_dir -S fuzzer02 [...other stuff...] +$ ./afl-fuzz -i testcase_dir -o sync_dir -S fuzzer03 [...other stuff...] +``` + +Each fuzzer will keep its state in a separate subdirectory, like so: + + /path/to/sync_dir/fuzzer01/ + +Each instance will also periodically rescan the top-level sync directory +for any test cases found by other fuzzers - and will incorporate them into +its own fuzzing when they are deemed interesting enough. + +The difference between the -M and -S modes is that the master instance will +still perform deterministic checks; while the secondary instances will +proceed straight to random tweaks. If you don't want to do deterministic +fuzzing at all, it's OK to run all instances with -S. With very slow or complex +targets, or when running heavily parallelized jobs, this is usually a good plan. + +Note that running multiple -M instances is wasteful, although there is an +experimental support for parallelizing the deterministic checks. To leverage +that, you need to create -M instances like so: + +``` +$ ./afl-fuzz -i testcase_dir -o sync_dir -M masterA:1/3 [...] +$ ./afl-fuzz -i testcase_dir -o sync_dir -M masterB:2/3 [...] +$ ./afl-fuzz -i testcase_dir -o sync_dir -M masterC:3/3 [...] +``` + +...where the first value after ':' is the sequential ID of a particular master +instance (starting at 1), and the second value is the total number of fuzzers to +distribute the deterministic fuzzing across. Note that if you boot up fewer +fuzzers than indicated by the second number passed to -M, you may end up with +poor coverage. + +You can also monitor the progress of your jobs from the command line with the +provided afl-whatsup tool. When the instances are no longer finding new paths, +it's probably time to stop. + +WARNING: Exercise caution when explicitly specifying the -f option. Each fuzzer +must use a separate temporary file; otherwise, things will go south. One safe +example may be: + +``` +$ ./afl-fuzz [...] -S fuzzer10 -f file10.txt ./fuzzed/binary @@ +$ ./afl-fuzz [...] -S fuzzer11 -f file11.txt ./fuzzed/binary @@ +$ ./afl-fuzz [...] -S fuzzer12 -f file12.txt ./fuzzed/binary @@ +``` + +This is not a concern if you use @@ without -f and let afl-fuzz come up with the +file name. + +## 3) Multi-system parallelization + +The basic operating principle for multi-system parallelization is similar to +the mechanism explained in section 2. The key difference is that you need to +write a simple script that performs two actions: + + - Uses SSH with authorized_keys to connect to every machine and retrieve + a tar archive of the /path/to/sync_dir//queue/ directories for + every local to the machine. It's best to use a naming scheme + that includes host name in the fuzzer ID, so that you can do something + like: + + ```sh + for s in {1..10}; do + ssh user@host${s} "tar -czf - sync/host${s}_fuzzid*/[qf]*" >host${s}.tgz + done + ``` + + - Distributes and unpacks these files on all the remaining machines, e.g.: + + ```sh + for s in {1..10}; do + for d in {1..10}; do + test "$s" = "$d" && continue + ssh user@host${d} 'tar -kxzf -' /queue/* and writing their own finds to sequentially + numbered id:nnnnnn files in out_dir//queue/*. + + - Running some of the synchronized fuzzers with different (but related) + target binaries. For example, simultaneously stress-testing several + different JPEG parsers (say, IJG jpeg and libjpeg-turbo) while sharing + the discovered test cases can have synergistic effects and improve the + overall coverage. + + (In this case, running one -M instance per each binary is a good plan.) + + - Having some of the fuzzers invoke the binary in different ways. + For example, 'djpeg' supports several DCT modes, configurable with + a command-line flag, while 'dwebp' supports incremental and one-shot + decoding. In some scenarios, going after multiple distinct modes and then + pooling test cases will improve coverage. + + - Much less convincingly, running the synchronized fuzzers with different + starting test cases (e.g., progressive and standard JPEG) or dictionaries. + The synchronization mechanism ensures that the test sets will get fairly + homogeneous over time, but it introduces some initial variability. diff --git a/docs/parallel_fuzzing.txt b/docs/parallel_fuzzing.txt deleted file mode 100644 index 1e65c01f..00000000 --- a/docs/parallel_fuzzing.txt +++ /dev/null @@ -1,223 +0,0 @@ -========================= -Tips for parallel fuzzing -========================= - - This document talks about synchronizing afl-fuzz jobs on a single machine - or across a fleet of systems. See README for the general instruction manual. - -1) Introduction ---------------- - -Every copy of afl-fuzz will take up one CPU core. This means that on an -n-core system, you can almost always run around n concurrent fuzzing jobs with -virtually no performance hit (you can use the afl-gotcpu tool to make sure). - -In fact, if you rely on just a single job on a multi-core system, you will -be underutilizing the hardware. So, parallelization is usually the right -way to go. - -When targeting multiple unrelated binaries or using the tool in "dumb" (-n) -mode, it is perfectly fine to just start up several fully separate instances -of afl-fuzz. The picture gets more complicated when you want to have multiple -fuzzers hammering a common target: if a hard-to-hit but interesting test case -is synthesized by one fuzzer, the remaining instances will not be able to use -that input to guide their work. - -To help with this problem, afl-fuzz offers a simple way to synchronize test -cases on the fly. - -Note that afl++ has AFLfast's power schedules implemented. -It is therefore a good idea to use different power schedules if you run -several instances in parallel. See docs/power_schedules.txt - -Alternatively running other AFL spinoffs in parallel can be of value, -e.g. Angora (https://github.com/AngoraFuzzer/Angora/) - -2) Single-system parallelization --------------------------------- - -If you wish to parallelize a single job across multiple cores on a local -system, simply create a new, empty output directory ("sync dir") that will be -shared by all the instances of afl-fuzz; and then come up with a naming scheme -for every instance - say, "fuzzer01", "fuzzer02", etc. - -Run the first one ("master", -M) like this: - -$ ./afl-fuzz -i testcase_dir -o sync_dir -M fuzzer01 [...other stuff...] - -...and then, start up secondary (-S) instances like this: - -$ ./afl-fuzz -i testcase_dir -o sync_dir -S fuzzer02 [...other stuff...] -$ ./afl-fuzz -i testcase_dir -o sync_dir -S fuzzer03 [...other stuff...] - -Each fuzzer will keep its state in a separate subdirectory, like so: - - /path/to/sync_dir/fuzzer01/ - -Each instance will also periodically rescan the top-level sync directory -for any test cases found by other fuzzers - and will incorporate them into -its own fuzzing when they are deemed interesting enough. - -The difference between the -M and -S modes is that the master instance will -still perform deterministic checks; while the secondary instances will -proceed straight to random tweaks. If you don't want to do deterministic -fuzzing at all, it's OK to run all instances with -S. With very slow or complex -targets, or when running heavily parallelized jobs, this is usually a good plan. - -Note that running multiple -M instances is wasteful, although there is an -experimental support for parallelizing the deterministic checks. To leverage -that, you need to create -M instances like so: - -$ ./afl-fuzz -i testcase_dir -o sync_dir -M masterA:1/3 [...] -$ ./afl-fuzz -i testcase_dir -o sync_dir -M masterB:2/3 [...] -$ ./afl-fuzz -i testcase_dir -o sync_dir -M masterC:3/3 [...] - -...where the first value after ':' is the sequential ID of a particular master -instance (starting at 1), and the second value is the total number of fuzzers to -distribute the deterministic fuzzing across. Note that if you boot up fewer -fuzzers than indicated by the second number passed to -M, you may end up with -poor coverage. - -You can also monitor the progress of your jobs from the command line with the -provided afl-whatsup tool. When the instances are no longer finding new paths, -it's probably time to stop. - -WARNING: Exercise caution when explicitly specifying the -f option. Each fuzzer -must use a separate temporary file; otherwise, things will go south. One safe -example may be: - -$ ./afl-fuzz [...] -S fuzzer10 -f file10.txt ./fuzzed/binary @@ -$ ./afl-fuzz [...] -S fuzzer11 -f file11.txt ./fuzzed/binary @@ -$ ./afl-fuzz [...] -S fuzzer12 -f file12.txt ./fuzzed/binary @@ - -This is not a concern if you use @@ without -f and let afl-fuzz come up with the -file name. - -3) Multi-system parallelization -------------------------------- - -The basic operating principle for multi-system parallelization is similar to -the mechanism explained in section 2. The key difference is that you need to -write a simple script that performs two actions: - - - Uses SSH with authorized_keys to connect to every machine and retrieve - a tar archive of the /path/to/sync_dir//queue/ directories for - every local to the machine. It's best to use a naming scheme - that includes host name in the fuzzer ID, so that you can do something - like: - - for s in {1..10}; do - ssh user@host${s} "tar -czf - sync/host${s}_fuzzid*/[qf]*" >host${s}.tgz - done - - - Distributes and unpacks these files on all the remaining machines, e.g.: - - for s in {1..10}; do - for d in {1..10}; do - test "$s" = "$d" && continue - ssh user@host${d} 'tar -kxzf -' /queue/* and writing their own finds to sequentially - numbered id:nnnnnn files in out_dir//queue/*. - - - Running some of the synchronized fuzzers with different (but related) - target binaries. For example, simultaneously stress-testing several - different JPEG parsers (say, IJG jpeg and libjpeg-turbo) while sharing - the discovered test cases can have synergistic effects and improve the - overall coverage. - - (In this case, running one -M instance per each binary is a good plan.) - - - Having some of the fuzzers invoke the binary in different ways. - For example, 'djpeg' supports several DCT modes, configurable with - a command-line flag, while 'dwebp' supports incremental and one-shot - decoding. In some scenarios, going after multiple distinct modes and then - pooling test cases will improve coverage. - - - Much less convincingly, running the synchronized fuzzers with different - starting test cases (e.g., progressive and standard JPEG) or dictionaries. - The synchronization mechanism ensures that the test sets will get fairly - homogeneous over time, but it introduces some initial variability. diff --git a/docs/perf_tips.txt b/docs/perf_tips.txt index 0cac8f7b..b4a8893d 100644 --- a/docs/perf_tips.txt +++ b/docs/perf_tips.txt @@ -140,7 +140,7 @@ options to use non-instrumented system-wide copies instead. The fuzzer is designed to need ~1 core per job. This means that on a, say, 4-core system, you can easily run four parallel fuzzing jobs with relatively -little performance hit. For tips on how to do that, see parallel_fuzzing.txt. +little performance hit. For tips on how to do that, see parallel_fuzzing.md. The afl-gotcpu utility can help you understand if you still have idle CPU capacity on your system. (It won't tell you about memory bandwidth, cache diff --git a/docs/status_screen.txt b/docs/status_screen.txt index c6f9f791..ef27bc76 100644 --- a/docs/status_screen.txt +++ b/docs/status_screen.txt @@ -218,7 +218,7 @@ now. It tells you about the current stage, which can be any of: splices together two random inputs from the queue at some arbitrarily selected midpoint. - - sync - a stage used only when -M or -S is set (see parallel_fuzzing.txt). + - sync - a stage used only when -M or -S is set (see parallel_fuzzing.md). No real fuzzing is involved, but the tool scans the output from other fuzzers and imports test cases as necessary. The first time this is done, it may take several minutes or so. @@ -370,7 +370,7 @@ comparing it to the number of logical cores on the system. If the value is shown in green, you are using fewer CPU cores than available on your system and can probably parallelize to improve performance; for tips on -how to do that, see parallel_fuzzing.txt. +how to do that, see parallel_fuzzing.md. If the value is shown in red, your CPU is *possibly* oversubscribed, and running additional fuzzers may not give you any benefits. diff --git a/docs/technical_details.txt b/docs/technical_details.txt index 1604c4d0..734512a2 100644 --- a/docs/technical_details.txt +++ b/docs/technical_details.txt @@ -485,7 +485,7 @@ This allows for extreme flexibility in fuzzer setup, including running synced instances against different parsers of a common data format, often with synergistic effects. -For more information about this design, see parallel_fuzzing.txt. +For more information about this design, see parallel_fuzzing.md. 12) Binary-only instrumentation ------------------------------- diff --git a/experimental/README.experiments b/experimental/README.experiments index 543c078c..06f22ee1 100644 --- a/experimental/README.experiments +++ b/experimental/README.experiments @@ -20,7 +20,7 @@ Here's a quick overview of the stuff you can find in this directory: with additional gdb metadata. - distributed_fuzzing - a sample script for synchronizing fuzzer instances - across multiple machines (see parallel_fuzzing.txt). + across multiple machines (see parallel_fuzzing.md). - libpng_no_checksum - a sample patch for removing CRC checks in libpng. diff --git a/src/afl-fuzz-init.c b/src/afl-fuzz-init.c index c4a02698..e39480da 100644 --- a/src/afl-fuzz-init.c +++ b/src/afl-fuzz-init.c @@ -1719,7 +1719,7 @@ void get_core_count(void) { } else if (cur_runnable + 1 <= cpu_core_count) { - OKF("Try parallel jobs - see %s/parallel_fuzzing.txt.", doc_path); + OKF("Try parallel jobs - see %s/parallel_fuzzing.md.", doc_path); } diff --git a/src/afl-fuzz.c b/src/afl-fuzz.c index eae4ba1f..4957a8bf 100644 --- a/src/afl-fuzz.c +++ b/src/afl-fuzz.c @@ -133,7 +133,7 @@ static void usage(u8* argv0) { "Other stuff:\n" " -T text - text banner to show on the screen\n" - " -M / -S id - distributed mode (see parallel_fuzzing.txt)\n" + " -M / -S id - distributed mode (see parallel_fuzzing.md)\n" " -I command - execute this command/script when a new crash is " "found\n" " -B bitmap.txt - mutate a specific test case, use the out/fuzz_bitmap " diff --git a/src/afl-gotcpu.c b/src/afl-gotcpu.c index 5be30238..214862a9 100644 --- a/src/afl-gotcpu.c +++ b/src/afl-gotcpu.c @@ -19,7 +19,7 @@ This tool provides a fairly accurate measurement of CPU preemption rate. It is meant to complement the quick-and-dirty load average widget shown - in the afl-fuzz UI. See docs/parallel_fuzzing.txt for more info. + in the afl-fuzz UI. See docs/parallel_fuzzing.md for more info. For some work loads, the tool may actually suggest running more instances than you have CPU cores. This can happen if the tested program is spending -- cgit 1.4.1