From f367728c4435670caf2e9cc5acad257e7766cc65 Mon Sep 17 00:00:00 2001 From: van Hauser Date: Tue, 28 May 2019 16:40:24 +0200 Subject: afl++ 2.52c initial commit --- docs/parallel_fuzzing.txt | 216 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 216 insertions(+) create mode 100644 docs/parallel_fuzzing.txt (limited to 'docs/parallel_fuzzing.txt') diff --git a/docs/parallel_fuzzing.txt b/docs/parallel_fuzzing.txt new file mode 100644 index 00000000..58f8d2f4 --- /dev/null +++ b/docs/parallel_fuzzing.txt @@ -0,0 +1,216 @@ +========================= +Tips for parallel fuzzing +========================= + + This document talks about synchronizing afl-fuzz jobs on a single machine + or across a fleet of systems. See README for the general instruction manual. + +1) Introduction +--------------- + +Every copy of afl-fuzz will take up one CPU core. This means that on an +n-core system, you can almost always run around n concurrent fuzzing jobs with +virtually no performance hit (you can use the afl-gotcpu tool to make sure). + +In fact, if you rely on just a single job on a multi-core system, you will +be underutilizing the hardware. So, parallelization is usually the right +way to go. + +When targeting multiple unrelated binaries or using the tool in "dumb" (-n) +mode, it is perfectly fine to just start up several fully separate instances +of afl-fuzz. The picture gets more complicated when you want to have multiple +fuzzers hammering a common target: if a hard-to-hit but interesting test case +is synthesized by one fuzzer, the remaining instances will not be able to use +that input to guide their work. + +To help with this problem, afl-fuzz offers a simple way to synchronize test +cases on the fly. + +2) Single-system parallelization +-------------------------------- + +If you wish to parallelize a single job across multiple cores on a local +system, simply create a new, empty output directory ("sync dir") that will be +shared by all the instances of afl-fuzz; and then come up with a naming scheme +for every instance - say, "fuzzer01", "fuzzer02", etc. + +Run the first one ("master", -M) like this: + +$ ./afl-fuzz -i testcase_dir -o sync_dir -M fuzzer01 [...other stuff...] + +...and then, start up secondary (-S) instances like this: + +$ ./afl-fuzz -i testcase_dir -o sync_dir -S fuzzer02 [...other stuff...] +$ ./afl-fuzz -i testcase_dir -o sync_dir -S fuzzer03 [...other stuff...] + +Each fuzzer will keep its state in a separate subdirectory, like so: + + /path/to/sync_dir/fuzzer01/ + +Each instance will also periodically rescan the top-level sync directory +for any test cases found by other fuzzers - and will incorporate them into +its own fuzzing when they are deemed interesting enough. + +The difference between the -M and -S modes is that the master instance will +still perform deterministic checks; while the secondary instances will +proceed straight to random tweaks. If you don't want to do deterministic +fuzzing at all, it's OK to run all instances with -S. With very slow or complex +targets, or when running heavily parallelized jobs, this is usually a good plan. + +Note that running multiple -M instances is wasteful, although there is an +experimental support for parallelizing the deterministic checks. To leverage +that, you need to create -M instances like so: + +$ ./afl-fuzz -i testcase_dir -o sync_dir -M masterA:1/3 [...] +$ ./afl-fuzz -i testcase_dir -o sync_dir -M masterB:2/3 [...] +$ ./afl-fuzz -i testcase_dir -o sync_dir -M masterC:3/3 [...] + +...where the first value after ':' is the sequential ID of a particular master +instance (starting at 1), and the second value is the total number of fuzzers to +distribute the deterministic fuzzing across. Note that if you boot up fewer +fuzzers than indicated by the second number passed to -M, you may end up with +poor coverage. + +You can also monitor the progress of your jobs from the command line with the +provided afl-whatsup tool. When the instances are no longer finding new paths, +it's probably time to stop. + +WARNING: Exercise caution when explicitly specifying the -f option. Each fuzzer +must use a separate temporary file; otherwise, things will go south. One safe +example may be: + +$ ./afl-fuzz [...] -S fuzzer10 -f file10.txt ./fuzzed/binary @@ +$ ./afl-fuzz [...] -S fuzzer11 -f file11.txt ./fuzzed/binary @@ +$ ./afl-fuzz [...] -S fuzzer12 -f file12.txt ./fuzzed/binary @@ + +This is not a concern if you use @@ without -f and let afl-fuzz come up with the +file name. + +3) Multi-system parallelization +------------------------------- + +The basic operating principle for multi-system parallelization is similar to +the mechanism explained in section 2. The key difference is that you need to +write a simple script that performs two actions: + + - Uses SSH with authorized_keys to connect to every machine and retrieve + a tar archive of the /path/to/sync_dir//queue/ directories for + every local to the machine. It's best to use a naming scheme + that includes host name in the fuzzer ID, so that you can do something + like: + + for s in {1..10}; do + ssh user@host${s} "tar -czf - sync/host${s}_fuzzid*/[qf]*" >host${s}.tgz + done + + - Distributes and unpacks these files on all the remaining machines, e.g.: + + for s in {1..10}; do + for d in {1..10}; do + test "$s" = "$d" && continue + ssh user@host${d} 'tar -kxzf -' /queue/* and writing their own finds to sequentially + numbered id:nnnnnn files in out_dir//queue/*. + + - Running some of the synchronized fuzzers with different (but related) + target binaries. For example, simultaneously stress-testing several + different JPEG parsers (say, IJG jpeg and libjpeg-turbo) while sharing + the discovered test cases can have synergistic effects and improve the + overall coverage. + + (In this case, running one -M instance per each binary is a good plan.) + + - Having some of the fuzzers invoke the binary in different ways. + For example, 'djpeg' supports several DCT modes, configurable with + a command-line flag, while 'dwebp' supports incremental and one-shot + decoding. In some scenarios, going after multiple distinct modes and then + pooling test cases will improve coverage. + + - Much less convincingly, running the synchronized fuzzers with different + starting test cases (e.g., progressive and standard JPEG) or dictionaries. + The synchronization mechanism ensures that the test sets will get fairly + homogeneous over time, but it introduces some initial variability. -- cgit 1.4.1