diff options
author | llzmb <46303940+llzmb@users.noreply.github.com> | 2021-11-22 19:56:39 +0100 |
---|---|---|
committer | llzmb <46303940+llzmb@users.noreply.github.com> | 2021-11-22 19:56:39 +0100 |
commit | 8b5eafe7c504e68e710244ae7e58b1809e6584d9 (patch) | |
tree | f244196da8c39b2d1c24833207cdd42341f0b485 | |
parent | c31f4646cbd00f591dad3258c08ff8e56aa94420 (diff) | |
download | afl++-8b5eafe7c504e68e710244ae7e58b1809e6584d9.tar.gz |
Clean up docs folder
-rw-r--r-- | docs/afl-fuzz_approach.md | 24 | ||||
-rw-r--r-- | docs/features.md | 96 | ||||
-rw-r--r-- | docs/fuzzing_binary-only_targets.md | 99 | ||||
-rw-r--r-- | docs/limitations.md | 53 | ||||
-rw-r--r-- | docs/parallel_fuzzing.md | 256 | ||||
-rw-r--r-- | docs/technical_details.md | 550 | ||||
-rw-r--r-- | docs/third_party_tools.md | 68 | ||||
-rw-r--r-- | docs/tutorials.md | 14 |
8 files changed, 204 insertions, 956 deletions
diff --git a/docs/afl-fuzz_approach.md b/docs/afl-fuzz_approach.md index 57a275d9..e0d5a1c9 100644 --- a/docs/afl-fuzz_approach.md +++ b/docs/afl-fuzz_approach.md @@ -37,9 +37,10 @@ superior to blind fuzzing or coverage-only tools. ## Understanding the status screen -This document provides an overview of the status screen - plus tips for -troubleshooting any warnings and red text shown in the UI. See -[README.md](../README.md) for the general instruction manual. +This chapter provides an overview of the status screen - plus tips for +troubleshooting any warnings and red text shown in the UI. + +For the general instruction manual, see [README.md](../README.md). ### A note about colors @@ -47,7 +48,7 @@ The status screen and error messages use colors to keep things readable and attract your attention to the most important details. For example, red almost always means "consult this doc" :-) -Unfortunately, the UI will render correctly only if your terminal is using +Unfortunately, the UI will only render correctly if your terminal is using traditional un*x palette (white text on black background) or something close to that. @@ -61,7 +62,7 @@ If you are using inverse video, you may want to change your settings, say: Alternatively, if you really like your current colors, you can edit config.h to comment out USE_COLORS, then do `make clean all`. -I'm not aware of any other simple way to make this work without causing other +We are not aware of any other simple way to make this work without causing other side effects - sorry about that. With that out of the way, let's talk about what's actually on the screen... @@ -103,8 +104,8 @@ will be allowed to run for months. There's one important thing to watch out for: if the tool is not finding new paths within several minutes of starting, you're probably not invoking the target binary correctly and it never gets to parse the input files we're -throwing at it; another possible explanations are that the default memory limit -(`-m`) is too restrictive, and the program exits after failing to allocate a +throwing at it; other possible explanations are that the default memory limit +(`-m`) is too restrictive and the program exits after failing to allocate a buffer very early on; or that the input files are patently invalid and always fail a basic header check. @@ -124,9 +125,9 @@ red warning in this section, too :-) The first field in this section gives you the count of queue passes done so far - that is, the number of times the fuzzer went over all the interesting test -cases discovered so far, fuzzed them, and looped back to the very beginning. -Every fuzzing session should be allowed to complete at least one cycle; and -ideally, should run much longer than that. + cases discovered so far, fuzzed them, and looped back to the very beginning. + Every fuzzing session should be allowed to complete at least one cycle; and + ideally, should run much longer than that. As noted earlier, the first pass can take a day or longer, so sit back and relax. @@ -140,7 +141,8 @@ while. The remaining fields in this part of the screen should be pretty obvious: there's the number of test cases ("paths") discovered so far, and the number of unique faults. The test cases, crashes, and hangs can be explored in real-time -by browsing the output directory, as discussed in [README.md](../README.md). +by browsing the output directory, see +[#interpreting-output](#interpreting-output). ### Cycle progress diff --git a/docs/features.md b/docs/features.md index 05670e6f..35a869a9 100644 --- a/docs/features.md +++ b/docs/features.md @@ -1,49 +1,61 @@ # Important features of AFL++ - AFL++ supports llvm from 3.8 up to version 12, very fast binary fuzzing with QEMU 5.1 - with laf-intel and redqueen, frida mode, unicorn mode, gcc plugin, full *BSD, - Mac OS, Solaris and Android support and much, much, much more. +AFL++ supports llvm from 3.8 up to version 12, very fast binary fuzzing with +QEMU 5.1 with laf-intel and redqueen, frida mode, unicorn mode, gcc plugin, full +*BSD, Mac OS, Solaris and Android support and much, much, much more. - | Feature/Instrumentation | afl-gcc | llvm | gcc_plugin | frida_mode(9) | qemu_mode(10) |unicorn_mode(10) |coresight_mode(11)| - | -------------------------|:-------:|:---------:|:----------:|:----------------:|:----------------:|:----------------:|:----------------:| - | Threadsafe counters | | x(3) | | | | | | - | NeverZero | x86[_64]| x(1) | x | x | x | x | | - | Persistent Mode | | x | x | x86[_64]/arm64 | x86[_64]/arm[64] | x | | - | LAF-Intel / CompCov | | x | | | x86[_64]/arm[64] | x86[_64]/arm[64] | | - | CmpLog | | x | | x86[_64]/arm64 | x86[_64]/arm[64] | | | - | Selective Instrumentation| | x | x | x | x | | | - | Non-Colliding Coverage | | x(4) | | | (x)(5) | | | - | Ngram prev_loc Coverage | | x(6) | | | | | | - | Context Coverage | | x(6) | | | | | | - | Auto Dictionary | | x(7) | | | | | | - | Snapshot LKM Support | | (x)(8) | (x)(8) | | (x)(5) | | | - | Shared Memory Test cases | | x | x | x86[_64]/arm64 | x | x | | +| Feature/Instrumentation | afl-gcc | llvm | gcc_plugin | frida_mode(9) | qemu_mode(10) |unicorn_mode(10) |coresight_mode(11)| +| -------------------------|:-------:|:---------:|:----------:|:----------------:|:----------------:|:----------------:|:----------------:| +| Threadsafe counters | | x(3) | | | | | | +| NeverZero | x86[_64]| x(1) | x | x | x | x | | +| Persistent Mode | | x | x | x86[_64]/arm64 | x86[_64]/arm[64] | x | | +| LAF-Intel / CompCov | | x | | | x86[_64]/arm[64] | x86[_64]/arm[64] | | +| CmpLog | | x | | x86[_64]/arm64 | x86[_64]/arm[64] | | | +| Selective Instrumentation| | x | x | x | x | | | +| Non-Colliding Coverage | | x(4) | | | (x)(5) | | | +| Ngram prev_loc Coverage | | x(6) | | | | | | +| Context Coverage | | x(6) | | | | | | +| Auto Dictionary | | x(7) | | | | | | +| Snapshot LKM Support | | (x)(8) | (x)(8) | | (x)(5) | | | +| Shared Memory Test cases | | x | x | x86[_64]/arm64 | x | x | | - 1. default for LLVM >= 9.0, env var for older version due an efficiency bug in previous llvm versions - 2. GCC creates non-performant code, hence it is disabled in gcc_plugin - 3. with `AFL_LLVM_THREADSAFE_INST`, disables NeverZero - 4. with pcguard mode and LTO mode for LLVM 11 and newer - 5. upcoming, development in the branch - 6. not compatible with LTO instrumentation and needs at least LLVM v4.1 - 7. automatic in LTO mode with LLVM 11 and newer, an extra pass for all LLVM versions that write to a file to use with afl-fuzz' `-x` - 8. the snapshot LKM is currently unmaintained due to too many kernel changes coming too fast :-( - 9. frida mode is supported on Linux and MacOS for Intel and ARM - 10. QEMU/Unicorn is only supported on Linux - 11. Coresight mode is only available on AARCH64 Linux with a CPU with Coresight extension +1. default for LLVM >= 9.0, env var for older version due an efficiency bug in + previous llvm versions +2. GCC creates non-performant code, hence it is disabled in gcc_plugin +3. with `AFL_LLVM_THREADSAFE_INST`, disables NeverZero +4. with pcguard mode and LTO mode for LLVM 11 and newer +5. upcoming, development in the branch +6. not compatible with LTO instrumentation and needs at least LLVM v4.1 +7. automatic in LTO mode with LLVM 11 and newer, an extra pass for all LLVM + versions that write to a file to use with afl-fuzz' `-x` +8. the snapshot LKM is currently unmaintained due to too many kernel changes + coming too fast :-( +9. frida mode is supported on Linux and MacOS for Intel and ARM +10. QEMU/Unicorn is only supported on Linux +11. Coresight mode is only available on AARCH64 Linux with a CPU with Coresight + extension - Among others, the following features and patches have been integrated: +Among others, the following features and patches have been integrated: - * NeverZero patch for afl-gcc, instrumentation, qemu_mode and unicorn_mode which prevents a wrapping map value to zero, increases coverage - * Persistent mode, deferred forkserver and in-memory fuzzing for qemu_mode - * Unicorn mode which allows fuzzing of binaries from completely different platforms (integration provided by domenukk) - * The new CmpLog instrumentation for LLVM and QEMU inspired by [Redqueen](https://www.syssec.ruhr-uni-bochum.de/media/emma/veroeffentlichungen/2018/12/17/NDSS19-Redqueen.pdf) - * Win32 PE binary-only fuzzing with QEMU and Wine - * AFLfast's power schedules by Marcel Böhme: [https://github.com/mboehme/aflfast](https://github.com/mboehme/aflfast) - * The MOpt mutator: [https://github.com/puppet-meteor/MOpt-AFL](https://github.com/puppet-meteor/MOpt-AFL) - * LLVM mode Ngram coverage by Adrian Herrera [https://github.com/adrianherrera/afl-ngram-pass](https://github.com/adrianherrera/afl-ngram-pass) - * LAF-Intel/CompCov support for instrumentation, qemu_mode and unicorn_mode (with enhanced capabilities) - * Radamsa and honggfuzz mutators (as custom mutators). - * QBDI mode to fuzz android native libraries via Quarkslab's [QBDI](https://github.com/QBDI/QBDI) framework - * Frida and ptrace mode to fuzz binary-only libraries, etc. +* NeverZero patch for afl-gcc, instrumentation, qemu_mode and unicorn_mode which + prevents a wrapping map value to zero, increases coverage +* Persistent mode, deferred forkserver and in-memory fuzzing for qemu_mode +* Unicorn mode which allows fuzzing of binaries from completely different + platforms (integration provided by domenukk) +* The new CmpLog instrumentation for LLVM and QEMU inspired by + [Redqueen](https://www.syssec.ruhr-uni-bochum.de/media/emma/veroeffentlichungen/2018/12/17/NDSS19-Redqueen.pdf) +* Win32 PE binary-only fuzzing with QEMU and Wine +* AFLfast's power schedules by Marcel Böhme: + [https://github.com/mboehme/aflfast](https://github.com/mboehme/aflfast) +* The MOpt mutator: + [https://github.com/puppet-meteor/MOpt-AFL](https://github.com/puppet-meteor/MOpt-AFL) +* LLVM mode Ngram coverage by Adrian Herrera + [https://github.com/adrianherrera/afl-ngram-pass](https://github.com/adrianherrera/afl-ngram-pass) +* LAF-Intel/CompCov support for instrumentation, qemu_mode and unicorn_mode + (with enhanced capabilities) +* Radamsa and honggfuzz mutators (as custom mutators). +* QBDI mode to fuzz android native libraries via Quarkslab's + [QBDI](https://github.com/QBDI/QBDI) framework +* Frida and ptrace mode to fuzz binary-only libraries, etc. - So all in all this is the best-of AFL that is out there :-) \ No newline at end of file +So all in all this is the best-of AFL that is out there :-) \ No newline at end of file diff --git a/docs/fuzzing_binary-only_targets.md b/docs/fuzzing_binary-only_targets.md index 0b39042f..4490660d 100644 --- a/docs/fuzzing_binary-only_targets.md +++ b/docs/fuzzing_binary-only_targets.md @@ -84,6 +84,8 @@ Wine, python3, and the pefile python package installed. It is included in AFL++. +For more information, see [qemu_mode/README.wine.md](../qemu_mode/README.wine.md). + ### Frida_mode In frida_mode, you can fuzz binary-only targets as easily as with QEMU. @@ -99,11 +101,13 @@ make ``` For additional instructions and caveats, see -[frida_mode/README.md](../frida_mode/README.md). If possible, you should use the -persistent mode, see [qemu_frida/README.md](../qemu_frida/README.md). The mode -is approximately 2-5x slower than compile-time instrumentation, and is less -conducive to parallelization. But for binary-only fuzzing, it gives a huge speed -improvement if it is possible to use. +[frida_mode/README.md](../frida_mode/README.md). + +If possible, you should use the persistent mode, see +[qemu_frida/README.md](../qemu_frida/README.md). The mode is approximately 2-5x +slower than compile-time instrumentation, and is less conducive to +parallelization. But for binary-only fuzzing, it gives a huge speed improvement +if it is possible to use. If you want to fuzz a binary-only library, then you can fuzz it with frida-gum via frida_mode/. You will have to write a harness to call the target function in @@ -154,8 +158,6 @@ and use afl-untracer.c as a template. It is slower than frida_mode. For more information, see [utils/afl_untracer/README.md](../utils/afl_untracer/README.md). -## Binary rewriters - ### Coresight Coresight is ARM's answer to Intel's PT. With AFL++ v3.15, there is a coresight @@ -163,6 +165,35 @@ tracer implementation available in `coresight_mode/` which is faster than QEMU, however, cannot run in parallel. Currently, only one process can be traced, it is WIP. +Fore more information, see +[coresight_mode/README.md](../coresight_mode/README.md). + +## Binary rewriters + +An alternative solution are binary rewriters. They are faster then the solutions native to AFL++ but don't always work. + +### ZAFL +ZAFL is a static rewriting platform supporting x86-64 C/C++, +stripped/unstripped, and PIE/non-PIE binaries. Beyond conventional +instrumentation, ZAFL's API enables transformation passes (e.g., laf-Intel, +context sensitivity, InsTrim, etc.). + +Its baseline instrumentation speed typically averages 90-95% of +afl-clang-fast's. + +[https://git.zephyr-software.com/opensrc/zafl](https://git.zephyr-software.com/opensrc/zafl) + +### RetroWrite + +If you have an x86/x86_64 binary that still has its symbols, is compiled with +position independent code (PIC/PIE), and does not use most of the C++ features, +then the RetroWrite solution might be for you. It decompiles to ASM files which +can then be instrumented with afl-gcc. + +It is at about 80-85% performance. + +[https://github.com/HexHive/retrowrite](https://github.com/HexHive/retrowrite) + ### Dyninst Dyninst is a binary instrumentation framework similar to Pintool and DynamoRIO. @@ -183,27 +214,6 @@ with afl-dyninst. [https://github.com/vanhauser-thc/afl-dyninst](https://github.com/vanhauser-thc/afl-dyninst) -### Intel PT - -If you have a newer Intel CPU, you can make use of Intel's processor trace. The -big issue with Intel's PT is the small buffer size and the complex encoding of -the debug information collected through PT. This makes the decoding very CPU -intensive and hence slow. As a result, the overall speed decrease is about -70-90% (depending on the implementation and other factors). - -There are two AFL intel-pt implementations: - -1. [https://github.com/junxzm1990/afl-pt](https://github.com/junxzm1990/afl-pt) - => This needs Ubuntu 14.04.05 without any updates and the 4.4 kernel. - -2. [https://github.com/hunter-ht-2018/ptfuzzer](https://github.com/hunter-ht-2018/ptfuzzer) - => This needs a 4.14 or 4.15 kernel. The "nopti" kernel boot option must be - used. This one is faster than the other. - -Note that there is also honggfuzz: -[https://github.com/google/honggfuzz](https://github.com/google/honggfuzz). But -its IPT performance is just 6%! - ### Mcsema Theoretically, you can also decompile to llvm IR with mcsema, and then use @@ -211,6 +221,8 @@ llvm_mode to instrument the binary. Good luck with that. [https://github.com/lifting-bits/mcsema](https://github.com/lifting-bits/mcsema) +## Binary tracers + ### Pintool & DynamoRIO Pintool and DynamoRIO are dynamic instrumentation engines. They can be used for @@ -236,27 +248,26 @@ Pintool solutions: * [https://github.com/spinpx/afl_pin_mode](https://github.com/spinpx/afl_pin_mode) <= only old Pintool version supported -### RetroWrite - -If you have an x86/x86_64 binary that still has its symbols, is compiled with -position independent code (PIC/PIE), and does not use most of the C++ features, -then the RetroWrite solution might be for you. It decompiles to ASM files which -can then be instrumented with afl-gcc. +### Intel PT -It is at about 80-85% performance. +If you have a newer Intel CPU, you can make use of Intel's processor trace. The +big issue with Intel's PT is the small buffer size and the complex encoding of +the debug information collected through PT. This makes the decoding very CPU +intensive and hence slow. As a result, the overall speed decrease is about +70-90% (depending on the implementation and other factors). -[https://github.com/HexHive/retrowrite](https://github.com/HexHive/retrowrite) +There are two AFL intel-pt implementations: -### ZAFL -ZAFL is a static rewriting platform supporting x86-64 C/C++, -stripped/unstripped, and PIE/non-PIE binaries. Beyond conventional -instrumentation, ZAFL's API enables transformation passes (e.g., laf-Intel, -context sensitivity, InsTrim, etc.). +1. [https://github.com/junxzm1990/afl-pt](https://github.com/junxzm1990/afl-pt) + => This needs Ubuntu 14.04.05 without any updates and the 4.4 kernel. -Its baseline instrumentation speed typically averages 90-95% of -afl-clang-fast's. +2. [https://github.com/hunter-ht-2018/ptfuzzer](https://github.com/hunter-ht-2018/ptfuzzer) + => This needs a 4.14 or 4.15 kernel. The "nopti" kernel boot option must be + used. This one is faster than the other. -[https://git.zephyr-software.com/opensrc/zafl](https://git.zephyr-software.com/opensrc/zafl) +Note that there is also honggfuzz: +[https://github.com/google/honggfuzz](https://github.com/google/honggfuzz). But +its IPT performance is just 6%! ## Non-AFL++ solutions diff --git a/docs/limitations.md b/docs/limitations.md index a68c0a85..8172a902 100644 --- a/docs/limitations.md +++ b/docs/limitations.md @@ -1,36 +1,37 @@ # Known limitations & areas for improvement -Here are some of the most important caveats for AFL: +Here are some of the most important caveats for AFL++: - - AFL++ detects faults by checking for the first spawned process dying due to - a signal (SIGSEGV, SIGABRT, etc). Programs that install custom handlers for - these signals may need to have the relevant code commented out. In the same - vein, faults in child processes spawned by the fuzzed target may evade - detection unless you manually add some code to catch that. +- AFL++ detects faults by checking for the first spawned process dying due to a + signal (SIGSEGV, SIGABRT, etc). Programs that install custom handlers for + these signals may need to have the relevant code commented out. In the same + vein, faults in child processes spawned by the fuzzed target may evade + detection unless you manually add some code to catch that. - - As with any other brute-force tool, the fuzzer offers limited coverage if - encryption, checksums, cryptographic signatures, or compression are used to - wholly wrap the actual data format to be tested. +- As with any other brute-force tool, the fuzzer offers limited coverage if + encryption, checksums, cryptographic signatures, or compression are used to + wholly wrap the actual data format to be tested. - To work around this, you can comment out the relevant checks (see - utils/libpng_no_checksum/ for inspiration); if this is not possible, - you can also write a postprocessor, one of the hooks of custom mutators. - See [custom_mutators.md](custom_mutators.md) on how to use - `AFL_CUSTOM_MUTATOR_LIBRARY` +To work around this, you can comment out the relevant checks (see +utils/libpng_no_checksum/ for inspiration); if this is not possible, you can +also write a postprocessor, one of the hooks of custom mutators. See +[custom_mutators.md](custom_mutators.md) on how to use +`AFL_CUSTOM_MUTATOR_LIBRARY`. - - There are some unfortunate trade-offs with ASAN and 64-bit binaries. This - isn't due to any specific fault of afl-fuzz. +- There are some unfortunate trade-offs with ASAN and 64-bit binaries. This + isn't due to any specific fault of afl-fuzz. - - There is no direct support for fuzzing network services, background - daemons, or interactive apps that require UI interaction to work. You may - need to make simple code changes to make them behave in a more traditional - way. Preeny may offer a relatively simple option, too - see: - [https://github.com/zardus/preeny](https://github.com/zardus/preeny) +- There is no direct support for fuzzing network services, background daemons, + or interactive apps that require UI interaction to work. You may need to make + simple code changes to make them behave in a more traditional way. Preeny may + offer a relatively simple option, too - see: + [https://github.com/zardus/preeny](https://github.com/zardus/preeny) - Some useful tips for modifying network-based services can be also found at: - [https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop](https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop) +Some useful tips for modifying network-based services can be also found at: +[https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop](https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop) - - Occasionally, sentient machines rise against their creators. If this - happens to you, please consult [https://lcamtuf.coredump.cx/prep/](https://lcamtuf.coredump.cx/prep/). +- Occasionally, sentient machines rise against their creators. If this happens + to you, please consult + [https://lcamtuf.coredump.cx/prep/](https://lcamtuf.coredump.cx/prep/). -Beyond this, see [INSTALL.md](INSTALL.md) for platform-specific tips. +Beyond this, see [INSTALL.md](INSTALL.md) for platform-specific tips. \ No newline at end of file diff --git a/docs/parallel_fuzzing.md b/docs/parallel_fuzzing.md deleted file mode 100644 index 130cb3ce..00000000 --- a/docs/parallel_fuzzing.md +++ /dev/null @@ -1,256 +0,0 @@ -# Tips for parallel fuzzing - -This document talks about synchronizing afl-fuzz jobs on a single machine or -across a fleet of systems. See README.md for the general instruction manual. - -Note that this document is rather outdated. please refer to the main document -section on multiple core usage -[fuzzing_in_depth.md:b) Using multiple cores](fuzzing_in_depth.md#b-using-multiple-cores) -for up to date strategies! - -## 1) Introduction - -Every copy of afl-fuzz will take up one CPU core. This means that on an n-core -system, you can almost always run around n concurrent fuzzing jobs with -virtually no performance hit (you can use the afl-gotcpu tool to make sure). - -In fact, if you rely on just a single job on a multi-core system, you will be -underutilizing the hardware. So, parallelization is always the right way to go. - -When targeting multiple unrelated binaries or using the tool in -"non-instrumented" (-n) mode, it is perfectly fine to just start up several -fully separate instances of afl-fuzz. The picture gets more complicated when you -want to have multiple fuzzers hammering a common target: if a hard-to-hit but -interesting test case is synthesized by one fuzzer, the remaining instances will -not be able to use that input to guide their work. - -To help with this problem, afl-fuzz offers a simple way to synchronize test -cases on the fly. - -It is a good idea to use different power schedules if you run several instances -in parallel (`-p` option). - -Alternatively running other AFL spinoffs in parallel can be of value, e.g. -Angora (https://github.com/AngoraFuzzer/Angora/) - -## 2) Single-system parallelization - -If you wish to parallelize a single job across multiple cores on a local system, -simply create a new, empty output directory ("sync dir") that will be shared by -all the instances of afl-fuzz; and then come up with a naming scheme for every -instance - say, "fuzzer01", "fuzzer02", etc. - -Run the first one ("main node", -M) like this: - -``` -./afl-fuzz -i testcase_dir -o sync_dir -M fuzzer01 [...other stuff...] -``` - -...and then, start up secondary (-S) instances like this: - -``` -./afl-fuzz -i testcase_dir -o sync_dir -S fuzzer02 [...other stuff...] -./afl-fuzz -i testcase_dir -o sync_dir -S fuzzer03 [...other stuff...] -``` - -Each fuzzer will keep its state in a separate subdirectory, like so: - - /path/to/sync_dir/fuzzer01/ - -Each instance will also periodically rescan the top-level sync directory for any -test cases found by other fuzzers - and will incorporate them into its own -fuzzing when they are deemed interesting enough. For performance reasons only -M -main node syncs the queue with everyone, the -S secondary nodes will only sync -from the main node. - -The difference between the -M and -S modes is that the main instance will still -perform deterministic checks; while the secondary instances will proceed -straight to random tweaks. - -Note that you must always have one -M main instance! Running multiple -M -instances is wasteful! - -You can also monitor the progress of your jobs from the command line with the -provided afl-whatsup tool. When the instances are no longer finding new paths, -it's probably time to stop. - -WARNING: Exercise caution when explicitly specifying the -f option. Each fuzzer -must use a separate temporary file; otherwise, things will go south. One safe -example may be: - -``` -./afl-fuzz [...] -S fuzzer10 -f file10.txt ./fuzzed/binary @@ -./afl-fuzz [...] -S fuzzer11 -f file11.txt ./fuzzed/binary @@ -./afl-fuzz [...] -S fuzzer12 -f file12.txt ./fuzzed/binary @@ -``` - -This is not a concern if you use @@ without -f and let afl-fuzz come up with the -file name. - -## 3) Multiple -M mains - - -There is support for parallelizing the deterministic checks. This is only needed -where - - 1. many new paths are found fast over a long time and it looks unlikely that - main node will ever catch up, and - 2. deterministic fuzzing is actively helping path discovery (you can see this - in the main node for the first for lines in the "fuzzing strategy yields" - section. If the ration `found/attempts` is high, then it is effective. It - most commonly isn't.) - -Only if both are true it is beneficial to have more than one main. You can -leverage this by creating -M instances like so: - -``` -./afl-fuzz -i testcase_dir -o sync_dir -M mainA:1/3 [...] -./afl-fuzz -i testcase_dir -o sync_dir -M mainB:2/3 [...] -./afl-fuzz -i testcase_dir -o sync_dir -M mainC:3/3 [...] -``` - -... where the first value after ':' is the sequential ID of a particular main -instance (starting at 1), and the second value is the total number of fuzzers to -distribute the deterministic fuzzing across. Note that if you boot up fewer -fuzzers than indicated by the second number passed to -M, you may end up with -poor coverage. - -## 4) Syncing with non-AFL fuzzers or independent instances - -A -M main node can be told with the `-F other_fuzzer_queue_directory` option to -sync results from other fuzzers, e.g. libfuzzer or honggfuzz. - -Only the specified directory will by synced into afl, not subdirectories. The -specified directory does not need to exist yet at the start of afl. - -The `-F` option can be passed to the main node several times. - -## 5) Multi-system parallelization - -The basic operating principle for multi-system parallelization is similar to the -mechanism explained in section 2. The key difference is that you need to write a -simple script that performs two actions: - - - Uses SSH with authorized_keys to connect to every machine and retrieve a tar - archive of the /path/to/sync_dir/<main_node(s)> directory local to the - machine. It is best to use a naming scheme that includes host name and it's - being a main node (e.g. main1, main2) in the fuzzer ID, so that you can do - something like: - - ```sh - for host in `cat HOSTLIST`; do - ssh user@$host "tar -czf - sync/$host_main*/" > $host.tgz - done - ``` - - - Distributes and unpacks these files on all the remaining machines, e.g.: - - ```sh - for srchost in `cat HOSTLIST`; do - for dsthost in `cat HOSTLIST`; do - test "$srchost" = "$dsthost" && continue - ssh user@$srchost 'tar -kxzf -' < $dsthost.tgz - done - done - ``` - -There is an example of such a script in utils/distributed_fuzzing/. - -There are other (older) more featured, experimental tools: - * https://github.com/richo/roving - * https://github.com/MartijnB/disfuzz-afl - -However these do not support syncing just main nodes (yet). - -When developing custom test case sync code, there are several optimizations to -keep in mind: - - - The synchronization does not have to happen very often; running the task - every 60 minutes or even less often at later fuzzing stages is fine - - - There is no need to synchronize crashes/ or hangs/; you only need to copy - over queue/* (and ideally, also fuzzer_stats). - - - It is not necessary (and not advisable!) to overwrite existing files; the -k - option in tar is a good way to avoid that. - - - There is no need to fetch directories for fuzzers that are not running - locally on a particular machine, and were simply copied over onto that - system during earlier runs. - - - For large fleets, you will want to consolidate tarballs for each host, as - this will let you use n SSH connections for sync, rather than n*(n-1). - - You may also want to implement staged synchronization. For example, you - could have 10 groups of systems, with group 1 pushing test cases only to - group 2; group 2 pushing them only to group 3; and so on, with group - eventually 10 feeding back to group 1. - - This arrangement would allow test interesting cases to propagate across the - fleet without having to copy every fuzzer queue to every single host. - - - You do not want a "main" instance of afl-fuzz on every system; you should - run them all with -S, and just designate a single process somewhere within - the fleet to run with -M. - - - Syncing is only necessary for the main nodes on a system. It is possible to - run main-less with only secondaries. However then you need to find out which - secondary took over the temporary role to be the main node. Look for the - `is_main_node` file in the fuzzer directories, eg. - `sync-dir/hostname-*/is_main_node` - -It is *not* advisable to skip the synchronization script and run the fuzzers -directly on a network filesystem; unexpected latency and unkillable processes in -I/O wait state can mess things up. - -## 6) Remote monitoring and data collection - -You can use screen, nohup, tmux, or something equivalent to run remote instances -of afl-fuzz. If you redirect the program's output to a file, it will -automatically switch from a fancy UI to more limited status reports. There is -also basic machine-readable information which is always written to the -fuzzer_stats file in the output directory. Locally, that information can be -interpreted with afl-whatsup. - -In principle, you can use the status screen of the main (-M) instance to monitor -the overall fuzzing progress and decide when to stop. In this mode, the most -important signal is just that no new paths are being found for a longer while. -If you do not have a main instance, just pick any single secondary instance to -watch and go by that. - -You can also rely on that instance's output directory to collect the synthesized -corpus that covers all the noteworthy paths discovered anywhere within the -fleet. Secondary (-S) instances do not require any special monitoring, other -than just making sure that they are up. - -Keep in mind that crashing inputs are *not* automatically propagated to the main -instance, so you may still want to monitor for crashes fleet-wide from within -your synchronization or health checking scripts (see afl-whatsup). - -## 7) Asymmetric setups - -It is perhaps worth noting that all of the following is permitted: - - - Running afl-fuzz with conjunction with other guided tools that can extend - coverage (e.g., via concolic execution). Third-party tools simply need to - follow the protocol described above for pulling new test cases from - out_dir/<fuzzer_id>/queue/* and writing their own finds to sequentially - numbered id:nnnnnn files in out_dir/<ext_tool_id>/queue/*. - - - Running some of the synchronized fuzzers with different (but related) target - binaries. For example, simultaneously stress-testing several different JPEG - parsers (say, IJG jpeg and libjpeg-turbo) while sharing the discovered test - cases can have synergistic effects and improve the overall coverage. - - (In this case, running one -M instance per target is necessary.) - - - Having some of the fuzzers invoke the binary in different ways. For example, - 'djpeg' supports several DCT modes, configurable with a command-line flag, - while 'dwebp' supports incremental and one-shot decoding. In some scenarios, - going after multiple distinct modes and then pooling test cases will improve - coverage. - - - Much less convincingly, running the synchronized fuzzers with different - starting test cases (e.g., progressive and standard JPEG) or dictionaries. - The synchronization mechanism ensures that the test sets will get fairly - homogeneous over time, but it introduces some initial variability. \ No newline at end of file diff --git a/docs/technical_details.md b/docs/technical_details.md deleted file mode 100644 index 994ffe9f..00000000 --- a/docs/technical_details.md +++ /dev/null @@ -1,550 +0,0 @@ -# Technical "whitepaper" for afl-fuzz - - -NOTE: this document is mostly outdated! - - -This document provides a quick overview of the guts of American Fuzzy Lop. -See README.md for the general instruction manual; and for a discussion of -motivations and design goals behind AFL, see historical_notes.md. - -## 0. Design statement - -American Fuzzy Lop does its best not to focus on any singular principle of -operation and not be a proof-of-concept for any specific theory. The tool can -be thought of as a collection of hacks that have been tested in practice, -found to be surprisingly effective, and have been implemented in the simplest, -most robust way I could think of at the time. - -Many of the resulting features are made possible thanks to the availability of -lightweight instrumentation that served as a foundation for the tool, but this -mechanism should be thought of merely as a means to an end. The only true -governing principles are speed, reliability, and ease of use. - -## 1. Coverage measurements - -The instrumentation injected into compiled programs captures branch (edge) -coverage, along with coarse branch-taken hit counts. The code injected at -branch points is essentially equivalent to: - -```c - cur_location = <COMPILE_TIME_RANDOM>; - shared_mem[cur_location ^ prev_location]++; - prev_location = cur_location >> 1; -``` - -The `cur_location` value is generated randomly to simplify the process of -linking complex projects and keep the XOR output distributed uniformly. - -The `shared_mem[]` array is a 64 kB SHM region passed to the instrumented binary -by the caller. Every byte set in the output map can be thought of as a hit for -a particular (`branch_src`, `branch_dst`) tuple in the instrumented code. - -The size of the map is chosen so that collisions are sporadic with almost all -of the intended targets, which usually sport between 2k and 10k discoverable -branch points: - -``` - Branch cnt | Colliding tuples | Example targets - ------------+------------------+----------------- - 1,000 | 0.75% | giflib, lzo - 2,000 | 1.5% | zlib, tar, xz - 5,000 | 3.5% | libpng, libwebp - 10,000 | 7% | libxml - 20,000 | 14% | sqlite - 50,000 | 30% | - -``` - -At the same time, its size is small enough to allow the map to be analyzed -in a matter of microseconds on the receiving end, and to effortlessly fit -within L2 cache. - -This form of coverage provides considerably more insight into the execution -path of the program than simple block coverage. In particular, it trivially -distinguishes between the following execution traces: - -``` - A -> B -> C -> D -> E (tuples: AB, BC, CD, DE) - A -> B -> D -> C -> E (tuples: AB, BD, DC, CE) -``` - -This aids the discovery of subtle fault conditions in the underlying code, -because security vulnerabilities are more often associated with unexpected -or incorrect state transitions than with merely reaching a new basic block. - -The reason for the shift operation in the last line of the pseudocode shown -earlier in this section is to preserve the directionality of tuples (without -this, A ^ B would be indistinguishable from B ^ A) and to retain the identity -of tight loops (otherwise, A ^ A would be obviously equal to B ^ B). - -The absence of simple saturating arithmetic opcodes on Intel CPUs means that -the hit counters can sometimes wrap around to zero. Since this is a fairly -unlikely and localized event, it's seen as an acceptable performance trade-off. - -### 2. Detecting new behaviors - -The fuzzer maintains a global map of tuples seen in previous executions; this -data can be rapidly compared with individual traces and updated in just a couple -of dword- or qword-wide instructions and a simple loop. - -When a mutated input produces an execution trace containing new tuples, the -corresponding input file is preserved and routed for additional processing -later on (see section #3). Inputs that do not trigger new local-scale state -transitions in the execution trace (i.e., produce no new tuples) are discarded, -even if their overall control flow sequence is unique. - -This approach allows for a very fine-grained and long-term exploration of -program state while not having to perform any computationally intensive and -fragile global comparisons of complex execution traces, and while avoiding the -scourge of path explosion. - -To illustrate the properties of the algorithm, consider that the second trace -shown below would be considered substantially new because of the presence of -new tuples (CA, AE): - -``` - #1: A -> B -> C -> D -> E - #2: A -> B -> C -> A -> E -``` - -At the same time, with #2 processed, the following pattern will not be seen -as unique, despite having a markedly different overall execution path: - -``` - #3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E -``` - -In addition to detecting new tuples, the fuzzer also considers coarse tuple -hit counts. These are divided into several buckets: - -``` - 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+ -``` - -To some extent, the number of buckets is an implementation artifact: it allows -an in-place mapping of an 8-bit counter generated by the instrumentation to -an 8-position bitmap relied on by the fuzzer executable to keep track of the -already-seen execution counts for each tuple. - -Changes within the range of a single bucket are ignored; transition from one -bucket to another is flagged as an interesting change in program control flow, -and is routed to the evolutionary process outlined in the section below. - -The hit count behavior provides a way to distinguish between potentially -interesting control flow changes, such as a block of code being executed -twice when it was normally hit only once. At the same time, it is fairly -insensitive to empirically less notable changes, such as a loop going from -47 cycles to 48. The counters also provide some degree of "accidental" -immunity against tuple collisions in dense trace maps. - -The execution is policed fairly heavily through memory and execution time -limits; by default, the timeout is set at 5x the initially-calibrated -execution speed, rounded up to 20 ms. The aggressive timeouts are meant to -prevent dramatic fuzzer performance degradation by descending into tarpits -that, say, improve coverage by 1% while being 100x slower; we pragmatically -reject them and hope that the fuzzer will find a less expensive way to reach -the same code. Empirical testing strongly suggests that more generous time -limits are not worth the cost. - -## 3. Evolving the input queue - -Mutated test cases that produced new state transitions within the program are -added to the input queue and used as a starting point for future rounds of -fuzzing. They supplement, but do not automatically replace, existing finds. - -In contrast to more greedy genetic algorithms, this approach allows the tool -to progressively explore various disjoint and possibly mutually incompatible -features of the underlying data format, as shown in this image: - -  - -Several practical examples of the results of this algorithm are discussed -here: - - https://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html - https://lcamtuf.blogspot.com/2014/11/afl-fuzz-nobody-expects-cdata-sections.html - -The synthetic corpus produced by this process is essentially a compact -collection of "hmm, this does something new!" input files, and can be used to -seed any other testing processes down the line (for example, to manually -stress-test resource-intensive desktop apps). - -With this approach, the queue for most targets grows to somewhere between 1k -and 10k entries; approximately 10-30% of this is attributable to the discovery -of new tuples, and the remainder is associated with changes in hit counts. - -The following table compares the relative ability to discover file syntax and -explore program states when using several different approaches to guided -fuzzing. The instrumented target was GNU patch 2.7k.3 compiled with `-O3` and -seeded with a dummy text file; the session consisted of a single pass over the -input queue with afl-fuzz: - -``` - Fuzzer guidance | Blocks | Edges | Edge hit | Highest-coverage - strategy used | reached | reached | cnt var | test case generated - ------------------+---------+---------+----------+--------------------------- - (Initial file) | 156 | 163 | 1.00 | (none) - | | | | - Blind fuzzing S | 182 | 205 | 2.23 | First 2 B of RCS diff - Blind fuzzing L | 228 | 265 | 2.23 | First 4 B of -c mode diff - Block coverage | 855 | 1,130 | 1.57 | Almost-valid RCS diff - Edge coverage | 1,452 | 2,070 | 2.18 | One-chunk -c mode diff - AFL model | 1,765 | 2,597 | 4.99 | Four-chunk -c mode diff -``` - -The first entry for blind fuzzing ("S") corresponds to executing just a single -round of testing; the second set of figures ("L") shows the fuzzer running in a -loop for a number of execution cycles comparable with that of the instrumented -runs, which required more time to fully process the growing queue. - -Roughly similar results have been obtained in a separate experiment where the -fuzzer was modified to compile out all the random fuzzing stages and leave just -a series of rudimentary, sequential operations such as walking bit flips. -Because this mode would be incapable of altering the size of the input file, -the sessions were seeded with a valid unified diff: - -``` - Queue extension | Blocks | Edges | Edge hit | Number of unique - strategy used | reached | reached | cnt var | crashes found - ------------------+---------+---------+----------+------------------ - (Initial file) | 624 | 717 | 1.00 | - - | | | | - Blind fuzzing | 1,101 | 1,409 | 1.60 | 0 - Block coverage | 1,255 | 1,649 | 1.48 | 0 - Edge coverage | 1,259 | 1,734 | 1.72 | 0 - AFL model | 1,452 | 2,040 | 3.16 | 1 -``` - -At noted earlier on, some of the prior work on genetic fuzzing relied on -maintaining a single test case and evolving it to maximize coverage. At least -in the tests described above, this "greedy" approach appears to confer no -substantial benefits over blind fuzzing strategies. - -### 4. Culling the corpus - -The progressive state exploration approach outlined above means that some of -the test cases synthesized later on in the game may have edge coverage that -is a strict superset of the coverage provided by their ancestors. - -To optimize the fuzzing effort, AFL periodically re-evaluates the queue using a -fast algorithm that selects a smaller subset of test cases that still cover -every tuple seen so far, and whose characteristics make them particularly -favorable to the tool. - -The algorithm works by assigning every queue entry a score proportional to its -execution latency and file size; and then selecting lowest-scoring candidates -for each tuple. - -The tuples are then processed sequentially using a simple workflow: - - 1) Find next tuple not yet in the temporary working set, - 2) Locate the winning queue entry for this tuple, - 3) Register *all* tuples present in that entry's trace in the working set, - 4) Go to #1 if there are any missing tuples in the set. - -The generated corpus of "favored" entries is usually 5-10x smaller than the -starting data set. Non-favored entries are not discarded, but they are skipped -with varying probabilities when encountered in the queue: - - - If there are new, yet-to-be-fuzzed favorites present in the queue, 99% - of non-favored entries will be skipped to get to the favored ones. - - If there are no new favorites: - * If the current non-favored entry was fuzzed before, it will be skipped - 95% of the time. - * If it hasn't gone through any fuzzing rounds yet, the odds of skipping - drop down to 75%. - -Based on empirical testing, this provides a reasonable balance between queue -cycling speed and test case diversity. - -Slightly more sophisticated but much slower culling can be performed on input -or output corpora with `afl-cmin`. This tool permanently discards the redundant -entries and produces a smaller corpus suitable for use with `afl-fuzz` or -external tools. - -## 5. Trimming input files - -File size has a dramatic impact on fuzzing performance, both because large -files make the target binary slower, and because they reduce the likelihood -that a mutation would touch important format control structures, rather than -redundant data blocks. This is discussed in more detail in perf_tips.md. - -The possibility that the user will provide a low-quality starting corpus aside, -some types of mutations can have the effect of iteratively increasing the size -of the generated files, so it is important to counter this trend. - -Luckily, the instrumentation feedback provides a simple way to automatically -trim down input files while ensuring that the changes made to the files have no -impact on the execution path. - -The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data -with variable length and stepover; any deletion that doesn't affect the checksum -of the trace map is committed to disk. The trimmer is not designed to be -particularly thorough; instead, it tries to strike a balance between precision -and the number of `execve()` calls spent on the process, selecting the block size -and stepover to match. The average per-file gains are around 5-20%. - -The standalone `afl-tmin` tool uses a more exhaustive, iterative algorithm, and -also attempts to perform alphabet normalization on the trimmed files. The -operation of `afl-tmin` is as follows. - -First, the tool automatically selects the operating mode. If the initial input -crashes the target binary, afl-tmin will run in non-instrumented mode, simply -keeping any tweaks that produce a simpler file but still crash the target. -The same mode is used for hangs, if `-H` (hang mode) is specified. -If the target is non-crashing, the tool uses an instrumented mode and keeps only -the tweaks that produce exactly the same execution path. - -The actual minimization algorithm is: - - 1) Attempt to zero large blocks of data with large stepovers. Empirically, - this is shown to reduce the number of execs by preempting finer-grained - efforts later on. - 2) Perform a block deletion pass with decreasing block sizes and stepovers, - binary-search-style. - 3) Perform alphabet normalization by counting unique characters and trying - to bulk-replace each with a zero value. - 4) As a last result, perform byte-by-byte normalization on non-zero bytes. - -Instead of zeroing with a 0x00 byte, `afl-tmin` uses the ASCII digit '0'. This -is done because such a modification is much less likely to interfere with -text parsing, so it is more likely to result in successful minimization of -text files. - -The algorithm used here is less involved than some other test case -minimization approaches proposed in academic work, but requires far fewer -executions and tends to produce comparable results in most real-world -applications. - -## 6. Fuzzing strategies - -The feedback provided by the instrumentation makes it easy to understand the -value of various fuzzing strategies and optimize their parameters so that they -work equally well across a wide range of file types. The strategies used by -afl-fuzz are generally format-agnostic and are discussed in more detail here: - - https://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html - -It is somewhat notable that especially early on, most of the work done by -`afl-fuzz` is actually highly deterministic, and progresses to random stacked -modifications and test case splicing only at a later stage. The deterministic -strategies include: - - - Sequential bit flips with varying lengths and stepovers, - - Sequential addition and subtraction of small integers, - - Sequential insertion of known interesting integers (`0`, `1`, `INT_MAX`, etc), - -The purpose of opening with deterministic steps is related to their tendency to -produce compact test cases and small diffs between the non-crashing and crashing -inputs. - -With deterministic fuzzing out of the way, the non-deterministic steps include -stacked bit flips, insertions, deletions, arithmetics, and splicing of different -test cases. - -The relative yields and `execve()` costs of all these strategies have been -investigated and are discussed in the aforementioned blog post. - -For the reasons discussed in historical_notes.md (chiefly, performance, -simplicity, and reliability), AFL generally does not try to reason about the -relationship between specific mutations and program states; the fuzzing steps -are nominally blind, and are guided only by the evolutionary design of the -input queue. - -That said, there is one (trivial) exception to this rule: when a new queue -entry goes through the initial set of deterministic fuzzing steps, and tweaks to -some regions in the file are observed to have no effect on the checksum of the -execution path, they may be excluded from the remaining phases of -deterministic fuzzing - and the fuzzer may proceed straight to random tweaks. -Especially for verbose, human-readable data formats, this can reduce the number -of execs by 10-40% or so without an appreciable drop in coverage. In extreme -cases, such as normally block-aligned tar archives, the gains can be as high as -90%. - -Because the underlying "effector maps" are local every queue entry and remain -in force only during deterministic stages that do not alter the size or the -general layout of the underlying file, this mechanism appears to work very -reliably and proved to be simple to implement. - -## 7. Dictionaries - -The feedback provided by the instrumentation makes it easy to automatically -identify syntax tokens in some types of input files, and to detect that certain -combinations of predefined or auto-detected dictionary terms constitute a -valid grammar for the tested parser. - -A discussion of how these features are implemented within afl-fuzz can be found -here: - - https://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html - -In essence, when basic, typically easily-obtained syntax tokens are combined -together in a purely random manner, the instrumentation and the evolutionary -design of the queue together provide a feedback mechanism to differentiate -between meaningless mutations and ones that trigger new behaviors in the -instrumented code - and to incrementally build more complex syntax on top of -this discovery. - -The dictionaries have been shown to enable the fuzzer to rapidly reconstruct -the grammar of highly verbose and complex languages such as JavaScript, SQL, -or XML; several examples of generated SQL statements are given in the blog -post mentioned above. - -Interestingly, the AFL instrumentation also allows the fuzzer to automatically -isolate syntax tokens already present in an input file. It can do so by looking -for run of bytes that, when flipped, produce a consistent change to the -program's execution path; this is suggestive of an underlying atomic comparison -to a predefined value baked into the code. The fuzzer relies on this signal -to build compact "auto dictionaries" that are then used in conjunction with -other fuzzing strategies. - -## 8. De-duping crashes - -De-duplication of crashes is one of the more important problems for any -competent fuzzing tool. Many of the naive approaches run into problems; in -particular, looking just at the faulting address may lead to completely -unrelated issues being clustered together if the fault happens in a common -library function (say, `strcmp`, `strcpy`); while checksumming call stack -backtraces can lead to extreme crash count inflation if the fault can be -reached through a number of different, possibly recursive code paths. - -The solution implemented in `afl-fuzz` considers a crash unique if any of two -conditions are met: - - - The crash trace includes a tuple not seen in any of the previous crashes, - - The crash trace is missing a tuple that was always present in earlier - faults. - -The approach is vulnerable to some path count inflation early on, but exhibits -a very strong self-limiting effect, similar to the execution path analysis -logic that is the cornerstone of `afl-fuzz`. - -## 9. Investigating crashes - -The exploitability of many types of crashes can be ambiguous; afl-fuzz tries -to address this by providing a crash exploration mode where a known-faulting -test case is fuzzed in a manner very similar to the normal operation of the -fuzzer, but with a constraint that causes any non-crashing mutations to be -thrown away. - -A detailed discussion of the value of this approach can be found here: - - https://lcamtuf.blogspot.com/2014/11/afl-fuzz-crash-exploration-mode.html - -The method uses instrumentation feedback to explore the state of the crashing -program to get past the ambiguous faulting condition and then isolate the -newly-found inputs for human review. - -On the subject of crashes, it is worth noting that in contrast to normal -queue entries, crashing inputs are *not* trimmed; they are kept exactly as -discovered to make it easier to compare them to the parent, non-crashing entry -in the queue. That said, `afl-tmin` can be used to shrink them at will. - -## 10 The fork server - -To improve performance, `afl-fuzz` uses a "fork server", where the fuzzed process -goes through `execve()`, linking, and libc initialization only once, and is then -cloned from a stopped process image by leveraging copy-on-write. The -implementation is described in more detail here: - - https://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html - -The fork server is an integral aspect of the injected instrumentation and -simply stops at the first instrumented function to await commands from -`afl-fuzz`. - -With fast targets, the fork server can offer considerable performance gains, -usually between 1.5x and 2x. It is also possible to: - - - Use the fork server in manual ("deferred") mode, skipping over larger, - user-selected chunks of initialization code. It requires very modest - code changes to the targeted program, and With some targets, can - produce 10x+ performance gains. - - Enable "persistent" mode, where a single process is used to try out - multiple inputs, greatly limiting the overhead of repetitive `fork()` - calls. This generally requires some code changes to the targeted program, - but can improve the performance of fast targets by a factor of 5 or more - approximating the benefits of in-process fuzzing jobs while still - maintaining very robust isolation between the fuzzer process and the - targeted binary. - -## 11. Parallelization - -The parallelization mechanism relies on periodically examining the queues -produced by independently-running instances on other CPU cores or on remote -machines, and then selectively pulling in the test cases that, when tried -out locally, produce behaviors not yet seen by the fuzzer at hand. - -This allows for extreme flexibility in fuzzer setup, including running synced -instances against different parsers of a common data format, often with -synergistic effects. - -For more information about this design, see parallel_fuzzing.md. - -## 12. Binary-only instrumentation - -Instrumentation of black-box, binary-only targets is accomplished with the -help of a separately-built version of QEMU in "user emulation" mode. This also -allows the execution of cross-architecture code - say, ARM binaries on x86. - -QEMU uses basic blocks as translation units; the instrumentation is implemented -on top of this and uses a model roughly analogous to the compile-time hooks: - -```c - if (block_address > elf_text_start && block_address < elf_text_end) { - - cur_location = (block_address >> 4) ^ (block_address << 8); - shared_mem[cur_location ^ prev_location]++; - prev_location = cur_location >> 1; - - } -``` - -The shift-and-XOR-based scrambling in the second line is used to mask the -effects of instruction alignment. - -The start-up of binary translators such as QEMU, DynamoRIO, and PIN is fairly -slow; to counter this, the QEMU mode leverages a fork server similar to that -used for compiler-instrumented code, effectively spawning copies of an -already-initialized process paused at `_start`. - -First-time translation of a new basic block also incurs substantial latency. To -eliminate this problem, the AFL fork server is extended by providing a channel -between the running emulator and the parent process. The channel is used -to notify the parent about the addresses of any newly-encountered blocks and to -add them to the translation cache that will be replicated for future child -processes. - -As a result of these two optimizations, the overhead of the QEMU mode is -roughly 2-5x, compared to 100x+ for PIN. - -## 13. The `afl-analyze` tool - -The file format analyzer is a simple extension of the minimization algorithm -discussed earlier on; instead of attempting to remove no-op blocks, the tool -performs a series of walking byte flips and then annotates runs of bytes -in the input file. - -It uses the following classification scheme: - - - "No-op blocks" - segments where bit flips cause no apparent changes to - control flow. Common examples may be comment sections, pixel data within - a bitmap file, etc. - - "Superficial content" - segments where some, but not all, bitflips - produce some control flow changes. Examples may include strings in rich - documents (e.g., XML, RTF). - - "Critical stream" - a sequence of bytes where all bit flips alter control - flow in different but correlated ways. This may be compressed data, - non-atomically compared keywords or magic values, etc. - - "Suspected length field" - small, atomic integer that, when touched in - any way, causes a consistent change to program control flow, suggestive - of a failed length check. - - "Suspected cksum or magic int" - an integer that behaves similarly to a - length field, but has a numerical value that makes the length explanation - unlikely. This is suggestive of a checksum or other "magic" integer. - - "Suspected checksummed block" - a long block of data where any change - always triggers the same new execution path. Likely caused by failing - a checksum or a similar integrity check before any subsequent parsing - takes place. - - "Magic value section" - a generic token where changes cause the type - of binary behavior outlined earlier, but that doesn't meet any of the - other criteria. May be an atomically compared keyword or so. diff --git a/docs/third_party_tools.md b/docs/third_party_tools.md index 446d373c..92229e84 100644 --- a/docs/third_party_tools.md +++ b/docs/third_party_tools.md @@ -1,33 +1,57 @@ # Tools that help fuzzing with AFL++ Speeding up fuzzing: - * [libfiowrapper](https://github.com/marekzmyslowski/libfiowrapper) - if the function you want to fuzz requires loading a file, this allows using the shared memory test case feature :-) - recommended. +* [libfiowrapper](https://github.com/marekzmyslowski/libfiowrapper) - if the + function you want to fuzz requires loading a file, this allows using the + shared memory test case feature :-) - recommended. Minimization of test cases: - * [afl-pytmin](https://github.com/ilsani/afl-pytmin) - a wrapper for afl-tmin that tries to speed up the process of minimization of a single test case by using many CPU cores. - * [afl-ddmin-mod](https://github.com/MarkusTeufelberger/afl-ddmin-mod) - a variation of afl-tmin based on the ddmin algorithm. - * [halfempty](https://github.com/googleprojectzero/halfempty) - is a fast utility for minimizing test cases by Tavis Ormandy based on parallelization. +* [afl-pytmin](https://github.com/ilsani/afl-pytmin) - a wrapper for afl-tmin + that tries to speed up the process of minimization of a single test case by + using many CPU cores. +* [afl-ddmin-mod](https://github.com/MarkusTeufelberger/afl-ddmin-mod) - a + variation of afl-tmin based on the ddmin algorithm. +* [halfempty](https://github.com/googleprojectzero/halfempty) - is a fast + utility for minimizing test cases by Tavis Ormandy based on parallelization. Distributed execution: - * [disfuzz-afl](https://github.com/MartijnB/disfuzz-afl) - distributed fuzzing for AFL. - * [AFLDFF](https://github.com/quantumvm/AFLDFF) - AFL distributed fuzzing framework. - * [afl-launch](https://github.com/bnagy/afl-launch) - a tool for the execution of many AFL instances. - * [afl-mothership](https://github.com/afl-mothership/afl-mothership) - management and execution of many synchronized AFL fuzzers on AWS cloud. - * [afl-in-the-cloud](https://github.com/abhisek/afl-in-the-cloud) - another script for running AFL in AWS. +* [disfuzz-afl](https://github.com/MartijnB/disfuzz-afl) - distributed fuzzing + for AFL. +* [AFLDFF](https://github.com/quantumvm/AFLDFF) - AFL distributed fuzzing + framework. +* [afl-launch](https://github.com/bnagy/afl-launch) - a tool for the execution + of many AFL instances. +* [afl-mothership](https://github.com/afl-mothership/afl-mothership) - + management and execution of many synchronized AFL fuzzers on AWS cloud. +* [afl-in-the-cloud](https://github.com/abhisek/afl-in-the-cloud) - another + script for running AFL in AWS. Deployment, management, monitoring, reporting - * [afl-utils](https://gitlab.com/rc0r/afl-utils) - a set of utilities for automatic processing/analysis of crashes and reducing the number of test cases. - * [afl-other-arch](https://github.com/shellphish/afl-other-arch) - is a set of patches and scripts for easily adding support for various non-x86 architectures for AFL. - * [afl-trivia](https://github.com/bnagy/afl-trivia) - a few small scripts to simplify the management of AFL. - * [afl-monitor](https://github.com/reflare/afl-monitor) - a script for monitoring AFL. - * [afl-manager](https://github.com/zx1340/afl-manager) - a web server on Python for managing multi-afl. - * [afl-remote](https://github.com/block8437/afl-remote) - a web server for the remote management of AFL instances. - * [afl-extras](https://github.com/fekir/afl-extras) - shell scripts to parallelize afl-tmin, startup, and data collection. +* [afl-utils](https://gitlab.com/rc0r/afl-utils) - a set of utilities for + automatic processing/analysis of crashes and reducing the number of test + cases. +* [afl-other-arch](https://github.com/shellphish/afl-other-arch) - is a set of + patches and scripts for easily adding support for various non-x86 + architectures for AFL. +* [afl-trivia](https://github.com/bnagy/afl-trivia) - a few small scripts to + simplify the management of AFL. +* [afl-monitor](https://github.com/reflare/afl-monitor) - a script for + monitoring AFL. +* [afl-manager](https://github.com/zx1340/afl-manager) - a web server on Python + for managing multi-afl. +* [afl-remote](https://github.com/block8437/afl-remote) - a web server for the + remote management of AFL instances. +* [afl-extras](https://github.com/fekir/afl-extras) - shell scripts to + parallelize afl-tmin, startup, and data collection. Crash processing - * [afl-crash-analyzer](https://github.com/floyd-fuh/afl-crash-analyzer) - another crash analyzer for AFL. - * [fuzzer-utils](https://github.com/ThePatrickStar/fuzzer-utils) - a set of scripts for the analysis of results. - * [atriage](https://github.com/Ayrx/atriage) - a simple triage tool. - * [afl-kit](https://github.com/kcwu/afl-kit) - afl-cmin on Python. - * [AFLize](https://github.com/d33tah/aflize) - a tool that automatically generates builds of debian packages suitable for AFL. - * [afl-fid](https://github.com/FoRTE-Research/afl-fid) - a set of tools for working with input data. \ No newline at end of file +* [afl-crash-analyzer](https://github.com/floyd-fuh/afl-crash-analyzer) - + another crash analyzer for AFL. +* [fuzzer-utils](https://github.com/ThePatrickStar/fuzzer-utils) - a set of + scripts for the analysis of results. +* [atriage](https://github.com/Ayrx/atriage) - a simple triage tool. +* [afl-kit](https://github.com/kcwu/afl-kit) - afl-cmin on Python. +* [AFLize](https://github.com/d33tah/aflize) - a tool that automatically + generates builds of debian packages suitable for AFL. +* [afl-fid](https://github.com/FoRTE-Research/afl-fid) - a set of tools for + working with input data. \ No newline at end of file diff --git a/docs/tutorials.md b/docs/tutorials.md index cc7ed130..ed8a7eec 100644 --- a/docs/tutorials.md +++ b/docs/tutorials.md @@ -1,6 +1,6 @@ # Tutorials -Here are some good writeups to show how to effectively use AFL++: +Here are some good write-ups to show how to effectively use AFL++: * [https://aflplus.plus/docs/tutorials/libxml2_tutorial/](https://aflplus.plus/docs/tutorials/libxml2_tutorial/) * [https://bananamafia.dev/post/gb-fuzz/](https://bananamafia.dev/post/gb-fuzz/) @@ -18,9 +18,13 @@ training, then we can highly recommend the following: If you are interested in fuzzing structured data (where you define what the structure is), these links have you covered: -* Superion for AFL++: [https://github.com/adrian-rt/superion-mutator](https://github.com/adrian-rt/superion-mutator) -* libprotobuf for AFL++: [https://github.com/P1umer/AFLplusplus-protobuf-mutator](https://github.com/P1umer/AFLplusplus-protobuf-mutator) -* libprotobuf raw: [https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator](https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator) -* libprotobuf for old AFL++ API: [https://github.com/thebabush/afl-libprotobuf-mutator](https://github.com/thebabush/afl-libprotobuf-mutator) +* Superion for AFL++: + [https://github.com/adrian-rt/superion-mutator](https://github.com/adrian-rt/superion-mutator) +* libprotobuf for AFL++: + [https://github.com/P1umer/AFLplusplus-protobuf-mutator](https://github.com/P1umer/AFLplusplus-protobuf-mutator) +* libprotobuf raw: + [https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator](https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator) +* libprotobuf for old AFL++ API: + [https://github.com/thebabush/afl-libprotobuf-mutator](https://github.com/thebabush/afl-libprotobuf-mutator) If you find other good ones, please send them to us :-) \ No newline at end of file |