diff options
Diffstat (limited to 'docs')
-rw-r--r-- | docs/INSTALL.md (renamed from docs/INSTALL) | 97 | ||||
-rw-r--r-- | docs/PATCHES.md (renamed from docs/PATCHES) | 13 | ||||
-rw-r--r-- | docs/README.MOpt.md (renamed from docs/README.MOpt) | 0 | ||||
-rw-r--r-- | docs/historical_notes.md (renamed from docs/historical_notes.txt) | 24 | ||||
-rw-r--r-- | docs/life_pro_tips.md | 90 | ||||
-rw-r--r-- | docs/life_pro_tips.txt | 128 | ||||
-rw-r--r-- | docs/perf_tips.md (renamed from docs/perf_tips.txt) | 127 | ||||
-rw-r--r-- | docs/power_schedules.md (renamed from docs/power_schedules.txt) | 7 | ||||
-rw-r--r-- | docs/sister_projects.md | 318 | ||||
-rw-r--r-- | docs/sister_projects.txt | 360 | ||||
-rw-r--r-- | docs/status_screen.md (renamed from docs/status_screen.txt) | 180 | ||||
-rw-r--r-- | docs/technical_details.md (renamed from docs/technical_details.txt) | 148 |
12 files changed, 689 insertions, 803 deletions
diff --git a/docs/INSTALL b/docs/INSTALL.md index 2e24724f..0f9673ad 100644 --- a/docs/INSTALL +++ b/docs/INSTALL.md @@ -1,28 +1,30 @@ -========================= -Installation instructions -========================= +# Installation instructions This document provides basic installation instructions and discusses known - issues for a variety of platforms. See README for the general instruction + issues for a variety of platforms. See README.md for the general instruction manual. -1) Linux on x86 +## 1) Linux on x86 --------------- This platform is expected to work well. Compile the program with: -$ make +```bash +make +``` You can start using the fuzzer without installation, but it is also possible to install it with: -# make install +```bash +make install +``` There are no special dependencies to speak of; you will need GNU make and a working compiler (gcc or clang). Some of the optional scripts bundled with the program may depend on bash, gdb, and similar basic tools. -If you are using clang, please review llvm_mode/README.llvm; the LLVM +If you are using clang, please review llvm_mode/README.md; the LLVM integration mode can offer substantial performance gains compared to the traditional approach. @@ -30,27 +32,30 @@ You may have to change several settings to get optimal results (most notably, disable crash reporting utilities and switch to a different CPU governor), but afl-fuzz will guide you through that if necessary. -2) OpenBSD, FreeBSD, NetBSD on x86 ----------------------------------- +## OpenBSD, FreeBSD, NetBSD on x86 Similarly to Linux, these platforms are expected to work well and are regularly tested. Compile everything with GNU make: -$ gmake +```bash +gmake +``` Note that BSD make will *not* work; if you do not have gmake on your system, please install it first. As on Linux, you can use the fuzzer itself without installation, or install it with: -# gmake install +``` +gmake install +``` Keep in mind that if you are using csh as your shell, the syntax of some of the -shell commands given in the README and other docs will be different. +shell commands given in the README.md and other docs will be different. -The llvm_mode requires a dynamically linked, fully-operational installation of +The `llvm_mode` requires a dynamically linked, fully-operational installation of clang. At least on FreeBSD, the clang binaries are static and do not include some of the essential tools, so if you want to make it work, you may need to -follow the instructions in llvm_mode/README.llvm. +follow the instructions in llvm_mode/README.md. Beyond that, everything should work as advertised. @@ -58,8 +63,7 @@ The QEMU mode is currently supported only on Linux. I think it's just a QEMU problem, I couldn't get a vanilla copy of user-mode emulation support working correctly on BSD at all. -3) MacOS X on x86 ------------------ +## 3. MacOS X on x86 MacOS X should work, but there are some gotchas due to the idiosyncrasies of the platform. On top of this, I have limited release testing capabilities @@ -69,8 +73,8 @@ To build AFL, install Xcode and follow the general instructions for Linux. The Xcode 'gcc' tool is just a wrapper for clang, so be sure to use afl-clang to compile any instrumented binaries; afl-gcc will fail unless you have GCC -installed from another source (in which case, please specify AFL_CC and -AFL_CXX to point to the "real" GCC binaries). +installed from another source (in which case, please specify `AFL_CC` and +`AFL_CXX` to point to the "real" GCC binaries). Only 64-bit compilation will work on the platform; porting the 32-bit instrumentation would require a fair amount of work due to the way OS X @@ -80,47 +84,45 @@ The crash reporting daemon that comes by default with MacOS X will cause problems with fuzzing. You need to turn it off by following the instructions provided here: http://goo.gl/CCcd5u -The fork() semantics on OS X are a bit unusual compared to other unix systems +The `fork()` semantics on OS X are a bit unusual compared to other unix systems and definitely don't look POSIX-compliant. This means two things: - Fuzzing will be probably slower than on Linux. In fact, some folks report considerable performance gains by running the jobs inside a Linux VM on MacOS X. - - Some non-portable, platform-specific code may be incompatible with the - AFL forkserver. If you run into any problems, set AFL_NO_FORKSRV=1 in the + AFL forkserver. If you run into any problems, set `AFL_NO_FORKSRV=1` in the environment before starting afl-fuzz. User emulation mode of QEMU does not appear to be supported on MacOS X, so -black-box instrumentation mode (-Q) will not work. +black-box instrumentation mode (`-Q`) will not work. The llvm_mode requires a fully-operational installation of clang. The one that comes with Xcode is missing some of the essential headers and helper tools. -See llvm_mode/README.llvm for advice on how to build the compiler from scratch. +See llvm_mode/README.md for advice on how to build the compiler from scratch. -4) Linux or *BSD on non-x86 systems ------------------------------------ +## 4. Linux or *BSD on non-x86 systems Standard build will fail on non-x86 systems, but you should be able to leverage two other options: - - The LLVM mode (see llvm_mode/README.llvm), which does not rely on + - The LLVM mode (see llvm_mode/README.md), which does not rely on x86-specific assembly shims. It's fast and robust, but requires a complete installation of clang. - - - The QEMU mode (see qemu_mode/README.qemu), which can be also used for + - The QEMU mode (see qemu_mode/README.md), which can be also used for fuzzing cross-platform binaries. It's slower and more fragile, but can be used even when you don't have the source for the tested app. If you're not sure what you need, you need the LLVM mode. To get it, try: -$ AFL_NO_X86=1 gmake && gmake -C llvm_mode +```bash +AFL_NO_X86=1 gmake && gmake -C llvm_mode +``` ...and compile your target program with afl-clang-fast or afl-clang-fast++ instead of the traditional afl-gcc or afl-clang wrappers. -5) Solaris on x86 ------------------ +## 5. Solaris on x86 The fuzzer reportedly works on Solaris, but I have not tested this first-hand, and the user base is fairly small, so I don't have a lot of feedback. @@ -128,36 +130,39 @@ and the user base is fairly small, so I don't have a lot of feedback. To get the ball rolling, you will need to use GNU make and GCC or clang. I'm being told that the stock version of GCC that comes with the platform does not work properly due to its reliance on a hardcoded location for 'as' (completely -ignoring the -B parameter or $PATH). +ignoring the `-B` parameter or `$PATH`). To fix this, you may want to build stock GCC from the source, like so: -$ ./configure --prefix=$HOME/gcc --with-gnu-as --with-gnu-ld \ +```sh +./configure --prefix=$HOME/gcc --with-gnu-as --with-gnu-ld \ --with-gmp-include=/usr/include/gmp --with-mpfr-include=/usr/include/mpfr -$ make -$ sudo make install +make +sudo make install +``` -Do *not* specify --with-as=/usr/gnu/bin/as - this will produce a GCC binary that -ignores the -B flag and you will be back to square one. +Do *not* specify `--with-as=/usr/gnu/bin/as` - this will produce a GCC binary that +ignores the `-B` flag and you will be back to square one. Note that Solaris reportedly comes with crash reporting enabled, which causes problems with crashes being misinterpreted as hangs, similarly to the gotchas for Linux and MacOS X. AFL does not auto-detect crash reporting on this particular platform, but you may need to run the following command: -$ coreadm -d global -d global-setid -d process -d proc-setid \ +```sh +coreadm -d global -d global-setid -d process -d proc-setid \ -d kzone -d log +``` User emulation mode of QEMU is not available on Solaris, so black-box -instrumentation mode (-Q) will not work. +instrumentation mode (`-Q`) will not work. -6) Everything else ------------------- +## 6. Everything else You're on your own. On POSIX-compliant systems, you may be able to compile and run the fuzzer; and the LLVM mode may offer a way to instrument non-x86 code. -The fuzzer will not run on Windows. It will also not work under Cygwin. It +The fuzzer will run on Windows in WSL only. It will not work under Cygwin on in the normal Windows world. It could be ported to the latter platform fairly easily, but it's a pretty bad idea, because Cygwin is extremely slow. It makes much more sense to use VirtualBox or so to run a hardware-accelerated Linux VM; it will run around @@ -171,13 +176,15 @@ It's possible that all you need is this workaround: https://github.com/pelya/android-shmem Joshua J. Drake notes that the Android linker adds a shim that automatically -intercepts SIGSEGV and related signals. To fix this issue and be able to see +intercepts `SIGSEGV` and related signals. To fix this issue and be able to see crashes, you need to put this at the beginning of the fuzzed program: +```sh signal(SIGILL, SIG_DFL); signal(SIGABRT, SIG_DFL); signal(SIGBUS, SIG_DFL); signal(SIGFPE, SIG_DFL); signal(SIGSEGV, SIG_DFL); +``` -You may need to #include <signal.h> first. +You may need to `#include <signal.h>` first. diff --git a/docs/PATCHES b/docs/PATCHES.md index 50bcb32f..1dfb6622 100644 --- a/docs/PATCHES +++ b/docs/PATCHES.md @@ -1,9 +1,11 @@ +# Applied Patches + The following patches from https://github.com/vanhauser-thc/afl-patches have been installed or not installed: -INSTALLED -========= +## INSTALLED +``` afl-llvm-fix.diff by kcwu(at)csie(dot)org afl-sort-all_uniq-fix.diff by legarrec(dot)vincent(at)gmail(dot)com laf-intel.diff by heiko(dot)eissfeldt(at)hexco(dot)de @@ -16,6 +18,7 @@ afl-qemu-ppc64.diff by william(dot)barsse(at)airbus(dot)com afl-qemu-optimize-entrypoint.diff by mh(at)mh-sec(dot)de afl-qemu-speed.diff by abiondo on github afl-qemu-optimize-map.diff by mh(at)mh-sec(dot)de +``` + Custom mutator (native library) (by kyakdan) + unicorn_mode (modernized and updated by domenukk) @@ -28,10 +31,12 @@ afl-qemu-optimize-map.diff by mh(at)mh-sec(dot)de + forkserver patch for afl-tmin (github.com/nccgroup/TriforceAFL) -NOT INSTALLED -============= +## NOT INSTALLED + +``` afl-fuzz-context_sensitive.diff - changes too much of the behaviour afl-tmpfs.diff - same as afl-fuzz-tmpdir.diff but more complex afl-cmin-reduce-dataset.diff - unsure of the impact afl-llvm-fix2.diff - not needed with the other patches +``` diff --git a/docs/README.MOpt b/docs/README.MOpt.md index 94e63959..94e63959 100644 --- a/docs/README.MOpt +++ b/docs/README.MOpt.md diff --git a/docs/historical_notes.txt b/docs/historical_notes.md index 741fd925..2079698b 100644 --- a/docs/historical_notes.txt +++ b/docs/historical_notes.md @@ -1,17 +1,14 @@ -================ -Historical notes -================ +# Historical notes This doc talks about the rationale of some of the high-level design decisions for American Fuzzy Lop. It's adopted from a discussion with Rob Graham. - See README for the general instruction manual, and technical_details.txt for + See README.md for the general instruction manual, and technical_details.md for additional implementation-level insights. -1) Influences -------------- +## 1) Influences -In short, afl-fuzz is inspired chiefly by the work done by Tavis Ormandy back -in 2007. Tavis did some very persuasive experiments using gcov block coverage +In short, `afl-fuzz` is inspired chiefly by the work done by Tavis Ormandy back +in 2007. Tavis did some very persuasive experiments using `gcov` block coverage to select optimal test cases out of a large corpus of data, and then using them as a starting point for traditional fuzzing workflows. @@ -22,7 +19,7 @@ In parallel to this, both Tavis and I were interested in evolutionary fuzzing. Tavis had his experiments, and I was working on a tool called bunny-the-fuzzer, released somewhere in 2007. -Bunny used a generational algorithm not much different from afl-fuzz, but +Bunny used a generational algorithm not much different from `afl-fuzz`, but also tried to reason about the relationship between various input bits and the internal state of the program, with hopes of deriving some additional value from that. The reasoning / correlation part was probably in part inspired by @@ -75,8 +72,7 @@ But I digress; ultimately, attribution is hard, and glorying the fundamental concepts behind AFL is probably a waste of time. The devil is very much in the often-overlooked details, which brings us to... -2) Design goals for afl-fuzz ----------------------------- +## 2. Design goals for afl-fuzz In short, I believe that the current implementation of afl-fuzz takes care of several itches that seemed impossible to scratch with other tools: @@ -86,7 +82,7 @@ several itches that seemed impossible to scratch with other tools: likely to find a bug, but runs 100x slower, your users are getting a bad deal. - To avoid starting with a handicap, afl-fuzz is meant to let you fuzz most of + To avoid starting with a handicap, `afl-fuzz` is meant to let you fuzz most of the intended targets at roughly their native speed - so even if it doesn't add value, you do not lose much. @@ -107,7 +103,7 @@ several itches that seemed impossible to scratch with other tools: them strictly worse than "dumb" tools, and such degradation can be difficult for less experienced users to notice and correct. - In contrast, afl-fuzz is designed to be rock solid, chiefly by keeping it + In contrast, `afl-fuzz` is designed to be rock solid, chiefly by keeping it simple. In fact, at its core, it's designed to be just a very good traditional fuzzer with a wide range of interesting, well-researched strategies to go by. The fancy parts just help it focus the effort in @@ -143,5 +139,5 @@ of small, complementary methods that were shown to reliably yields results better than chance. The use of instrumentation is a part of that toolkit, but is far from being the most important one. -Ultimately, what matters is that afl-fuzz is designed to find cool bugs - and +Ultimately, what matters is that `afl-fuzz` is designed to find cool bugs - and has a pretty robust track record of doing just that. diff --git a/docs/life_pro_tips.md b/docs/life_pro_tips.md new file mode 100644 index 00000000..b79c6ab9 --- /dev/null +++ b/docs/life_pro_tips.md @@ -0,0 +1,90 @@ +# AFL "Life Pro Tips" + +Bite-sized advice for those who understand the basics, but can't be bothered +to read or memorize every other piece of documentation for AFL. + +## Get more bang for your buck by using fuzzing dictionaries. + +See dictionaries/README.md to learn how. + +## You can get the most out of your hardware by parallelizing AFL jobs. + +See docs/parallel_fuzzing.md for step-by-step tips. + +## Improve the odds of spotting memory corruption bugs with libdislocator.so! + +It's easy. Consult libdislocator/README.md for usage tips. + +## Want to understand how your target parses a particular input file? + +Try the bundled `afl-analyze` tool; it's got colors and all! + +## You can visually monitor the progress of your fuzzing jobs. + +Run the bundled `afl-plot` utility to generate browser-friendly graphs. + +## Need to monitor AFL jobs programmatically? +Check out the `fuzzer_stats` file in the AFL output dir or try `afl-whatsup`. + +## Puzzled by something showing up in red or purple in the AFL UI? +It could be important - consult docs/status_screen.md right away! + +## Know your target? Convert it to persistent mode for a huge performance gain! +Consult section #5 in llvm_mode/README.md for tips. + +## Using clang? +Check out llvm_mode/ for a faster alternative to afl-gcc! + +## Did you know that AFL can fuzz closed-source or cross-platform binaries? +Check out qemu_mode/README.md and unicorn_mode/README.md for more. + +## Did you know that afl-fuzz can minimize any test case for you? +Try the bundled `afl-tmin` tool - and get small repro files fast! + +## Not sure if a crash is exploitable? AFL can help you figure it out. Specify +`-C` to enable the peruvian were-rabbit mode. + +## Trouble dealing with a machine uprising? Relax, we've all been there. + +Find essential survival tips at http://lcamtuf.coredump.cx/prep/. + +## Want to automatically spot non-crashing memory handling bugs? + +Try running an AFL-generated corpus through ASAN, MSAN, or Valgrind. + +## Good selection of input files is critical to a successful fuzzing job. + +See docs/perf_tips.md for pro tips. + +## You can improve the odds of automatically spotting stack corruption issues. + +Specify `AFL_HARDEN=1` in the environment to enable hardening flags. + +## Bumping into problems with non-reproducible crashes? +It happens, but usually +isn't hard to diagnose. See section #7 in README for tips. + +## Fuzzing is not just about memory corruption issues in the codebase. +Add some +sanity-checking `assert()` / `abort()` statements to effortlessly catch logic bugs. + +## Hey kid... pssst... want to figure out how AFL really works? + +Check out docs/technical_details.md for all the gory details in one place! + +## There's a ton of third-party helper tools designed to work with AFL! + +Be sure to check out docs/sister_projects.md before writing your own. + +## Need to fuzz the command-line arguments of a particular program? + +You can find a simple solution in experimental/argv_fuzzing. + +## Attacking a format that uses checksums? + +Remove the checksum-checking code or +use a postprocessor! See experimental/post_library/ for more. + +## Dealing with a very slow target or hoping for instant results? + +Specify `-d` when calling afl-fuzz! \ No newline at end of file diff --git a/docs/life_pro_tips.txt b/docs/life_pro_tips.txt deleted file mode 100644 index 27c70592..00000000 --- a/docs/life_pro_tips.txt +++ /dev/null @@ -1,128 +0,0 @@ -# =================== -# AFL "Life Pro Tips" -# =================== -# -# Bite-sized advice for those who understand the basics, but can't be bothered -# to read or memorize every other piece of documentation for AFL. -# - -% - -Get more bang for your buck by using fuzzing dictionaries. -See dictionaries/README.dictionaries to learn how. - -% - -You can get the most out of your hardware by parallelizing AFL jobs. -See docs/parallel_fuzzing.md for step-by-step tips. - -% - -Improve the odds of spotting memory corruption bugs with libdislocator.so! -It's easy. Consult libdislocator/README.dislocator for usage tips. - -% - -Want to understand how your target parses a particular input file? -Try the bundled afl-analyze tool; it's got colors and all! - -% - -You can visually monitor the progress of your fuzzing jobs. -Run the bundled afl-plot utility to generate browser-friendly graphs. - -% - -Need to monitor AFL jobs programmatically? Check out the fuzzer_stats file -in the AFL output dir or try afl-whatsup. - -% - -Puzzled by something showing up in red or purple in the AFL UI? -It could be important - consult docs/status_screen.txt right away! - -% - -Know your target? Convert it to persistent mode for a huge performance gain! -Consult section #5 in llvm_mode/README.llvm for tips. - -% - -Using clang? Check out llvm_mode/ for a faster alternative to afl-gcc! - -% - -Did you know that AFL can fuzz closed-source or cross-platform binaries? -Check out qemu_mode/README.qemu for more. - -% - -Did you know that afl-fuzz can minimize any test case for you? -Try the bundled afl-tmin tool - and get small repro files fast! - -% - -Not sure if a crash is exploitable? AFL can help you figure it out. Specify --C to enable the peruvian were-rabbit mode. See section #10 in README for more. - -% - -Trouble dealing with a machine uprising? Relax, we've all been there. -Find essential survival tips at http://lcamtuf.coredump.cx/prep/. - -% - -AFL-generated corpora can be used to power other testing processes. -See section #2 in README for inspiration - it tends to pay off! - -% - -Want to automatically spot non-crashing memory handling bugs? -Try running an AFL-generated corpus through ASAN, MSAN, or Valgrind. - -% - -Good selection of input files is critical to a successful fuzzing job. -See section #5 in README (or docs/perf_tips.txt) for pro tips. - -% - -You can improve the odds of automatically spotting stack corruption issues. -Specify AFL_HARDEN=1 in the environment to enable hardening flags. - -% - -Bumping into problems with non-reproducible crashes? It happens, but usually -isn't hard to diagnose. See section #7 in README for tips. - -% - -Fuzzing is not just about memory corruption issues in the codebase. Add some -sanity-checking assert() / abort() statements to effortlessly catch logic bugs. - -% - -Hey kid... pssst... want to figure out how AFL really works? -Check out docs/technical_details.txt for all the gory details in one place! - -% - -There's a ton of third-party helper tools designed to work with AFL! -Be sure to check out docs/sister_projects.txt before writing your own. - -% - -Need to fuzz the command-line arguments of a particular program? -You can find a simple solution in experimental/argv_fuzzing. - -% - -Attacking a format that uses checksums? Remove the checksum-checking code or -use a postprocessor! See experimental/post_library/ for more. - -% - -Dealing with a very slow target or hoping for instant results? Specify -d -when calling afl-fuzz! - -% diff --git a/docs/perf_tips.txt b/docs/perf_tips.md index b4a8893d..41d74447 100644 --- a/docs/perf_tips.txt +++ b/docs/perf_tips.md @@ -1,12 +1,9 @@ -================================= -Tips for performance optimization -================================= +## Tips for performance optimization This file provides tips for troubleshooting slow or wasteful fuzzing jobs. See README for the general instruction manual. -1) Keep your test cases small ------------------------------ +## 1. Keep your test cases small This is probably the single most important step to take! Large test cases do not merely take more time and memory to be parsed by the tested binary, but @@ -29,22 +26,20 @@ as high as 500x or so. In practice, this means that you shouldn't fuzz image parsers with your vacation photos. Generate a tiny 16x16 picture instead, and run it through -jpegtran or pngcrunch for good measure. The same goes for most other types +`jpegtran` or `pngcrunch` for good measure. The same goes for most other types of documents. -There's plenty of small starting test cases in ../testcases/* - try them out +There's plenty of small starting test cases in ../testcases/ - try them out or submit new ones! -If you want to start with a larger, third-party corpus, run afl-cmin with an +If you want to start with a larger, third-party corpus, run `afl-cmin` with an aggressive timeout on that data set first. -2) Use a simpler target ------------------------ +## 2. Use a simpler target Consider using a simpler target binary in your fuzzing work. For example, for -image formats, bundled utilities such as djpeg, readpng, or gifhisto are -considerably (10-20x) faster than the convert tool from ImageMagick - all while -exercising roughly the same library-level image parsing code. +image formats, bundled utilities such as `djpeg`, `readpng`, or `gifhisto` are +considerably (10-20x) faster than the convert tool from ImageMagick - all while exercising roughly the same library-level image parsing code. Even if you don't have a lightweight harness for a particular target, remember that you can always use another, related library to generate a corpus that will @@ -53,11 +48,10 @@ be then manually fed to a more resource-hungry program later on. Also note that reading the fuzzing input via stdin is faster than reading from a file. -3) Use LLVM instrumentation ---------------------------- +## 3. Use LLVM instrumentation When fuzzing slow targets, you can gain 20-100% performance improvement by -using the LLVM-based instrumentation mode described in llvm_mode/README.llvm. +using the LLVM-based instrumentation mode described in [the llvm_mode README](../llvm_mode/README.md). Note that this mode requires the use of clang and will not work with GCC. The LLVM mode also offers a "persistent", in-process fuzzing mode that can @@ -67,25 +61,26 @@ that can offer huge benefits for programs with high startup overhead. Both modes require you to edit the source code of the fuzzed program, but the changes often amount to just strategically placing a single line or two. -If there are important data comparisons performed (e.g. strcmp(ptr, MAGIC_HDR) -then using laf-intel (see llvm_mode/README.laf-intel) will help afl-fuzz a lot +If there are important data comparisons performed (e.g. `strcmp(ptr, MAGIC_HDR)`) +then using laf-intel (see llvm_mode/README.laf-intel.md) will help `afl-fuzz` a lot to get to the important parts in the code. -If you are only intested in specific parts of the code being fuzzed, you can +If you are only interested in specific parts of the code being fuzzed, you can whitelist the files that are actually relevant. This improves the speed and -accuracy of afl. See llvm_mode/README.whitelist +accuracy of afl. See llvm_mode/README.whitelist.md Also use the InsTrim mode on larger binaries, this improves performance and coverage a lot. -4) Profile and optimize the binary ----------------------------------- +## 4. Profile and optimize the binary Check for any parameters or settings that obviously improve performance. For example, the djpeg utility that comes with IJG jpeg and libjpeg-turbo can be called with: +```bash -dct fast -nosmooth -onepass -dither none -scale 1/4 +``` ...and that will speed things up. There is a corresponding drop in the quality of decoded images, but it's probably not something you care about. @@ -98,134 +93,132 @@ With some laid-back parsers, enabling "strict" mode (i.e., bailing out after first error) may result in smaller files and improved run time without sacrificing coverage; for example, for sqlite, you may want to specify -bail. -If the program is still too slow, you can use strace -tt or an equivalent +If the program is still too slow, you can use `strace -tt` or an equivalent profiling tool to see if the targeted binary is doing anything silly. -Sometimes, you can speed things up simply by specifying /dev/null as the +Sometimes, you can speed things up simply by specifying `/dev/null` as the config file, or disabling some compile-time features that aren't really needed -for the job (try ./configure --help). One of the notoriously resource-consuming -things would be calling other utilities via exec*(), popen(), system(), or +for the job (try `./configure --help`). One of the notoriously resource-consuming +things would be calling other utilities via `exec*()`, `popen()`, `system()`, or equivalent calls; for example, tar can invoke external decompression tools when it decides that the input file is a compressed archive. -Some programs may also intentionally call sleep(), usleep(), or nanosleep(); -vim is a good example of that. Other programs may attempt fsync() and so on. +Some programs may also intentionally call `sleep()`, `usleep()`, or `nanosleep()`; +vim is a good example of that. Other programs may attempt `fsync()` and so on. There are third-party libraries that make it easy to get rid of such code, e.g.: https://launchpad.net/libeatmydata In programs that are slow due to unavoidable initialization overhead, you may -want to try the LLVM deferred forkserver mode (see llvm_mode/README.llvm), +want to try the LLVM deferred forkserver mode (see llvm_mode/README.md), which can give you speed gains up to 10x, as mentioned above. Last but not least, if you are using ASAN and the performance is unacceptable, consider turning it off for now, and manually examining the generated corpus with an ASAN-enabled binary later on. -5) Instrument just what you need --------------------------------- +## 5. Instrument just what you need Instrument just the libraries you actually want to stress-test right now, one at a time. Let the program use system-wide, non-instrumented libraries for any functionality you don't actually want to fuzz. For example, in most -cases, it doesn't make to instrument libgmp just because you're testing a +cases, it doesn't make to instrument `libgmp` just because you're testing a crypto app that relies on it for bignum math. Beware of programs that come with oddball third-party libraries bundled with -their source code (Spidermonkey is a good example of this). Check ./configure +their source code (Spidermonkey is a good example of this). Check `./configure` options to use non-instrumented system-wide copies instead. -6) Parallelize your fuzzers ---------------------------- +## 6. Parallelize your fuzzers The fuzzer is designed to need ~1 core per job. This means that on a, say, 4-core system, you can easily run four parallel fuzzing jobs with relatively little performance hit. For tips on how to do that, see parallel_fuzzing.md. -The afl-gotcpu utility can help you understand if you still have idle CPU +The `afl-gotcpu` utility can help you understand if you still have idle CPU capacity on your system. (It won't tell you about memory bandwidth, cache misses, or similar factors, but they are less likely to be a concern.) -7) Keep memory use and timeouts in check ----------------------------------------- +## 7. Keep memory use and timeouts in check -If you have increased the -m or -t limits more than truly necessary, consider +If you have increased the `-m` or `-t` limits more than truly necessary, consider dialing them back down. For programs that are nominally very fast, but get sluggish for some inputs, -you can also try setting -t values that are more punishing than what afl-fuzz -dares to use on its own. On fast and idle machines, going down to -t 5 may be +you can also try setting `-t` values that are more punishing than what `afl-fuzz` +dares to use on its own. On fast and idle machines, going down to `-t 5` may be a viable plan. -The -m parameter is worth looking at, too. Some programs can end up spending +The `-m` parameter is worth looking at, too. Some programs can end up spending a fair amount of time allocating and initializing megabytes of memory when -presented with pathological inputs. Low -m values can make them give up sooner +presented with pathological inputs. Low `-m` values can make them give up sooner and not waste CPU time. -8) Check OS configuration -------------------------- +## 8. Check OS configuration There are several OS-level factors that may affect fuzzing speed: - If you have no risk of power loss then run your fuzzing on a tmpfs partition. This increases the performance noticably. - Alternatively you can use AFL_TMPDIR to point to a tmpfs location to + Alternatively you can use `AFL_TMPDIR` to point to a tmpfs location to just write the input file to a tmpfs. - - High system load. Use idle machines where possible. Kill any non-essential CPU hogs (idle browser windows, media players, complex screensavers, etc). - - Network filesystems, either used for fuzzer input / output, or accessed by the fuzzed binary to read configuration files (pay special attention to the home directory - many programs search it for dot-files). - - - On-demand CPU scaling. The Linux 'ondemand' governor performs its analysis + - On-demand CPU scaling. The Linux `ondemand` governor performs its analysis on a particular schedule and is known to underestimate the needs of - short-lived processes spawned by afl-fuzz (or any other fuzzer). On Linux, + short-lived processes spawned by `afl-fuzz` (or any other fuzzer). On Linux, this can be fixed with: +``` bash cd /sys/devices/system/cpu echo performance | tee cpu*/cpufreq/scaling_governor +``` On other systems, the impact of CPU scaling will be different; when fuzzing, use OS-specific tools to find out if all cores are running at full speed. - - - Transparent huge pages. Some allocators, such as jemalloc, can incur a + - Transparent huge pages. Some allocators, such as `jemalloc`, can incur a heavy fuzzing penalty when transparent huge pages (THP) are enabled in the kernel. You can disable this via: +```bash echo never > /sys/kernel/mm/transparent_hugepage/enabled +``` - Suboptimal scheduling strategies. The significance of this will vary from one target to another, but on Linux, you may want to make sure that the following options are set: +```bash echo 1 >/proc/sys/kernel/sched_child_runs_first echo 1 >/proc/sys/kernel/sched_autogroup_enabled +``` Setting a different scheduling policy for the fuzzer process - say - SCHED_RR - can usually speed things up, too, but needs to be done with + `SCHED_RR` - can usually speed things up, too, but needs to be done with care. - - - Use the afl-system-config script to set all proc/sys settings above - + - Use the `afl-system-config` script to set all proc/sys settings above in one go. - Disable all the spectre, meltdown etc. security countermeasures in the kernel if your machine is properly separated: - "ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off - no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable - nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off - spectre_v2=off stf_barrier=off" - In most Linux distributions you can put this into a /etc/default/grub + +``` +ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off +no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable +nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off +spectre_v2=off stf_barrier=off +``` + In most Linux distributions you can put this into a `/etc/default/grub` variable. -9) If all other options fail, use -d ------------------------------------- +## 9. If all other options fail, use `-d` For programs that are genuinely slow, in cases where you really can't escape using huge input files, or when you simply want to get quick and dirty results -early on, you can always resort to the -d mode. +early on, you can always resort to the `-d` mode. -The mode causes afl-fuzz to skip all the deterministic fuzzing steps, which +The mode causes `afl-fuzz` to skip all the deterministic fuzzing steps, which makes output a lot less neat and can ultimately make the testing a bit less in-depth, but it will give you an experience more familiar from other fuzzing -tools. +tools. \ No newline at end of file diff --git a/docs/power_schedules.txt b/docs/power_schedules.md index 7b9d34c4..4026aedf 100644 --- a/docs/power_schedules.txt +++ b/docs/power_schedules.md @@ -1,8 +1,9 @@ -afl++'s power schedules based on AFLfast +# afl++'s power schedules based on AFLfast -<a href="https://comp.nus.edu.sg/~mboehme/paper/CCS16.pdf"><img src="https://comp.nus.edu.sg/~mboehme/paper/CCS16.png" align="right" width="250"></a> +<a href="https://mboehme.github.io/paper/CCS16.pdf"><img src="https://mboehme.github.io/paper/CCS16.png" align="right" width="250"></a> Power schedules implemented by Marcel Böhme \<marcel.boehme@acm.org\>. -AFLFast is an extension of AFL which was written by Michal Zalewski. +AFLFast is an extension of AFL which is written and maintained by +Michal Zalewski \<lcamtuf@google.com\>. AFLfast has helped in the success of Team Codejitsu at the finals of the DARPA Cyber Grand Challenge where their bot Galactica took **2nd place** in terms of #POVs proven (see red bar at https://www.cybergrandchallenge.com/event#results). AFLFast exposed several previously unreported CVEs that could not be exposed by AFL in 24 hours and otherwise exposed vulnerabilities significantly faster than AFL while generating orders of magnitude more unique crashes. diff --git a/docs/sister_projects.md b/docs/sister_projects.md new file mode 100644 index 00000000..ecc3b924 --- /dev/null +++ b/docs/sister_projects.md @@ -0,0 +1,318 @@ +# Sister projects + +This doc lists some of the projects that are inspired by, derived from, +designed for, or meant to integrate with AFL. See README for the general +instruction manual. + +!!! +!!! This list is outdated and needs an update, missing: e.g. Angora, FairFuzz +!!! + +## Support for other languages / environments: + +### Python AFL (Jakub Wilk) + +Allows fuzz-testing of Python programs. Uses custom instrumentation and its +own forkserver. + +http://jwilk.net/software/python-afl + +### Go-fuzz (Dmitry Vyukov) + +AFL-inspired guided fuzzing approach for Go targets: + +https://github.com/dvyukov/go-fuzz + +### afl.rs (Keegan McAllister) + +Allows Rust features to be easily fuzzed with AFL (using the LLVM mode). + +https://github.com/kmcallister/afl.rs + +### OCaml support (KC Sivaramakrishnan) + +Adds AFL-compatible instrumentation to OCaml programs. + +https://github.com/ocamllabs/opam-repo-dev/pull/23 +http://canopy.mirage.io/Posts/Fuzzing + +### AFL for GCJ Java and other GCC frontends (-) + +GCC Java programs are actually supported out of the box - simply rename +afl-gcc to afl-gcj. Unfortunately, by default, unhandled exceptions in GCJ do +not result in abort() being called, so you will need to manually add a +top-level exception handler that exits with SIGABRT or something equivalent. + +Other GCC-supported languages should be fairly easy to get working, but may +face similar problems. See https://gcc.gnu.org/frontends.html for a list of +options. + +## AFL-style in-process fuzzer for LLVM (Kostya Serebryany) + +Provides an evolutionary instrumentation-guided fuzzing harness that allows +some programs to be fuzzed without the fork / execve overhead. (Similar +functionality is now available as the "persistent" feature described in +[the llvm_mode readme](../llvm_mode/README.md)) + +http://llvm.org/docs/LibFuzzer.html + +## AFL fixup shim (Ben Nagy) + +Allows AFL_POST_LIBRARY postprocessors to be written in arbitrary languages +that don't have C / .so bindings. Includes examples in Go. + +https://github.com/bnagy/aflfix + +## TriforceAFL (Tim Newsham and Jesse Hertz) + +Leverages QEMU full system emulation mode to allow AFL to target operating +systems and other alien worlds: + +https://www.nccgroup.trust/us/about-us/newsroom-and-events/blog/2016/june/project-triforce-run-afl-on-everything/ + +## WinAFL (Ivan Fratric) + +As the name implies, allows you to fuzz Windows binaries (using DynamoRio). + +https://github.com/ivanfratric/winafl + +Another Windows alternative may be: + +https://github.com/carlosgprado/BrundleFuzz/ + +## Network fuzzing + +### Preeny (Yan Shoshitaishvili) + +Provides a fairly simple way to convince dynamically linked network-centric +programs to read from a file or not fork. Not AFL-specific, but described as +useful by many users. Some assembly required. + +https://github.com/zardus/preeny + +## Distributed fuzzing and related automation + +### roving (Richo Healey) + +A client-server architecture for effortlessly orchestrating AFL runs across +a fleet of machines. You don't want to use this on systems that face the +Internet or live in other untrusted environments. + +https://github.com/richo/roving + +### Distfuzz-AFL (Martijn Bogaard) + +Simplifies the management of afl-fuzz instances on remote machines. The +author notes that the current implementation isn't secure and should not +be exposed on the Internet. + +https://github.com/MartijnB/disfuzz-afl + +### AFLDFF (quantumvm) + +A nice GUI for managing AFL jobs. + +https://github.com/quantumvm/AFLDFF + +### afl-launch (Ben Nagy) + +Batch AFL launcher utility with a simple CLI. + +https://github.com/bnagy/afl-launch + +### AFL Utils (rc0r) + +Simplifies the triage of discovered crashes, start parallel instances, etc. + +https://github.com/rc0r/afl-utils + +Another crash triage tool: + +https://github.com/floyd-fuh/afl-crash-analyzer + +### afl-fuzzing-scripts (Tobias Ospelt) + +Simplifies starting up multiple parallel AFL jobs. + +https://github.com/floyd-fuh/afl-fuzzing-scripts/ + +### afl-sid (Jacek Wielemborek) + +Allows users to more conveniently build and deploy AFL via Docker. + +https://github.com/d33tah/afl-sid + +Another Docker-related project: + +https://github.com/ozzyjohnson/docker-afl + +### afl-monitor (Paul S. Ziegler) + +Provides more detailed and versatile statistics about your running AFL jobs. + +https://github.com/reflare/afl-monitor + +### FEXM (Security in Telecommunications) + +Fully automated fuzzing framework, based on AFL + +https://github.com/fgsect/fexm + +## Crash triage, coverage analysis, and other companion tools: + +### afl-crash-analyzer (Tobias Ospelt) + +Makes it easier to navigate and annotate crashing test cases. + +https://github.com/floyd-fuh/afl-crash-analyzer/ + +### Crashwalk (Ben Nagy) + +AFL-aware tool to annotate and sort through crashing test cases. + +https://github.com/bnagy/crashwalk + +### afl-cov (Michael Rash) + +Produces human-readable coverage data based on the output queue of afl-fuzz. + +https://github.com/mrash/afl-cov + +### afl-sancov (Bhargava Shastry) + +Similar to afl-cov, but uses clang sanitizer instrumentation. + +https://github.com/bshastry/afl-sancov + +### RecidiVM (Jakub Wilk) + +Makes it easy to estimate memory usage limits when fuzzing with ASAN or MSAN. + +http://jwilk.net/software/recidivm + +### aflize (Jacek Wielemborek) + +Automatically build AFL-enabled versions of Debian packages. + +https://github.com/d33tah/aflize + +### afl-ddmin-mod (Markus Teufelberger) + +A variant of afl-tmin that uses a more sophisticated (but slower) +minimization algorithm. + +https://github.com/MarkusTeufelberger/afl-ddmin-mod + +### afl-kit (Kuang-che Wu) + +Replacements for afl-cmin and afl-tmin with additional features, such +as the ability to filter crashes based on stderr patterns. + +https://github.com/kcwu/afl-kit + +## Narrow-purpose or experimental: + +### Cygwin support (Ali Rizvi-Santiago) + +Pretty self-explanatory. As per the author, this "mostly" ports AFL to +Windows. Field reports welcome! + +https://github.com/arizvisa/afl-cygwin + +### Pause and resume scripts (Ben Nagy) + +Simple automation to suspend and resume groups of fuzzing jobs. + +https://github.com/bnagy/afl-trivia + +### Static binary-only instrumentation (Aleksandar Nikolich) + +Allows black-box binaries to be instrumented statically (i.e., by modifying +the binary ahead of the time, rather than translating it on the run). Author +reports better performance compared to QEMU, but occasional translation +errors with stripped binaries. + +https://github.com/vanhauser-thc/afl-dyninst + +### AFL PIN (Parker Thompson) + +Early-stage Intel PIN instrumentation support (from before we settled on +faster-running QEMU). + +https://github.com/mothran/aflpin + +### AFL-style instrumentation in llvm (Kostya Serebryany) + +Allows AFL-equivalent instrumentation to be injected at compiler level. +This is currently not supported by AFL as-is, but may be useful in other +projects. + +https://code.google.com/p/address-sanitizer/wiki/AsanCoverage#Coverage_counters + +### AFL JS (Han Choongwoo) + +One-off optimizations to speed up the fuzzing of JavaScriptCore (now likely +superseded by LLVM deferred forkserver init - see llvm_mode/README.llvm). + +https://github.com/tunz/afl-fuzz-js + +### AFL harness for fwknop (Michael Rash) + +An example of a fairly involved integration with AFL. + +https://github.com/mrash/fwknop/tree/master/test/afl + +### Building harnesses for DNS servers (Jonathan Foote, Ron Bowes) + +Two articles outlining the general principles and showing some example code. + +https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop +https://goo.gl/j9EgFf + +### Fuzzer shell for SQLite (Richard Hipp) + +A simple SQL shell designed specifically for fuzzing the underlying library. + +http://www.sqlite.org/src/artifact/9e7e273da2030371 + +### Support for Python mutation modules (Christian Holler) + +now integrated in AFL++, originally from here +https://github.com/choller/afl/blob/master/docs/mozilla/python_modules.txt + +### Support for selective instrumentation (Christian Holler) + +now integrated in AFL++, originally from here +https://github.com/choller/afl/blob/master/docs/mozilla/partial_instrumentation.txt + +### Syzkaller (Dmitry Vyukov) + +A similar guided approach as applied to fuzzing syscalls: + +https://github.com/google/syzkaller/wiki/Found-Bugs +https://github.com/dvyukov/linux/commit/33787098ffaaa83b8a7ccf519913ac5fd6125931 +http://events.linuxfoundation.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf + + +### Kernel Snapshot Fuzzing using Unicornafl (Security in Telecommunications) + +https://github.com/fgsect/unicorefuzz + +### Android support (ele7enxxh) + +Based on a somewhat dated version of AFL: + +https://github.com/ele7enxxh/android-afl + +### CGI wrapper (floyd) + +Facilitates the testing of CGI scripts. + +https://github.com/floyd-fuh/afl-cgi-wrapper + +### Fuzzing difficulty estimation (Marcel Boehme) + +A fork of AFL that tries to quantify the likelihood of finding additional +paths or crashes at any point in a fuzzing job. + +https://github.com/mboehme/pythia diff --git a/docs/sister_projects.txt b/docs/sister_projects.txt deleted file mode 100644 index 25e5560c..00000000 --- a/docs/sister_projects.txt +++ /dev/null @@ -1,360 +0,0 @@ -=============== -Sister projects -=============== - - This doc lists some of the projects that are inspired by, derived from, - designed for, or meant to integrate with AFL. See README for the general - instruction manual. - -!!! -!!! This list is outdated and needs an update, missing: e.g. Angora, FairFuzz -!!! - -------------------------------------------- -Support for other languages / environments: -------------------------------------------- - -Python AFL (Jakub Wilk) ------------------------ - - Allows fuzz-testing of Python programs. Uses custom instrumentation and its - own forkserver. - - http://jwilk.net/software/python-afl - -Go-fuzz (Dmitry Vyukov) ------------------------ - - AFL-inspired guided fuzzing approach for Go targets: - - https://github.com/dvyukov/go-fuzz - -afl.rs (Keegan McAllister) --------------------------- - - Allows Rust features to be easily fuzzed with AFL (using the LLVM mode). - - https://github.com/kmcallister/afl.rs - -OCaml support (KC Sivaramakrishnan) ------------------------------------ - - Adds AFL-compatible instrumentation to OCaml programs. - - https://github.com/ocamllabs/opam-repo-dev/pull/23 - http://canopy.mirage.io/Posts/Fuzzing - -AFL for GCJ Java and other GCC frontends (-) --------------------------------------------- - - GCC Java programs are actually supported out of the box - simply rename - afl-gcc to afl-gcj. Unfortunately, by default, unhandled exceptions in GCJ do - not result in abort() being called, so you will need to manually add a - top-level exception handler that exits with SIGABRT or something equivalent. - - Other GCC-supported languages should be fairly easy to get working, but may - face similar problems. See https://gcc.gnu.org/frontends.html for a list of - options. - -AFL-style in-process fuzzer for LLVM (Kostya Serebryany) --------------------------------------------------------- - - Provides an evolutionary instrumentation-guided fuzzing harness that allows - some programs to be fuzzed without the fork / execve overhead. (Similar - functionality is now available as the "persistent" feature described in - ../llvm_mode/README.llvm.) - - http://llvm.org/docs/LibFuzzer.html - -AFL fixup shim (Ben Nagy) -------------------------- - - Allows AFL_POST_LIBRARY postprocessors to be written in arbitrary languages - that don't have C / .so bindings. Includes examples in Go. - - https://github.com/bnagy/aflfix - -TriforceAFL (Tim Newsham and Jesse Hertz) ------------------------------------------ - - Leverages QEMU full system emulation mode to allow AFL to target operating - systems and other alien worlds: - - https://www.nccgroup.trust/us/about-us/newsroom-and-events/blog/2016/june/project-triforce-run-afl-on-everything/ - -WinAFL (Ivan Fratric) ---------------------- - - As the name implies, allows you to fuzz Windows binaries (using DynamoRio). - - https://github.com/ivanfratric/winafl - - Another Windows alternative may be: - - https://github.com/carlosgprado/BrundleFuzz/ - ----------------- -Network fuzzing: ----------------- - -Preeny (Yan Shoshitaishvili) ----------------------------- - - Provides a fairly simple way to convince dynamically linked network-centric - programs to read from a file or not fork. Not AFL-specific, but described as - useful by many users. Some assembly required. - - https://github.com/zardus/preeny - -------------------------------------------- -Distributed fuzzing and related automation: -------------------------------------------- - -roving (Richo Healey) ---------------------- - - A client-server architecture for effortlessly orchestrating AFL runs across - a fleet of machines. You don't want to use this on systems that face the - Internet or live in other untrusted environments. - - https://github.com/richo/roving - -Distfuzz-AFL (Martijn Bogaard) ------------------------------- - - Simplifies the management of afl-fuzz instances on remote machines. The - author notes that the current implementation isn't secure and should not - be exposed on the Internet. - - https://github.com/MartijnB/disfuzz-afl - -AFLDFF (quantumvm) ------------------- - - A nice GUI for managing AFL jobs. - - https://github.com/quantumvm/AFLDFF - -afl-launch (Ben Nagy) ---------------------- - - Batch AFL launcher utility with a simple CLI. - - https://github.com/bnagy/afl-launch - -AFL Utils (rc0r) ----------------- - - Simplifies the triage of discovered crashes, start parallel instances, etc. - - https://github.com/rc0r/afl-utils - - Another crash triage tool: - - https://github.com/floyd-fuh/afl-crash-analyzer - -afl-fuzzing-scripts (Tobias Ospelt) ------------------------------------ - - Simplifies starting up multiple parallel AFL jobs. - - https://github.com/floyd-fuh/afl-fuzzing-scripts/ - -afl-sid (Jacek Wielemborek) ---------------------------- - - Allows users to more conveniently build and deploy AFL via Docker. - - https://github.com/d33tah/afl-sid - - Another Docker-related project: - - https://github.com/ozzyjohnson/docker-afl - -afl-monitor (Paul S. Ziegler) ------------------------------ - - Provides more detailed and versatile statistics about your running AFL jobs. - - https://github.com/reflare/afl-monitor - ------------------------------------------------------------ -Crash triage, coverage analysis, and other companion tools: ------------------------------------------------------------ - -afl-crash-analyzer (Tobias Ospelt) ----------------------------------- - - Makes it easier to navigate and annotate crashing test cases. - - https://github.com/floyd-fuh/afl-crash-analyzer/ - -Crashwalk (Ben Nagy) --------------------- - - AFL-aware tool to annotate and sort through crashing test cases. - - https://github.com/bnagy/crashwalk - -afl-cov (Michael Rash) ----------------------- - - Produces human-readable coverage data based on the output queue of afl-fuzz. - - https://github.com/mrash/afl-cov - -afl-sancov (Bhargava Shastry) ------------------------------ - - Similar to afl-cov, but uses clang sanitizer instrumentation. - - https://github.com/bshastry/afl-sancov - -RecidiVM (Jakub Wilk) ---------------------- - - Makes it easy to estimate memory usage limits when fuzzing with ASAN or MSAN. - - http://jwilk.net/software/recidivm - -aflize (Jacek Wielemborek) --------------------------- - - Automatically build AFL-enabled versions of Debian packages. - - https://github.com/d33tah/aflize - -afl-ddmin-mod (Markus Teufelberger) ------------------------------------ - - A variant of afl-tmin that uses a more sophisticated (but slower) - minimization algorithm. - - https://github.com/MarkusTeufelberger/afl-ddmin-mod - -afl-kit (Kuang-che Wu) ----------------------- - - Replacements for afl-cmin and afl-tmin with additional features, such - as the ability to filter crashes based on stderr patterns. - - https://github.com/kcwu/afl-kit - -------------------------------- -Narrow-purpose or experimental: -------------------------------- - -Cygwin support (Ali Rizvi-Santiago) ------------------------------------ - - Pretty self-explanatory. As per the author, this "mostly" ports AFL to - Windows. Field reports welcome! - - https://github.com/arizvisa/afl-cygwin - -Pause and resume scripts (Ben Nagy) ------------------------------------ - - Simple automation to suspend and resume groups of fuzzing jobs. - - https://github.com/bnagy/afl-trivia - -Static binary-only instrumentation (Aleksandar Nikolich) --------------------------------------------------------- - - Allows black-box binaries to be instrumented statically (i.e., by modifying - the binary ahead of the time, rather than translating it on the run). Author - reports better performance compared to QEMU, but occasional translation - errors with stripped binaries. - - https://github.com/vanhauser-thc/afl-dyninst - -AFL PIN (Parker Thompson) -------------------------- - - Early-stage Intel PIN instrumentation support (from before we settled on - faster-running QEMU). - - https://github.com/mothran/aflpin - -AFL-style instrumentation in llvm (Kostya Serebryany) ------------------------------------------------------ - - Allows AFL-equivalent instrumentation to be injected at compiler level. - This is currently not supported by AFL as-is, but may be useful in other - projects. - - https://code.google.com/p/address-sanitizer/wiki/AsanCoverage#Coverage_counters - -AFL JS (Han Choongwoo) ----------------------- - - One-off optimizations to speed up the fuzzing of JavaScriptCore (now likely - superseded by LLVM deferred forkserver init - see llvm_mode/README.llvm). - - https://github.com/tunz/afl-fuzz-js - -AFL harness for fwknop (Michael Rash) -------------------------------------- - - An example of a fairly involved integration with AFL. - - https://github.com/mrash/fwknop/tree/master/test/afl - -Building harnesses for DNS servers (Jonathan Foote, Ron Bowes) --------------------------------------------------------------- - - Two articles outlining the general principles and showing some example code. - - https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop - https://goo.gl/j9EgFf - -Fuzzer shell for SQLite (Richard Hipp) --------------------------------------- - - A simple SQL shell designed specifically for fuzzing the underlying library. - - http://www.sqlite.org/src/artifact/9e7e273da2030371 - -Support for Python mutation modules (Christian Holler) ------------------------------------------------------- - -now integrated in AFL++, originally from here - https://github.com/choller/afl/blob/master/docs/mozilla/python_modules.txt - -Support for selective instrumentation (Christian Holler) --------------------------------------------------------- - -now integrated in AFL++, originally from here - https://github.com/choller/afl/blob/master/docs/mozilla/partial_instrumentation.txt - -Kernel fuzzing (Dmitry Vyukov) ------------------------------- - - A similar guided approach as applied to fuzzing syscalls: - - https://github.com/google/syzkaller/wiki/Found-Bugs - https://github.com/dvyukov/linux/commit/33787098ffaaa83b8a7ccf519913ac5fd6125931 - http://events.linuxfoundation.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf - -Android support (ele7enxxh) ---------------------------- - - Based on a somewhat dated version of AFL: - - https://github.com/ele7enxxh/android-afl - -CGI wrapper (floyd) -------------------- - - Facilitates the testing of CGI scripts. - - https://github.com/floyd-fuh/afl-cgi-wrapper - -Fuzzing difficulty estimation (Marcel Boehme) ---------------------------------------------- - - A fork of AFL that tries to quantify the likelihood of finding additional - paths or crashes at any point in a fuzzing job. - - https://github.com/mboehme/pythia diff --git a/docs/status_screen.txt b/docs/status_screen.md index ef27bc76..dd726c1d 100644 --- a/docs/status_screen.txt +++ b/docs/status_screen.md @@ -1,13 +1,10 @@ -=============================== -Understanding the status screen -=============================== +# Understanding the status screen - This document provides an overview of the status screen - plus tips for - troubleshooting any warnings and red text shown in the UI. See README for - the general instruction manual. +This document provides an overview of the status screen - plus tips for +troubleshooting any warnings and red text shown in the UI. See README for +the general instruction manual. -0) A note about colors ----------------------- +## A note about colors The status screen and error messages use colors to keep things readable and attract your attention to the most important details. For example, red almost @@ -19,21 +16,18 @@ to that. If you are using inverse video, you may want to change your settings, say: - - For GNOME Terminal, go to Edit > Profile preferences, select the "colors" - tab, and from the list of built-in schemes, choose "white on black". - - - For the MacOS X Terminal app, open a new window using the "Pro" scheme via - the Shell > New Window menu (or make "Pro" your default). +- For GNOME Terminal, go to `Edit > Profile` preferences, select the "colors" tab, and from the list of built-in schemes, choose "white on black". +- For the MacOS X Terminal app, open a new window using the "Pro" scheme via the `Shell > New Window` menu (or make "Pro" your default). Alternatively, if you really like your current colors, you can edit config.h -to comment out USE_COLORS, then do 'make clean all'. +to comment out USE_COLORS, then do `make clean all`. I'm not aware of any other simple way to make this work without causing other side effects - sorry about that. With that out of the way, let's talk about what's actually on the screen... -0) The status bar +### The status bar The top line shows you which mode afl-fuzz is running in (normal: "american fuzy lop", crash exploration mode: "peruvian rabbit mode") @@ -43,15 +37,16 @@ either show the binary name being fuzzed, or the -M/-S master/slave name for parallel fuzzing. Finally, the last item is the power schedule mode being run (default: explore). -1) Process timing ------------------ +### Process timing +``` +----------------------------------------------------+ | run time : 0 days, 8 hrs, 32 min, 43 sec | | last new path : 0 days, 0 hrs, 6 min, 40 sec | | last uniq crash : none seen yet | | last uniq hang : 0 days, 1 hrs, 24 min, 32 sec | +----------------------------------------------------+ +``` This section is fairly self-explanatory: it tells you how long the fuzzer has been running and how much time has elapsed since its most recent finds. This is @@ -67,36 +62,36 @@ There's one important thing to watch out for: if the tool is not finding new paths within several minutes of starting, you're probably not invoking the target binary correctly and it never gets to parse the input files we're throwing at it; another possible explanations are that the default memory limit -(-m) is too restrictive, and the program exits after failing to allocate a +(`-m`) is too restrictive, and the program exits after failing to allocate a buffer very early on; or that the input files are patently invalid and always fail a basic header check. If there are no new paths showing up for a while, you will eventually see a big red warning in this section, too :-) -2) Overall results ------------------- +### Overall results +``` +-----------------------+ | cycles done : 0 | | total paths : 2095 | | uniq crashes : 0 | | uniq hangs : 19 | +-----------------------+ +``` -The first field in this section gives you the count of queue passes done so far -- that is, the number of times the fuzzer went over all the interesting test +The first field in this section gives you the count of queue passes done so far - that is, the number of times the fuzzer went over all the interesting test cases discovered so far, fuzzed them, and looped back to the very beginning. Every fuzzing session should be allowed to complete at least one cycle; and ideally, should run much longer than that. As noted earlier, the first pass can take a day or longer, so sit back and relax. If you want to get broader but more shallow coverage right away, try -the -d option - it gives you a more familiar experience by skipping the +the `-d` option - it gives you a more familiar experience by skipping the deterministic fuzzing steps. It is, however, inferior to the standard mode in a couple of subtle ways. -To help make the call on when to hit Ctrl-C, the cycle counter is color-coded. +To help make the call on when to hit `Ctrl-C`, the cycle counter is color-coded. It is shown in magenta during the first pass, progresses to yellow if new finds are still being made in subsequent rounds, then blue when that ends - and finally, turns green after the fuzzer hasn't been seeing any action for a @@ -105,33 +100,35 @@ longer while. The remaining fields in this part of the screen should be pretty obvious: there's the number of test cases ("paths") discovered so far, and the number of unique faults. The test cases, crashes, and hangs can be explored in real-time -by browsing the output directory, as discussed in the README. +by browsing the output directory, as discussed in README.md. -3) Cycle progress ------------------ +### Cycle progress +``` +-------------------------------------+ | now processing : 1296 (61.86%) | | paths timed out : 0 (0.00%) | +-------------------------------------+ +``` This box tells you how far along the fuzzer is with the current queue cycle: it shows the ID of the test case it is currently working on, plus the number of inputs it decided to ditch because they were persistently timing out. The "*" suffix sometimes shown in the first line means that the currently -processed path is not "favored" (a property discussed later on, in section 6). +processed path is not "favored" (a property discussed later on). If you feel that the fuzzer is progressing too slowly, see the note about the --d option in section 2 of this doc. +`-d` option in this doc. -4) Map coverage ---------------- +### Map coverage +``` +--------------------------------------+ | map density : 10.15% / 29.07% | | count coverage : 4.03 bits/tuple | +--------------------------------------+ +``` The section provides some trivia about the coverage observed by the instrumentation embedded in the target binary. @@ -148,37 +145,35 @@ Be wary of extremes: due to being linked against a non-instrumented copy of the target library); or that it is bailing out prematurely on your input test cases. The fuzzer will try to mark this in pink, just to make you aware. - - Percentages over 70% may very rarely happen with very complex programs that make heavy use of template-generated code. - Because high bitmap density makes it harder for the fuzzer to reliably discern new program states, I recommend recompiling the binary with - AFL_INST_RATIO=10 or so and trying again (see env_variables.txt). - + `AFL_INST_RATIO=10` or so and trying again (see env_variables.md). The fuzzer will flag high percentages in red. Chances are, you will never see that unless you're fuzzing extremely hairy software (say, v8, perl, ffmpeg). The other line deals with the variability in tuple hit counts seen in the binary. In essence, if every taken branch is always taken a fixed number of -times for all the inputs we have tried, this will read "1.00". As we manage +times for all the inputs we have tried, this will read `1.00`. As we manage to trigger other hit counts for every branch, the needle will start to move -toward "8.00" (every bit in the 8-bit map hit), but will probably never +toward `8.00` (every bit in the 8-bit map hit), but will probably never reach that extreme. Together, the values can be useful for comparing the coverage of several different fuzzing jobs that rely on the same instrumented binary. -5) Stage progress ------------------ +### Stage progress +``` +-------------------------------------+ | now trying : interest 32/8 | | stage execs : 3996/34.4k (11.62%) | | total execs : 27.4M | | exec speed : 891.7/sec | +-------------------------------------+ +``` This part gives you an in-depth peek at what the fuzzer is actually doing right now. It tells you about the current stage, which can be any of: @@ -186,39 +181,31 @@ now. It tells you about the current stage, which can be any of: - calibration - a pre-fuzzing stage where the execution path is examined to detect anomalies, establish baseline execution speed, and so on. Executed very briefly whenever a new find is being made. - - trim L/S - another pre-fuzzing stage where the test case is trimmed to the shortest form that still produces the same execution path. The length (L) and stepover (S) are chosen in general relationship to file size. - - bitflip L/S - deterministic bit flips. There are L bits toggled at any given time, walking the input file with S-bit increments. The current L/S variants - are: 1/1, 2/1, 4/1, 8/8, 16/8, 32/8. - + are: `1/1`, `2/1`, `4/1`, `8/8`, `16/8`, `32/8`. - arith L/8 - deterministic arithmetics. The fuzzer tries to subtract or add small integers to 8-, 16-, and 32-bit values. The stepover is always 8 bits. - - interest L/8 - deterministic value overwrite. The fuzzer has a list of known "interesting" 8-, 16-, and 32-bit values to try. The stepover is 8 bits. - - extras - deterministic injection of dictionary terms. This can be shown as "user" or "auto", depending on whether the fuzzer is using a user-supplied - dictionary (-x) or an auto-created one. You will also see "over" or "insert", + dictionary (`-x`) or an auto-created one. You will also see "over" or "insert", depending on whether the dictionary words overwrite existing data or are inserted by offsetting the remaining data to accommodate their length. - - havoc - a sort-of-fixed-length cycle with stacked random tweaks. The operations attempted during this stage include bit flips, overwrites with random and "interesting" integers, block deletion, block duplication, plus assorted dictionary-related operations (if a dictionary is supplied in the first place). - - splice - a last-resort strategy that kicks in after the first full queue cycle with no new paths. It is equivalent to 'havoc', except that it first splices together two random inputs from the queue at some arbitrarily selected midpoint. - - - sync - a stage used only when -M or -S is set (see parallel_fuzzing.md). + - sync - a stage used only when `-M` or `-S` is set (see parallel_fuzzing.md). No real fuzzing is involved, but the tool scans the output from other fuzzers and imports test cases as necessary. The first time this is done, it may take several minutes or so. @@ -234,15 +221,16 @@ The fuzzer will explicitly warn you about slow targets, too. If this happens, see the perf_tips.txt file included with the fuzzer for ideas on how to speed things up. -6) Findings in depth --------------------- +### Findings in depth +``` +--------------------------------------+ | favored paths : 879 (41.96%) | | new edges on : 423 (20.19%) | | total crashes : 0 (0 unique) | | total tmouts : 24 (19 unique) | +--------------------------------------+ +``` This gives you several metrics that are of interest mostly to complete nerds. The section includes the number of paths that the fuzzer likes the most based @@ -255,9 +243,9 @@ Note that the timeout counter is somewhat different from the hang counter; this one includes all test cases that exceeded the timeout, even if they did not exceed it by a margin sufficient to be classified as hangs. -7) Fuzzing strategy yields --------------------------- +### Fuzzing strategy yields +``` +-----------------------------------------------------+ | bit flips : 57/289k, 18/289k, 18/288k | | byte flips : 0/36.2k, 4/35.7k, 7/34.6k | @@ -267,6 +255,7 @@ exceed it by a margin sufficient to be classified as hangs. | havoc : 1903/20.0M, 0/0 | | trim : 20.31%/9201, 17.05% | +-----------------------------------------------------+ +``` This is just another nerd-targeted section keeping track of how many paths we have netted, in proportion to the number of execs attempted, for each of the @@ -280,9 +269,9 @@ goal. Finally, the third number shows the proportion of bytes that, although not possible to remove, were deemed to have no effect and were excluded from some of the more expensive deterministic fuzzing steps. -8) Path geometry ----------------- +### Path geometry +``` +---------------------+ | levels : 5 | | pending : 1570 | @@ -291,6 +280,7 @@ some of the more expensive deterministic fuzzing steps. | imported : 0 | | stability : 100.00% | +---------------------+ +``` The first field in this section tracks the path depth reached through the guided fuzzing process. In essence: the initial test cases supplied by the @@ -323,46 +313,40 @@ there are several things to look at: - The use of uninitialized memory in conjunction with some intrinsic sources of entropy in the tested binary. Harmless to AFL, but could be indicative of a security bug. - - Attempts to manipulate persistent resources, such as left over temporary files or shared memory objects. This is usually harmless, but you may want to double-check to make sure the program isn't bailing out prematurely. Running out of disk space, SHM handles, or other global resources can trigger this, too. - - Hitting some functionality that is actually designed to behave randomly. Generally harmless. For example, when fuzzing sqlite, an input like - 'select random();' will trigger a variable execution path. - + `select random();` will trigger a variable execution path. - Multiple threads executing at once in semi-random order. This is harmless when the 'stability' metric stays over 90% or so, but can become an issue if not. Here's what to try: - - - Use afl-clang-fast from llvm_mode/ - it uses a thread-local tracking + * Use afl-clang-fast from [llvm_mode](../llvm_mode/) - it uses a thread-local tracking model that is less prone to concurrency issues, - - - See if the target can be compiled or run without threads. Common - ./configure options include --without-threads, --disable-pthreads, or - --disable-openmp. - - - Replace pthreads with GNU Pth (https://www.gnu.org/software/pth/), which + * See if the target can be compiled or run without threads. Common + `./configure` options include `--without-threads`, `--disable-pthreads`, or + `--disable-openmp`. + * Replace pthreads with GNU Pth (https://www.gnu.org/software/pth/), which allows you to use a deterministic scheduler. - - In persistent mode, minor drops in the "stability" metric can be normal, because not all the code behaves identically when re-entered; but major - dips may signify that the code within __AFL_LOOP() is not behaving + dips may signify that the code within `__AFL_LOOP()` is not behaving correctly on subsequent iterations (e.g., due to incomplete clean-up or reinitialization of the state) and that most of the fuzzing effort goes to waste. The paths where variable behavior is detected are marked with a matching entry -in the <out_dir>/queue/.state/variable_behavior/ directory, so you can look +in the `<out_dir>/queue/.state/variable_behavior/` directory, so you can look them up easily. -9) CPU load ------------ +### CPU load +``` [cpu: 25%] +``` This tiny widget shows the apparent CPU utilization on the local system. It is calculated by taking the number of processes in the "runnable" state, and then @@ -380,39 +364,37 @@ are ready to run, but not how resource-hungry they may be. It also doesn't distinguish between physical cores, logical cores, and virtualized CPUs; the performance characteristics of each of these will differ quite a bit. -If you want a more accurate measurement, you can run the afl-gotcpu utility -from the command line. +If you want a more accurate measurement, you can run the `afl-gotcpu` utility from the command line. -10) Addendum: status and plot files ------------------------------------ +### Addendum: status and plot files For unattended operation, some of the key status screen information can be also found in a machine-readable format in the fuzzer_stats file in the output directory. This includes: - - start_time - unix time indicating the start time of afl-fuzz - - last_update - unix time corresponding to the last update of this file - - fuzzer_pid - PID of the fuzzer process - - cycles_done - queue cycles completed so far - - execs_done - number of execve() calls attempted - - execs_per_sec - current number of execs per second - - paths_total - total number of entries in the queue - - paths_found - number of entries discovered through local fuzzing - - paths_imported - number of entries imported from other instances - - max_depth - number of levels in the generated data set - - cur_path - currently processed entry number - - pending_favs - number of favored entries still waiting to be fuzzed - - pending_total - number of all entries waiting to be fuzzed - - stability - percentage of bitmap bytes that behave consistently - - variable_paths - number of test cases showing variable behavior - - unique_crashes - number of unique crashes recorded - - unique_hangs - number of unique hangs encountered - - command_line - full command line used for the fuzzing session - - slowest_exec_ms- real time of the slowest execution in seconds - - peak_rss_mb - max rss usage reached during fuzzing in MB + - `start_time` - unix time indicating the start time of afl-fuzz + - `last_update` - unix time corresponding to the last update of this file + - `fuzzer_pid` - PID of the fuzzer process + - `cycles_done` - queue cycles completed so far + - `execs_done` - number of execve() calls attempted + - `execs_per_sec` - current number of execs per second + - `paths_total` - total number of entries in the queue + - `paths_found` - number of entries discovered through local fuzzing + - `paths_imported` - number of entries imported from other instances + - `max_depth` - number of levels in the generated data set + - `cur_path` - currently processed entry number + - `pending_favs` - number of favored entries still waiting to be fuzzed + - `pending_total` - number of all entries waiting to be fuzzed + - `stability - percentage of bitmap bytes that behave consistently + - `variable_paths` - number of test cases showing variable behavior + - `unique_crashes` - number of unique crashes recorded + - `unique_hangs` - number of unique hangs encountered + - `command_line` - full command line used for the fuzzing session + - `slowest_exec_ms`- real time of the slowest execution in seconds + - `peak_rss_mb` - max rss usage reached during fuzzing in MB Most of these map directly to the UI elements discussed earlier on. -On top of that, you can also find an entry called 'plot_data', containing a +On top of that, you can also find an entry called `plot_data`, containing a plottable history for most of these fields. If you have gnuplot installed, you -can turn this into a nice progress report with the included 'afl-plot' tool. +can turn this into a nice progress report with the included `afl-plot` tool. diff --git a/docs/technical_details.txt b/docs/technical_details.md index 734512a2..d53b30e3 100644 --- a/docs/technical_details.txt +++ b/docs/technical_details.md @@ -1,13 +1,10 @@ -=================================== -Technical "whitepaper" for afl-fuzz -=================================== +# Technical "whitepaper" for afl-fuzz - This document provides a quick overview of the guts of American Fuzzy Lop. - See README for the general instruction manual; and for a discussion of - motivations and design goals behind AFL, see historical_notes.txt. +This document provides a quick overview of the guts of American Fuzzy Lop. +See README for the general instruction manual; and for a discussion of +motivations and design goals behind AFL, see historical_notes.md. -0) Design statement -------------------- +## 0. Design statement American Fuzzy Lop does its best not to focus on any singular principle of operation and not be a proof-of-concept for any specific theory. The tool can @@ -20,28 +17,30 @@ lightweight instrumentation that served as a foundation for the tool, but this mechanism should be thought of merely as a means to an end. The only true governing principles are speed, reliability, and ease of use. -1) Coverage measurements ------------------------- +## 1. Coverage measurements The instrumentation injected into compiled programs captures branch (edge) coverage, along with coarse branch-taken hit counts. The code injected at branch points is essentially equivalent to: +```c cur_location = <COMPILE_TIME_RANDOM>; shared_mem[cur_location ^ prev_location]++; prev_location = cur_location >> 1; +``` -The cur_location value is generated randomly to simplify the process of +The `cur_location` value is generated randomly to simplify the process of linking complex projects and keep the XOR output distributed uniformly. -The shared_mem[] array is a 64 kB SHM region passed to the instrumented binary +The `shared_mem[]` array is a 64 kB SHM region passed to the instrumented binary by the caller. Every byte set in the output map can be thought of as a hit for -a particular (branch_src, branch_dst) tuple in the instrumented code. +a particular (`branch_src`, `branch_dst`) tuple in the instrumented code. The size of the map is chosen so that collisions are sporadic with almost all of the intended targets, which usually sport between 2k and 10k discoverable branch points: +``` Branch cnt | Colliding tuples | Example targets ------------+------------------+----------------- 1,000 | 0.75% | giflib, lzo @@ -50,6 +49,7 @@ branch points: 10,000 | 7% | libxml 20,000 | 14% | sqlite 50,000 | 30% | - +``` At the same time, its size is small enough to allow the map to be analyzed in a matter of microseconds on the receiving end, and to effortlessly fit @@ -59,8 +59,10 @@ This form of coverage provides considerably more insight into the execution path of the program than simple block coverage. In particular, it trivially distinguishes between the following execution traces: +``` A -> B -> C -> D -> E (tuples: AB, BC, CD, DE) A -> B -> D -> C -> E (tuples: AB, BD, DC, CE) +``` This aids the discovery of subtle fault conditions in the underlying code, because security vulnerabilities are more often associated with unexpected @@ -75,8 +77,7 @@ The absence of simple saturating arithmetic opcodes on Intel CPUs means that the hit counters can sometimes wrap around to zero. Since this is a fairly unlikely and localized event, it's seen as an acceptable performance trade-off. -2) Detecting new behaviors --------------------------- +### 2. Detecting new behaviors The fuzzer maintains a global map of tuples seen in previous executions; this data can be rapidly compared with individual traces and updated in just a couple @@ -97,18 +98,24 @@ To illustrate the properties of the algorithm, consider that the second trace shown below would be considered substantially new because of the presence of new tuples (CA, AE): +``` #1: A -> B -> C -> D -> E #2: A -> B -> C -> A -> E +``` At the same time, with #2 processed, the following pattern will not be seen as unique, despite having a markedly different overall execution path: +``` #3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E +``` In addition to detecting new tuples, the fuzzer also considers coarse tuple hit counts. These are divided into several buckets: +``` 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+ +``` To some extent, the number of buckets is an implementation artifact: it allows an in-place mapping of an 8-bit counter generated by the instrumentation to @@ -135,8 +142,7 @@ reject them and hope that the fuzzer will find a less expensive way to reach the same code. Empirical testing strongly suggests that more generous time limits are not worth the cost. -3) Evolving the input queue ---------------------------- +## 3. Evolving the input queue Mutated test cases that produced new state transitions within the program are added to the input queue and used as a starting point for future rounds of @@ -146,7 +152,7 @@ In contrast to more greedy genetic algorithms, this approach allows the tool to progressively explore various disjoint and possibly mutually incompatible features of the underlying data format, as shown in this image: - http://lcamtuf.coredump.cx/afl/afl_gzip.png +  Several practical examples of the results of this algorithm are discussed here: @@ -165,10 +171,11 @@ of new tuples, and the remainder is associated with changes in hit counts. The following table compares the relative ability to discover file syntax and explore program states when using several different approaches to guided -fuzzing. The instrumented target was GNU patch 2.7k.3 compiled with -O3 and +fuzzing. The instrumented target was GNU patch 2.7k.3 compiled with `-O3` and seeded with a dummy text file; the session consisted of a single pass over the input queue with afl-fuzz: +``` Fuzzer guidance | Blocks | Edges | Edge hit | Highest-coverage strategy used | reached | reached | cnt var | test case generated ------------------+---------+---------+----------+--------------------------- @@ -179,6 +186,7 @@ input queue with afl-fuzz: Block coverage | 855 | 1,130 | 1.57 | Almost-valid RCS diff Edge coverage | 1,452 | 2,070 | 2.18 | One-chunk -c mode diff AFL model | 1,765 | 2,597 | 4.99 | Four-chunk -c mode diff +``` The first entry for blind fuzzing ("S") corresponds to executing just a single round of testing; the second set of figures ("L") shows the fuzzer running in a @@ -191,6 +199,7 @@ a series of rudimentary, sequential operations such as walking bit flips. Because this mode would be incapable of altering the size of the input file, the sessions were seeded with a valid unified diff: +``` Queue extension | Blocks | Edges | Edge hit | Number of unique strategy used | reached | reached | cnt var | crashes found ------------------+---------+---------+----------+------------------ @@ -200,14 +209,14 @@ the sessions were seeded with a valid unified diff: Block coverage | 1,255 | 1,649 | 1.48 | 0 Edge coverage | 1,259 | 1,734 | 1.72 | 0 AFL model | 1,452 | 2,040 | 3.16 | 1 +``` At noted earlier on, some of the prior work on genetic fuzzing relied on maintaining a single test case and evolving it to maximize coverage. At least in the tests described above, this "greedy" approach appears to confer no substantial benefits over blind fuzzing strategies. -4) Culling the corpus ---------------------- +### 4. Culling the corpus The progressive state exploration approach outlined above means that some of the test cases synthesized later on in the game may have edge coverage that @@ -225,11 +234,8 @@ for each tuple. The tuples are then processed sequentially using a simple workflow: 1) Find next tuple not yet in the temporary working set, - 2) Locate the winning queue entry for this tuple, - 3) Register *all* tuples present in that entry's trace in the working set, - 4) Go to #1 if there are any missing tuples in the set. The generated corpus of "favored" entries is usually 5-10x smaller than the @@ -238,30 +244,26 @@ with varying probabilities when encountered in the queue: - If there are new, yet-to-be-fuzzed favorites present in the queue, 99% of non-favored entries will be skipped to get to the favored ones. - - If there are no new favorites: - - - If the current non-favored entry was fuzzed before, it will be skipped + * If the current non-favored entry was fuzzed before, it will be skipped 95% of the time. - - - If it hasn't gone through any fuzzing rounds yet, the odds of skipping + * If it hasn't gone through any fuzzing rounds yet, the odds of skipping drop down to 75%. Based on empirical testing, this provides a reasonable balance between queue cycling speed and test case diversity. Slightly more sophisticated but much slower culling can be performed on input -or output corpora with afl-cmin. This tool permanently discards the redundant -entries and produces a smaller corpus suitable for use with afl-fuzz or +or output corpora with `afl-cmin`. This tool permanently discards the redundant +entries and produces a smaller corpus suitable for use with `afl-fuzz` or external tools. -5) Trimming input files ------------------------ +## 5. Trimming input files File size has a dramatic impact on fuzzing performance, both because large files make the target binary slower, and because they reduce the likelihood that a mutation would touch important format control structures, rather than -redundant data blocks. This is discussed in more detail in perf_tips.txt. +redundant data blocks. This is discussed in more detail in perf_tips.md. The possibility that the user will provide a low-quality starting corpus aside, some types of mutations can have the effect of iteratively increasing the size @@ -275,12 +277,12 @@ The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data with variable length and stepover; any deletion that doesn't affect the checksum of the trace map is committed to disk. The trimmer is not designed to be particularly thorough; instead, it tries to strike a balance between precision -and the number of execve() calls spent on the process, selecting the block size +and the number of `execve()` calls spent on the process, selecting the block size and stepover to match. The average per-file gains are around 5-20%. -The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and +The standalone `afl-tmin` tool uses a more exhaustive, iterative algorithm, and also attempts to perform alphabet normalization on the trimmed files. The -operation of afl-tmin is as follows. +operation of `afl-tmin` is as follows. First, the tool automatically selects the operating mode. If the initial input crashes the target binary, afl-tmin will run in non-instrumented mode, simply @@ -293,16 +295,13 @@ The actual minimization algorithm is: 1) Attempt to zero large blocks of data with large stepovers. Empirically, this is shown to reduce the number of execs by preempting finer-grained efforts later on. - 2) Perform a block deletion pass with decreasing block sizes and stepovers, binary-search-style. - 3) Perform alphabet normalization by counting unique characters and trying to bulk-replace each with a zero value. - 4) As a last result, perform byte-by-byte normalization on non-zero bytes. -Instead of zeroing with a 0x00 byte, afl-tmin uses the ASCII digit '0'. This +Instead of zeroing with a 0x00 byte, `afl-tmin` uses the ASCII digit '0'. This is done because such a modification is much less likely to interfere with text parsing, so it is more likely to result in successful minimization of text files. @@ -312,8 +311,7 @@ minimization approaches proposed in academic work, but requires far fewer executions and tends to produce comparable results in most real-world applications. -6) Fuzzing strategies ---------------------- +## 6. Fuzzing strategies The feedback provided by the instrumentation makes it easy to understand the value of various fuzzing strategies and optimize their parameters so that they @@ -323,15 +321,13 @@ afl-fuzz are generally format-agnostic and are discussed in more detail here: http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html It is somewhat notable that especially early on, most of the work done by -afl-fuzz is actually highly deterministic, and progresses to random stacked +`afl-fuzz` is actually highly deterministic, and progresses to random stacked modifications and test case splicing only at a later stage. The deterministic strategies include: - Sequential bit flips with varying lengths and stepovers, - - Sequential addition and subtraction of small integers, - - - Sequential insertion of known interesting integers (0, 1, INT_MAX, etc), + - Sequential insertion of known interesting integers (`0`, `1`, `INT_MAX`, etc), The purpose of opening with deterministic steps is related to their tendency to produce compact test cases and small diffs between the non-crashing and crashing @@ -341,10 +337,10 @@ With deterministic fuzzing out of the way, the non-deterministic steps include stacked bit flips, insertions, deletions, arithmetics, and splicing of different test cases. -The relative yields and execve() costs of all these strategies have been +The relative yields and `execve()` costs of all these strategies have been investigated and are discussed in the aforementioned blog post. -For the reasons discussed in historical_notes.txt (chiefly, performance, +For the reasons discussed in historical_notes.md (chiefly, performance, simplicity, and reliability), AFL generally does not try to reason about the relationship between specific mutations and program states; the fuzzing steps are nominally blind, and are guided only by the evolutionary design of the @@ -365,8 +361,7 @@ in force only during deterministic stages that do not alter the size or the general layout of the underlying file, this mechanism appears to work very reliably and proved to be simple to implement. -7) Dictionaries ---------------- +## 7. Dictionaries The feedback provided by the instrumentation makes it easy to automatically identify syntax tokens in some types of input files, and to detect that certain @@ -398,31 +393,28 @@ to a predefined value baked into the code. The fuzzer relies on this signal to build compact "auto dictionaries" that are then used in conjunction with other fuzzing strategies. -8) De-duping crashes --------------------- +## 8. De-duping crashes De-duplication of crashes is one of the more important problems for any competent fuzzing tool. Many of the naive approaches run into problems; in particular, looking just at the faulting address may lead to completely unrelated issues being clustered together if the fault happens in a common -library function (say, strcmp, strcpy); while checksumming call stack +library function (say, `strcmp`, `strcpy`); while checksumming call stack backtraces can lead to extreme crash count inflation if the fault can be reached through a number of different, possibly recursive code paths. -The solution implemented in afl-fuzz considers a crash unique if any of two +The solution implemented in `afl-fuzz` considers a crash unique if any of two conditions are met: - The crash trace includes a tuple not seen in any of the previous crashes, - - The crash trace is missing a tuple that was always present in earlier faults. The approach is vulnerable to some path count inflation early on, but exhibits a very strong self-limiting effect, similar to the execution path analysis -logic that is the cornerstone of afl-fuzz. +logic that is the cornerstone of `afl-fuzz`. -9) Investigating crashes ------------------------- +## 9. Investigating crashes The exploitability of many types of crashes can be ambiguous; afl-fuzz tries to address this by providing a crash exploration mode where a known-faulting @@ -441,13 +433,12 @@ newly-found inputs for human review. On the subject of crashes, it is worth noting that in contrast to normal queue entries, crashing inputs are *not* trimmed; they are kept exactly as discovered to make it easier to compare them to the parent, non-crashing entry -in the queue. That said, afl-tmin can be used to shrink them at will. +in the queue. That said, `afl-tmin` can be used to shrink them at will. -10) The fork server -------------------- +## 10 The fork server -To improve performance, afl-fuzz uses a "fork server", where the fuzzed process -goes through execve(), linking, and libc initialization only once, and is then +To improve performance, `afl-fuzz` uses a "fork server", where the fuzzed process +goes through `execve()`, linking, and libc initialization only once, and is then cloned from a stopped process image by leveraging copy-on-write. The implementation is described in more detail here: @@ -455,7 +446,7 @@ implementation is described in more detail here: The fork server is an integral aspect of the injected instrumentation and simply stops at the first instrumented function to await commands from -afl-fuzz. +`afl-fuzz`. With fast targets, the fork server can offer considerable performance gains, usually between 1.5x and 2x. It is also possible to: @@ -464,17 +455,14 @@ usually between 1.5x and 2x. It is also possible to: user-selected chunks of initialization code. It requires very modest code changes to the targeted program, and With some targets, can produce 10x+ performance gains. - - Enable "persistent" mode, where a single process is used to try out - multiple inputs, greatly limiting the overhead of repetitive fork() + multiple inputs, greatly limiting the overhead of repetitive `fork()` calls. This generally requires some code changes to the targeted program, - but can improve the performance of fast targets by a factor of 5 or more - - approximating the benefits of in-process fuzzing jobs while still + but can improve the performance of fast targets by a factor of 5 or more - approximating the benefits of in-process fuzzing jobs while still maintaining very robust isolation between the fuzzer process and the targeted binary. -11) Parallelization -------------------- +## 11. Parallelization The parallelization mechanism relies on periodically examining the queues produced by independently-running instances on other CPU cores or on remote @@ -487,8 +475,7 @@ synergistic effects. For more information about this design, see parallel_fuzzing.md. -12) Binary-only instrumentation -------------------------------- +## 12. Binary-only instrumentation Instrumentation of black-box, binary-only targets is accomplished with the help of a separately-built version of QEMU in "user emulation" mode. This also @@ -497,6 +484,7 @@ allows the execution of cross-architecture code - say, ARM binaries on x86. QEMU uses basic blocks as translation units; the instrumentation is implemented on top of this and uses a model roughly analogous to the compile-time hooks: +```c if (block_address > elf_text_start && block_address < elf_text_end) { cur_location = (block_address >> 4) ^ (block_address << 8); @@ -504,6 +492,7 @@ on top of this and uses a model roughly analogous to the compile-time hooks: prev_location = cur_location >> 1; } +``` The shift-and-XOR-based scrambling in the second line is used to mask the effects of instruction alignment. @@ -511,7 +500,7 @@ effects of instruction alignment. The start-up of binary translators such as QEMU, DynamoRIO, and PIN is fairly slow; to counter this, the QEMU mode leverages a fork server similar to that used for compiler-instrumented code, effectively spawning copies of an -already-initialized process paused at _start. +already-initialized process paused at `_start`. First-time translation of a new basic block also incurs substantial latency. To eliminate this problem, the AFL fork server is extended by providing a channel @@ -523,8 +512,7 @@ processes. As a result of these two optimizations, the overhead of the QEMU mode is roughly 2-5x, compared to 100x+ for PIN. -13) The afl-analyze tool ------------------------- +## 13. The `afl-analyze` tool The file format analyzer is a simple extension of the minimization algorithm discussed earlier on; instead of attempting to remove no-op blocks, the tool @@ -536,28 +524,22 @@ It uses the following classification scheme: - "No-op blocks" - segments where bit flips cause no apparent changes to control flow. Common examples may be comment sections, pixel data within a bitmap file, etc. - - "Superficial content" - segments where some, but not all, bitflips produce some control flow changes. Examples may include strings in rich documents (e.g., XML, RTF). - - "Critical stream" - a sequence of bytes where all bit flips alter control flow in different but correlated ways. This may be compressed data, non-atomically compared keywords or magic values, etc. - - "Suspected length field" - small, atomic integer that, when touched in any way, causes a consistent change to program control flow, suggestive of a failed length check. - - "Suspected cksum or magic int" - an integer that behaves similarly to a length field, but has a numerical value that makes the length explanation unlikely. This is suggestive of a checksum or other "magic" integer. - - "Suspected checksummed block" - a long block of data where any change always triggers the same new execution path. Likely caused by failing a checksum or a similar integrity check before any subsequent parsing takes place. - - "Magic value section" - a generic token where changes cause the type of binary behavior outlined earlier, but that doesn't meet any of the - other criteria. May be an atomically compared keyword or so. + other criteria. May be an atomically compared keyword or so. \ No newline at end of file |