From e66402485342088e6fcaecfe2abbba291a48bda5 Mon Sep 17 00:00:00 2001 From: van Hauser Date: Sun, 14 Jul 2019 10:50:13 +0200 Subject: whitelist features works now --- llvm_mode/README.llvm | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) (limited to 'llvm_mode/README.llvm') diff --git a/llvm_mode/README.llvm b/llvm_mode/README.llvm index dc860e97..b4e05a7a 100644 --- a/llvm_mode/README.llvm +++ b/llvm_mode/README.llvm @@ -38,8 +38,8 @@ co-exists with the original code. The idea and much of the implementation comes from Laszlo Szekeres. -2) How to use -------------- +2) How to use this +------------------ In order to leverage this mechanism, you need to have clang installed on your system. You should also make sure that the llvm-config tool is in your path @@ -69,8 +69,10 @@ operating mode of AFL, e.g.: Be sure to also include CXX set to afl-clang-fast++ for C++ code. The tool honors roughly the same environmental variables as afl-gcc (see -../docs/env_variables.txt). This includes AFL_INST_RATIO, AFL_USE_ASAN, -AFL_HARDEN, and AFL_DONT_OPTIMIZE. +../docs/env_variables.txt). This includes AFL_USE_ASAN, +AFL_HARDEN, and AFL_DONT_OPTIMIZE. However AFL_INST_RATIO is not honored +as it does not serve a good purpose with the more effective instrim CFG +analysis. Note: if you want the LLVM helper to be installed on your system for all users, you need to build it before issuing 'make install' in the parent -- cgit 1.4.1 From 013a1731d590eaa1f3e4c58c69985f89b7a3d2f9 Mon Sep 17 00:00:00 2001 From: van Hauser Date: Sun, 14 Jul 2019 19:48:28 +0200 Subject: set instrim as default and updated documentation --- docs/env_variables.txt | 13 ++++++++----- llvm_mode/LLVMInsTrim.so.cc | 6 +++--- llvm_mode/README.llvm | 27 +++++++++++++++++++++++---- 3 files changed, 34 insertions(+), 12 deletions(-) (limited to 'llvm_mode/README.llvm') diff --git a/docs/env_variables.txt b/docs/env_variables.txt index d854ea8d..8e2723d7 100644 --- a/docs/env_variables.txt +++ b/docs/env_variables.txt @@ -82,6 +82,9 @@ discussed in section #1, with the exception of: - TMPDIR and AFL_KEEP_ASSEMBLY, since no temporary assembly files are created. + - AFL_INST_RATIO, as we switched for instrim instrumentation which + is more effective but makes not much sense together with this option. + Then there are a few specific features that are only available in llvm_mode: LAF-INTEL @@ -108,16 +111,16 @@ Then there are a few specific features that are only available in llvm_mode: OTHER ===== - - Setting export AFL_LLVM_NOT_ZERO=1 during compilation will use counters + - Setting LOOPHEAD=1 optimized loops. afl-fuzz will only be able to + see the path the loop took, but not how many times it was called + (unless its a complex loop). + + - Setting AFL_LLVM_NOT_ZERO=1 during compilation will use counters that skip zero on overflow. This is the default for llvm >= 9, however for llvm versions below that this will increase an unnecessary slowdown due a performance issue that is only fixed in llvm 9+. This feature increases path discovery by a little bit. -Note that AFL_INST_RATIO will behave a bit differently than for afl-gcc, -because functions are *not* instrumented unconditionally - so low values -will have a more striking effect. For this tool, 0 is not a valid choice. - 3) Settings for afl-fuzz ------------------------ diff --git a/llvm_mode/LLVMInsTrim.so.cc b/llvm_mode/LLVMInsTrim.so.cc index 51640870..8e9f7667 100644 --- a/llvm_mode/LLVMInsTrim.so.cc +++ b/llvm_mode/LLVMInsTrim.so.cc @@ -98,10 +98,10 @@ namespace { if (getenv("LOOPHEAD")) { LoopHeadOpt = true; - MarkSetOpt = true; - } else if (getenv("MARKSET")) { - MarkSetOpt = true; } + + // this is our default + MarkSetOpt = true; /* // I dont think this makes sense to port into LLVMInsTrim char* inst_ratio_str = getenv("AFL_INST_RATIO"); diff --git a/llvm_mode/README.llvm b/llvm_mode/README.llvm index b4e05a7a..77c406f8 100644 --- a/llvm_mode/README.llvm +++ b/llvm_mode/README.llvm @@ -78,13 +78,32 @@ Note: if you want the LLVM helper to be installed on your system for all users, you need to build it before issuing 'make install' in the parent directory. -3) Gotchas, feedback, bugs +3) Options + +Several options are present to make llvm_mode faster or help it rearrange +the code to make afl-fuzz path discovery easier. + +If you need just to instrument specific parts of the code, you can whitelist +which C/C++ files to actually intrument. See README.whitelist + +For splitting memcmp, strncmp, etc. please see README.laf-intel + +As the original afl llvm_mode implementation has been replaced with +then much more effective instrim (https://github.com/csienslab/instrim/) +there is an option for optimizing loops. This optimization shows which +part of the loop has been selected, but not how many time a loop has been +called in a row (unless its a complex loop and a block inside was +instrumented). If you want to enable this set the environment variable +LOOPHEAD=1 + + +4) Gotchas, feedback, bugs -------------------------- This is an early-stage mechanism, so field reports are welcome. You can send bug reports to . -4) Bonus feature #1: deferred instrumentation +5) Bonus feature #1: deferred instrumentation --------------------------------------------- AFL tries to optimize performance by executing the targeted binary just once, @@ -131,7 +150,7 @@ will keep working normally when compiled with a tool other than afl-clang-fast. Finally, recompile the program with afl-clang-fast (afl-gcc or afl-clang will *not* generate a deferred-initialization binary) - and you should be all set! -5) Bonus feature #2: persistent mode +6) Bonus feature #2: persistent mode ------------------------------------ Some libraries provide APIs that are stateless, or whose state can be reset in @@ -171,7 +190,7 @@ PS. Because there are task switches still involved, the mode isn't as fast as faster than the normal fork() model, and compared to in-process fuzzing, should be a lot more robust. -6) Bonus feature #3: new 'trace-pc-guard' mode +8) Bonus feature #3: new 'trace-pc-guard' mode ---------------------------------------------- Recent versions of LLVM are shipping with a built-in execution tracing feature -- cgit 1.4.1 From 32525238238e96ec0ce64a36f70558f76bc90ff5 Mon Sep 17 00:00:00 2001 From: van Hauser Date: Mon, 15 Jul 2019 11:22:54 +0200 Subject: fixing commit fuckup --- Makefile | 2 +- docs/ChangeLog | 8 +++----- docs/README | 3 +-- docs/env_variables.txt | 22 +++++++++++++++++----- llvm_mode/LLVMInsTrim.so.cc | 2 +- llvm_mode/Makefile | 13 ++++++++----- llvm_mode/README.llvm | 19 ++++++++++++------- llvm_mode/afl-clang-fast.c | 16 +++++++++------- 8 files changed, 52 insertions(+), 33 deletions(-) (limited to 'llvm_mode/README.llvm') diff --git a/Makefile b/Makefile index 60dfde18..6b580381 100644 --- a/Makefile +++ b/Makefile @@ -194,7 +194,7 @@ install: all rm -f $${DESTDIR}$(BIN_PATH)/afl-as if [ -f afl-qemu-trace ]; then install -m 755 afl-qemu-trace $${DESTDIR}$(BIN_PATH); fi ifndef AFL_TRACE_PC - if [ -f afl-clang-fast -a -f libLLVMInsTrim.so -a -f afl-llvm-rt.o ]; then set -e; install -m 755 afl-clang-fast $${DESTDIR}$(BIN_PATH); ln -sf afl-clang-fast $${DESTDIR}$(BIN_PATH)/afl-clang-fast++; install -m 755 libLLVMInsTrim.so afl-llvm-rt.o $${DESTDIR}$(HELPER_PATH); fi + if [ -f afl-clang-fast -a -f libLLVMInsTrim.so -a -f afl-llvm-rt.o ]; then set -e; install -m 755 afl-clang-fast $${DESTDIR}$(BIN_PATH); ln -sf afl-clang-fast $${DESTDIR}$(BIN_PATH)/afl-clang-fast++; install -m 755 libLLVMInsTrim.so afl-llvm-pass.so afl-llvm-rt.o $${DESTDIR}$(HELPER_PATH); fi else if [ -f afl-clang-fast -a -f afl-llvm-rt.o ]; then set -e; install -m 755 afl-clang-fast $${DESTDIR}$(BIN_PATH); ln -sf afl-clang-fast $${DESTDIR}$(BIN_PATH)/afl-clang-fast++; install -m 755 afl-llvm-rt.o $${DESTDIR}$(HELPER_PATH); fi endif diff --git a/docs/ChangeLog b/docs/ChangeLog index 9cdca49b..116029ea 100644 --- a/docs/ChangeLog +++ b/docs/ChangeLog @@ -17,9 +17,9 @@ sending a mail to . Version ++2.52d (tbd): ----------------------------- - - added instrim a much better llvm_mode instrumentation - (https://github.com/csienslab/instrim) - - added MOpt (github.com/puppet-meteor/MOpt-AFL) mode + - added instrim, a much faster llvm_mode instrumentation at the cost of + path discovery. See llvm_mode/README.instrim (https://github.com/csienslab/instrim) + - added MOpt (github.com/puppet-meteor/MOpt-AFL) mode, see docs/README.MOpt - added code to make it more portable to other platforms than Intel Linux - added never zero counters for afl-gcc and optional (because of an optimization issue in llvm < 9) for llvm_mode (AFL_LLVM_NEVER_ZERO=1) @@ -41,8 +41,6 @@ Version ++2.52d (tbd): tests as the random numbers are deterministic then - llvm_mode LAF_... env variables can now be specified as AFL_LLVM_LAF_... that is longer but in line with other llvm specific env vars - - ... your idea or patch? - ----------------------------- diff --git a/docs/README b/docs/README index 54e3e4a4..3a6c2921 100644 --- a/docs/README +++ b/docs/README @@ -23,8 +23,7 @@ american fuzzy lop plus plus https://github.com/puppet-meteor/MOpt-AFL Also newly integrated is instrim, a very effective CFG llvm_mode - instrumentation implementation which replaced the original afl one and is - from https://github.com/csienslab/instrim + instrumentation implementation from https://github.com/csienslab/instrim A more thorough list is available in the PATCHES file. diff --git a/docs/env_variables.txt b/docs/env_variables.txt index 8e2723d7..e58327b4 100644 --- a/docs/env_variables.txt +++ b/docs/env_variables.txt @@ -109,11 +109,21 @@ Then there are a few specific features that are only available in llvm_mode: See llvm_mode/README.whitelist for more information. - OTHER - ===== - - Setting LOOPHEAD=1 optimized loops. afl-fuzz will only be able to - see the path the loop took, but not how many times it was called - (unless its a complex loop). + INSTRIM + ======= + This feature increases the speed by whopping 20% but at the cost of a + lower path discovery and thefore coverage. + + - Setting AFL_LLVM_INSTRIM activates this mode + + - Setting AFL_LLVM_INSTRIM LOOPHEAD=1 expands on INSTRIM to optimize loops. + afl-fuzz will only be able to see the path the loop took, but not how + many times it was called (unless its a complex loop). + + See llvm_mode/README.instrim + + NOT_ZERO + ======== - Setting AFL_LLVM_NOT_ZERO=1 during compilation will use counters that skip zero on overflow. This is the default for llvm >= 9, @@ -121,6 +131,8 @@ Then there are a few specific features that are only available in llvm_mode: slowdown due a performance issue that is only fixed in llvm 9+. This feature increases path discovery by a little bit. + See llvm_mode/README.neverzero + 3) Settings for afl-fuzz ------------------------ diff --git a/llvm_mode/LLVMInsTrim.so.cc b/llvm_mode/LLVMInsTrim.so.cc index 8e9f7667..81cf98c4 100644 --- a/llvm_mode/LLVMInsTrim.so.cc +++ b/llvm_mode/LLVMInsTrim.so.cc @@ -96,7 +96,7 @@ namespace { OKF("LLVM neverZero activated (by hexcoder)\n"); #endif - if (getenv("LOOPHEAD")) { + if (getenv("AFL_LLVM_INSTRIM_LOOPHEAD") != NULL || getenv("LOOPHEAD") != NULL) { LoopHeadOpt = true; } diff --git a/llvm_mode/Makefile b/llvm_mode/Makefile index d0d4b690..2b685ddc 100644 --- a/llvm_mode/Makefile +++ b/llvm_mode/Makefile @@ -94,7 +94,7 @@ endif ifndef AFL_TRACE_PC - PROGS = ../afl-clang-fast ../libLLVMInsTrim.so ../afl-llvm-rt.o ../afl-llvm-rt-32.o ../afl-llvm-rt-64.o ../compare-transform-pass.so ../split-compares-pass.so ../split-switches-pass.so + PROGS = ../afl-clang-fast ../afl-llvm-pass.so ../libLLVMInsTrim.so ../afl-llvm-rt.o ../afl-llvm-rt-32.o ../afl-llvm-rt-64.o ../compare-transform-pass.so ../split-compares-pass.so ../split-switches-pass.so else PROGS = ../afl-clang-fast ../afl-llvm-rt.o ../afl-llvm-rt-32.o ../afl-llvm-rt-64.o ../compare-transform-pass.so ../split-compares-pass.so ../split-switches-pass.so endif @@ -104,7 +104,7 @@ ifneq "$(CLANGVER)" "$(LLVMVER)" CXX = $(shell llvm-config --bindir)/clang++ endif -all: test_deps test_shm $(PROGS) test_build all_done +all: test_shm test_deps $(PROGS) test_build all_done ifeq "$(SHMAT_OK)" "1" @@ -132,10 +132,10 @@ endif @which $(CC) >/dev/null 2>&1 || ( echo "[-] Oops, can't find '$(CC)'. Make sure that it's in your \$$PATH (or set \$$CC and \$$CXX)."; exit 1 ) @echo "[*] Checking for matching versions of '$(CC)' and '$(LLVM_CONFIG)'" ifneq "$(CLANGVER)" "$(LLVMVER)" - @echo "WARNING: we have llvm-config version $(LLVMVER) and a clang version $(CLANGVER)" - @echo "Retrying with the clang compiler from llvm: CC=`llvm-config --bindir`/clang" + @echo "[!] WARNING: we have llvm-config version $(LLVMVER) and a clang version $(CLANGVER)" + @echo "[!] Retrying with the clang compiler from llvm: CC=`llvm-config --bindir`/clang" else - @echo "we have llvm-config version $(LLVMVER) with a clang version $(CLANGVER), good." + @echo "[*] We have llvm-config version $(LLVMVER) with a clang version $(CLANGVER), good." endif @echo "[*] Checking for '../afl-showmap'..." @test -f ../afl-showmap || ( echo "[-] Oops, can't find '../afl-showmap'. Be sure to compile AFL first."; exit 1 ) @@ -148,6 +148,9 @@ endif ../libLLVMInsTrim.so: LLVMInsTrim.so.cc MarkNodes.cc | test_deps $(CXX) $(CLANG_CFL) -DLLVMInsTrim_EXPORTS -fno-rtti -fPIC -std=gnu++11 -shared $< MarkNodes.cc -o $@ $(CLANG_LFL) +../afl-llvm-pass.so: afl-llvm-pass.so.cc | test_deps + $(CXX) $(CLANG_CFL) -DLLVMInsTrim_EXPORTS -fno-rtti -fPIC -std=gnu++11 -shared $< -o $@ $(CLANG_LFL) + # laf ../split-switches-pass.so: split-switches-pass.so.cc | test_deps $(CXX) $(CLANG_CFL) -shared $< -o $@ $(CLANG_LFL) diff --git a/llvm_mode/README.llvm b/llvm_mode/README.llvm index 77c406f8..779ff47c 100644 --- a/llvm_mode/README.llvm +++ b/llvm_mode/README.llvm @@ -88,13 +88,18 @@ which C/C++ files to actually intrument. See README.whitelist For splitting memcmp, strncmp, etc. please see README.laf-intel -As the original afl llvm_mode implementation has been replaced with -then much more effective instrim (https://github.com/csienslab/instrim/) -there is an option for optimizing loops. This optimization shows which -part of the loop has been selected, but not how many time a loop has been -called in a row (unless its a complex loop and a block inside was -instrumented). If you want to enable this set the environment variable -LOOPHEAD=1 +Then there is an optimized instrumentation strategy that uses CFGs and +markers to just instrument what is needed. This increases speed by 20-25% +however has a lower path discovery. +If you want to use this, set AFL_LLVM_INSTRIM=1 +See README.instrim + +Finally if your llvm version is 8 or lower, you can activate a mode that +prevents that a counter overflow result in a 0 value. This is good for +path discovery, but the llvm implementation for intel for this functionality +is not optimal and was only fixed in llvm 9. +You can set this with AFL_LLVM_NOT_ZERO=1 +See README.neverzero 4) Gotchas, feedback, bugs diff --git a/llvm_mode/afl-clang-fast.c b/llvm_mode/afl-clang-fast.c index 249eea7d..19bad86c 100644 --- a/llvm_mode/afl-clang-fast.c +++ b/llvm_mode/afl-clang-fast.c @@ -88,7 +88,7 @@ static void find_obj(u8* argv0) { return; } - FATAL("Unable to find 'afl-llvm-rt.o' or 'libLLVMInsTrim.so'. Please set AFL_PATH"); + FATAL("Unable to find 'afl-llvm-rt.o' or 'afl-llvm-pass.so.cc'. Please set AFL_PATH"); } @@ -113,11 +113,11 @@ static void edit_params(u32 argc, char** argv) { cc_params[0] = alt_cc ? alt_cc : (u8*)"clang"; } - /* There are two ways to compile afl-clang-fast. In the traditional mode, we - use libLLVMInsTrim.so to inject instrumentation. In the experimental + /* There are three ways to compile with afl-clang-fast. In the traditional + mode, we use afl-llvm-pass.so, then there is libLLVMInsTrim.so which is + much faster but has less coverage. Finally tere is the experimental 'trace-pc-guard' mode, we use native LLVM instrumentation callbacks - instead. The latter is a very recent addition - see: - + instead. For trace-pc-guard see: http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs-with-guards */ // laf @@ -151,8 +151,10 @@ static void edit_params(u32 argc, char** argv) { cc_params[cc_par_cnt++] = "-Xclang"; cc_params[cc_par_cnt++] = "-load"; cc_params[cc_par_cnt++] = "-Xclang"; - cc_params[cc_par_cnt++] = alloc_printf("%s/libLLVMInsTrim.so", obj_path); -// cc_params[cc_par_cnt++] = alloc_printf("%s/afl-llvm-pass.so", obj_path); + if (getenv("AFL_LLVM_INSTRIM") != NULL || getenv("INSTRIM_LIB") != NULL) + cc_params[cc_par_cnt++] = alloc_printf("%s/libLLVMInsTrim.so", obj_path); + else + cc_params[cc_par_cnt++] = alloc_printf("%s/afl-llvm-pass.so", obj_path); #endif /* ^USE_TRACE_PC */ cc_params[cc_par_cnt++] = "-Qunused-arguments"; -- cgit 1.4.1 From 995eb0cd7972e2179ea9fe727d3c89d0b552c111 Mon Sep 17 00:00:00 2001 From: van Hauser Date: Tue, 16 Jul 2019 08:51:00 +0200 Subject: deprecate afl-gcc --- afl-gcc.c | 2 ++ docs/ChangeLog | 4 +++- llvm_mode/README.llvm | 3 ++- types.h | 9 +++++++-- 4 files changed, 14 insertions(+), 4 deletions(-) (limited to 'llvm_mode/README.llvm') diff --git a/afl-gcc.c b/afl-gcc.c index 467a9bc1..2e3c4f76 100644 --- a/afl-gcc.c +++ b/afl-gcc.c @@ -311,6 +311,8 @@ int main(int argc, char** argv) { } else be_quiet = 1; + SAYF(cYEL "[!] " cBRI "WARNING: " cRST "afl-gcc is deprecated, gcc_plugin is faster, llvm_mode even faster\n"); + if (argc < 2) { SAYF("\n" diff --git a/docs/ChangeLog b/docs/ChangeLog index 8c1aa994..735653c0 100644 --- a/docs/ChangeLog +++ b/docs/ChangeLog @@ -17,10 +17,12 @@ sending a mail to . Version ++2.52d (tbd): ----------------------------- + - Using the old ineffective afl-gcc will now show a deprecation warning - if llvm_mode was compiled, afl-clang/afl-clang++ will point to these instead of afl-gcc - added gcc_plugin which is like llvm_mode but for gcc. This version - supports gcc version 5 to 8. See gcc_plugin/README (https://github.com/T12z/afl) + supports gcc version 5 to 8. See gcc_plugin/README.gcc + (https://github.com/T12z/afl) - added instrim, a much faster llvm_mode instrumentation at the cost of path discovery. See llvm_mode/README.instrim (https://github.com/csienslab/instrim) - added MOpt (github.com/puppet-meteor/MOpt-AFL) mode, see docs/README.MOpt diff --git a/llvm_mode/README.llvm b/llvm_mode/README.llvm index 779ff47c..aaa7b81f 100644 --- a/llvm_mode/README.llvm +++ b/llvm_mode/README.llvm @@ -3,6 +3,7 @@ Fast LLVM-based instrumentation for afl-fuzz ============================================ (See ../docs/README for the general instruction manual.) + (See ../gcc_plugin/README.gcc for the GCC-based instrumentation.) 1) Introduction --------------- @@ -30,7 +31,7 @@ several interesting properties: - The instrumentation can cope a bit better with multi-threaded targets. - Because the feature relies on the internals of LLVM, it is clang-specific - and will *not* work with GCC. + and will *not* work with GCC (see ../gcc_plugin/ for an alternative). Once this implementation is shown to be sufficiently robust and portable, it will probably replace afl-clang. For now, it can be built separately and diff --git a/types.h b/types.h index 784d3a7a..3497bb2b 100644 --- a/types.h +++ b/types.h @@ -80,7 +80,12 @@ typedef int64_t s64; #define MEM_BARRIER() \ asm volatile("" ::: "memory") -#define likely(_x) __builtin_expect(!!(_x), 1) -#define unlikely(_x) __builtin_expect(!!(_x), 0) +#if __GNUC__ < 6 + #define likely(_x) (_x) + #define unlikely(_x) (_x) +#else + #define likely(_x) __builtin_expect(!!(_x), 1) + #define unlikely(_x) __builtin_expect(!!(_x), 0) +#endif #endif /* ! _HAVE_TYPES_H */ -- cgit 1.4.1 From fe084b9866c5cb01408e3155078f092f64650edf Mon Sep 17 00:00:00 2001 From: Heiko Eißfeldt Date: Fri, 19 Jul 2019 11:17:30 +0200 Subject: several documentation fixes --- docs/ChangeLog | 8 ++++---- docs/README.MOpt | 30 +++++++++++++++++++----------- docs/binaryonly_fuzzing.txt | 27 ++++++++++++++------------- docs/env_variables.txt | 9 +++++---- gcc_plugin/README.gcc | 4 ++-- llvm_mode/README.llvm | 4 ++-- 6 files changed, 46 insertions(+), 36 deletions(-) (limited to 'llvm_mode/README.llvm') diff --git a/docs/ChangeLog b/docs/ChangeLog index f2709877..be50215c 100644 --- a/docs/ChangeLog +++ b/docs/ChangeLog @@ -30,7 +30,7 @@ Version ++2.52d (tbd): path discovery. See llvm_mode/README.instrim (https://github.com/csienslab/instrim) - added MOpt (github.com/puppet-meteor/MOpt-AFL) mode, see docs/README.MOpt - added code to make it more portable to other platforms than Intel Linux - - added never zero counters for afl-gcc and optional (because of an + - added never zero counters for afl-gcc and optionally (because of an optimization issue in llvm < 9) for llvm_mode (AFL_LLVM_NEVER_ZERO=1) - added a new doc about binary only fuzzing: docs/binaryonly_fuzzing.txt - more cpu power for afl-system-config @@ -45,8 +45,8 @@ Version ++2.52d (tbd): debugging - added -V time and -E execs option to better comparison runs, runs afl-fuzz for a specific time/executions. - - added a -s seed switch to allow afl run with a fixed initial - seed that is not updated. this is good for performance and path discovery + - added a -s seed switch to allow afl run with a fixed initial + seed that is not updated. This is good for performance and path discovery tests as the random numbers are deterministic then - llvm_mode LAF_... env variables can now be specified as AFL_LLVM_LAF_... that is longer but in line with other llvm specific env vars @@ -59,7 +59,7 @@ Version ++2.52c (2019-06-05): - Applied community patches. See docs/PATCHES for the full list. LLVM and Qemu modes are now faster. Important changes: - afl-fuzz: -e EXTENSION commandline option + afl-fuzz: -e EXTENSION commandline option llvm_mode: LAF-intel performance (needs activation, see llvm/README.laf-intel) a few new environment variables for afl-fuzz, llvm and qemu, see docs/env_variables.txt - Added the power schedules of AFLfast by Marcel Boehme, but set the default diff --git a/docs/README.MOpt b/docs/README.MOpt index 5575189c..94e63959 100644 --- a/docs/README.MOpt +++ b/docs/README.MOpt @@ -17,7 +17,8 @@ We open source all the seed sets used in the paper ### 4. Experiment Results The experiment results can be found in -https://drive.google.com/drive/folders/184GOzkZGls1H2NuLuUfSp9gfqp1E2-lL?usp=sharing. We only open source the crash files since the space is limited. +https://drive.google.com/drive/folders/184GOzkZGls1H2NuLuUfSp9gfqp1E2-lL?usp=sharing. +We only open source the crash files since the space is limited. ### 5. Technical Report MOpt_TechReport.pdf is the technical report of the paper @@ -26,18 +27,25 @@ MOpt_TechReport.pdf is the technical report of the paper ### 6. Parameter Introduction Most important, you must add the parameter `-L` (e.g., `-L 0`) to launch the MOpt scheme. -
`-L` controls the time to move on to the pacemaker fuzzing mode. -
`-L t:` when MOpt-AFL finishes the mutation of one input, if it has not -discovered any new unique crash or path for more than t min, MOpt-AFL will + +Option '-L' controls the time to move on to the pacemaker fuzzing mode. +'-L t': when MOpt-AFL finishes the mutation of one input, if it has not +discovered any new unique crash or path for more than t minutes, MOpt-AFL will enter the pacemaker fuzzing mode. -
Setting 0 will enter the pacemaker fuzzing mode at first, which is + +Setting 0 will enter the pacemaker fuzzing mode at first, which is recommended in a short time-scale evaluation. Other important parameters can be found in afl-fuzz.c, for instance, -
`swarm_num:` the number of the PSO swarms used in the fuzzing process. -
`period_pilot:` how many times MOpt-AFL will execute the target program in the pilot fuzzing module, then it will enter the core fuzzing module. -
`period_core:` how many times MOpt-AFL will execute the target program in the core fuzzing module, then it will enter the PSO updating module. -
`limit_time_bound:` control how many interesting test cases need to be found before MOpt-AFL quits the pacemaker fuzzing mode and reuses the deterministic stage. -0 < `limit_time_bound` < 1, MOpt-AFL-tmp. `limit_time_bound` >= 1, MOpt-AFL-ever. -Having fun with MOpt in AFL! +'swarm_num': the number of the PSO swarms used in the fuzzing process. +'period_pilot': how many times MOpt-AFL will execute the target program + in the pilot fuzzing module, then it will enter the core fuzzing module. +'period_core': how many times MOpt-AFL will execute the target program in the + core fuzzing module, then it will enter the PSO updating module. +'limit_time_bound': control how many interesting test cases need to be found + before MOpt-AFL quits the pacemaker fuzzing mode and reuses the deterministic stage. + 0 < 'limit_time_bound' < 1, MOpt-AFL-tmp. + 'limit_time_bound' >= 1, MOpt-AFL-ever. + +Have fun with MOpt in AFL! diff --git a/docs/binaryonly_fuzzing.txt b/docs/binaryonly_fuzzing.txt index f370ec74..ae5269f0 100644 --- a/docs/binaryonly_fuzzing.txt +++ b/docs/binaryonly_fuzzing.txt @@ -11,7 +11,7 @@ then standard afl++ (dumb mode) is not effective. The following is a description of how these can be fuzzed with afl++ !!!!! -DTLR: try DYNINST with afl-dyninst. If it produces too many crashes then +TL;DR: try DYNINST with afl-dyninst. If it produces too many crashes then use afl -Q qemu_mode. !!!!! @@ -22,7 +22,7 @@ Qemu is the "native" solution to the program. It is available in the ./qemu_mode/ directory and once compiled it can be accessed by the afl-fuzz -Q command line option. The speed decrease is at about 50% -It the easiest to use alternative and even works for cross-platform binaries. +It is the easiest to use alternative and even works for cross-platform binaries. As it is included in afl++ this needs no URL. @@ -30,7 +30,7 @@ As it is included in afl++ this needs no URL. DYNINST ------- Dyninst is a binary instrumentation framework similar to Pintool and Dynamorio -(see far below). Howver whereas Pintool and Dynamorio work at runtime, dyninst +(see far below). However whereas Pintool and Dynamorio work at runtime, dyninst instruments the target at load time, and then let it run. This is great for some things, e.g. fuzzing, and not so effective for others, e.g. malware analysis. @@ -38,15 +38,15 @@ e.g. malware analysis. So what we can do with dyninst is taking every basic block, and put afl's instrumention code in there - and then save the binary. Afterwards we can just fuzz the newly saved target binary with afl-fuzz. -Sounds great? It is. The issue though - this is a non-trivial problem to -insert instructions, which changes addresses in the process space and that -everything still works afterwards. Hence more often than not binaries -crash when they are run. +Sounds great? It is. The issue though - it is a non-trivial problem to +insert instructions, which change addresses in the process space, so +everything is still working afterwards. Hence more often than not binaries +crash when they are run (because of instrumentation). The speed decrease is about 15-35%, depending on the optimization options used with afl-dyninst. -So if dyninst works, its the best option available. Otherwise it just doesn't +So if dyninst works, it is the best option available. Otherwise it just doesn't work well. https://github.com/vanhauser-thc/afl-dyninst @@ -54,13 +54,14 @@ https://github.com/vanhauser-thc/afl-dyninst INTEL-PT -------- +If you have a newer Intel CPU, you can make use of Intels processor trace. The big issue with Intel's PT is the small buffer size and the complex encoding of the debug information collected through PT. This makes the decoding very CPU intensive and hence slow. As a result, the overall speed decrease is about 70-90% (depending on -the implementation and other factors) +the implementation and other factors). -there are two afl intel-pt implementations: +There are two afl intel-pt implementations: 1. https://github.com/junxzm1990/afl-pt => this needs Ubuntu 14.04.05 without any updates and the 4.4 kernel. @@ -73,13 +74,13 @@ there are two afl intel-pt implementations: CORESIGHT --------- -Coresight is the ARM answer to Intel's PT. +Coresight is ARM's answer to Intel's PT. There is no implementation so far which handle coresight and getting -it working on an ARM Linux is very difficult due custom kernel building +it working on an ARM Linux is very difficult due to custom kernel building on embedded systems is difficult. And finding one that has coresight in the ARM chip is difficult too. My guess is that it is slower than Qemu, but faster than Intel PT. -If anyone finds any coresight implemention for afl please ping me: +If anyone finds any coresight implementation for afl please ping me: vh@thc.org diff --git a/docs/env_variables.txt b/docs/env_variables.txt index 338df36f..1703a947 100644 --- a/docs/env_variables.txt +++ b/docs/env_variables.txt @@ -90,7 +90,8 @@ Then there are a few specific features that are only available in llvm_mode: LAF-INTEL ========= This great feature will split compares to series of single byte comparisons - to allow afl-fuzz to find otherwise rather impossible paths. + to allow afl-fuzz to find otherwise rather impossible paths. It is not + restricted to Intel CPUs ;-) - Setting AFL_LLVM_LAF_SPLIT_SWITCHES will split switch()es @@ -105,20 +106,20 @@ Then there are a few specific features that are only available in llvm_mode: This feature allows selectively instrumentation of the source - Setting AFL_LLVM_WHITELIST with a filename will only instrument those - files that match these names. + files that match the names listed in this file. See llvm_mode/README.whitelist for more information. INSTRIM ======= This feature increases the speed by whopping 20% but at the cost of a - lower path discovery and thefore coverage. + lower path discovery and therefore coverage. - Setting AFL_LLVM_INSTRIM activates this mode - Setting AFL_LLVM_INSTRIM_LOOPHEAD=1 expands on INSTRIM to optimize loops. afl-fuzz will only be able to see the path the loop took, but not how - many times it was called (unless its a complex loop). + many times it was called (unless it is a complex loop). See llvm_mode/README.instrim diff --git a/gcc_plugin/README.gcc b/gcc_plugin/README.gcc index b3e9c853..fe62020b 100644 --- a/gcc_plugin/README.gcc +++ b/gcc_plugin/README.gcc @@ -65,8 +65,8 @@ directory. This is an early-stage mechanism, so field reports are welcome. You can send bug reports to . -4) Bonus feature #1: deferred instrumentation ---------------------------------------------- +4) Bonus feature #1: deferred initialization +-------------------------------------------- AFL tries to optimize performance by executing the targeted binary just once, stopping it just before main(), and then cloning this "master" process to get diff --git a/llvm_mode/README.llvm b/llvm_mode/README.llvm index aaa7b81f..00528a46 100644 --- a/llvm_mode/README.llvm +++ b/llvm_mode/README.llvm @@ -109,8 +109,8 @@ See README.neverzero This is an early-stage mechanism, so field reports are welcome. You can send bug reports to . -5) Bonus feature #1: deferred instrumentation ---------------------------------------------- +5) Bonus feature #1: deferred initialization +-------------------------------------------- AFL tries to optimize performance by executing the targeted binary just once, stopping it just before main(), and then cloning this "master" process to get -- cgit 1.4.1 From 8f4f45c524d217236a2e64be0d95d0a6de11df9c Mon Sep 17 00:00:00 2001 From: van Hauser Date: Fri, 26 Jul 2019 10:35:58 +0200 Subject: incorporated most of the 2.53b changes --- README | 1 - README.md | 607 ++++++++++++++++++++++++++++++++++++++++++++++++++ afl-fuzz.c | 2 + docs/README | 592 ------------------------------------------------ llvm_mode/README.llvm | 6 +- types.h | 2 +- 6 files changed, 612 insertions(+), 598 deletions(-) delete mode 120000 README create mode 100644 README.md delete mode 100644 docs/README (limited to 'llvm_mode/README.llvm') diff --git a/README b/README deleted file mode 120000 index a90f4af9..00000000 --- a/README +++ /dev/null @@ -1 +0,0 @@ -docs/README \ No newline at end of file diff --git a/README.md b/README.md new file mode 100644 index 00000000..e1371175 --- /dev/null +++ b/README.md @@ -0,0 +1,607 @@ +# american fuzzy lop plus plus (afl++) + + Originally developed by Michal "lcamtuf" Zalewski. + + Repository: [https://github.com/vanhauser-thc/AFLplusplus](https://github.com/vanhauser-thc/AFLplusplus) + + afl++ is maintained by Marc Heuse , Heiko Eissfeldt + and Andrea Fioraldi . + +## The enhancements compared to the original stock afl + + Many improvements were made over the official afl release - which did not + get any improvements since November 2017. + + Among others afl++ has, e.g. more performant llvm_mode, supporting + llvm up to version 8, Qemu 3.1, more speed and crashfixes for Qemu, + laf-intel feature for Qemu (with libcompcov) and more. + + Additionally the following patches have been integrated: + + * AFLfast's power schedules by Marcel Boehme: [https://github.com/mboehme/aflfast](https://github.com/mboehme/aflfast) + + * C. Hollers afl-fuzz Python mutator module and llvm_mode whitelist support: [https://github.com/choller/afl](https://github.com/choller/afl) + + * the new excellent MOpt mutator: [https://github.com/puppet-meteor/MOpt-AFL](https://github.com/puppet-meteor/MOpt-AFL) + + * instrim, a very effective CFG llvm_mode instrumentation implementation for large targets: [https://github.com/csienslab/instrim](https://github.com/csienslab/instrim) + + * unicorn_mode which allows fuzzing of binaries from completely different platforms (integration provided by domenukk) + + A more thorough list is available in the PATCHES file. + + So all in all this is the best-of AFL that is currently out there :-) + + For new versions and additional information, check out: + [https://github.com/vanhauser-thc/AFLplusplus](https://github.com/vanhauser-thc/AFLplusplus) + + To compare notes with other users or get notified about major new features, + send a mail to . + + See [docs/QuickStartGuide.txt](docs/QuickStartGuide.txt) if you don't have time to + read this file. + + +## 1) Challenges of guided fuzzing +------------------------------- + +Fuzzing is one of the most powerful and proven strategies for identifying +security issues in real-world software; it is responsible for the vast +majority of remote code execution and privilege escalation bugs found to date +in security-critical software. + +Unfortunately, fuzzing is also relatively shallow; blind, random mutations +make it very unlikely to reach certain code paths in the tested code, leaving +some vulnerabilities firmly outside the reach of this technique. + +There have been numerous attempts to solve this problem. One of the early +approaches - pioneered by Tavis Ormandy - is corpus distillation. The method +relies on coverage signals to select a subset of interesting seeds from a +massive, high-quality corpus of candidate files, and then fuzz them by +traditional means. The approach works exceptionally well, but requires such +a corpus to be readily available. In addition, block coverage measurements +provide only a very simplistic understanding of program state, and are less +useful for guiding the fuzzing effort in the long haul. + +Other, more sophisticated research has focused on techniques such as program +flow analysis ("concolic execution"), symbolic execution, or static analysis. +All these methods are extremely promising in experimental settings, but tend +to suffer from reliability and performance problems in practical uses - and +currently do not offer a viable alternative to "dumb" fuzzing techniques. + + +## 2) The afl-fuzz approach + +American Fuzzy Lop is a brute-force fuzzer coupled with an exceedingly simple +but rock-solid instrumentation-guided genetic algorithm. It uses a modified +form of edge coverage to effortlessly pick up subtle, local-scale changes to +program control flow. + +Simplifying a bit, the overall algorithm can be summed up as: + + 1) Load user-supplied initial test cases into the queue, + + 2) Take next input file from the queue, + + 3) Attempt to trim the test case to the smallest size that doesn't alter + the measured behavior of the program, + + 4) Repeatedly mutate the file using a balanced and well-researched variety + of traditional fuzzing strategies, + + 5) If any of the generated mutations resulted in a new state transition + recorded by the instrumentation, add mutated output as a new entry in the + queue. + + 6) Go to 2. + +The discovered test cases are also periodically culled to eliminate ones that +have been obsoleted by newer, higher-coverage finds; and undergo several other +instrumentation-driven effort minimization steps. + +As a side result of the fuzzing process, the tool creates a small, +self-contained corpus of interesting test cases. These are extremely useful +for seeding other, labor- or resource-intensive testing regimes - for example, +for stress-testing browsers, office applications, graphics suites, or +closed-source tools. + +The fuzzer is thoroughly tested to deliver out-of-the-box performance far +superior to blind fuzzing or coverage-only tools. + + +## 3) Instrumenting programs for use with AFL + +PLEASE NOTE: llvm_mode compilation with afl-clang-fast/afl-clang-fast++ +instead of afl-gcc/afl-g++ is much faster and has a few cool features. +See llvm_mode/ - however few code does not compile with llvm. +We support llvm versions 4.0 to 8. + +When source code is available, instrumentation can be injected by a companion +tool that works as a drop-in replacement for gcc or clang in any standard build +process for third-party code. + +The instrumentation has a fairly modest performance impact; in conjunction with +other optimizations implemented by afl-fuzz, most programs can be fuzzed as fast +or even faster than possible with traditional tools. + +The correct way to recompile the target program may vary depending on the +specifics of the build process, but a nearly-universal approach would be: + +```shell +$ CC=/path/to/afl/afl-gcc ./configure +$ make clean all +``` + +For C++ programs, you'd would also want to set `CXX=/path/to/afl/afl-g++`. + +The clang wrappers (afl-clang and afl-clang++) can be used in the same way; +clang users may also opt to leverage a higher-performance instrumentation mode, +as described in [llvm_mode/README.llvm](llvm_mode/README.llvm). +Clang/LLVM has a much better performance and works with LLVM version 4.0 to 8. + +Using the LAF Intel performance enhancements are also recommended, see +[llvm_mode/README.laf-intel](llvm_mode/README.laf-intel) + +Using partial instrumentation is also recommended, see +[llvm_mode/README.whitelist](llvm_mode/README.whitelist) + +When testing libraries, you need to find or write a simple program that reads +data from stdin or from a file and passes it to the tested library. In such a +case, it is essential to link this executable against a static version of the +instrumented library, or to make sure that the correct .so file is loaded at +runtime (usually by setting `LD_LIBRARY_PATH`). The simplest option is a static +build, usually possible via: + +```shell +$ CC=/path/to/afl/afl-gcc ./configure --disable-shared +``` + +Setting `AFL_HARDEN=1` when calling 'make' will cause the CC wrapper to +automatically enable code hardening options that make it easier to detect +simple memory bugs. Libdislocator, a helper library included with AFL (see +[libdislocator/README.dislocator](libdislocator/README.dislocator)) can help uncover heap corruption issues, too. + +PS. ASAN users are advised to review [docs/notes_for_asan.txt](docs/notes_for_asan.txt) +file for important caveats. + + +## 4) Instrumenting binary-only apps +--------------------------------- + +When source code is *NOT* available, the fuzzer offers experimental support for +fast, on-the-fly instrumentation of black-box binaries. This is accomplished +with a version of QEMU running in the lesser-known "user space emulation" mode. + +QEMU is a project separate from AFL, but you can conveniently build the +feature by doing: + +```shell +$ cd qemu_mode +$ ./build_qemu_support.sh +``` + +For additional instructions and caveats, see [qemu_mode/README.qemu](qemu_mode/README.qemu). + +The mode is approximately 2-5x slower than compile-time instrumentation, is +less conductive to parallelization, and may have some other quirks. + +If [afl-dyninst](https://github.com/vanhauser-thc/afl-dyninst) works for +your binary, then you can use afl-fuzz normally and it will have twice +the speed compared to qemu_mode. + +A more comprehensive description of these and other options can be found in +[docs/binaryonly_fuzzing.txt](docs/binaryonly_fuzzing.txt) + + +## 5) Power schedules +------------------ + +The power schedules were copied from Marcel Böhme's excellent AFLfast +implementation and expands on the ability to discover new paths and +therefore the coverage. + +The available schedules are: + + - explore (default) + - fast + - coe + - quad + - lin + - exploit + +In parallel mode (-M/-S, several instances with shared queue), we suggest to +run the master using the exploit schedule (-p exploit) and the slaves with a +combination of cut-off-exponential (-p coe), exponential (-p fast; default), +and explore (-p explore) schedules. + +In single mode, using -p fast is usually more beneficial than the default +explore mode. +(We don't want to change the default behaviour of afl, so "fast" has not been +made the default mode). + +More details can be found in the paper published at the 23rd ACM Conference on +Computer and Communications Security (CCS'16): + + (https://www.sigsac.org/ccs/CCS2016/accepted-papers/)[https://www.sigsac.org/ccs/CCS2016/accepted-papers/] + + +## 6) Choosing initial test cases +------------------------------ + +To operate correctly, the fuzzer requires one or more starting file that +contains a good example of the input data normally expected by the targeted +application. There are two basic rules: + + - Keep the files small. Under 1 kB is ideal, although not strictly necessary. + For a discussion of why size matters, see [perf_tips.txt](docs/perf_tips.txt). + + - Use multiple test cases only if they are functionally different from + each other. There is no point in using fifty different vacation photos + to fuzz an image library. + +You can find many good examples of starting files in the testcases/ subdirectory +that comes with this tool. + +PS. If a large corpus of data is available for screening, you may want to use +the afl-cmin utility to identify a subset of functionally distinct files that +exercise different code paths in the target binary. + + +## 7) Fuzzing binaries +------------------- + +The fuzzing process itself is carried out by the afl-fuzz utility. This program +requires a read-only directory with initial test cases, a separate place to +store its findings, plus a path to the binary to test. + +For target binaries that accept input directly from stdin, the usual syntax is: + +```shell +$ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program [...params...] +``` + +For programs that take input from a file, use '@@' to mark the location in +the target's command line where the input file name should be placed. The +fuzzer will substitute this for you: + +```shell +$ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program @@ +``` + +You can also use the -f option to have the mutated data written to a specific +file. This is useful if the program expects a particular file extension or so. + +Non-instrumented binaries can be fuzzed in the QEMU mode (add -Q in the command +line) or in a traditional, blind-fuzzer mode (specify -n). + +You can use -t and -m to override the default timeout and memory limit for the +executed process; rare examples of targets that may need these settings touched +include compilers and video decoders. + +Tips for optimizing fuzzing performance are discussed in [perf_tips.txt](docs/perf_tips.txt). + +Note that afl-fuzz starts by performing an array of deterministic fuzzing +steps, which can take several days, but tend to produce neat test cases. If you +want quick & dirty results right away - akin to zzuf and other traditional +fuzzers - add the -d option to the command line. + + +## 8) Interpreting output +---------------------- + +See the [docs/status_screen.txt](docs/status_screen.txt) file for information on +how to interpret the displayed stats and monitor the health of the process. Be +sure to consult this file especially if any UI elements are highlighted in red. + +The fuzzing process will continue until you press Ctrl-C. At minimum, you want +to allow the fuzzer to complete one queue cycle, which may take anywhere from a +couple of hours to a week or so. + +There are three subdirectories created within the output directory and updated +in real time: + + - queue/ - test cases for every distinctive execution path, plus all the + starting files given by the user. This is the synthesized corpus + mentioned in section 2. + + Before using this corpus for any other purposes, you can shrink + it to a smaller size using the afl-cmin tool. The tool will find + a smaller subset of files offering equivalent edge coverage. + + - crashes/ - unique test cases that cause the tested program to receive a + fatal signal (e.g., SIGSEGV, SIGILL, SIGABRT). The entries are + grouped by the received signal. + + - hangs/ - unique test cases that cause the tested program to time out. The + default time limit before something is classified as a hang is + the larger of 1 second and the value of the -t parameter. + The value can be fine-tuned by setting AFL_HANG_TMOUT, but this + is rarely necessary. + +Crashes and hangs are considered "unique" if the associated execution paths +involve any state transitions not seen in previously-recorded faults. If a +single bug can be reached in multiple ways, there will be some count inflation +early in the process, but this should quickly taper off. + +The file names for crashes and hangs are correlated with parent, non-faulting +queue entries. This should help with debugging. + +When you can't reproduce a crash found by afl-fuzz, the most likely cause is +that you are not setting the same memory limit as used by the tool. Try: + +```shell +$ LIMIT_MB=50 +$ ( ulimit -Sv $[LIMIT_MB << 10]; /path/to/tested_binary ... ) +``` + +Change LIMIT_MB to match the -m parameter passed to afl-fuzz. On OpenBSD, +also change -Sv to -Sd. + +Any existing output directory can be also used to resume aborted jobs; try: + +```shell +$ ./afl-fuzz -i- -o existing_output_dir [...etc...] +``` + +If you have gnuplot installed, you can also generate some pretty graphs for any +active fuzzing task using afl-plot. For an example of how this looks like, +see [http://lcamtuf.coredump.cx/afl/plot/](http://lcamtuf.coredump.cx/afl/plot/). + + +## 9) Parallelized fuzzing +----------------------- + +Every instance of afl-fuzz takes up roughly one core. This means that on +multi-core systems, parallelization is necessary to fully utilize the hardware. +For tips on how to fuzz a common target on multiple cores or multiple networked +machines, please refer to [parallel_fuzzing.txt](docs/parallel_fuzzing.txt). + +The parallel fuzzing mode also offers a simple way for interfacing AFL to other +fuzzers, to symbolic or concolic execution engines, and so forth; again, see the +last section of [parallel_fuzzing.txt](docs/parallel_fuzzing.txt) for tips. + + +## 10) Fuzzer dictionaries +---------------------- + +By default, afl-fuzz mutation engine is optimized for compact data formats - +say, images, multimedia, compressed data, regular expression syntax, or shell +scripts. It is somewhat less suited for languages with particularly verbose and +redundant verbiage - notably including HTML, SQL, or JavaScript. + +To avoid the hassle of building syntax-aware tools, afl-fuzz provides a way to +seed the fuzzing process with an optional dictionary of language keywords, +magic headers, or other special tokens associated with the targeted data type +-- and use that to reconstruct the underlying grammar on the go: + + [http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html](http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html) + +To use this feature, you first need to create a dictionary in one of the two +formats discussed in [dictionaries/README.dictionaries](ictionaries/README.dictionaries); +and then point the fuzzer to it via the -x option in the command line. + +(Several common dictionaries are already provided in that subdirectory, too.) + +There is no way to provide more structured descriptions of the underlying +syntax, but the fuzzer will likely figure out some of this based on the +instrumentation feedback alone. This actually works in practice, say: + + [http://lcamtuf.blogspot.com/2015/04/finding-bugs-in-sqlite-easy-way.html](http://lcamtuf.blogspot.com/2015/04/finding-bugs-in-sqlite-easy-way.html) + +PS. Even when no explicit dictionary is given, afl-fuzz will try to extract +existing syntax tokens in the input corpus by watching the instrumentation +very closely during deterministic byte flips. This works for some types of +parsers and grammars, but isn't nearly as good as the -x mode. + +If a dictionary is really hard to come by, another option is to let AFL run +for a while, and then use the token capture library that comes as a companion +utility with AFL. For that, see [libtokencap/README.tokencap](libtokencap/README.tokencap). + + +## 11) Crash triage +---------------- + +The coverage-based grouping of crashes usually produces a small data set that +can be quickly triaged manually or with a very simple GDB or Valgrind script. +Every crash is also traceable to its parent non-crashing test case in the +queue, making it easier to diagnose faults. + +Having said that, it's important to acknowledge that some fuzzing crashes can be +difficult to quickly evaluate for exploitability without a lot of debugging and +code analysis work. To assist with this task, afl-fuzz supports a very unique +"crash exploration" mode enabled with the -C flag. + +In this mode, the fuzzer takes one or more crashing test cases as the input, +and uses its feedback-driven fuzzing strategies to very quickly enumerate all +code paths that can be reached in the program while keeping it in the +crashing state. + +Mutations that do not result in a crash are rejected; so are any changes that +do not affect the execution path. + +The output is a small corpus of files that can be very rapidly examined to see +what degree of control the attacker has over the faulting address, or whether +it is possible to get past an initial out-of-bounds read - and see what lies +beneath. + +Oh, one more thing: for test case minimization, give afl-tmin a try. The tool +can be operated in a very simple way: + +```shell +$ ./afl-tmin -i test_case -o minimized_result -- /path/to/program [...] +``` + +The tool works with crashing and non-crashing test cases alike. In the crash +mode, it will happily accept instrumented and non-instrumented binaries. In the +non-crashing mode, the minimizer relies on standard AFL instrumentation to make +the file simpler without altering the execution path. + +The minimizer accepts the -m, -t, -f and @@ syntax in a manner compatible with +afl-fuzz. + +Another recent addition to AFL is the afl-analyze tool. It takes an input +file, attempts to sequentially flip bytes, and observes the behavior of the +tested program. It then color-codes the input based on which sections appear to +be critical, and which are not; while not bulletproof, it can often offer quick +insights into complex file formats. More info about its operation can be found +near the end of [docs/technical_details.txt](docs/technical_details.txt). + + +## 12) Going beyond crashes +------------------------ + +Fuzzing is a wonderful and underutilized technique for discovering non-crashing +design and implementation errors, too. Quite a few interesting bugs have been +found by modifying the target programs to call abort() when, say: + + - Two bignum libraries produce different outputs when given the same + fuzzer-generated input, + + - An image library produces different outputs when asked to decode the same + input image several times in a row, + + - A serialization / deserialization library fails to produce stable outputs + when iteratively serializing and deserializing fuzzer-supplied data, + + - A compression library produces an output inconsistent with the input file + when asked to compress and then decompress a particular blob. + +Implementing these or similar sanity checks usually takes very little time; +if you are the maintainer of a particular package, you can make this code +conditional with `#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` (a flag also +shared with libfuzzer) or `#ifdef __AFL_COMPILER` (this one is just for AFL). + + +## 13) Common-sense risks +---------------------- + +Please keep in mind that, similarly to many other computationally-intensive +tasks, fuzzing may put strain on your hardware and on the OS. In particular: + + - Your CPU will run hot and will need adequate cooling. In most cases, if + cooling is insufficient or stops working properly, CPU speeds will be + automatically throttled. That said, especially when fuzzing on less + suitable hardware (laptops, smartphones, etc), it's not entirely impossible + for something to blow up. + + - Targeted programs may end up erratically grabbing gigabytes of memory or + filling up disk space with junk files. AFL tries to enforce basic memory + limits, but can't prevent each and every possible mishap. The bottom line + is that you shouldn't be fuzzing on systems where the prospect of data loss + is not an acceptable risk. + + - Fuzzing involves billions of reads and writes to the filesystem. On modern + systems, this will be usually heavily cached, resulting in fairly modest + "physical" I/O - but there are many factors that may alter this equation. + It is your responsibility to monitor for potential trouble; with very heavy + I/O, the lifespan of many HDDs and SSDs may be reduced. + + A good way to monitor disk I/O on Linux is the 'iostat' command: + +```shell + $ iostat -d 3 -x -k [...optional disk ID...] +``` + + +## 14) Known limitations & areas for improvement +--------------------------------------------- + +Here are some of the most important caveats for AFL: + + - AFL detects faults by checking for the first spawned process dying due to + a signal (SIGSEGV, SIGABRT, etc). Programs that install custom handlers for + these signals may need to have the relevant code commented out. In the same + vein, faults in child processed spawned by the fuzzed target may evade + detection unless you manually add some code to catch that. + + - As with any other brute-force tool, the fuzzer offers limited coverage if + encryption, checksums, cryptographic signatures, or compression are used to + wholly wrap the actual data format to be tested. + + To work around this, you can comment out the relevant checks (see + experimental/libpng_no_checksum/ for inspiration); if this is not possible, + you can also write a postprocessor, as explained in + experimental/post_library/ (with AFL_POST_LIBRARY) + + - There are some unfortunate trade-offs with ASAN and 64-bit binaries. This + isn't due to any specific fault of afl-fuzz; see [docs/notes_for_asan.txt](docs/notes_for_asan.txt) + for tips. + + - There is no direct support for fuzzing network services, background + daemons, or interactive apps that require UI interaction to work. You may + need to make simple code changes to make them behave in a more traditional + way. Preeny may offer a relatively simple option, too - see: + [https://github.com/zardus/preeny](https://github.com/zardus/preeny) + + Some useful tips for modifying network-based services can be also found at: + [https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop](https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop) + + - AFL doesn't output human-readable coverage data. If you want to monitor + coverage, use afl-cov from Michael Rash: [https://github.com/mrash/afl-cov](https://github.com/mrash/afl-cov) + + - Occasionally, sentient machines rise against their creators. If this + happens to you, please consult [http://lcamtuf.coredump.cx/prep/](http://lcamtuf.coredump.cx/prep/). + +Beyond this, see INSTALL for platform-specific tips. + + +## 15) Special thanks +------------------ + +Many of the improvements to the original afl wouldn't be possible without +feedback, bug reports, or patches from: + +``` + Jann Horn Hanno Boeck + Felix Groebert Jakub Wilk + Richard W. M. Jones Alexander Cherepanov + Tom Ritter Hovik Manucharyan + Sebastian Roschke Eberhard Mattes + Padraig Brady Ben Laurie + @dronesec Luca Barbato + Tobias Ospelt Thomas Jarosch + Martin Carpenter Mudge Zatko + Joe Zbiciak Ryan Govostes + Michael Rash William Robinet + Jonathan Gray Filipe Cabecinhas + Nico Weber Jodie Cunningham + Andrew Griffiths Parker Thompson + Jonathan Neuschfer Tyler Nighswander + Ben Nagy Samir Aguiar + Aidan Thornton Aleksandar Nikolich + Sam Hakim Laszlo Szekeres + David A. Wheeler Turo Lamminen + Andreas Stieger Richard Godbee + Louis Dassy teor2345 + Alex Moneger Dmitry Vyukov + Keegan McAllister Kostya Serebryany + Richo Healey Martijn Bogaard + rc0r Jonathan Foote + Christian Holler Dominique Pelle + Jacek Wielemborek Leo Barnes + Jeremy Barnes Jeff Trull + Guillaume Endignoux ilovezfs + Daniel Godas-Lopez Franjo Ivancic + Austin Seipp Daniel Komaromy + Daniel Binderman Jonathan Metzman + Vegard Nossum Jan Kneschke + Kurt Roeckx Marcel Bohme + Van-Thuan Pham Abhik Roychoudhury + Joshua J. Drake Toby Hutton + Rene Freingruber Sergey Davidoff + Sami Liedes Craig Young + Andrzej Jackowski Daniel Hodson +``` + +Thank you! + + +## 16) Contact +----------- + +Questions? Concerns? Bug reports? The contributors can be reached via +[https://github.com/vanhauser-thc/AFLplusplus](https://github.com/vanhauser-thc/AFLplusplus) + +There is also a mailing list for the afl project; to join, send a mail to +. Or, if you prefer to browse +archives first, try: [https://groups.google.com/group/afl-users](https://groups.google.com/group/afl-users) diff --git a/afl-fuzz.c b/afl-fuzz.c index 8f4e1344..f974268f 100644 --- a/afl-fuzz.c +++ b/afl-fuzz.c @@ -23,7 +23,9 @@ #define AFL_MAIN #define MESSAGES_TO_STDOUT +#ifndef _GNU_SOURCE #define _GNU_SOURCE +#endif #define _FILE_OFFSET_BITS 64 #include "config.h" diff --git a/docs/README b/docs/README deleted file mode 100644 index c2c93f38..00000000 --- a/docs/README +++ /dev/null @@ -1,592 +0,0 @@ -============================ -american fuzzy lop plus plus -============================ - - Originally written by Michal Zalewski - - Repository: https://github.com/vanhauser-thc/AFLplusplus - - afl++ is maintained by Marc Heuse , Heiko Eissfeldt - and Andrea Fioraldi as - there have been no updates to afl since November 2017. - - - Many improvements were made, e.g. more performant llvm_mode, supporting - llvm up to version 8, Qemu 3.1, more speed and crashfixes for Qemu, - laf-intel feature for Qemu (with libcompcov) etc. - - Additionally AFLfast's power schedules by Marcel Boehme from - https://github.com/mboehme/aflfast have been incorporated. - - C. Hollers afl-fuzz Python mutator module and llvm_mode whitelist support - was added too (https://github.com/choller/afl) - - New is the excellent MOpt mutator from - https://github.com/puppet-meteor/MOpt-AFL - - Also newly integrated is instrim, a very effective CFG llvm_mode - instrumentation implementation from https://github.com/csienslab/instrim - - And finally the newest addition is the unicorn_mode which allows fuzzing - of binaries from completely different platforms - provided by domenukk! - The unicorn afl mode is not the stock version but like afl++ contains - various patches from forks that make it better :) - - A more thorough list is available in the PATCHES file. - - So all in all this is the best-of AFL that is currently out there :-) - - - Copyright 2013, 2014, 2015, 2016 Google Inc. All rights reserved. - Released under terms and conditions of Apache License, Version 2.0. - - For new versions and additional information, check out: - https://github.com/vanhauser-thc/AFLplusplus - - To compare notes with other users or get notified about major new features, - send a mail to . - - ** See QuickStartGuide.txt if you don't have time to read this file. ** - - -1) Challenges of guided fuzzing -------------------------------- - -Fuzzing is one of the most powerful and proven strategies for identifying -security issues in real-world software; it is responsible for the vast -majority of remote code execution and privilege escalation bugs found to date -in security-critical software. - -Unfortunately, fuzzing is also relatively shallow; blind, random mutations -make it very unlikely to reach certain code paths in the tested code, leaving -some vulnerabilities firmly outside the reach of this technique. - -There have been numerous attempts to solve this problem. One of the early -approaches - pioneered by Tavis Ormandy - is corpus distillation. The method -relies on coverage signals to select a subset of interesting seeds from a -massive, high-quality corpus of candidate files, and then fuzz them by -traditional means. The approach works exceptionally well, but requires such -a corpus to be readily available. In addition, block coverage measurements -provide only a very simplistic understanding of program state, and are less -useful for guiding the fuzzing effort in the long haul. - -Other, more sophisticated research has focused on techniques such as program -flow analysis ("concolic execution"), symbolic execution, or static analysis. -All these methods are extremely promising in experimental settings, but tend -to suffer from reliability and performance problems in practical uses - and -currently do not offer a viable alternative to "dumb" fuzzing techniques. - - -2) The afl-fuzz approach ------------------------- - -American Fuzzy Lop is a brute-force fuzzer coupled with an exceedingly simple -but rock-solid instrumentation-guided genetic algorithm. It uses a modified -form of edge coverage to effortlessly pick up subtle, local-scale changes to -program control flow. - -Simplifying a bit, the overall algorithm can be summed up as: - - 1) Load user-supplied initial test cases into the queue, - - 2) Take next input file from the queue, - - 3) Attempt to trim the test case to the smallest size that doesn't alter - the measured behavior of the program, - - 4) Repeatedly mutate the file using a balanced and well-researched variety - of traditional fuzzing strategies, - - 5) If any of the generated mutations resulted in a new state transition - recorded by the instrumentation, add mutated output as a new entry in the - queue. - - 6) Go to 2. - -The discovered test cases are also periodically culled to eliminate ones that -have been obsoleted by newer, higher-coverage finds; and undergo several other -instrumentation-driven effort minimization steps. - -As a side result of the fuzzing process, the tool creates a small, -self-contained corpus of interesting test cases. These are extremely useful -for seeding other, labor- or resource-intensive testing regimes - for example, -for stress-testing browsers, office applications, graphics suites, or -closed-source tools. - -The fuzzer is thoroughly tested to deliver out-of-the-box performance far -superior to blind fuzzing or coverage-only tools. - - -3) Instrumenting programs for use with AFL ------------------------------------------- - -PLEASE NOTE: llvm_mode compilation with afl-clang-fast/afl-clang-fast++ -instead of afl-gcc/afl-g++ is much faster and has a few cool features. -See llvm_mode/ - however few code does not compile with llvm. -We support llvm versions 4.0 to 8. - -When source code is available, instrumentation can be injected by a companion -tool that works as a drop-in replacement for gcc or clang in any standard build -process for third-party code. - -The instrumentation has a fairly modest performance impact; in conjunction with -other optimizations implemented by afl-fuzz, most programs can be fuzzed as fast -or even faster than possible with traditional tools. - -The correct way to recompile the target program may vary depending on the -specifics of the build process, but a nearly-universal approach would be: - -$ CC=/path/to/afl/afl-gcc ./configure -$ make clean all - -For C++ programs, you'd would also want to set CXX=/path/to/afl/afl-g++. - -The clang wrappers (afl-clang and afl-clang++) can be used in the same way; -clang users may also opt to leverage a higher-performance instrumentation mode, -as described in llvm_mode/README.llvm. -Clang/LLVM has a much better performance and works from LLVM version 4.0 to 8. -Using the LAF Intel performance enhancements are also recommended, see -llvm_mode/README.laf-intel -Using partial instrumentation is also recommended, see -llvm_mode/README.whitelist - -When testing libraries, you need to find or write a simple program that reads -data from stdin or from a file and passes it to the tested library. In such a -case, it is essential to link this executable against a static version of the -instrumented library, or to make sure that the correct .so file is loaded at -runtime (usually by setting LD_LIBRARY_PATH). The simplest option is a static -build, usually possible via: - -$ CC=/path/to/afl/afl-gcc ./configure --disable-shared - -Setting AFL_HARDEN=1 when calling 'make' will cause the CC wrapper to -automatically enable code hardening options that make it easier to detect -simple memory bugs. Libdislocator, a helper library included with AFL (see -libdislocator/README.dislocator) can help uncover heap corruption issues, too. - -PS. ASAN users are advised to docs/review notes_for_asan.txt file for -important caveats. - - -4) Instrumenting binary-only apps ---------------------------------- - -When source code is *NOT* available, the fuzzer offers experimental support for -fast, on-the-fly instrumentation of black-box binaries. This is accomplished -with a version of QEMU running in the lesser-known "user space emulation" mode. - -QEMU is a project separate from AFL, but you can conveniently build the -feature by doing: - -$ cd qemu_mode -$ ./build_qemu_support.sh - -For additional instructions and caveats, see qemu_mode/README.qemu. - -The mode is approximately 2-5x slower than compile-time instrumentation, is -less conductive to parallelization, and may have some other quirks. - -If [afl-dyninst](https://github.com/vanhauser-thc/afl-dyninst) works for -your binary, then you can use afl-fuzz normally and it will have twice -the speed compared to qemu_mode. - - -5) Power schedules ------------------- - -The power schedules were copied from Marcel Böhme's excellent AFLfast -implementation and expands on the ability to discover new paths and -therefore the coverage. - -The available schedules are: - - - explore (default) - - fast - - coe - - quad - - lin - - exploit - -In parallel mode (-M/-S, several instances with shared queue), we suggest to -run the master using the exploit schedule (-p exploit) and the slaves with a -combination of cut-off-exponential (-p coe), exponential (-p fast; default), -and explore (-p explore) schedules. - -In single mode, using -p fast is usually more beneficial than the default -explore mode. -(We don't want to change the default behaviour of afl, so "fast" has not been -made the default mode). - -More details can be found in the paper published at the 23rd ACM Conference on -Computer and Communications Security (CCS'16): - - https://www.sigsac.org/ccs/CCS2016/accepted-papers/ - -6) Choosing initial test cases ------------------------------- - -To operate correctly, the fuzzer requires one or more starting file that -contains a good example of the input data normally expected by the targeted -application. There are two basic rules: - - - Keep the files small. Under 1 kB is ideal, although not strictly necessary. - For a discussion of why size matters, see perf_tips.txt. - - - Use multiple test cases only if they are functionally different from - each other. There is no point in using fifty different vacation photos - to fuzz an image library. - -You can find many good examples of starting files in the testcases/ subdirectory -that comes with this tool. - -PS. If a large corpus of data is available for screening, you may want to use -the afl-cmin utility to identify a subset of functionally distinct files that -exercise different code paths in the target binary. - - -7) Fuzzing binaries -------------------- - -The fuzzing process itself is carried out by the afl-fuzz utility. This program -requires a read-only directory with initial test cases, a separate place to -store its findings, plus a path to the binary to test. - -For target binaries that accept input directly from stdin, the usual syntax is: - -$ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program [...params...] - -For programs that take input from a file, use '@@' to mark the location in -the target's command line where the input file name should be placed. The -fuzzer will substitute this for you: - -$ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program @@ - -You can also use the -f option to have the mutated data written to a specific -file. This is useful if the program expects a particular file extension or so. - -Non-instrumented binaries can be fuzzed in the QEMU mode (add -Q in the command -line) or in a traditional, blind-fuzzer mode (specify -n). - -You can use -t and -m to override the default timeout and memory limit for the -executed process; rare examples of targets that may need these settings touched -include compilers and video decoders. - -Tips for optimizing fuzzing performance are discussed in perf_tips.txt. - -Note that afl-fuzz starts by performing an array of deterministic fuzzing -steps, which can take several days, but tend to produce neat test cases. If you -want quick & dirty results right away - akin to zzuf and other traditional -fuzzers - add the -d option to the command line. - - -8) Interpreting output ----------------------- - -See the status_screen.txt file for information on how to interpret the -displayed stats and monitor the health of the process. Be sure to consult this -file especially if any UI elements are highlighted in red. - -The fuzzing process will continue until you press Ctrl-C. At minimum, you want -to allow the fuzzer to complete one queue cycle, which may take anywhere from a -couple of hours to a week or so. - -There are three subdirectories created within the output directory and updated -in real time: - - - queue/ - test cases for every distinctive execution path, plus all the - starting files given by the user. This is the synthesized corpus - mentioned in section 2. - - Before using this corpus for any other purposes, you can shrink - it to a smaller size using the afl-cmin tool. The tool will find - a smaller subset of files offering equivalent edge coverage. - - - crashes/ - unique test cases that cause the tested program to receive a - fatal signal (e.g., SIGSEGV, SIGILL, SIGABRT). The entries are - grouped by the received signal. - - - hangs/ - unique test cases that cause the tested program to time out. The - default time limit before something is classified as a hang is - the larger of 1 second and the value of the -t parameter. - The value can be fine-tuned by setting AFL_HANG_TMOUT, but this - is rarely necessary. - -Crashes and hangs are considered "unique" if the associated execution paths -involve any state transitions not seen in previously-recorded faults. If a -single bug can be reached in multiple ways, there will be some count inflation -early in the process, but this should quickly taper off. - -The file names for crashes and hangs are correlated with parent, non-faulting -queue entries. This should help with debugging. - -When you can't reproduce a crash found by afl-fuzz, the most likely cause is -that you are not setting the same memory limit as used by the tool. Try: - -$ LIMIT_MB=50 -$ ( ulimit -Sv $[LIMIT_MB << 10]; /path/to/tested_binary ... ) - -Change LIMIT_MB to match the -m parameter passed to afl-fuzz. On OpenBSD, -also change -Sv to -Sd. - -Any existing output directory can be also used to resume aborted jobs; try: - -$ ./afl-fuzz -i- -o existing_output_dir [...etc...] - -If you have gnuplot installed, you can also generate some pretty graphs for any -active fuzzing task using afl-plot. For an example of how this looks like, -see http://lcamtuf.coredump.cx/afl/plot/. - - -9) Parallelized fuzzing ------------------------ - -Every instance of afl-fuzz takes up roughly one core. This means that on -multi-core systems, parallelization is necessary to fully utilize the hardware. -For tips on how to fuzz a common target on multiple cores or multiple networked -machines, please refer to parallel_fuzzing.txt. - -The parallel fuzzing mode also offers a simple way for interfacing AFL to other -fuzzers, to symbolic or concolic execution engines, and so forth; again, see the -last section of parallel_fuzzing.txt for tips. - - -10) Fuzzer dictionaries ----------------------- - -By default, afl-fuzz mutation engine is optimized for compact data formats - -say, images, multimedia, compressed data, regular expression syntax, or shell -scripts. It is somewhat less suited for languages with particularly verbose and -redundant verbiage - notably including HTML, SQL, or JavaScript. - -To avoid the hassle of building syntax-aware tools, afl-fuzz provides a way to -seed the fuzzing process with an optional dictionary of language keywords, -magic headers, or other special tokens associated with the targeted data type -- and use that to reconstruct the underlying grammar on the go: - - http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html - -To use this feature, you first need to create a dictionary in one of the two -formats discussed in dictionaries/README.dictionaries; and then point the fuzzer -to it via the -x option in the command line. - -(Several common dictionaries are already provided in that subdirectory, too.) - -There is no way to provide more structured descriptions of the underlying -syntax, but the fuzzer will likely figure out some of this based on the -instrumentation feedback alone. This actually works in practice, say: - - http://lcamtuf.blogspot.com/2015/04/finding-bugs-in-sqlite-easy-way.html - -PS. Even when no explicit dictionary is given, afl-fuzz will try to extract -existing syntax tokens in the input corpus by watching the instrumentation -very closely during deterministic byte flips. This works for some types of -parsers and grammars, but isn't nearly as good as the -x mode. - -If a dictionary is really hard to come by, another option is to let AFL run -for a while, and then use the token capture library that comes as a companion -utility with AFL. For that, see libtokencap/README.tokencap. - - -11) Crash triage ----------------- - -The coverage-based grouping of crashes usually produces a small data set that -can be quickly triaged manually or with a very simple GDB or Valgrind script. -Every crash is also traceable to its parent non-crashing test case in the -queue, making it easier to diagnose faults. - -Having said that, it's important to acknowledge that some fuzzing crashes can be -difficult to quickly evaluate for exploitability without a lot of debugging and -code analysis work. To assist with this task, afl-fuzz supports a very unique -"crash exploration" mode enabled with the -C flag. - -In this mode, the fuzzer takes one or more crashing test cases as the input, -and uses its feedback-driven fuzzing strategies to very quickly enumerate all -code paths that can be reached in the program while keeping it in the -crashing state. - -Mutations that do not result in a crash are rejected; so are any changes that -do not affect the execution path. - -The output is a small corpus of files that can be very rapidly examined to see -what degree of control the attacker has over the faulting address, or whether -it is possible to get past an initial out-of-bounds read - and see what lies -beneath. - -Oh, one more thing: for test case minimization, give afl-tmin a try. The tool -can be operated in a very simple way: - -$ ./afl-tmin -i test_case -o minimized_result -- /path/to/program [...] - -The tool works with crashing and non-crashing test cases alike. In the crash -mode, it will happily accept instrumented and non-instrumented binaries. In the -non-crashing mode, the minimizer relies on standard AFL instrumentation to make -the file simpler without altering the execution path. - -The minimizer accepts the -m, -t, -f and @@ syntax in a manner compatible with -afl-fuzz. - -Another recent addition to AFL is the afl-analyze tool. It takes an input -file, attempts to sequentially flip bytes, and observes the behavior of the -tested program. It then color-codes the input based on which sections appear to -be critical, and which are not; while not bulletproof, it can often offer quick -insights into complex file formats. More info about its operation can be found -near the end of technical_details.txt. - - -12) Going beyond crashes ------------------------- - -Fuzzing is a wonderful and underutilized technique for discovering non-crashing -design and implementation errors, too. Quite a few interesting bugs have been -found by modifying the target programs to call abort() when, say: - - - Two bignum libraries produce different outputs when given the same - fuzzer-generated input, - - - An image library produces different outputs when asked to decode the same - input image several times in a row, - - - A serialization / deserialization library fails to produce stable outputs - when iteratively serializing and deserializing fuzzer-supplied data, - - - A compression library produces an output inconsistent with the input file - when asked to compress and then decompress a particular blob. - -Implementing these or similar sanity checks usually takes very little time; -if you are the maintainer of a particular package, you can make this code -conditional with #ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION (a flag also -shared with libfuzzer) or #ifdef __AFL_COMPILER (this one is just for AFL). - - -13) Common-sense risks ----------------------- - -Please keep in mind that, similarly to many other computationally-intensive -tasks, fuzzing may put strain on your hardware and on the OS. In particular: - - - Your CPU will run hot and will need adequate cooling. In most cases, if - cooling is insufficient or stops working properly, CPU speeds will be - automatically throttled. That said, especially when fuzzing on less - suitable hardware (laptops, smartphones, etc), it's not entirely impossible - for something to blow up. - - - Targeted programs may end up erratically grabbing gigabytes of memory or - filling up disk space with junk files. AFL tries to enforce basic memory - limits, but can't prevent each and every possible mishap. The bottom line - is that you shouldn't be fuzzing on systems where the prospect of data loss - is not an acceptable risk. - - - Fuzzing involves billions of reads and writes to the filesystem. On modern - systems, this will be usually heavily cached, resulting in fairly modest - "physical" I/O - but there are many factors that may alter this equation. - It is your responsibility to monitor for potential trouble; with very heavy - I/O, the lifespan of many HDDs and SSDs may be reduced. - - A good way to monitor disk I/O on Linux is the 'iostat' command: - - $ iostat -d 3 -x -k [...optional disk ID...] - - -14) Known limitations & areas for improvement ---------------------------------------------- - -Here are some of the most important caveats for AFL: - - - AFL detects faults by checking for the first spawned process dying due to - a signal (SIGSEGV, SIGABRT, etc). Programs that install custom handlers for - these signals may need to have the relevant code commented out. In the same - vein, faults in child processed spawned by the fuzzed target may evade - detection unless you manually add some code to catch that. - - - As with any other brute-force tool, the fuzzer offers limited coverage if - encryption, checksums, cryptographic signatures, or compression are used to - wholly wrap the actual data format to be tested. - - To work around this, you can comment out the relevant checks (see - experimental/libpng_no_checksum/ for inspiration); if this is not possible, - you can also write a postprocessor, as explained in - experimental/post_library/ (with AFL_POST_LIBRARY) - - - There are some unfortunate trade-offs with ASAN and 64-bit binaries. This - isn't due to any specific fault of afl-fuzz; see notes_for_asan.txt for - tips. - - - There is no direct support for fuzzing network services, background - daemons, or interactive apps that require UI interaction to work. You may - need to make simple code changes to make them behave in a more traditional - way. Preeny may offer a relatively simple option, too - see: - https://github.com/zardus/preeny - - Some useful tips for modifying network-based services can be also found at: - https://www.fastly.com/blog/how-to-fuzz-server-american-fuzzy-lop - - - AFL doesn't output human-readable coverage data. If you want to monitor - coverage, use afl-cov from Michael Rash: https://github.com/mrash/afl-cov - - - Occasionally, sentient machines rise against their creators. If this - happens to you, please consult http://lcamtuf.coredump.cx/prep/. - -Beyond this, see INSTALL for platform-specific tips. - - -15) Special thanks ------------------- - -Many of the improvements to afl-fuzz wouldn't be possible without feedback, -bug reports, or patches from: - - Jann Horn Hanno Boeck - Felix Groebert Jakub Wilk - Richard W. M. Jones Alexander Cherepanov - Tom Ritter Hovik Manucharyan - Sebastian Roschke Eberhard Mattes - Padraig Brady Ben Laurie - @dronesec Luca Barbato - Tobias Ospelt Thomas Jarosch - Martin Carpenter Mudge Zatko - Joe Zbiciak Ryan Govostes - Michael Rash William Robinet - Jonathan Gray Filipe Cabecinhas - Nico Weber Jodie Cunningham - Andrew Griffiths Parker Thompson - Jonathan Neuschfer Tyler Nighswander - Ben Nagy Samir Aguiar - Aidan Thornton Aleksandar Nikolich - Sam Hakim Laszlo Szekeres - David A. Wheeler Turo Lamminen - Andreas Stieger Richard Godbee - Louis Dassy teor2345 - Alex Moneger Dmitry Vyukov - Keegan McAllister Kostya Serebryany - Richo Healey Martijn Bogaard - rc0r Jonathan Foote - Christian Holler Dominique Pelle - Jacek Wielemborek Leo Barnes - Jeremy Barnes Jeff Trull - Guillaume Endignoux ilovezfs - Daniel Godas-Lopez Franjo Ivancic - Austin Seipp Daniel Komaromy - Daniel Binderman Jonathan Metzman - Vegard Nossum Jan Kneschke - Kurt Roeckx Marcel Bohme - Van-Thuan Pham Abhik Roychoudhury - Joshua J. Drake Toby Hutton - Rene Freingruber Sergey Davidoff - Sami Liedes Craig Young - Andrzej Jackowski Daniel Hodson - -Thank you! - - -16) Contact ------------ - -Questions? Concerns? Bug reports? The contributors can be reached via -https://github.com/vanhauser-thc/AFLplusplus - -There is also a mailing list for the afl project; to join, send a mail to -. Or, if you prefer to browse -archives first, try: - - https://groups.google.com/group/afl-users diff --git a/llvm_mode/README.llvm b/llvm_mode/README.llvm index 00528a46..a0c40211 100644 --- a/llvm_mode/README.llvm +++ b/llvm_mode/README.llvm @@ -205,10 +205,8 @@ post-process the assembly or install any compiler plugins. See: http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs-with-guards -As of this writing, the feature is only available on SVN trunk, and is yet to -make it to an official release of LLVM. Nevertheless, if you have a -sufficiently recent compiler and want to give it a try, build afl-clang-fast -this way: +If you have a sufficiently recent compiler and want to give it a try, build +afl-clang-fast this way: AFL_TRACE_PC=1 make clean all diff --git a/types.h b/types.h index 3497bb2b..7606d4ed 100644 --- a/types.h +++ b/types.h @@ -78,7 +78,7 @@ typedef int64_t s64; #define STRINGIFY(x) STRINGIFY_INTERNAL(x) #define MEM_BARRIER() \ - asm volatile("" ::: "memory") + __asm__ volatile("" ::: "memory") #if __GNUC__ < 6 #define likely(_x) (_x) -- cgit 1.4.1