diff options
Diffstat (limited to 'docs/fuzzing_binary-only_targets.md')
-rw-r--r-- | docs/fuzzing_binary-only_targets.md | 289 |
1 files changed, 244 insertions, 45 deletions
diff --git a/docs/fuzzing_binary-only_targets.md b/docs/fuzzing_binary-only_targets.md index ea262f6e..0b39042f 100644 --- a/docs/fuzzing_binary-only_targets.md +++ b/docs/fuzzing_binary-only_targets.md @@ -1,83 +1,282 @@ # Fuzzing binary-only targets -When source code is *NOT* available, AFL++ offers various support for fast, -on-the-fly instrumentation of black-box binaries. +AFL++, libfuzzer, and other fuzzers are great if you have the source code of the +target. This allows for very fast and coverage guided fuzzing. -If you do not have to use Unicorn the following setup is recommended to use -qemu_mode: - * run 1 afl-fuzz -Q instance with CMPLOG (`-c 0` + `AFL_COMPCOV_LEVEL=2`) - * run 1 afl-fuzz -Q instance with QASAN (`AFL_USE_QASAN=1`) - * run 1 afl-fuzz -Q instance with LAF (`AFL_PRELOAD=libcmpcov.so` + `AFL_COMPCOV_LEVEL=2`) -Alternatively you can use frida_mode, just switch `-Q` with `-O` and remove the -LAF instance. +However, if there is only the binary program and no source code available, then +standard `afl-fuzz -n` (non-instrumented mode) is not effective. -Then run as many instances as you have cores left with either -Q mode or - better - -use a binary rewriter like afl-dyninst, retrowrite, zafl, etc. +For fast, on-the-fly instrumentation of black-box binaries, AFL++ still offers +various support. The following is a description of how these binaries can be +fuzzed with AFL++. -For Qemu and Frida mode, check out the persistent mode, it gives a huge speed -improvement if it is possible to use. +## TL;DR: + +Qemu_mode in persistent mode is the fastest - if the stability is high enough. +Otherwise, try RetroWrite, Dyninst, and if these fail, too, then try standard +qemu_mode with AFL_ENTRYPOINT to where you need it. + +If your target is a library, then use frida_mode. + +If your target is non-linux, then use unicorn_mode. -### QEMU +## Fuzzing binary-only targets with AFL++ +### Qemu_mode -For linux programs and its libraries this is accomplished with a version of -QEMU running in the lesser-known "user space emulation" mode. -QEMU is a project separate from AFL, but you can conveniently build the -feature by doing: +Qemu_mode is the "native" solution to the program. It is available in the +./qemu_mode/ directory and, once compiled, it can be accessed by the afl-fuzz -Q +command line option. It is the easiest to use alternative and even works for +cross-platform binaries. + +For linux programs and its libraries, this is accomplished with a version of +QEMU running in the lesser-known "user space emulation" mode. QEMU is a project +separate from AFL++, but you can conveniently build the feature by doing: ```shell cd qemu_mode ./build_qemu_support.sh ``` -For additional instructions and caveats, see [qemu_mode/README.md](../qemu_mode/README.md). -If possible you should use the persistent mode, see [qemu_mode/README.persistent.md](../qemu_mode/README.persistent.md). -The mode is approximately 2-5x slower than compile-time instrumentation, and is -less conducive to parallelization. +The following setup to use qemu_mode is recommended: +* run 1 afl-fuzz -Q instance with CMPLOG (`-c 0` + `AFL_COMPCOV_LEVEL=2`) +* run 1 afl-fuzz -Q instance with QASAN (`AFL_USE_QASAN=1`) +* run 1 afl-fuzz -Q instance with LAF (`AFL_PRELOAD=libcmpcov.so` + + `AFL_COMPCOV_LEVEL=2`), alternatively you can use frida_mode, just switch `-Q` + with `-O` and remove the LAF instance + +Then run as many instances as you have cores left with either -Q mode or - even +better - use a binary rewriter like Dyninst, RetroWrite, ZAFL, etc. + +If [afl-dyninst](https://github.com/vanhauser-thc/afl-dyninst) works for your +binary, then you can use afl-fuzz normally and it will have twice the speed +compared to qemu_mode (but slower than qemu persistent mode). Note that several +other binary rewriters exist, all with their advantages and caveats. + +The speed decrease of qemu_mode is at about 50%. However, various options exist +to increase the speed: +- using AFL_ENTRYPOINT to move the forkserver entry to a later basic block in + the binary (+5-10% speed) +- using persistent mode + [qemu_mode/README.persistent.md](../qemu_mode/README.persistent.md) this will + result in a 150-300% overall speed increase - so 3-8x the original qemu_mode + speed! +- using AFL_CODE_START/AFL_CODE_END to only instrument specific parts + +For additional instructions and caveats, see +[qemu_mode/README.md](../qemu_mode/README.md). If possible, you should use the +persistent mode, see +[qemu_mode/README.persistent.md](../qemu_mode/README.persistent.md). The mode is +approximately 2-5x slower than compile-time instrumentation, and is less +conducive to parallelization. + +Note that there is also honggfuzz: +[https://github.com/google/honggfuzz](https://github.com/google/honggfuzz) which +now has a qemu_mode, but its performance is just 1.5% ... + +If you like to code a customized fuzzer without much work, we highly recommend +to check out our sister project libafl which will support QEMU, too: +[https://github.com/AFLplusplus/LibAFL](https://github.com/AFLplusplus/LibAFL) + +### WINE+QEMU + +Wine mode can run Win32 PE binaries with the QEMU instrumentation. It needs +Wine, python3, and the pefile python package installed. + +It is included in AFL++. -If [afl-dyninst](https://github.com/vanhauser-thc/afl-dyninst) works for -your binary, then you can use afl-fuzz normally and it will have twice -the speed compared to qemu_mode (but slower than qemu persistent mode). -Note that several other binary rewriters exist, all with their advantages and -caveats. +### Frida_mode -### Frida +In frida_mode, you can fuzz binary-only targets as easily as with QEMU. +Frida_mode is sometimes faster and sometimes slower than Qemu_mode. It is also +newer, lacks COMPCOV, and has the advantage that it works on MacOS (both intel +and M1). -Frida mode is sometimes faster and sometimes slower than Qemu mode. -It is also newer, lacks COMPCOV, but supports MacOS. +To build frida_mode: ```shell cd frida_mode make ``` -For additional instructions and caveats, see [frida_mode/README.md](../frida_mode/README.md). -If possible you should use the persistent mode, see [qemu_frida/README.md](../qemu_frida/README.md). -The mode is approximately 2-5x slower than compile-time instrumentation, and is -less conducive to parallelization. +For additional instructions and caveats, see +[frida_mode/README.md](../frida_mode/README.md). If possible, you should use the +persistent mode, see [qemu_frida/README.md](../qemu_frida/README.md). The mode +is approximately 2-5x slower than compile-time instrumentation, and is less +conducive to parallelization. But for binary-only fuzzing, it gives a huge speed +improvement if it is possible to use. + +If you want to fuzz a binary-only library, then you can fuzz it with frida-gum +via frida_mode/. You will have to write a harness to call the target function in +the library, use afl-frida.c as a template. + +You can also perform remote fuzzing with frida, e.g. if you want to fuzz on +iPhone or Android devices, for this you can use +[https://github.com/ttdennis/fpicker/](https://github.com/ttdennis/fpicker/) as +an intermediate that uses AFL++ for fuzzing. + +If you like to code a customized fuzzer without much work, we highly recommend +to check out our sister project libafl which supports Frida, too: +[https://github.com/AFLplusplus/LibAFL](https://github.com/AFLplusplus/LibAFL). +Working examples already exist :-) ### Unicorn -For non-Linux binaries you can use AFL++'s unicorn mode which can emulate +Unicorn is a fork of QEMU. The instrumentation is, therefore, very similar. In +contrast to QEMU, Unicorn does not offer a full system or even userland +emulation. Runtime environment and/or loaders have to be written from scratch, +if needed. On top, block chaining has been removed. This means the speed boost +introduced in the patched QEMU Mode of AFL++ cannot simply be ported over to +Unicorn. + +For non-Linux binaries, you can use AFL++'s unicorn_mode which can emulate anything you want - for the price of speed and user written scripts. -See [unicorn_mode/README.md](../unicorn_mode/README.md). -It can be easily built by: +To build unicorn_mode: + ```shell cd unicorn_mode ./build_unicorn_support.sh ``` +For further information, check out +[unicorn_mode/README.md](../unicorn_mode/README.md). + ### Shared libraries -If the goal is to fuzz a dynamic library then there are two options available. -For both you need to write a small harness that loads and calls the library. -Then you fuzz this with either frida_mode or qemu_mode, and either use +If the goal is to fuzz a dynamic library, then there are two options available. +For both, you need to write a small harness that loads and calls the library. +Then you fuzz this with either frida_mode or qemu_mode and either use `AFL_INST_LIBS=1` or `AFL_QEMU/FRIDA_INST_RANGES`. -Another, less precise and slower option is using ptrace with debugger interrupt -instrumentation: [utils/afl_untracer/README.md](../utils/afl_untracer/README.md). +Another, less precise and slower option is to fuzz it with utils/afl_untracer/ +and use afl-untracer.c as a template. It is slower than frida_mode. + +For more information, see +[utils/afl_untracer/README.md](../utils/afl_untracer/README.md). + +## Binary rewriters + +### Coresight + +Coresight is ARM's answer to Intel's PT. With AFL++ v3.15, there is a coresight +tracer implementation available in `coresight_mode/` which is faster than QEMU, +however, cannot run in parallel. Currently, only one process can be traced, it +is WIP. + +### Dyninst + +Dyninst is a binary instrumentation framework similar to Pintool and DynamoRIO. +However, whereas Pintool and DynamoRIO work at runtime, Dyninst instruments the +target at load time and then let it run - or save the binary with the changes. +This is great for some things, e.g. fuzzing, and not so effective for others, +e.g. malware analysis. + +So, what we can do with Dyninst is taking every basic block and put AFL++'s +instrumentation code in there - and then save the binary. Afterwards, we can +just fuzz the newly saved target binary with afl-fuzz. Sounds great? It is. The +issue though - it is a non-trivial problem to insert instructions, which change +addresses in the process space, so that everything is still working afterwards. +Hence, more often than not binaries crash when they are run. + +The speed decrease is about 15-35%, depending on the optimization options used +with afl-dyninst. + +[https://github.com/vanhauser-thc/afl-dyninst](https://github.com/vanhauser-thc/afl-dyninst) + +### Intel PT + +If you have a newer Intel CPU, you can make use of Intel's processor trace. The +big issue with Intel's PT is the small buffer size and the complex encoding of +the debug information collected through PT. This makes the decoding very CPU +intensive and hence slow. As a result, the overall speed decrease is about +70-90% (depending on the implementation and other factors). + +There are two AFL intel-pt implementations: + +1. [https://github.com/junxzm1990/afl-pt](https://github.com/junxzm1990/afl-pt) + => This needs Ubuntu 14.04.05 without any updates and the 4.4 kernel. + +2. [https://github.com/hunter-ht-2018/ptfuzzer](https://github.com/hunter-ht-2018/ptfuzzer) + => This needs a 4.14 or 4.15 kernel. The "nopti" kernel boot option must be + used. This one is faster than the other. + +Note that there is also honggfuzz: +[https://github.com/google/honggfuzz](https://github.com/google/honggfuzz). But +its IPT performance is just 6%! + +### Mcsema + +Theoretically, you can also decompile to llvm IR with mcsema, and then use +llvm_mode to instrument the binary. Good luck with that. + +[https://github.com/lifting-bits/mcsema](https://github.com/lifting-bits/mcsema) + +### Pintool & DynamoRIO + +Pintool and DynamoRIO are dynamic instrumentation engines. They can be used for +getting basic block information at runtime. Pintool is only available for Intel +x32/x64 on Linux, Mac OS, and Windows, whereas DynamoRIO is additionally +available for ARM and AARCH64. DynamoRIO is also 10x faster than Pintool. + +The big issue with DynamoRIO (and therefore Pintool, too) is speed. DynamoRIO +has a speed decrease of 98-99%, Pintool has a speed decrease of 99.5%. + +Hence, DynamoRIO is the option to go for if everything else fails and Pintool +only if DynamoRIO fails, too. + +DynamoRIO solutions: +* [https://github.com/vanhauser-thc/afl-dynamorio](https://github.com/vanhauser-thc/afl-dynamorio) +* [https://github.com/mxmssh/drAFL](https://github.com/mxmssh/drAFL) +* [https://github.com/googleprojectzero/winafl/](https://github.com/googleprojectzero/winafl/) + <= very good but windows only + +Pintool solutions: +* [https://github.com/vanhauser-thc/afl-pin](https://github.com/vanhauser-thc/afl-pin) +* [https://github.com/mothran/aflpin](https://github.com/mothran/aflpin) +* [https://github.com/spinpx/afl_pin_mode](https://github.com/spinpx/afl_pin_mode) + <= only old Pintool version supported + +### RetroWrite + +If you have an x86/x86_64 binary that still has its symbols, is compiled with +position independent code (PIC/PIE), and does not use most of the C++ features, +then the RetroWrite solution might be for you. It decompiles to ASM files which +can then be instrumented with afl-gcc. + +It is at about 80-85% performance. + +[https://github.com/HexHive/retrowrite](https://github.com/HexHive/retrowrite) + +### ZAFL +ZAFL is a static rewriting platform supporting x86-64 C/C++, +stripped/unstripped, and PIE/non-PIE binaries. Beyond conventional +instrumentation, ZAFL's API enables transformation passes (e.g., laf-Intel, +context sensitivity, InsTrim, etc.). + +Its baseline instrumentation speed typically averages 90-95% of +afl-clang-fast's. + +[https://git.zephyr-software.com/opensrc/zafl](https://git.zephyr-software.com/opensrc/zafl) + +## Non-AFL++ solutions + +There are many binary-only fuzzing frameworks. Some are great for CTFs but don't +work with large binaries, others are very slow but have good path discovery, +some are very hard to set-up... + + +* Jackalope: + [https://github.com/googleprojectzero/Jackalope](https://github.com/googleprojectzero/Jackalope) +* Manticore: + [https://github.com/trailofbits/manticore](https://github.com/trailofbits/manticore) +* QSYM: + [https://github.com/sslab-gatech/qsym](https://github.com/sslab-gatech/qsym) +* S2E: [https://github.com/S2E](https://github.com/S2E) +* TinyInst: + [https://github.com/googleprojectzero/TinyInst](https://github.com/googleprojectzero/TinyInst) + (Mac/Windows only) +* ... please send me any missing that are good -### More +## Closing words -A more comprehensive description of these and other options can be found in -[binaryonly_fuzzing.md](binaryonly_fuzzing.md). \ No newline at end of file +That's it! News, corrections, updates? Send an email to vh@thc.org. \ No newline at end of file |