diff options
Diffstat (limited to 'docs')
-rw-r--r-- | docs/binaryonly_fuzzing.txt | 27 | ||||
-rw-r--r-- | docs/unicorn_mode.txt | 107 |
2 files changed, 118 insertions, 16 deletions
diff --git a/docs/binaryonly_fuzzing.txt b/docs/binaryonly_fuzzing.txt index 0fb12b2b..04e449c0 100644 --- a/docs/binaryonly_fuzzing.txt +++ b/docs/binaryonly_fuzzing.txt @@ -12,7 +12,7 @@ The following is a description of how these can be fuzzed with afl++ !!!!! TL;DR: try DYNINST with afl-dyninst. If it produces too many crashes then - use afl -Q qemu_mode, or better: use both in parallel + use afl -Q qemu_mode. !!!!! @@ -27,6 +27,16 @@ It is the easiest to use alternative and even works for cross-platform binaries. As it is included in afl++ this needs no URL. +UNICORN +------- +Unicorn is a fork of QEMU. The instrumentation is, therefore, very similar. +In contrast to QEMU, Unicorn does not offer a full system or even userland emulation. +Runtime environment and/or loaders have to be written from scratch, if needed. +On top, block chaining has been removed. This means the speed boost introduced in +to the patched QEMU Mode of afl++ cannot simply be ported over to Unicorn. +For further information, check out ./unicorn_mode.txt. + + DYNINST ------- Dyninst is a binary instrumentation framework similar to Pintool and Dynamorio @@ -111,21 +121,6 @@ Pintool solutions: https://github.com/spinpx/afl_pin_mode <= only old Pintool version supported -Non-AFL solutions ------------------ - -There are many binary-only fuzzing frameworks. Some are great for CTFs but don't -work with large binaries, other are very slow but have good path discovery, -some are very hard to set-up ... - -QSYM: https://github.com/sslab-gatech/qsym -Manticore: https://github.com/trailofbits/manticore -S2E: https://github.com/S2E -<please send me any missing that are good> - - - That's it! News, corrections, updates? Email vh@thc.org - diff --git a/docs/unicorn_mode.txt b/docs/unicorn_mode.txt new file mode 100644 index 00000000..ae6a2bde --- /dev/null +++ b/docs/unicorn_mode.txt @@ -0,0 +1,107 @@ +========================================================= +Unicorn-based binary-only instrumentation for afl-fuzz +========================================================= + +1) Introduction +--------------- + +The code in ./unicorn_mode allows you to build a standalone feature that +leverages the Unicorn Engine and allows callers to obtain instrumentation +output for black-box, closed-source binary code snippets. This mechanism +can be then used by afl-fuzz to stress-test targets that couldn't be built +with afl-gcc or used in QEMU mode, or with other extensions such as +TriforceAFL. + +There is a significant performance penalty compared to native AFL, +but at least we're able to use AFL on these binaries, right? + +The idea and much of the implementation comes from Nathan Voss <njvoss299@gmail.com>. + +2) How to use +------------- + +*** Building AFL's Unicorn Mode *** + +First, make afl as usual. +Once that completes successfully you need to build and add in the Unicorn Mode +features: + + $ cd unicorn_mode + $ ./build_unicorn_support.sh + +NOTE: This script downloads a recent Unicorn Engine commit that has been tested +and is stable-ish from the Unicorn github page. If you are offline, you'll need +to hack up this script a little bit and supply your own copy of Unicorn's latest +stable release. It's not very hard, just check out the beginning of the +build_unicorn_support.sh script and adjust as necessary. + +Building Unicorn will take a little bit (~5-10 minutes). Once it completes +it automatically compiles a sample application and verify that it works. + +*** Fuzzing with Unicorn Mode *** + +To really use unicorn-mode effectively you need to prepare the following: + + * Relevant binary code to be fuzzed + * Knowledge of the memory map and good starting state + * Folder containing sample inputs to start fuzzing with + - Same ideas as any other AFL inputs + - Quality/speed of results will depend greatly on quality of starting + samples + - See AFL's guidance on how to create a sample corpus + * Unicorn-based test harness which: + - Adds memory map regions + - Loads binary code into memory + - Emulates at least one instruction* + - Yeah, this is lame. See 'Gotchas' section below for more info + - Loads and verifies data to fuzz from a command-line specified file + - AFL will provide mutated inputs by changing the file passed to + the test harness + - Presumably the data to be fuzzed is at a fixed buffer address + - If input constraints (size, invalid bytes, etc.) are known they + should be checked after the file is loaded. If a constraint + fails, just exit the test harness. AFL will treat the input as + 'uninteresting' and move on. + - Sets up registers and memory state for beginning of test + - Emulates the interested code from beginning to end + - If a crash is detected, the test harness must 'crash' by + throwing a signal (SIGSEGV, SIGKILL, SIGABORT, etc.) + +Once you have all those things ready to go you just need to run afl-fuzz in +'unicorn-mode' by passing in the '-U' flag: + + $ afl-fuzz -U -m none -i /path/to/inputs -o /path/to/results -- ./test_harness @@ + +The normal afl-fuzz command line format applies to everything here. Refer to +AFL's main documentation for more info about how to use afl-fuzz effectively. + +For a much clearer vision of what all of this looks like, please refer to the +sample provided in the 'unicorn_mode/samples' directory. There is also a blog +post that goes over the basics at: + +https://medium.com/@njvoss299/afl-unicorn-fuzzing-arbitrary-binary-code-563ca28936bf + +The 'helper_scripts' directory also contains several helper scripts that allow you +to dump context from a running process, load it, and hook heap allocations. For details +on how to use this check out the follow-up blog post to the one linked above. + +A example use of AFL-Unicorn mode is discussed in the Paper Unicorefuzz: +https://www.usenix.org/conference/woot19/presentation/maier + +3) Gotchas, feedback, bugs +-------------------------- + +To make sure that AFL's fork server starts up correctly the Unicorn test +harness script must emulate at least one instruction before loading the +data that will be fuzzed from the input file. It doesn't matter what the +instruction is, nor if it is valid. This is an artifact of how the fork-server +is started and could likely be fixed with some clever re-arranging of the +patches applied to Unicorn. + +Running the build script builds Unicorn and its python bindings and installs +them on your system. This installation will supersede any existing Unicorn +installation with the patched afl-unicorn version. + +Refer to the unicorn_mode/samples/arm_example/arm_tester.c for an example +of how to do this properly! If you don't get this right, AFL will not +load any mutated inputs and your fuzzing will be useless! |