Merge branch 'master-upstream' into custom_mutator_docs

# Conflicts: # afl-fuzz.c
author: Khaled Yakdan <yakdan@code-intelligence.de> 2019-09-04 23:20:18 +0200
committer: Khaled Yakdan <yakdan@code-intelligence.de> 2019-09-04 23:20:18 +0200
commit: b31dff6beec6a7aa17da6f7f8a2eef198c263ccc (patch)
tree: c039aeed3572b171c2b7108cd650a0ee53c1b0f6 /unicorn_mode/README.md
parent: 1b3f9713309d27c49b153f9b3af12d208076e93c (diff)
parent: abf61ecc8f1b4ea3de59f818d859139637b29f32 (diff)
download: afl++-b31dff6beec6a7aa17da6f7f8a2eef198c263ccc.tar.gz
1 files changed, 113 insertions, 17 deletions
diff --git a/unicorn_mode/README.md b/unicorn_mode/README.md
index 9ee975ef..ea3e3c9b 100644
--- a/unicorn_mode/README.md
+++ b/unicorn_mode/README.md
@@ -1,23 +1,119 @@
-```
-        __ _                 _                      
-  __ _ / _| |    _   _ _ __ (_) ___ ___  _ __ _ __  
- / _` | |_| |___| | | | '_ \| |/ __/ _ \| '__| '_ \ 
-| (_| |  _| |___| |_| | | | | | (_| (_) | |  | | | |
- \__,_|_| |_|    \__,_|_| |_|_|\___\___/|_|  |_| |_|
-                                                      
-```
+# Unicorn-based binary-only instrumentation for afl-fuzz
 
-afl-unicorn lets you fuzz any piece of binary that can be emulated by
-[Unicorn Engine](http://www.unicorn-engine.org/). 
+The idea and much of the original implementation comes from Nathan Voss <njvoss299@gmail.com>.
 
-Requirements: Python2
+The port to afl++ if by Dominik Maier <mail@dmnk.co>.
 
-For the full readme please see docs/unicorn_mode.txt
+The CompareCoverage and NeverZero counters features by Andrea Fioraldi <andreafioraldi@gmail.com>.
 
-For an in-depth description of what this is, how to install it, and how to use
-it check out this [blog post](https://medium.com/@njvoss299/afl-unicorn-fuzzing-arbitrary-binary-code-563ca28936bf).
+## 1) Introduction
 
-For general help with AFL, please refer to the documents in the ./docs/ directory.
+The code in ./unicorn_mode allows you to build a standalone feature that
+leverages the Unicorn Engine and allows callers to obtain instrumentation 
+output for black-box, closed-source binary code snippets. This mechanism 
+can be then used by afl-fuzz to stress-test targets that couldn't be built 
+with afl-gcc or used in QEMU mode, or with other extensions such as 
+TriforceAFL.
 
-Created by Nathan Voss, originally funded by
-[Battelle](https://www.battelle.org/cyber).
+There is a significant performance penalty compared to native AFL,
+but at least we're able to use AFL on these binaries, right?
+
+## 2) How to use
+
+Requirements: you need an installed python2 environment.
+
+### Building AFL's Unicorn Mode
+
+First, make afl++ as usual.
+Once that completes successfully you need to build and add in the Unicorn Mode 
+features:
+
+  $ cd unicorn_mode
+  $ ./build_unicorn_support.sh
+
+NOTE: This script downloads a Unicorn Engine commit that has been tested 
+and is stable-ish from the Unicorn github page. If you are offline, you'll need 
+to hack up this script a little bit and supply your own copy of Unicorn's latest 
+stable release. It's not very hard, just check out the beginning of the 
+build_unicorn_support.sh script and adjust as necessary.
+
+Building Unicorn will take a little bit (~5-10 minutes). Once it completes 
+it automatically compiles a sample application and verify that it works.
+
+### Fuzzing with Unicorn Mode
+
+To really use unicorn-mode effectively you need to prepare the following:
+
+	* Relevant binary code to be fuzzed
+	* Knowledge of the memory map and good starting state
+	* Folder containing sample inputs to start fuzzing with
+		+ Same ideas as any other AFL inputs
+		+ Quality/speed of results will depend greatly on quality of starting 
+		  samples
+		+ See AFL's guidance on how to create a sample corpus
+	* Unicorn-based test harness which:
+		+ Adds memory map regions
+		+ Loads binary code into memory		
+		+ Emulates at least one instruction*
+			+ Yeah, this is lame. See 'Gotchas' section below for more info		
+		+ Loads and verifies data to fuzz from a command-line specified file
+			+ AFL will provide mutated inputs by changing the file passed to 
+			  the test harness
+			+ Presumably the data to be fuzzed is at a fixed buffer address
+			+ If input constraints (size, invalid bytes, etc.) are known they 
+			  should be checked after the file is loaded. If a constraint 
+			  fails, just exit the test harness. AFL will treat the input as 
+			  'uninteresting' and move on.
+		+ Sets up registers and memory state for beginning of test
+		+ Emulates the interested code from beginning to end
+		+ If a crash is detected, the test harness must 'crash' by 
+		  throwing a signal (SIGSEGV, SIGKILL, SIGABORT, etc.)
+
+Once you have all those things ready to go you just need to run afl-fuzz in
+'unicorn-mode' by passing in the '-U' flag:
+
+	$ afl-fuzz -U -m none -i /path/to/inputs -o /path/to/results -- ./test_harness @@
+
+The normal afl-fuzz command line format applies to everything here. Refer to
+AFL's main documentation for more info about how to use afl-fuzz effectively.
+
+For a much clearer vision of what all of this looks like, please refer to the
+sample provided in the 'unicorn_mode/samples' directory. There is also a blog
+post that goes over the basics at:
+
+https://medium.com/@njvoss299/afl-unicorn-fuzzing-arbitrary-binary-code-563ca28936bf
+
+The 'helper_scripts' directory also contains several helper scripts that allow you 
+to dump context from a running process, load it, and hook heap allocations. For details
+on how to use this check out the follow-up blog post to the one linked above.
+
+A example use of AFL-Unicorn mode is discussed in the Paper Unicorefuzz:
+https://www.usenix.org/conference/woot19/presentation/maier
+
+## 3) Options
+
+As for the QEMU-based instrumentation, the afl-unicorn twist of afl++
+comes with a sub-instruction based instrumentation similar in purpose to laf-intel.
+
+The options that enables Unicorn CompareCoverage are the same used for QEMU.
+AFL_COMPCOV_LEVEL=1 is to instrument comparisons with only immediate
+values. QEMU_COMPCOV_LEVEL=2 instruments all
+comparison instructions. Comparison instructions are currently instrumented only
+on the x86 and x86_64 targets.
+
+## 4) Gotchas, feedback, bugs
+
+To make sure that AFL's fork server starts up correctly the Unicorn test 
+harness script must emulate at least one instruction before loading the
+data that will be fuzzed from the input file. It doesn't matter what the
+instruction is, nor if it is valid. This is an artifact of how the fork-server
+is started and could likely be fixed with some clever re-arranging of the
+patches applied to Unicorn.
+
+Running the build script builds Unicorn and its python bindings and installs 
+them on your system. This installation will supersede any existing Unicorn
+installation with the patched afl-unicorn version.
+
+Refer to the unicorn_mode/samples/arm_example/arm_tester.c for an example
+of how to do this properly! If you don't get this right, AFL will not 
+load any mutated inputs and your fuzzing will be useless!
author	Khaled Yakdan <yakdan@code-intelligence.de>	2019-09-04 23:20:18 +0200
committer	Khaled Yakdan <yakdan@code-intelligence.de>	2019-09-04 23:20:18 +0200
commit	b31dff6beec6a7aa17da6f7f8a2eef198c263ccc (patch)
tree	c039aeed3572b171c2b7108cd650a0ee53c1b0f6 /unicorn_mode/README.md
parent	1b3f9713309d27c49b153f9b3af12d208076e93c (diff)
parent	abf61ecc8f1b4ea3de59f818d859139637b29f32 (diff)
download	afl++-b31dff6beec6a7aa17da6f7f8a2eef198c263ccc.tar.gz