4 files changed, 220 insertions, 212 deletions
diff --git a/docs/custom_mutator.md b/docs/custom_mutator.md
deleted file mode 100644
index dff32c1d..00000000
--- a/docs/custom_mutator.md
+++ /dev/null
@@ -1,45 +0,0 @@
-# Adding custom mutators to AFL
-
-This file describes how you can implement custom mutations to be used in AFL.
-
-Implemented by Khaled Yakdan from Code Intelligence <yakdan@code-intelligence.de>
-
-## 1) Description
-
-Custom mutator libraries can be passed to afl-fuzz to perform custom mutations
-on test cases beyond those available in AFL - for example, to enable
-structure-aware fuzzing by using libraries that perform mutations according to
-a given grammar.
-
-The custom mutator library is passed to afl-fuzz via the
-AFL_CUSTOM_MUTATOR_LIBRARY environment variable. The library must export
-the afl_custom_mutator() function and must be compiled as a shared object.
-For example:
-```
-$CC -shared -Wall -O3 <lib-name>.c -o <lib-name>.so
-```
-Note: unless AFL_CUSTOM_MUTATOR_ONLY is set, it is a state mutator like any
-other, so it will be used for some test cases, and other mutators for others.
-
-Only if AFL_CUSTOM_MUTATOR_ONLY is set the afl_custom_mutator() function will
-be called every time it needs to mutate a test case.
-
-For some cases, the format of the mutated data returned from the custom
-mutator is not suitable to directly execute the target with this input.
-For example, when using libprotobuf-mutator, the data returned is in a
-protobuf format which corresponds to a given grammar.
-In order to execute the target, the protobuf data must be converted to the
-plain-text format expected by the target.
-In such scenarios, the user can define the afl_pre_save_handler() function.
-This function is then transforms the data into the format expected by the
-API before executing the target.
-afl_pre_save_handler is optional and does not have to be implemented if its
-functionality is not needed.
-
-## 2) Example
-
-A simple example is provided in ../examples/custom_mutators/
-
-There is also a libprotobuf example available at [https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator](https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator)
-Another implementation can be found at [https://github.com/thebabush/afl-libprotobuf-mutator](https://github.com/thebabush/afl-libprotobuf-mutator)
-
diff --git a/docs/custom_mutators.md b/docs/custom_mutators.md
new file mode 100644
index 00000000..4deb07e1
--- /dev/null
+++ b/docs/custom_mutators.md
@@ -0,0 +1,201 @@
+# Custom Mutators in AFL++
+
+This file describes how you can implement custom mutations to be used in AFL.
+For now, we support C/C++ library and Python module, collectivelly named as the
+custom mutator.
+
+Implemented by
+- C/C++ library (`*.so`): Khaled Yakdan from Code Intelligence (<yakdan@code-intelligence.de>)
+- Python module: Christian Holler from Mozilla (<choller@mozilla.com>)
+
+## 1) Introduction
+
+Custom mutators can be passed to `afl-fuzz` to perform custom mutations on test
+cases beyond those available in AFL. For example, to enable structure-aware
+fuzzing by using libraries that perform mutations according to a given grammar.
+
+The custom mutator is passed to `afl-fuzz` via the `AFL_CUSTOM_MUTATOR_LIBRARY`
+or `AFL_PYTHON_MODULE` environment variable., and must export a fuzz function.
+Please see [APIs](#2-apis) and [Usage](#3-usage) for detail.
+
+The custom mutation stage is set to be the first non-deterministic stage (right before the havoc stage).
+
+Note: If `AFL_CUSTOM_MUTATOR_ONLY` is set, all mutations will solely be
+performed with the custom mutator.
+
+## 2) APIs
+
+C/C++:
+```c
+void afl_custom_init(unsigned int seed);
+size_t afl_custom_fuzz(u8* buf, size_t buf_size,
+                       u8* add_buf, size_t add_buf_size,
+                       u8* mutated_out, size_t max_size);
+size_t afl_custom_pre_save(u8* buf, size_t buf_size, u8** out_buf);
+u32 afl_custom_init_trim(u8* buf, size_t buf_size);
+void afl_custom_trim(u8** out_buf, size_t* out_buf_size);
+u32 afl_custom_post_trim(u8 success);
+```
+
+Python:
+```python
+def init(seed):
+    pass
+
+def fuzz(buf, add_buf, max_size):
+    return mutated_out
+
+def pre_save(buf):
+    return out_buf
+
+def init_trim(buf):
+    return cnt
+
+def trim():
+    return out_buf
+
+def post_trim(success):
+    return next_index
+```
+
+### Custom Mutation
+
+- `init` (optional):
+
+    This method is called when AFL++ starts up and is used to seed RNG.
+
+- `fuzz` (required):
+
+    This method performs custom mutations on a given input. It also accepts an
+    additional test case.
+
+- `pre_save` (optional):
+
+    For some cases, the format of the mutated data returned from the custom
+    mutator is not suitable to directly execute the target with this input.
+    For example, when using libprotobuf-mutator, the data returned is in a
+    protobuf format which corresponds to a given grammar. In order to execute
+    the target, the protobuf data must be converted to the plain-text format expected by the target. In such scenarios, the user can define the
+    `pre_save` function. This function is then transforms the data into the
+    format expected by the API before executing the target.
+
+
+### Trimming Support
+
+The generic trimming routines implemented in AFL++ can easily destroy the
+structure of complex formats, possibly leading to a point where you have a lot
+of test cases in the queue that your Python module cannot process anymore but
+your target application still accepts. This is especially the case when your
+target can process a part of the input (causing coverage) and then errors out
+on the remaining input.
+
+In such cases, it makes sense to implement a custom trimming routine. The API
+consists of multiple methods because after each trimming step, we have to go
+back into the C code to check if the coverage bitmap is still the same for the
+trimmed input. Here's a quick API description:
+
+- `init_trim` (optional):
+
+    This method is called at the start of each trimming operation and receives
+    the initial buffer. It should return the amount of iteration steps possible
+    on this input (e.g. if your input has n elements and you want to remove them
+    one by one, return n, if you do a binary search, return log(n), and so on).
+
+    If your trimming algorithm doesn't allow you to determine the amount of
+    (remaining) steps easily (esp. while running), then you can alternatively
+    return 1 here and always return 0 in `post_trim` until you are finished and
+    no steps remain. In that case, returning 1 in `post_trim` will end the
+    trimming routine. The whole current index/max iterations stuff is only used
+    to show progress.
+
+- `trim` (optional)
+
+    This method is called for each trimming operation. It doesn't have any
+    arguments because we already have the initial buffer from `init_trim` and we
+    can memorize the current state in global variables. This can also save
+    reparsing steps for each iteration. It should return the trimmed input
+    buffer, where the returned data must not exceed the initial input data in
+    length. Returning anything that is larger than the original data (passed to
+    `init_trim`) will result in a fatal abort of AFL++.
+
+- `post_trim` (optional)
+
+    This method is called after each trim operation to inform you if your
+    trimming step was successful or not (in terms of coverage). If you receive
+    a failure here, you should reset your input to the last known good state.
+    In any case, this method must return the next trim iteration index (from 0
+    to the maximum amount of steps you returned in `init_trim`).
+
+Omitting any of three methods will cause the trimming to be disabled and trigger
+a fallback to the builtin default trimming routine.
+
+### Environment Variables
+
+Optionally, the following environment variables are supported:
+
+- `AFL_PYTHON_ONLY`
+
+    Disable all other mutation stages. This can prevent broken testcases
+    (those that your Python module can't work with anymore) to fill up your
+    queue. Best combined with a custom trimming routine (see below) because
+    trimming can cause the same test breakage like havoc and splice.
+
+- `AFL_DEBUG`
+
+    When combined with `AFL_NO_UI`, this causes the C trimming code to emit additional messages about the performance and actions of your custom trimmer. Use this to see if it works :)
+
+## 3) Usage
+
+### Prerequisite
+
+For Python mutator, the python 3 or 2 development package is required. On
+Debian/Ubuntu/Kali this can be done:
+
+```bash
+sudo apt install python3-dev
+# or
+sudo apt install python-dev
+```
+
+Then, AFL++ can be compiled with Python support. The AFL++ Makefile detects
+Python 2 and 3 through `python-config` if it is in the PATH and compiles
+`afl-fuzz` with the feature if available.
+
+Note: for some distributions, you might also need the package `python[23]-apt`.
+In case your setup is different, set the necessary variables like this:
+`PYTHON_INCLUDE=/path/to/python/include LDFLAGS=-L/path/to/python/lib make`.
+
+### Custom Mutator Preparation
+
+For C/C++ mutator, the source code must be compiled as a shared object:
+```bash
+gcc -shared -Wall -O3 example.c -o example.so
+```
+
+### Run
+
+C/C++
+```bash
+export AFL_CUSTOM_MUTATOR_LIBRARY=/full/path/to/example.so
+afl-fuzz /path/to/program
+```
+
+Python
+```bash
+export PYTHONPATH=`dirname /full/path/to/example.py`
+export AFL_PYTHON_MODULE=example
+afl-fuzz /path/to/program
+```
+
+## 4) Example
+
+Please see [example.c](../examples/custom_mutators/example.c) and
+[example.py](../examples/custom_mutators/example.py)
+
+## 5) Other Resources
+
+- AFL libprotobuf mutator
+    - [bruce30262/libprotobuf-mutator_fuzzing_learning](https://github.com/bruce30262/libprotobuf-mutator_fuzzing_learning/tree/master/4_libprotobuf_aflpp_custom_mutator)
+    - [thebabush/afl-libprotobuf-mutator](https://github.com/thebabush/afl-libprotobuf-mutator)
+- [XML Fuzzing@NullCon 2017](https://www.agarri.fr/docs/XML_Fuzzing-NullCon2017-PUBLIC.pdf)
+    - [A bug detected by AFL + XML-aware mutators](https://bugs.chromium.org/p/chromium/issues/detail?id=930663)
diff --git a/docs/env_variables.md b/docs/env_variables.md
index 527f1c1b..d1cf6977 100644
--- a/docs/env_variables.md
+++ b/docs/env_variables.md
@@ -104,7 +104,7 @@ Then there are a few specific features that are only available in llvm_mode:
     - Setting AFL_LLVM_LAF_SPLIT_COMPARES will split all floating point and
       64, 32 and 16 bit integer CMP instructions
 
-    See llvm_mode/README.laf-intel.md for more information. 
+    See llvm_mode/README.laf-intel.md for more information.
 
 ### WHITELIST
 
@@ -192,7 +192,7 @@ checks or alter some of the more exotic semantics of the tool:
     deciding if a particular test case is a "hang". The default is 1 second
     or the value of the -t parameter, whichever is larger. Dialing the value
     down can be useful if you are very concerned about slow inputs, or if you
-    don't want AFL to spend too much time classifying that stuff and just 
+    don't want AFL to spend too much time classifying that stuff and just
     rapidly put all timeouts in that bin.
 
   - AFL_NO_ARITH causes AFL to skip most of the deterministic arithmetics.
@@ -223,15 +223,15 @@ checks or alter some of the more exotic semantics of the tool:
     for more.
 
   - Setting AFL_CUSTOM_MUTATOR_LIBRARY to a shared library with
-    afl_custom_mutator() creates additional mutations through this library.
+    afl_custom_fuzz() creates additional mutations through this library.
+    If afl-fuzz is compiled with Python (which is autodetected during builing
+    afl-fuzz), setting AFL_PYTHON_MODULE to a Python module can also provide
+    additional mutations.
     If AFL_CUSTOM_MUTATOR_ONLY is also set, all mutations will solely be
-    performed with/from the library. See [custom_mutator.md](custom_mutator.md)
-
-  - For AFL_PYTHON_MODULE and AFL_PYTHON_ONLY - they require afl-fuzz to
-    be compiled with Python (which is autodetected during builing afl-fuzz).
-    Please see [python_mutators.md](python_mutators.md).
-    This feature allows to configure custom mutators which can be very helpful
-    in e.g. fuzzing XML or other highly flexible structured input.
+    performed with the custom mutator.
+    This feature allows to configure custom mutators which can be very helpful,
+    e.g. fuzzing XML or other highly flexible structured input.
+    Please see [custom_mutators.md](custom_mutators.md).
 
   - AFL_FAST_CAL keeps the calibration stage about 2.5x faster (albeit less
     precise), which can help when starting a session against a slow target.
@@ -283,7 +283,7 @@ The QEMU wrapper used to instrument binary-only code supports several settings:
 
   - Setting AFL_INST_LIBS causes the translator to also instrument the code
     inside any dynamically linked libraries (notably including glibc).
-  
+
   - Setting AFL_COMPCOV_LEVEL enables the CompareCoverage tracing of all cmp
     and sub in x86 and x86_64 and memory comparions functions (e.g. strcmp,
     memcmp, ...) when libcompcov is preloaded using AFL_PRELOAD.
@@ -292,7 +292,7 @@ The QEMU wrapper used to instrument binary-only code supports several settings:
     only comparisons with immediate values / read-only memory and
     AFL_COMPCOV_LEVEL=2 that instruments all the comparions. Level 2 is more
     accurate but may need a larger shared memory.
-  
+
   - Setting AFL_QEMU_COMPCOV enables the CompareCoverage tracing of all
     cmp and sub in x86 and x86_64.
     This is an alias of AFL_COMPCOV_LEVEL=1 when AFL_COMPCOV_LEVEL is
@@ -304,25 +304,25 @@ The QEMU wrapper used to instrument binary-only code supports several settings:
 
   - AFL_DEBUG will print the found entrypoint for the binary to stderr.
     Use this if you are unsure if the entrypoint might be wrong - but
-    use it directly, e.g. afl-qemu-trace ./program 
+    use it directly, e.g. afl-qemu-trace ./program
 
   - AFL_ENTRYPOINT allows you to specify a specific entrypoint into the
     binary (this can be very good for the performance!).
     The entrypoint is specified as hex address, e.g. 0x4004110
     Note that the address must be the address of a basic block.
-  
+
   - When the target is i386/x86_64 you can specify the address of the function
     that has to be the body of the persistent loop using
     AFL_QEMU_PERSISTENT_ADDR=`start addr`.
-  
+
   - Another modality to execute the persistent loop is to specify also the
     AFL_QEMU_PERSISTENT_RET=`end addr` env variable.
     With this variable assigned, instead of patching the return address, the
     specified instruction is transformed to a jump towards `start addr`.
-    
+
   - AFL_QEMU_PERSISTENT_GPR=1 QEMU will save the original value of general
     purpose registers and restore them in each persistent cycle.
-  
+
   - With AFL_QEMU_PERSISTENT_RETADDR_OFFSET you can specify the offset from the
     stack pointer in which QEMU can find the return address when `start addr` is
     hitted.
@@ -376,7 +376,7 @@ The library honors these environmental variables:
   - AFL_LD_NO_CALLOC_OVER inhibits abort() on calloc() overflows. Most
     of the common allocators check for that internally and return NULL, so
     it's a security risk only in more exotic setups.
-  
+
   - AFL_ALIGNED_ALLOC=1 will force the alignment of the allocation size to
     max_align_t to be compliant with the C standard.
 
@@ -410,7 +410,7 @@ optimal values if not already present in the environment:
 
   - In the same vein, by default, MSAN_OPTIONS are set to:
 
-    exit_code=86 (required for legacy reasons)    
+    exit_code=86 (required for legacy reasons)
     abort_on_error=1
     symbolize=0
     msan_track_origins=0
diff --git a/docs/python_mutators.md b/docs/python_mutators.md
deleted file mode 100644
index a7e2c7de..00000000
--- a/docs/python_mutators.md
+++ /dev/null
@@ -1,148 +0,0 @@
-# Adding custom mutators to AFL using Python modules
-
-  This file describes how you can utilize the external Python API to write
-  your own custom mutation routines.
-
-  Note: This feature is highly experimental. Use at your own risk.
-
-  Implemented by Christian Holler (:decoder) <choller@mozilla.com>.
-
-  NOTE: Only cPython 2.7, 3.7 and above are supported, although others may work.
-  Depending on with which version afl-fuzz was compiled against, you must use
-  python2 or python3 syntax in your scripts!
-  After a major version upgrade (e.g. 3.7 -> 3.8), a recompilation of afl-fuzz may be needed.
-
-  For an example and a template see ../examples/python_mutators/
-
-
-## 1) Description and purpose
-
-While AFLFuzz comes with a good selection of generic deterministic and
-non-deterministic mutation operations, it sometimes might make sense to extend
-these to implement strategies more specific to the target you are fuzzing.
-
-For simplicity and in order to allow people without C knowledge to extend
-AFLFuzz, I implemented a "Python" stage that can make use of an external
-module (written in Python) that implements a custom mutation stage.
-
-The main motivation behind this is to lower the barrier for people
-experimenting with this tool. Hopefully, someone will be able to do useful
-things with this extension.
-
-If you find it useful, have questions or need additional features added to the
-interface, feel free to send a mail to <choller@mozilla.com>.
-
-See the following information to get a better pictures:
-  https://www.agarri.fr/docs/XML_Fuzzing-NullCon2017-PUBLIC.pdf
-  https://bugs.chromium.org/p/chromium/issues/detail?id=930663
-
-
-## 2) How the Python module looks like
-
-You can find a simple example in pymodules/example.py including documentation
-explaining each function. In the same directory, you can find another simple
-module that performs simple mutations.
-
-Right now, "init" is called at program startup and can be used to perform any
-kinds of one-time initializations while "fuzz" is called each time a mutation
-is requested.
-
-There is also optional support for a trimming API, see the section below for
-further information about this feature.
-
-
-## 3) How to compile AFLFuzz with Python support
-
-You must install the python 3 or 2 development package of your Linux
-distribution before this will work. On Debian/Ubuntu/Kali this can be done
-with either:
-  apt install python3-dev
-or
-  apt install python-dev
-Note that for some distributions you might also need the package python[23]-apt
-
-A prerequisite for using this mode is to compile AFLFuzz with Python support.
-
-The AFL++ Makefile detects Python 3 and 2 through `python-config` if is is in the PATH
-and compiles afl-fuzz with the feature if available.
-
-In case your setup is different set the necessary variables like this:
-PYTHON_INCLUDE=/path/to/python/include LDFLAGS=-L/path/to/python/lib make
-
-
-## 4) How to run AFLFuzz with your custom module
-
-You must pass the module name inside the env variable AFL_PYTHON_MODULE.
-
-In addition, if you are trying to load the module from the local directory,
-you must adjust your PYTHONPATH to reflect this circumstance. The following
-command should work if you are inside the aflfuzz directory:
-
-$ AFL_PYTHON_MODULE="pymodules.test" PYTHONPATH=. ./afl-fuzz
-
-Optionally, the following environment variables are supported:
-
-AFL_PYTHON_ONLY - Disable all other mutation stages. This can prevent broken
-                  testcases (those that your Python module can't work with
-                  anymore) to fill up your queue. Best combined with a custom
-                  trimming routine (see below) because trimming can cause the
-                  same test breakage like havoc and splice.
-
-AFL_DEBUG       - When combined with AFL_NO_UI, this causes the C trimming code
-                  to emit additional messages about the performance and actions
-                  of your custom Python trimmer. Use this to see if it works :)
-
-
-## 5) Order and statistics
-
-The Python stage is set to be the first non-deterministic stage (right before
-the havoc stage). In the statistics however, it shows up as the third number
-under "havoc". That's because I'm lazy and I didn't want to mess with the UI
-too much ;)
-
-
-## 6) Trimming support
-
-The generic trimming routines implemented in AFLFuzz can easily destroy the
-structure of complex formats, possibly leading to a point where you have a lot
-of testcases in the queue that your Python module cannot process anymore but
-your target application still accepts. This is especially the case when your
-target can process a part of the input (causing coverage) and then errors out
-on the remaining input.
-
-In such cases, it makes sense to implement a custom trimming routine in Python.
-The API consists of multiple methods because after each trimming step, we have
-to go back into the C code to check if the coverage bitmap is still the same
-for the trimmed input. Here's a quick API description:
-
-init_trim: This method is called at the start of each trimming operation
-           and receives the initial buffer. It should return the amount
-           of iteration steps possible on this input (e.g. if your input
-           has n elements and you want to remove them one by one, return n,
-           if you do a binary search, return log(n), and so on...).
-
-           If your trimming algorithm doesn't allow you to determine the
-           amount of (remaining) steps easily (esp. while running), then you
-           can alternatively return 1 here and always return 0 in post_trim
-           until you are finished and no steps remain. In that case,
-           returning 1 in post_trim will end the trimming routine. The whole
-           current index/max iterations stuff is only used to show progress.
-
-trim:      This method is called for each trimming operation. It doesn't
-           have any arguments because we already have the initial buffer
-           from init_trim and we can memorize the current state in global
-           variables. This can also save reparsing steps for each iteration.
-           It should return the trimmed input buffer, where the returned data
-           must not exceed the initial input data in length. Returning anything
-           that is larger than the original data (passed to init_trim) will
-           result in a fatal abort of AFLFuzz.
-
-post_trim: This method is called after each trim operation to inform you
-           if your trimming step was successful or not (in terms of coverage).
-           If you receive a failure here, you should reset your input to the
-           last known good state.
-           In any case, this method must return the next trim iteration index
-           (from 0 to the maximum amount of steps you returned in init_trim).
-
-Omitting any of the methods will cause Python trimming to be disabled and
-trigger a fallback to the builtin default trimming routine.