diff options
Diffstat (limited to 'docs/python_mutators.md')
-rw-r--r-- | docs/python_mutators.md | 148 |
1 files changed, 148 insertions, 0 deletions
diff --git a/docs/python_mutators.md b/docs/python_mutators.md new file mode 100644 index 00000000..a7e2c7de --- /dev/null +++ b/docs/python_mutators.md @@ -0,0 +1,148 @@ +# Adding custom mutators to AFL using Python modules + + This file describes how you can utilize the external Python API to write + your own custom mutation routines. + + Note: This feature is highly experimental. Use at your own risk. + + Implemented by Christian Holler (:decoder) <choller@mozilla.com>. + + NOTE: Only cPython 2.7, 3.7 and above are supported, although others may work. + Depending on with which version afl-fuzz was compiled against, you must use + python2 or python3 syntax in your scripts! + After a major version upgrade (e.g. 3.7 -> 3.8), a recompilation of afl-fuzz may be needed. + + For an example and a template see ../examples/python_mutators/ + + +## 1) Description and purpose + +While AFLFuzz comes with a good selection of generic deterministic and +non-deterministic mutation operations, it sometimes might make sense to extend +these to implement strategies more specific to the target you are fuzzing. + +For simplicity and in order to allow people without C knowledge to extend +AFLFuzz, I implemented a "Python" stage that can make use of an external +module (written in Python) that implements a custom mutation stage. + +The main motivation behind this is to lower the barrier for people +experimenting with this tool. Hopefully, someone will be able to do useful +things with this extension. + +If you find it useful, have questions or need additional features added to the +interface, feel free to send a mail to <choller@mozilla.com>. + +See the following information to get a better pictures: + https://www.agarri.fr/docs/XML_Fuzzing-NullCon2017-PUBLIC.pdf + https://bugs.chromium.org/p/chromium/issues/detail?id=930663 + + +## 2) How the Python module looks like + +You can find a simple example in pymodules/example.py including documentation +explaining each function. In the same directory, you can find another simple +module that performs simple mutations. + +Right now, "init" is called at program startup and can be used to perform any +kinds of one-time initializations while "fuzz" is called each time a mutation +is requested. + +There is also optional support for a trimming API, see the section below for +further information about this feature. + + +## 3) How to compile AFLFuzz with Python support + +You must install the python 3 or 2 development package of your Linux +distribution before this will work. On Debian/Ubuntu/Kali this can be done +with either: + apt install python3-dev +or + apt install python-dev +Note that for some distributions you might also need the package python[23]-apt + +A prerequisite for using this mode is to compile AFLFuzz with Python support. + +The AFL++ Makefile detects Python 3 and 2 through `python-config` if is is in the PATH +and compiles afl-fuzz with the feature if available. + +In case your setup is different set the necessary variables like this: +PYTHON_INCLUDE=/path/to/python/include LDFLAGS=-L/path/to/python/lib make + + +## 4) How to run AFLFuzz with your custom module + +You must pass the module name inside the env variable AFL_PYTHON_MODULE. + +In addition, if you are trying to load the module from the local directory, +you must adjust your PYTHONPATH to reflect this circumstance. The following +command should work if you are inside the aflfuzz directory: + +$ AFL_PYTHON_MODULE="pymodules.test" PYTHONPATH=. ./afl-fuzz + +Optionally, the following environment variables are supported: + +AFL_PYTHON_ONLY - Disable all other mutation stages. This can prevent broken + testcases (those that your Python module can't work with + anymore) to fill up your queue. Best combined with a custom + trimming routine (see below) because trimming can cause the + same test breakage like havoc and splice. + +AFL_DEBUG - When combined with AFL_NO_UI, this causes the C trimming code + to emit additional messages about the performance and actions + of your custom Python trimmer. Use this to see if it works :) + + +## 5) Order and statistics + +The Python stage is set to be the first non-deterministic stage (right before +the havoc stage). In the statistics however, it shows up as the third number +under "havoc". That's because I'm lazy and I didn't want to mess with the UI +too much ;) + + +## 6) Trimming support + +The generic trimming routines implemented in AFLFuzz can easily destroy the +structure of complex formats, possibly leading to a point where you have a lot +of testcases in the queue that your Python module cannot process anymore but +your target application still accepts. This is especially the case when your +target can process a part of the input (causing coverage) and then errors out +on the remaining input. + +In such cases, it makes sense to implement a custom trimming routine in Python. +The API consists of multiple methods because after each trimming step, we have +to go back into the C code to check if the coverage bitmap is still the same +for the trimmed input. Here's a quick API description: + +init_trim: This method is called at the start of each trimming operation + and receives the initial buffer. It should return the amount + of iteration steps possible on this input (e.g. if your input + has n elements and you want to remove them one by one, return n, + if you do a binary search, return log(n), and so on...). + + If your trimming algorithm doesn't allow you to determine the + amount of (remaining) steps easily (esp. while running), then you + can alternatively return 1 here and always return 0 in post_trim + until you are finished and no steps remain. In that case, + returning 1 in post_trim will end the trimming routine. The whole + current index/max iterations stuff is only used to show progress. + +trim: This method is called for each trimming operation. It doesn't + have any arguments because we already have the initial buffer + from init_trim and we can memorize the current state in global + variables. This can also save reparsing steps for each iteration. + It should return the trimmed input buffer, where the returned data + must not exceed the initial input data in length. Returning anything + that is larger than the original data (passed to init_trim) will + result in a fatal abort of AFLFuzz. + +post_trim: This method is called after each trim operation to inform you + if your trimming step was successful or not (in terms of coverage). + If you receive a failure here, you should reset your input to the + last known good state. + In any case, this method must return the next trim iteration index + (from 0 to the maximum amount of steps you returned in init_trim). + +Omitting any of the methods will cause Python trimming to be disabled and +trigger a fallback to the builtin default trimming routine. |