diff options
-rw-r--r-- | docs/fuzzing_in_depth.md | 98 |
1 files changed, 58 insertions, 40 deletions
diff --git a/docs/fuzzing_in_depth.md b/docs/fuzzing_in_depth.md index c926d59e..8f4eff79 100644 --- a/docs/fuzzing_in_depth.md +++ b/docs/fuzzing_in_depth.md @@ -55,7 +55,7 @@ tasks, fuzzing may put a strain on your hardware and on the OS. In particular: ### a) Selecting the best AFL++ compiler for instrumenting the target AFL++ comes with a central compiler `afl-cc` that incorporates various different -kinds of compiler targets and and instrumentation options. The following +kinds of compiler targets and instrumentation options. The following evaluation flow will help you to select the best possible. It is highly recommended to have the newest llvm version possible installed, @@ -101,7 +101,7 @@ You can select the mode for the afl-cc compiler by one of the following methods: afl-clang-fast, afl-clang-fast++, afl-clang-lto, afl-clang-lto++, afl-gcc-fast, afl-g++-fast (recommended!). * Using the environment variable `AFL_CC_COMPILER` with `MODE`. -* Passing --afl-MODE command line options to the compiler via +* Passing --afl-`MODE` command line options to the compiler via `CFLAGS`/`CXXFLAGS`/`CPPFLAGS`. `MODE` can be one of the following: @@ -141,20 +141,35 @@ options are available: [instrumentation/README.cmplog.md](../instrumentation/README.cmplog.md). If you use LTO, LLVM, or GCC_PLUGIN mode -(afl-clang-fast/afl-clang-lto/afl-gcc-fast), you have the option to selectively -only instrument parts of the target that you are interested in: - -* To instrument only those parts of the target that you are interested in, - create a file with all the filenames of the source code that should be - instrumented. For afl-clang-lto and afl-gcc-fast - or afl-clang-fast if a mode - other than DEFAULT/PCGUARD is used or you have llvm > 10.0.0 - just put one - filename or function per line (no directory information necessary for - filenames), and either set `export AFL_LLVM_ALLOWLIST=allowlist.txt` **or** - `export AFL_LLVM_DENYLIST=denylist.txt` - depending on if you want per default - to instrument unless noted (DENYLIST) or not perform instrumentation unless - requested (ALLOWLIST). **NOTE:** During optimization functions might be - inlined and then would not match! See - [instrumentation/README.instrument_list.md](../instrumentation/README.instrument_list.md). +(afl-clang-fast/afl-clang-lto/afl-gcc-fast), you might have the option to +selectively instrument _parts_ of the target that you are interested in. +For afl-clang-fast you have to use an llvm version > 10.0.0 or a mode +other than DEFAULT/PCGUARD. + +This step can be done either by explicitly including parts to be instrumented +or by explicitly excluding parts from instrumentation. + +* To instrument _only specified parts_ + create a file (eg `allowlist.txt`) with all the filenames of the source code + that should be instrumented and then + + 1. just put one filename or function per line (no directory information necessary for + filenames) in the file `allowlist.txt`, and + + 2. set `export AFL_LLVM_ALLOWLIST=allowlist.txt` to enable selective instrumentation. + +* Similarly to _exclude_ specified parts from instrumentation + create a file (eg `denylist.txt`) with all the filenames of the source code + that should be skipped during instrumentation and then + + 1. just put one filename or function per line (no directory information necessary for + filenames) in the file `denylist.txt`, and + + 2. set `export AFL_LLVM_DENYLIST=denylist.txt` to enable selective instrumentation. + +**NOTE:** During optimization functions might be +inlined and then would not match the list! See +[instrumentation/README.instrument_list.md](../instrumentation/README.instrument_list.md). There are many more options and modes available, however, these are most of the time less effective. See: @@ -172,26 +187,26 @@ It is possible to use sanitizers when instrumenting targets for fuzzing, which allows you to find bugs that would not necessarily result in a crash. Note that sanitizers have a huge impact on CPU (= less executions per second) -and RAM usage. Also, you should only run one afl-fuzz instance per sanitizer -type. This is enough because a use-after-free bug will be picked up, e.g., by -ASAN (address sanitizer) anyway when syncing to other fuzzing instances, so not -all fuzzing instances need to be instrumented with ASAN. +and RAM usage. Also, you should only run one afl-fuzz target per sanitizer +type. This is enough because e.g. a use-after-free bug will be picked up by +ASAN (address sanitizer) anyway after syncing test cases from other fuzzing +instances, so running more than one address sanitized target would be a waste. The following sanitizers have built-in support in AFL++: * ASAN = Address SANitizer, finds memory corruption vulnerabilities like use-after-free, NULL pointer dereference, buffer overruns, etc. Enabled with `export AFL_USE_ASAN=1` before compiling. -* MSAN = Memory SANitizer, finds read access to uninitialized memory, e.g., a +* MSAN = Memory SANitizer, finds read accesses to uninitialized memory, e.g., a local variable that is defined and read before it is even set. Enabled with `export AFL_USE_MSAN=1` before compiling. * UBSAN = Undefined Behavior SANitizer, finds instances where - by the C and C++ standards - undefined behavior happens, e.g., adding two signed integers - together where the result is larger than a signed integer can hold. Enabled + where the result is larger than what a signed integer can hold. Enabled with `export AFL_USE_UBSAN=1` before compiling. * CFISAN = Control Flow Integrity SANitizer, finds instances where the control flow is found to be illegal. Originally this was rather to prevent return - oriented programming exploit chains from functioning. In fuzzing, this is + oriented programming (rop) exploit chains from functioning. In fuzzing, this is mostly reduced to detecting type confusion vulnerabilities - which is, however, one of the most important and dangerous C++ memory corruption classes! Enabled with `export AFL_USE_CFISAN=1` before compiling. @@ -264,7 +279,8 @@ Then build the target. (Usually with `make`.) 3. In case the configure/build system complains about AFL++'s compiler and aborts, then set `export AFL_NOOPT=1` which will then just behave like the - real compiler. This option has to be unset again before building the target! + real compiler and run the configure step separately. + For building the target afterwards this option should be unset again! #### configure @@ -377,21 +393,22 @@ produce a new path/coverage in the target: 1. Put all files from [step a](#a-collecting-inputs) into one directory, e.g., INPUTS. 2. Run afl-cmin: * If the target program is to be called by fuzzing as `bin/target - INPUTFILE`, set the INPUTFILE argument that the target program would read - from as `@@`: + INPUTFILE`, replace the INPUTFILE argument that the target program would read + from with `@@`: ``` afl-cmin -i INPUTS -o INPUTS_UNIQUE -- bin/target -someopt @@ ``` - * If the target reads from stdin instead, just omit the `@@` as this is the - default: + * If the target reads from stdin (standard input) instead, just omit the `@@` + as this is the default: ``` afl-cmin -i INPUTS -o INPUTS_UNIQUE -- bin/target -someopt ``` -This step is highly recommended! +This step is highly recommended, because afterwards the testcase corpus is not +bloated with duplicates anymore, which would slow down the fuzzing progress! ### c) Minimizing all corpus files @@ -424,7 +441,7 @@ In this final step, fuzz the target. There are not that many important options to run the target - unless you want to use many CPU cores/threads for the fuzzing, which will make the fuzzing much more useful. -If you just use one CPU for fuzzing, then you are fuzzing just for fun and not +If you just use one instance for fuzzing, then you are fuzzing just for fun and not seriously :-) ### a) Running afl-fuzz @@ -458,7 +475,7 @@ afl-fuzz -i input -o output -- bin/target -someopt @@ Note that the directory specified with `-o` will be created if it does not exist. -It can be valuable to run afl-fuzz in a screen or tmux shell so you can log off, +It can be valuable to run afl-fuzz in a `screen` or `tmux` shell so you can log off, or afl-fuzz is not aborted if you are running it in a remote ssh session where the connection fails in between. Only do that though once you have verified that your fuzzing setup works! Run it like `screen -dmS afl-main -- afl-fuzz -M @@ -484,19 +501,20 @@ to do nothing except to use afl-clang-lto as the compiler. With `afl-clang-fast` you can set `AFL_LLVM_DICT2FILE=/full/path/to/new/file.dic` to automatically generate a dictionary during target compilation. -You also have the option to generate a dictionary yourself, see -[utils/libtokencap/README.md](../utils/libtokencap/README.md). +You also have the option to generate a dictionary yourself during a run of the +target, see [utils/libtokencap/README.md](../utils/libtokencap/README.md). +Finally you can also write a dictionary file manually, of course. afl-fuzz has a variety of options that help to workaround target quirks like -specific locations for the input file (`-f`), performing deterministic fuzzing -(`-D`) and many more. Check out `afl-fuzz -h`. +specific names/locations for the input file (`-f`), performing deterministic +fuzzing (`-D`) and many more. Check out `afl-fuzz -h`. We highly recommend that you set a memory limit for running the target with `-m` which defines the maximum memory in MB. This prevents a potential out-of-memory problem for your system plus helps you detect missing `malloc()` failure handling in the target. Play around with various `-m` values until you find one that safely works for all your input seeds (if you have good ones and then -double or quadruple that. +double or quadruple that). By default, afl-fuzz never stops fuzzing. To terminate AFL++, press Control-C or send a signal SIGINT. You can limit the number of executions or approximate @@ -609,7 +627,7 @@ Now there are three strategies on how you can sync between the servers: * regularly (~4h): this ensures that all fuzzing campaigns on the servers "see" the same thing. It is like fuzzing on a huge server. * in intervals of 1/10th of the overall expected runtime of the fuzzing you - sync. This tries a bit to combine both. have some individuality of the paths + sync. This tries a bit to combine both. Have some individuality of the paths each campaign on a server explores, on the other hand if one gets stuck where another found progress this is handed over making it unstuck. @@ -852,8 +870,8 @@ normal fuzzing campaigns as these are much shorter runnings. 1. Always: * LTO has a much longer compile time which is diametrical to short fuzzing - hence use afl-clang-fast instead. - * If you compile with CMPLOG, then you can save fuzzing time and reuse that - compiled target for both the `-c` option and the main fuzz target. This + * If you compile with CMPLOG, then you can save compilation time and reuse that + compiled target with the `-c` option and as the main fuzz target. This will impact the speed by ~15% though. * `AFL_FAST_CAL` - enables fast calibration, this halves the time the saturated corpus needs to be loaded. @@ -884,7 +902,7 @@ and ## The End -Check out the [FAQ](FAQ.md) if it maybe answers your question (that you might +Check out the [FAQ](FAQ.md). Maybe it answers your question (that you might not even have known you had ;-) ). This is basically all you need to know to professionally run fuzzing campaigns. |