From 996986bed5f2dd97a3d76f584d8eddc1203f8396 Mon Sep 17 00:00:00 2001 From: vanhauser-thc Date: Sat, 5 Sep 2020 12:11:48 +0200 Subject: first batch of changes --- instrumentation/README.gcc_plugin.md | 158 +++++++++++++++++++++++++++++++++++ 1 file changed, 158 insertions(+) create mode 100644 instrumentation/README.gcc_plugin.md (limited to 'instrumentation/README.gcc_plugin.md') diff --git a/instrumentation/README.gcc_plugin.md b/instrumentation/README.gcc_plugin.md new file mode 100644 index 00000000..9d6bc200 --- /dev/null +++ b/instrumentation/README.gcc_plugin.md @@ -0,0 +1,158 @@ +# GCC-based instrumentation for afl-fuzz + + (See [../README.md](../README.md) for the general instruction manual.) + (See [README.llvm.md](README.llvm.md) for the LLVM-based instrumentation.) + +!!! TODO items are: +!!! => inline instrumentation has to work! +!!! + + +## 1) Introduction + +The code in this directory allows you to instrument programs for AFL using +true compiler-level instrumentation, instead of the more crude +assembly-level rewriting approach taken by afl-gcc and afl-clang. This has +several interesting properties: + + - The compiler can make many optimizations that are hard to pull off when + manually inserting assembly. As a result, some slow, CPU-bound programs will + run up to around faster. + + The gains are less pronounced for fast binaries, where the speed is limited + chiefly by the cost of creating new processes. In such cases, the gain will + probably stay within 10%. + + - The instrumentation is CPU-independent. At least in principle, you should + be able to rely on it to fuzz programs on non-x86 architectures (after + building afl-fuzz with AFL_NOX86=1). + + - Because the feature relies on the internals of GCC, it is gcc-specific + and will *not* work with LLVM (see ../llvm_mode for an alternative). + +Once this implementation is shown to be sufficiently robust and portable, it +will probably replace afl-gcc. For now, it can be built separately and +co-exists with the original code. + +The idea and much of the implementation comes from Laszlo Szekeres. + +## 2) How to use + +In order to leverage this mechanism, you need to have modern enough GCC +(>= version 4.5.0) and the plugin headers installed on your system. That +should be all you need. On Debian machines, these headers can be acquired by +installing the `gcc--plugin-dev` packages. + +To build the instrumentation itself, type 'make'. This will generate binaries +called afl-gcc-fast and afl-g++-fast in the parent directory. +If the CC/CXX have been overridden, those compilers will be used from +those wrappers without using AFL_CXX/AFL_CC settings. +Once this is done, you can instrument third-party code in a way similar to the +standard operating mode of AFL, e.g.: + + CC=/path/to/afl/afl-gcc-fast ./configure [...options...] + make + +Be sure to also include CXX set to afl-g++-fast for C++ code. + +The tool honors roughly the same environmental variables as afl-gcc (see +[env_variables.md](../docs/env_variables.md). This includes AFL_INST_RATIO, AFL_USE_ASAN, +AFL_HARDEN, and AFL_DONT_OPTIMIZE. + +Note: if you want the GCC plugin to be installed on your system for all +users, you need to build it before issuing 'make install' in the parent +directory. + +## 3) Gotchas, feedback, bugs + +This is an early-stage mechanism, so field reports are welcome. You can send bug +reports to . + +## 4) Bonus feature #1: deferred initialization + +AFL tries to optimize performance by executing the targeted binary just once, +stopping it just before main(), and then cloning this "main" process to get +a steady supply of targets to fuzz. + +Although this approach eliminates much of the OS-, linker- and libc-level +costs of executing the program, it does not always help with binaries that +perform other time-consuming initialization steps - say, parsing a large config +file before getting to the fuzzed data. + +In such cases, it's beneficial to initialize the forkserver a bit later, once +most of the initialization work is already done, but before the binary attempts +to read the fuzzed input and parse it; in some cases, this can offer a 10x+ +performance gain. You can implement delayed initialization in LLVM mode in a +fairly simple way. + +First, locate a suitable location in the code where the delayed cloning can +take place. This needs to be done with *extreme* care to avoid breaking the +binary. In particular, the program will probably malfunction if you select +a location after: + + - The creation of any vital threads or child processes - since the forkserver + can't clone them easily. + + - The initialization of timers via setitimer() or equivalent calls. + + - The creation of temporary files, network sockets, offset-sensitive file + descriptors, and similar shared-state resources - but only provided that + their state meaningfully influences the behavior of the program later on. + + - Any access to the fuzzed input, including reading the metadata about its + size. + +With the location selected, add this code in the appropriate spot: + +``` +#ifdef __AFL_HAVE_MANUAL_CONTROL + __AFL_INIT(); +#endif +``` + +You don't need the #ifdef guards, but they will make the program still work as +usual when compiled with a tool other than afl-gcc-fast/afl-clang-fast. + +Finally, recompile the program with afl-gcc-fast (afl-gcc or afl-clang will +*not* generate a deferred-initialization binary) - and you should be all set! + +## 5) Bonus feature #2: persistent mode + +Some libraries provide APIs that are stateless, or whose state can be reset in +between processing different input files. When such a reset is performed, a +single long-lived process can be reused to try out multiple test cases, +eliminating the need for repeated fork() calls and the associated OS overhead. + +The basic structure of the program that does this would be: + +``` + while (__AFL_LOOP(1000)) { + + /* Read input data. */ + /* Call library code to be fuzzed. */ + /* Reset state. */ + + } + + /* Exit normally */ +``` + +The numerical value specified within the loop controls the maximum number +of iterations before AFL will restart the process from scratch. This minimizes +the impact of memory leaks and similar glitches; 1000 is a good starting point. + +A more detailed template is shown in ../examples/persistent_demo/. +Similarly to the previous mode, the feature works only with afl-gcc-fast or +afl-clang-fast; #ifdef guards can be used to suppress it when using other +compilers. + +Note that as with the previous mode, the feature is easy to misuse; if you +do not reset the critical state fully, you may end up with false positives or +waste a whole lot of CPU power doing nothing useful at all. Be particularly +wary of memory leaks and the state of file descriptors. + +When running in this mode, the execution paths will inherently vary a bit +depending on whether the input loop is being entered for the first time or +executed again. To avoid spurious warnings, the feature implies +AFL_NO_VAR_CHECK and hides the "variable path" warnings in the UI. + -- cgit 1.4.1 From 454a860020048c5531f518b5691c92949bdc8017 Mon Sep 17 00:00:00 2001 From: van Hauser Date: Wed, 9 Sep 2020 23:25:01 +0200 Subject: update gcc readme --- instrumentation/README.gcc_plugin.md | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) (limited to 'instrumentation/README.gcc_plugin.md') diff --git a/instrumentation/README.gcc_plugin.md b/instrumentation/README.gcc_plugin.md index 9d6bc200..53519b90 100644 --- a/instrumentation/README.gcc_plugin.md +++ b/instrumentation/README.gcc_plugin.md @@ -1,12 +1,7 @@ # GCC-based instrumentation for afl-fuzz - (See [../README.md](../README.md) for the general instruction manual.) - (See [README.llvm.md](README.llvm.md) for the LLVM-based instrumentation.) - -!!! TODO items are: -!!! => inline instrumentation has to work! -!!! - +See [../README.md](../README.md) for the general instruction manual. +See [README.llvm.md](README.llvm.md) for the LLVM-based instrumentation. ## 1) Introduction @@ -41,7 +36,7 @@ The idea and much of the implementation comes from Laszlo Szekeres. In order to leverage this mechanism, you need to have modern enough GCC (>= version 4.5.0) and the plugin headers installed on your system. That should be all you need. On Debian machines, these headers can be acquired by -installing the `gcc--plugin-dev` packages. +installing the `gcc-VERSION-plugin-dev` packages. To build the instrumentation itself, type 'make'. This will generate binaries called afl-gcc-fast and afl-g++-fast in the parent directory. @@ -56,8 +51,8 @@ standard operating mode of AFL, e.g.: Be sure to also include CXX set to afl-g++-fast for C++ code. The tool honors roughly the same environmental variables as afl-gcc (see -[env_variables.md](../docs/env_variables.md). This includes AFL_INST_RATIO, AFL_USE_ASAN, -AFL_HARDEN, and AFL_DONT_OPTIMIZE. +[env_variables.md](../docs/env_variables.md). This includes AFL_INST_RATIO, +AFL_USE_ASAN, AFL_HARDEN, and AFL_DONT_OPTIMIZE. Note: if you want the GCC plugin to be installed on your system for all users, you need to build it before issuing 'make install' in the parent @@ -66,7 +61,7 @@ directory. ## 3) Gotchas, feedback, bugs This is an early-stage mechanism, so field reports are welcome. You can send bug -reports to . +reports to afl@aflplus.plus ## 4) Bonus feature #1: deferred initialization -- cgit 1.4.1 From fdb0452245672db94be0832288f1335e905a2fc8 Mon Sep 17 00:00:00 2001 From: van Hauser Date: Thu, 10 Sep 2020 08:54:57 +0200 Subject: update documentation --- README.md | 12 +++++------- instrumentation/README.gcc_plugin.md | 11 +++++++++++ 2 files changed, 16 insertions(+), 7 deletions(-) (limited to 'instrumentation/README.gcc_plugin.md') diff --git a/README.md b/README.md index fb59835c..2fc9d807 100644 --- a/README.md +++ b/README.md @@ -41,7 +41,7 @@ behaviours: * When instrumenting targets, afl-cc will not supersede optimizations. This allows to fuzz targets as same as they are built for debug or release. * afl-fuzz' `-i` option now descends into subdirectories. - * afl-fuzz will skip over empty dictionaries and too large test cases instead + * afl-fuzz will skip over empty dictionaries and too-large test cases instead of failing. ## Contents @@ -63,20 +63,20 @@ behaviours: | Feature/Instrumentation | afl-gcc | llvm | gcc_plugin | qemu_mode | unicorn_mode | | -------------------------|:-------:|:---------:|:----------:|:----------------:|:------------:| - | NeverZero | x86[_64]| x(1) | (2) | x | x | + | NeverZero | x86[_64]| x(1) | x | x | x | | Persistent Mode | | x | x | x86[_64]/arm[64] | x | | LAF-Intel / CompCov | | x | | x86[_64]/arm[64] | x86[_64]/arm | | CmpLog | | x | | x86[_64]/arm[64] | | - | Selective Instrumentation| | x | x | (x)(3) | | + | Selective Instrumentation| | x | x | x | | | Non-Colliding Coverage | | x(4) | | (x)(5) | | | Ngram prev_loc Coverage | | x(6) | | | | | Context Coverage | | x(6) | | | | | Auto Dictionary | | x(7) | | | | - | Snapshot LKM Support | | x | | (x)(5) | | + | Snapshot LKM Support | | x | x | (x)(5) | | 1. default for LLVM >= 9.0, env var for older version due an efficiency bug in llvm <= 8 2. GCC creates non-performant code, hence it is disabled in gcc_plugin - 3. partially via AFL_CODE_START/AFL_CODE_END + 3. (currently unassigned) 4. with pcguard mode and LTO mode for LLVM >= 11 5. upcoming, development in the branch 6. not compatible with LTO instrumentation and needs at least LLVM >= 4.1 @@ -92,8 +92,6 @@ behaviours: * AFLfast's power schedules by Marcel Böhme: [https://github.com/mboehme/aflfast](https://github.com/mboehme/aflfast) * The MOpt mutator: [https://github.com/puppet-meteor/MOpt-AFL](https://github.com/puppet-meteor/MOpt-AFL) * LLVM mode Ngram coverage by Adrian Herrera [https://github.com/adrianherrera/afl-ngram-pass](https://github.com/adrianherrera/afl-ngram-pass) - * C. Holler's afl-fuzz Python mutator module: [https://github.com/choller/afl](https://github.com/choller/afl) - * Custom mutator by a library (instead of Python) by kyakdan * LAF-Intel/CompCov support for instrumentation, qemu_mode and unicorn_mode (with enhanced capabilities) * Radamsa and honggfuzz mutators (as custom mutators). * QBDI mode to fuzz android native libraries via Quarkslab's [QBDI](https://github.com/QBDI/QBDI) framework diff --git a/instrumentation/README.gcc_plugin.md b/instrumentation/README.gcc_plugin.md index 53519b90..919801d1 100644 --- a/instrumentation/README.gcc_plugin.md +++ b/instrumentation/README.gcc_plugin.md @@ -3,6 +3,13 @@ See [../README.md](../README.md) for the general instruction manual. See [README.llvm.md](README.llvm.md) for the LLVM-based instrumentation. +TLDR: + * `apt-get install gcc-VERSION-plugin-dev` + * `make` + * gcc and g++ must point to the gcc-VERSION you you have to set AFL_CC/AFL_CXX + to point to these! + * just use afl-gcc-fast/afl-g++-fast normally like you would afl-clang-fast + ## 1) Introduction The code in this directory allows you to instrument programs for AFL using @@ -40,8 +47,12 @@ installing the `gcc-VERSION-plugin-dev` packages. To build the instrumentation itself, type 'make'. This will generate binaries called afl-gcc-fast and afl-g++-fast in the parent directory. + +The gcc and g++ compiler links have to point to gcc-VERSION - or set these +by pointing the environment variables AFL_CC/AFL_CXX to them. If the CC/CXX have been overridden, those compilers will be used from those wrappers without using AFL_CXX/AFL_CC settings. + Once this is done, you can instrument third-party code in a way similar to the standard operating mode of AFL, e.g.: -- cgit 1.4.1