From 996986bed5f2dd97a3d76f584d8eddc1203f8396 Mon Sep 17 00:00:00 2001
From: vanhauser-thc <vh@thc.org>
Date: Sat, 5 Sep 2020 12:11:48 +0200
Subject: first batch of changes

---
 instrumentation/README.gcc_plugin.md | 158 +++++++++++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)
 create mode 100644 instrumentation/README.gcc_plugin.md

(limited to 'instrumentation/README.gcc_plugin.md')
diff --git a/instrumentation/README.gcc_plugin.md b/instrumentation/README.gcc_plugin.md
new file mode 100644
index 00000000..9d6bc200
--- /dev/null
+++ b/instrumentation/README.gcc_plugin.md
@@ -0,0 +1,158 @@
+# GCC-based instrumentation for afl-fuzz
+
+  (See [../README.md](../README.md) for the general instruction manual.)
+  (See [README.llvm.md](README.llvm.md) for the LLVM-based instrumentation.)
+
+!!! TODO items are:
+!!!  => inline instrumentation has to work!
+!!!
+
+
+## 1) Introduction
+
+The code in this directory allows you to instrument programs for AFL using
+true compiler-level instrumentation, instead of the more crude
+assembly-level rewriting approach taken by afl-gcc and afl-clang. This has
+several interesting properties:
+
+  - The compiler can make many optimizations that are hard to pull off when
+    manually inserting assembly. As a result, some slow, CPU-bound programs will
+    run up to around faster.
+
+    The gains are less pronounced for fast binaries, where the speed is limited
+    chiefly by the cost of creating new processes. In such cases, the gain will
+    probably stay within 10%.
+
+  - The instrumentation is CPU-independent. At least in principle, you should
+    be able to rely on it to fuzz programs on non-x86 architectures (after
+    building afl-fuzz with AFL_NOX86=1).
+
+  - Because the feature relies on the internals of GCC, it is gcc-specific
+    and will *not* work with LLVM (see ../llvm_mode for an alternative).
+
+Once this implementation is shown to be sufficiently robust and portable, it
+will probably replace afl-gcc. For now, it can be built separately and
+co-exists with the original code.
+
+The idea and much of the implementation comes from Laszlo Szekeres.
+
+## 2) How to use
+
+In order to leverage this mechanism, you need to have modern enough GCC
+(>= version 4.5.0) and the plugin headers installed on your system. That
+should be all you need. On Debian machines, these headers can be acquired by
+installing the `gcc-<VERSION>-plugin-dev` packages.
+
+To build the instrumentation itself, type 'make'. This will generate binaries
+called afl-gcc-fast and afl-g++-fast in the parent directory. 
+If the CC/CXX have been overridden, those compilers will be used from
+those wrappers without using AFL_CXX/AFL_CC settings.
+Once this is done, you can instrument third-party code in a way similar to the
+standard operating mode of AFL, e.g.:
+
+  CC=/path/to/afl/afl-gcc-fast ./configure [...options...]
+  make
+
+Be sure to also include CXX set to afl-g++-fast for C++ code.
+
+The tool honors roughly the same environmental variables as afl-gcc (see
+[env_variables.md](../docs/env_variables.md). This includes AFL_INST_RATIO, AFL_USE_ASAN,
+AFL_HARDEN, and AFL_DONT_OPTIMIZE.
+
+Note: if you want the GCC plugin to be installed on your system for all
+users, you need to build it before issuing 'make install' in the parent
+directory.
+
+## 3) Gotchas, feedback, bugs
+
+This is an early-stage mechanism, so field reports are welcome. You can send bug
+reports to <hexcoder-@github.com>.
+
+## 4) Bonus feature #1: deferred initialization
+
+AFL tries to optimize performance by executing the targeted binary just once,
+stopping it just before main(), and then cloning this "main" process to get
+a steady supply of targets to fuzz.
+
+Although this approach eliminates much of the OS-, linker- and libc-level
+costs of executing the program, it does not always help with binaries that
+perform other time-consuming initialization steps - say, parsing a large config
+file before getting to the fuzzed data.
+
+In such cases, it's beneficial to initialize the forkserver a bit later, once
+most of the initialization work is already done, but before the binary attempts
+to read the fuzzed input and parse it; in some cases, this can offer a 10x+
+performance gain. You can implement delayed initialization in LLVM mode in a
+fairly simple way.
+
+First, locate a suitable location in the code where the delayed cloning can
+take place. This needs to be done with *extreme* care to avoid breaking the
+binary. In particular, the program will probably malfunction if you select
+a location after:
+
+  - The creation of any vital threads or child processes - since the forkserver
+    can't clone them easily.
+
+  - The initialization of timers via setitimer() or equivalent calls.
+
+  - The creation of temporary files, network sockets, offset-sensitive file
+    descriptors, and similar shared-state resources - but only provided that
+    their state meaningfully influences the behavior of the program later on.
+
+  - Any access to the fuzzed input, including reading the metadata about its
+    size.
+
+With the location selected, add this code in the appropriate spot:
+
+```
+#ifdef __AFL_HAVE_MANUAL_CONTROL
+  __AFL_INIT();
+#endif
+```
+
+You don't need the #ifdef guards, but they will make the program still work as
+usual when compiled with a tool other than afl-gcc-fast/afl-clang-fast.
+
+Finally, recompile the program with afl-gcc-fast (afl-gcc or afl-clang will
+*not* generate a deferred-initialization binary) - and you should be all set!
+
+## 5) Bonus feature #2: persistent mode
+
+Some libraries provide APIs that are stateless, or whose state can be reset in
+between processing different input files. When such a reset is performed, a
+single long-lived process can be reused to try out multiple test cases,
+eliminating the need for repeated fork() calls and the associated OS overhead.
+
+The basic structure of the program that does this would be:
+
+```
+  while (__AFL_LOOP(1000)) {
+
+    /* Read input data. */
+    /* Call library code to be fuzzed. */
+    /* Reset state. */
+
+  }
+
+  /* Exit normally */
+```
+
+The numerical value specified within the loop controls the maximum number
+of iterations before AFL will restart the process from scratch. This minimizes
+the impact of memory leaks and similar glitches; 1000 is a good starting point.
+
+A more detailed template is shown in ../examples/persistent_demo/.
+Similarly to the previous mode, the feature works only with afl-gcc-fast or
+afl-clang-fast; #ifdef guards can be used to suppress it when using other
+compilers.
+
+Note that as with the previous mode, the feature is easy to misuse; if you
+do not reset the critical state fully, you may end up with false positives or
+waste a whole lot of CPU power doing nothing useful at all. Be particularly
+wary of memory leaks and the state of file descriptors.
+
+When running in this mode, the execution paths will inherently vary a bit
+depending on whether the input loop is being entered for the first time or
+executed again. To avoid spurious warnings, the feature implies
+AFL_NO_VAR_CHECK and hides the "variable path" warnings in the UI.
+
-- 
cgit 1.4.1


From 454a860020048c5531f518b5691c92949bdc8017 Mon Sep 17 00:00:00 2001
From: van Hauser <vh@thc.org>
Date: Wed, 9 Sep 2020 23:25:01 +0200
Subject: update gcc readme

---
 instrumentation/README.gcc_plugin.md | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

(limited to 'instrumentation/README.gcc_plugin.md')

diff --git a/instrumentation/README.gcc_plugin.md b/instrumentation/README.gcc_plugin.md
index 9d6bc200..53519b90 100644
--- a/instrumentation/README.gcc_plugin.md
+++ b/instrumentation/README.gcc_plugin.md
@@ -1,12 +1,7 @@
 # GCC-based instrumentation for afl-fuzz
 
-  (See [../README.md](../README.md) for the general instruction manual.)
-  (See [README.llvm.md](README.llvm.md) for the LLVM-based instrumentation.)
-
-!!! TODO items are:
-!!!  => inline instrumentation has to work!
-!!!
-
+See [../README.md](../README.md) for the general instruction manual.
+See [README.llvm.md](README.llvm.md) for the LLVM-based instrumentation.
 
 ## 1) Introduction
 
@@ -41,7 +36,7 @@ The idea and much of the implementation comes from Laszlo Szekeres.
 In order to leverage this mechanism, you need to have modern enough GCC
 (>= version 4.5.0) and the plugin headers installed on your system. That
 should be all you need. On Debian machines, these headers can be acquired by
-installing the `gcc-<VERSION>-plugin-dev` packages.
+installing the `gcc-VERSION-plugin-dev` packages.
 
 To build the instrumentation itself, type 'make'. This will generate binaries
 called afl-gcc-fast and afl-g++-fast in the parent directory. 
@@ -56,8 +51,8 @@ standard operating mode of AFL, e.g.:
 Be sure to also include CXX set to afl-g++-fast for C++ code.
 
 The tool honors roughly the same environmental variables as afl-gcc (see
-[env_variables.md](../docs/env_variables.md). This includes AFL_INST_RATIO, AFL_USE_ASAN,
-AFL_HARDEN, and AFL_DONT_OPTIMIZE.
+[env_variables.md](../docs/env_variables.md). This includes AFL_INST_RATIO,
+AFL_USE_ASAN, AFL_HARDEN, and AFL_DONT_OPTIMIZE.
 
 Note: if you want the GCC plugin to be installed on your system for all
 users, you need to build it before issuing 'make install' in the parent
@@ -66,7 +61,7 @@ directory.
 ## 3) Gotchas, feedback, bugs
 
 This is an early-stage mechanism, so field reports are welcome. You can send bug
-reports to <hexcoder-@github.com>.
+reports to afl@aflplus.plus
 
 ## 4) Bonus feature #1: deferred initialization
 
-- 
cgit 1.4.1


From fdb0452245672db94be0832288f1335e905a2fc8 Mon Sep 17 00:00:00 2001
From: van Hauser <vh@thc.org>
Date: Thu, 10 Sep 2020 08:54:57 +0200
Subject: update documentation

---
 README.md                            | 12 +++++-------
 instrumentation/README.gcc_plugin.md | 11 +++++++++++
 2 files changed, 16 insertions(+), 7 deletions(-)

(limited to 'instrumentation/README.gcc_plugin.md')

diff --git a/README.md b/README.md
index fb59835c..2fc9d807 100644
--- a/README.md
+++ b/README.md
@@ -41,7 +41,7 @@ behaviours:
   * When instrumenting targets, afl-cc will not supersede optimizations. This
     allows to fuzz targets as same as they are built for debug or release.
   * afl-fuzz' `-i` option now descends into subdirectories.
-  * afl-fuzz will skip over empty dictionaries and too large test cases instead
+  * afl-fuzz will skip over empty dictionaries and too-large test cases instead
     of failing.
 
 ## Contents
@@ -63,20 +63,20 @@ behaviours:
 
   | Feature/Instrumentation  | afl-gcc | llvm      | gcc_plugin | qemu_mode        | unicorn_mode |
   | -------------------------|:-------:|:---------:|:----------:|:----------------:|:------------:|
-  | NeverZero                | x86[_64]|     x(1)  |      (2)   |         x        |       x      |
+  | NeverZero                | x86[_64]|     x(1)  |     x      |         x        |       x      |
   | Persistent Mode          |         |     x     |     x      | x86[_64]/arm[64] |       x      |
   | LAF-Intel / CompCov      |         |     x     |            | x86[_64]/arm[64] | x86[_64]/arm |
   | CmpLog                   |         |     x     |            | x86[_64]/arm[64] |              |
-  | Selective Instrumentation|         |     x     |     x      |        (x)(3)    |              |
+  | Selective Instrumentation|         |     x     |     x      |         x        |              |
   | Non-Colliding Coverage   |         |     x(4)  |            |        (x)(5)    |              |
   | Ngram prev_loc Coverage  |         |     x(6)  |            |                  |              |
   | Context Coverage         |         |     x(6)  |            |                  |              |
   | Auto Dictionary          |         |     x(7)  |            |                  |              |
-  | Snapshot LKM Support     |         |     x     |            |        (x)(5)    |              |
+  | Snapshot LKM Support     |         |     x     |     x      |        (x)(5)    |              |
 
   1. default for LLVM >= 9.0, env var for older version due an efficiency bug in llvm <= 8
   2. GCC creates non-performant code, hence it is disabled in gcc_plugin
-  3. partially via AFL_CODE_START/AFL_CODE_END
+  3. (currently unassigned)
   4. with pcguard mode and LTO mode for LLVM >= 11
   5. upcoming, development in the branch
   6. not compatible with LTO instrumentation and needs at least LLVM >= 4.1
@@ -92,8 +92,6 @@ behaviours:
   * AFLfast's power schedules by Marcel Böhme: [https://github.com/mboehme/aflfast](https://github.com/mboehme/aflfast)
   * The MOpt mutator: [https://github.com/puppet-meteor/MOpt-AFL](https://github.com/puppet-meteor/MOpt-AFL)
   * LLVM mode Ngram coverage by Adrian Herrera [https://github.com/adrianherrera/afl-ngram-pass](https://github.com/adrianherrera/afl-ngram-pass)
-  * C. Holler's afl-fuzz Python mutator module: [https://github.com/choller/afl](https://github.com/choller/afl)
-  * Custom mutator by a library (instead of Python) by kyakdan
   * LAF-Intel/CompCov support for instrumentation, qemu_mode and unicorn_mode (with enhanced capabilities)
   * Radamsa and honggfuzz mutators (as custom mutators).
   * QBDI mode to fuzz android native libraries via Quarkslab's [QBDI](https://github.com/QBDI/QBDI) framework
diff --git a/instrumentation/README.gcc_plugin.md b/instrumentation/README.gcc_plugin.md
index 53519b90..919801d1 100644
--- a/instrumentation/README.gcc_plugin.md
+++ b/instrumentation/README.gcc_plugin.md
@@ -3,6 +3,13 @@
 See [../README.md](../README.md) for the general instruction manual.
 See [README.llvm.md](README.llvm.md) for the LLVM-based instrumentation.
 
+TLDR:
+  * `apt-get install gcc-VERSION-plugin-dev`
+  * `make`
+  * gcc and g++ must point to the gcc-VERSION you you have to set AFL_CC/AFL_CXX
+    to point to these!
+  * just use afl-gcc-fast/afl-g++-fast normally like you would afl-clang-fast
+
 ## 1) Introduction
 
 The code in this directory allows you to instrument programs for AFL using
@@ -40,8 +47,12 @@ installing the `gcc-VERSION-plugin-dev` packages.
 
 To build the instrumentation itself, type 'make'. This will generate binaries
 called afl-gcc-fast and afl-g++-fast in the parent directory. 
+
+The gcc and g++ compiler links have to point to gcc-VERSION - or set these
+by pointing the environment variables AFL_CC/AFL_CXX to them.
 If the CC/CXX have been overridden, those compilers will be used from
 those wrappers without using AFL_CXX/AFL_CC settings.
+
 Once this is done, you can instrument third-party code in a way similar to the
 standard operating mode of AFL, e.g.:
 
-- 
cgit 1.4.1