From 437efe795aa251d3beede18e2efd2f584884f060 Mon Sep 17 00:00:00 2001
From: Andrea Fioraldi <andreafioraldi@gmail.com>
Date: Sat, 1 Feb 2020 20:20:41 +0100
Subject: adjust a bit readmes

---
 docs/ChangeLog                 |   2 +-
 gcc_plugin/README.gcc.md       | 160 -----------------------------------------
 gcc_plugin/README.md           | 158 ++++++++++++++++++++++++++++++++++++++++
 libtokencap/README.md          |  64 +++++++++++++++++
 libtokencap/README.tokencap.md |  64 -----------------
 testcases/README.md            |  17 +++++
 testcases/README.testcases     |  19 -----
 7 files changed, 240 insertions(+), 244 deletions(-)
 delete mode 100644 gcc_plugin/README.gcc.md
 create mode 100644 gcc_plugin/README.md
 create mode 100644 libtokencap/README.md
 delete mode 100644 libtokencap/README.tokencap.md
 create mode 100644 testcases/README.md
 delete mode 100644 testcases/README.testcases
diff --git a/docs/ChangeLog b/docs/ChangeLog
index aef3ae7c..a559f6f2 100644
--- a/docs/ChangeLog
+++ b/docs/ChangeLog
@@ -34,7 +34,7 @@ Version ++2.60d (develop):
     the original script is still present as afl-cmin.bash
   - added blacklist and whitelisting function check in all modules of llvm_mode
   - added fix from Debian project to compile libdislocator and libtokencap
-  - libdislocator: AFL_ALIGNED_ALLOC to force size alignment to sizeof(void*)
+  - libdislocator: AFL_ALIGNED_ALLOC to force size alignment to max_align_t
 
 
 --------------------------
diff --git a/gcc_plugin/README.gcc.md b/gcc_plugin/README.gcc.md
deleted file mode 100644
index 80fccfb6..00000000
--- a/gcc_plugin/README.gcc.md
+++ /dev/null
@@ -1,160 +0,0 @@
-===========================================
-GCC-based instrumentation for afl-fuzz
-======================================
-
-  (See ../docs/README.md for the general instruction manual.)
-  (See ../llvm_mode/README.md for the LLVM-based instrumentation.)
-
-!!! TODO items are:
-!!!  => inline instrumentation has to work!
-!!!
-
-
-## 1) Introduction
-
-The code in this directory allows you to instrument programs for AFL using
-true compiler-level instrumentation, instead of the more crude
-assembly-level rewriting approach taken by afl-gcc and afl-clang. This has
-several interesting properties:
-
-  - The compiler can make many optimizations that are hard to pull off when
-    manually inserting assembly. As a result, some slow, CPU-bound programs will
-    run up to around faster.
-
-    The gains are less pronounced for fast binaries, where the speed is limited
-    chiefly by the cost of creating new processes. In such cases, the gain will
-    probably stay within 10%.
-
-  - The instrumentation is CPU-independent. At least in principle, you should
-    be able to rely on it to fuzz programs on non-x86 architectures (after
-    building afl-fuzz with AFL_NOX86=1).
-
-  - Because the feature relies on the internals of GCC, it is gcc-specific
-    and will *not* work with LLVM (see ../llvm_mode for an alternative).
-
-Once this implementation is shown to be sufficiently robust and portable, it
-will probably replace afl-gcc. For now, it can be built separately and
-co-exists with the original code.
-
-The idea and much of the implementation comes from Laszlo Szekeres.
-
-## 2) How to use
-
-In order to leverage this mechanism, you need to have modern enough GCC
-(>= version 4.5.0) and the plugin headers installed on your system. That
-should be all you need. On Debian machines, these headers can be acquired by
-installing the `gcc-<VERSION>-plugin-dev` packages.
-
-To build the instrumentation itself, type 'make'. This will generate binaries
-called afl-gcc-fast and afl-g++-fast in the parent directory. 
-If the CC/CXX have been overridden, those compilers will be used from
-those wrappers without using AFL_CXX/AFL_CC settings.
-Once this is done, you can instrument third-party code in a way similar to the
-standard operating mode of AFL, e.g.:
-
-  CC=/path/to/afl/afl-gcc-fast ./configure [...options...]
-  make
-
-Be sure to also include CXX set to afl-g++-fast for C++ code.
-
-The tool honors roughly the same environmental variables as afl-gcc (see
-../docs/env_variables.txt). This includes AFL_INST_RATIO, AFL_USE_ASAN,
-AFL_HARDEN, and AFL_DONT_OPTIMIZE.
-
-Note: if you want the GCC plugin to be installed on your system for all
-users, you need to build it before issuing 'make install' in the parent
-directory.
-
-## 3) Gotchas, feedback, bugs
-
-This is an early-stage mechanism, so field reports are welcome. You can send bug
-reports to <hexcoder-@github.com>.
-
-## 4) Bonus feature #1: deferred initialization
-
-AFL tries to optimize performance by executing the targeted binary just once,
-stopping it just before main(), and then cloning this "master" process to get
-a steady supply of targets to fuzz.
-
-Although this approach eliminates much of the OS-, linker- and libc-level
-costs of executing the program, it does not always help with binaries that
-perform other time-consuming initialization steps - say, parsing a large config
-file before getting to the fuzzed data.
-
-In such cases, it's beneficial to initialize the forkserver a bit later, once
-most of the initialization work is already done, but before the binary attempts
-to read the fuzzed input and parse it; in some cases, this can offer a 10x+
-performance gain. You can implement delayed initialization in LLVM mode in a
-fairly simple way.
-
-First, locate a suitable location in the code where the delayed cloning can
-take place. This needs to be done with *extreme* care to avoid breaking the
-binary. In particular, the program will probably malfunction if you select
-a location after:
-
-  - The creation of any vital threads or child processes - since the forkserver
-    can't clone them easily.
-
-  - The initialization of timers via setitimer() or equivalent calls.
-
-  - The creation of temporary files, network sockets, offset-sensitive file
-    descriptors, and similar shared-state resources - but only provided that
-    their state meaningfully influences the behavior of the program later on.
-
-  - Any access to the fuzzed input, including reading the metadata about its
-    size.
-
-With the location selected, add this code in the appropriate spot:
-
-```
-#ifdef __AFL_HAVE_MANUAL_CONTROL
-  __AFL_INIT();
-#endif
-```
-
-You don't need the #ifdef guards, but they will make the program still work as
-usual when compiled with a tool other than afl-gcc-fast/afl-clang-fast.
-
-Finally, recompile the program with afl-gcc-fast (afl-gcc or afl-clang will
-*not* generate a deferred-initialization binary) - and you should be all set!
-
-## 5) Bonus feature #2: persistent mode
-
-Some libraries provide APIs that are stateless, or whose state can be reset in
-between processing different input files. When such a reset is performed, a
-single long-lived process can be reused to try out multiple test cases,
-eliminating the need for repeated fork() calls and the associated OS overhead.
-
-The basic structure of the program that does this would be:
-
-```
-  while (__AFL_LOOP(1000)) {
-
-    /* Read input data. */
-    /* Call library code to be fuzzed. */
-    /* Reset state. */
-
-  }
-
-  /* Exit normally */
-```
-
-The numerical value specified within the loop controls the maximum number
-of iterations before AFL will restart the process from scratch. This minimizes
-the impact of memory leaks and similar glitches; 1000 is a good starting point.
-
-A more detailed template is shown in ../experimental/persistent_demo/.
-Similarly to the previous mode, the feature works only with afl-gcc-fast or
-afl-clang-fast; #ifdef guards can be used to suppress it when using other
-compilers.
-
-Note that as with the previous mode, the feature is easy to misuse; if you
-do not reset the critical state fully, you may end up with false positives or
-waste a whole lot of CPU power doing nothing useful at all. Be particularly
-wary of memory leaks and the state of file descriptors.
-
-When running in this mode, the execution paths will inherently vary a bit
-depending on whether the input loop is being entered for the first time or
-executed again. To avoid spurious warnings, the feature implies
-AFL_NO_VAR_CHECK and hides the "variable path" warnings in the UI.
-
diff --git a/gcc_plugin/README.md b/gcc_plugin/README.md
new file mode 100644
index 00000000..8b944f1a
--- /dev/null
+++ b/gcc_plugin/README.md
@@ -0,0 +1,158 @@
+# GCC-based instrumentation for afl-fuzz
+
+  (See [../README.md](../README.md) for the general instruction manual.)
+  (See [../llvm_mode/README.md](../llvm_mode/README.md) for the LLVM-based instrumentation.)
+
+!!! TODO items are:
+!!!  => inline instrumentation has to work!
+!!!
+
+
+## 1) Introduction
+
+The code in this directory allows you to instrument programs for AFL using
+true compiler-level instrumentation, instead of the more crude
+assembly-level rewriting approach taken by afl-gcc and afl-clang. This has
+several interesting properties:
+
+  - The compiler can make many optimizations that are hard to pull off when
+    manually inserting assembly. As a result, some slow, CPU-bound programs will
+    run up to around faster.
+
+    The gains are less pronounced for fast binaries, where the speed is limited
+    chiefly by the cost of creating new processes. In such cases, the gain will
+    probably stay within 10%.
+
+  - The instrumentation is CPU-independent. At least in principle, you should
+    be able to rely on it to fuzz programs on non-x86 architectures (after
+    building afl-fuzz with AFL_NOX86=1).
+
+  - Because the feature relies on the internals of GCC, it is gcc-specific
+    and will *not* work with LLVM (see ../llvm_mode for an alternative).
+
+Once this implementation is shown to be sufficiently robust and portable, it
+will probably replace afl-gcc. For now, it can be built separately and
+co-exists with the original code.
+
+The idea and much of the implementation comes from Laszlo Szekeres.
+
+## 2) How to use
+
+In order to leverage this mechanism, you need to have modern enough GCC
+(>= version 4.5.0) and the plugin headers installed on your system. That
+should be all you need. On Debian machines, these headers can be acquired by
+installing the `gcc-<VERSION>-plugin-dev` packages.
+
+To build the instrumentation itself, type 'make'. This will generate binaries
+called afl-gcc-fast and afl-g++-fast in the parent directory. 
+If the CC/CXX have been overridden, those compilers will be used from
+those wrappers without using AFL_CXX/AFL_CC settings.
+Once this is done, you can instrument third-party code in a way similar to the
+standard operating mode of AFL, e.g.:
+
+  CC=/path/to/afl/afl-gcc-fast ./configure [...options...]
+  make
+
+Be sure to also include CXX set to afl-g++-fast for C++ code.
+
+The tool honors roughly the same environmental variables as afl-gcc (see
+../docs/env_variables.txt). This includes AFL_INST_RATIO, AFL_USE_ASAN,
+AFL_HARDEN, and AFL_DONT_OPTIMIZE.
+
+Note: if you want the GCC plugin to be installed on your system for all
+users, you need to build it before issuing 'make install' in the parent
+directory.
+
+## 3) Gotchas, feedback, bugs
+
+This is an early-stage mechanism, so field reports are welcome. You can send bug
+reports to <hexcoder-@github.com>.
+
+## 4) Bonus feature #1: deferred initialization
+
+AFL tries to optimize performance by executing the targeted binary just once,
+stopping it just before main(), and then cloning this "master" process to get
+a steady supply of targets to fuzz.
+
+Although this approach eliminates much of the OS-, linker- and libc-level
+costs of executing the program, it does not always help with binaries that
+perform other time-consuming initialization steps - say, parsing a large config
+file before getting to the fuzzed data.
+
+In such cases, it's beneficial to initialize the forkserver a bit later, once
+most of the initialization work is already done, but before the binary attempts
+to read the fuzzed input and parse it; in some cases, this can offer a 10x+
+performance gain. You can implement delayed initialization in LLVM mode in a
+fairly simple way.
+
+First, locate a suitable location in the code where the delayed cloning can
+take place. This needs to be done with *extreme* care to avoid breaking the
+binary. In particular, the program will probably malfunction if you select
+a location after:
+
+  - The creation of any vital threads or child processes - since the forkserver
+    can't clone them easily.
+
+  - The initialization of timers via setitimer() or equivalent calls.
+
+  - The creation of temporary files, network sockets, offset-sensitive file
+    descriptors, and similar shared-state resources - but only provided that
+    their state meaningfully influences the behavior of the program later on.
+
+  - Any access to the fuzzed input, including reading the metadata about its
+    size.
+
+With the location selected, add this code in the appropriate spot:
+
+```
+#ifdef __AFL_HAVE_MANUAL_CONTROL
+  __AFL_INIT();
+#endif
+```
+
+You don't need the #ifdef guards, but they will make the program still work as
+usual when compiled with a tool other than afl-gcc-fast/afl-clang-fast.
+
+Finally, recompile the program with afl-gcc-fast (afl-gcc or afl-clang will
+*not* generate a deferred-initialization binary) - and you should be all set!
+
+## 5) Bonus feature #2: persistent mode
+
+Some libraries provide APIs that are stateless, or whose state can be reset in
+between processing different input files. When such a reset is performed, a
+single long-lived process can be reused to try out multiple test cases,
+eliminating the need for repeated fork() calls and the associated OS overhead.
+
+The basic structure of the program that does this would be:
+
+```
+  while (__AFL_LOOP(1000)) {
+
+    /* Read input data. */
+    /* Call library code to be fuzzed. */
+    /* Reset state. */
+
+  }
+
+  /* Exit normally */
+```
+
+The numerical value specified within the loop controls the maximum number
+of iterations before AFL will restart the process from scratch. This minimizes
+the impact of memory leaks and similar glitches; 1000 is a good starting point.
+
+A more detailed template is shown in ../experimental/persistent_demo/.
+Similarly to the previous mode, the feature works only with afl-gcc-fast or
+afl-clang-fast; #ifdef guards can be used to suppress it when using other
+compilers.
+
+Note that as with the previous mode, the feature is easy to misuse; if you
+do not reset the critical state fully, you may end up with false positives or
+waste a whole lot of CPU power doing nothing useful at all. Be particularly
+wary of memory leaks and the state of file descriptors.
+
+When running in this mode, the execution paths will inherently vary a bit
+depending on whether the input loop is being entered for the first time or
+executed again. To avoid spurious warnings, the feature implies
+AFL_NO_VAR_CHECK and hides the "variable path" warnings in the UI.
+
diff --git a/libtokencap/README.md b/libtokencap/README.md
new file mode 100644
index 00000000..8aae38bf
--- /dev/null
+++ b/libtokencap/README.md
@@ -0,0 +1,64 @@
+# strcmp() / memcmp() token capture library
+
+  (See ../docs/README for the general instruction manual.)
+
+This companion library allows you to instrument `strcmp()`, `memcmp()`,
+and related functions to automatically extract syntax tokens passed to any of
+these libcalls. The resulting list of tokens may be then given as a starting
+dictionary to afl-fuzz (the -x option) to improve coverage on subsequent
+fuzzing runs.
+
+This may help improving coverage in some targets, and do precisely nothing in
+others. In some cases, it may even make things worse: if libtokencap picks up
+syntax tokens that are not used to process the input data, but that are a part
+of - say - parsing a config file... well, you're going to end up wasting a lot
+of CPU time on trying them out in the input stream. In other words, use this
+feature with care. Manually screening the resulting dictionary is almost
+always a necessity.
+
+As for the actual operation: the library stores tokens, without any deduping,
+by appending them to a file specified via AFL_TOKEN_FILE. If the variable is not
+set, the tool uses stderr (which is probably not what you want).
+
+Similarly to afl-tmin, the library is not "proprietary" and can be used with
+other fuzzers or testing tools without the need for any code tweaks. It does not
+require AFL-instrumented binaries to work.
+
+To use the library, you *need* to make sure that your fuzzing target is compiled
+with -fno-builtin and is linked dynamically. If you wish to automate the first
+part without mucking with CFLAGS in Makefiles, you can set AFL_NO_BUILTIN=1
+when using afl-gcc. This setting specifically adds the following flags:
+
+```
+  -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strcasecmp
+  -fno-builtin-strcasencmp -fno-builtin-memcmp -fno-builtin-strstr
+  -fno-builtin-strcasestr
+```
+
+The next step is simply loading this library via LD_PRELOAD. The optimal usage
+pattern is to allow afl-fuzz to fuzz normally for a while and build up a corpus,
+and then fire off the target binary, with libtokencap.so loaded, on every file
+found by AFL in that earlier run. This demonstrates the basic principle:
+
+```
+  export AFL_TOKEN_FILE=$PWD/temp_output.txt
+
+  for i in <out_dir>/queue/id*; do
+    LD_PRELOAD=/path/to/libtokencap.so \
+      /path/to/target/program [...params, including $i...]
+  done
+
+  sort -u temp_output.txt >afl_dictionary.txt
+```
+
+If you don't get any results, the target library is probably not using strcmp()
+and memcmp() to parse input; or you haven't compiled it with -fno-builtin; or
+the whole thing isn't dynamically linked, and LD_PRELOAD is having no effect.
+
+Portability hints: There is probably no particularly portable and non-invasive
+way to distinguish between read-only and read-write memory mappings.
+The `__tokencap_load_mappings()` function is the only thing that would
+need to be changed for other OSes.
+
+Current supported OSes are: Linux, Darwin, FreeBSD (thanks to @devnexen)
+
diff --git a/libtokencap/README.tokencap.md b/libtokencap/README.tokencap.md
deleted file mode 100644
index 8aae38bf..00000000
--- a/libtokencap/README.tokencap.md
+++ /dev/null
@@ -1,64 +0,0 @@
-# strcmp() / memcmp() token capture library
-
-  (See ../docs/README for the general instruction manual.)
-
-This companion library allows you to instrument `strcmp()`, `memcmp()`,
-and related functions to automatically extract syntax tokens passed to any of
-these libcalls. The resulting list of tokens may be then given as a starting
-dictionary to afl-fuzz (the -x option) to improve coverage on subsequent
-fuzzing runs.
-
-This may help improving coverage in some targets, and do precisely nothing in
-others. In some cases, it may even make things worse: if libtokencap picks up
-syntax tokens that are not used to process the input data, but that are a part
-of - say - parsing a config file... well, you're going to end up wasting a lot
-of CPU time on trying them out in the input stream. In other words, use this
-feature with care. Manually screening the resulting dictionary is almost
-always a necessity.
-
-As for the actual operation: the library stores tokens, without any deduping,
-by appending them to a file specified via AFL_TOKEN_FILE. If the variable is not
-set, the tool uses stderr (which is probably not what you want).
-
-Similarly to afl-tmin, the library is not "proprietary" and can be used with
-other fuzzers or testing tools without the need for any code tweaks. It does not
-require AFL-instrumented binaries to work.
-
-To use the library, you *need* to make sure that your fuzzing target is compiled
-with -fno-builtin and is linked dynamically. If you wish to automate the first
-part without mucking with CFLAGS in Makefiles, you can set AFL_NO_BUILTIN=1
-when using afl-gcc. This setting specifically adds the following flags:
-
-```
-  -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strcasecmp
-  -fno-builtin-strcasencmp -fno-builtin-memcmp -fno-builtin-strstr
-  -fno-builtin-strcasestr
-```
-
-The next step is simply loading this library via LD_PRELOAD. The optimal usage
-pattern is to allow afl-fuzz to fuzz normally for a while and build up a corpus,
-and then fire off the target binary, with libtokencap.so loaded, on every file
-found by AFL in that earlier run. This demonstrates the basic principle:
-
-```
-  export AFL_TOKEN_FILE=$PWD/temp_output.txt
-
-  for i in <out_dir>/queue/id*; do
-    LD_PRELOAD=/path/to/libtokencap.so \
-      /path/to/target/program [...params, including $i...]
-  done
-
-  sort -u temp_output.txt >afl_dictionary.txt
-```
-
-If you don't get any results, the target library is probably not using strcmp()
-and memcmp() to parse input; or you haven't compiled it with -fno-builtin; or
-the whole thing isn't dynamically linked, and LD_PRELOAD is having no effect.
-
-Portability hints: There is probably no particularly portable and non-invasive
-way to distinguish between read-only and read-write memory mappings.
-The `__tokencap_load_mappings()` function is the only thing that would
-need to be changed for other OSes.
-
-Current supported OSes are: Linux, Darwin, FreeBSD (thanks to @devnexen)
-
diff --git a/testcases/README.md b/testcases/README.md
new file mode 100644
index 00000000..ef38d3c4
--- /dev/null
+++ b/testcases/README.md
@@ -0,0 +1,17 @@
+# AFL starting test cases
+
+  (See [../README.md](../README.md) for the general instruction manual.)
+
+The archives/, images/, multimedia/, and others/ subdirectories contain small,
+standalone files that can be used to seed afl-fuzz when testing parsers for a
+variety of common data formats.
+
+There is probably not much to be said about these files, except that they were
+optimized for size and stripped of any non-essential fluff. Some directories
+contain several examples that exercise various features of the underlying format.
+For example, there is a PNG file with and without a color profile.
+
+Additional test cases are always welcome.
+
+In addition to well-chosen starting files, many fuzzing jobs benefit from a
+small and concise dictionary. See [../dictionaries/README.md](../dictionaries/README.md) for more.
diff --git a/testcases/README.testcases b/testcases/README.testcases
deleted file mode 100644
index 30110ba1..00000000
--- a/testcases/README.testcases
+++ /dev/null
@@ -1,19 +0,0 @@
-=======================
-AFL starting test cases
-=======================
-
-  (See ../docs/README for the general instruction manual.)
-
-The archives/, images/, multimedia/, and others/ subdirectories contain small,
-standalone files that can be used to seed afl-fuzz when testing parsers for a
-variety of common data formats.
-
-There is probably not much to be said about these files, except that they were
-optimized for size and stripped of any non-essential fluff. Some directories
-contain several examples that exercise various features of the underlying format.
-For example, there is a PNG file with and without a color profile.
-
-Additional test cases are always welcome.
-
-In addition to well-chosen starting files, many fuzzing jobs benefit from a
-small and concise dictionary. See ../dictionaries/README.dictionaries for more.
-- 
cgit 1.4.1