about summary refs log tree commit diff
diff options
context:
space:
mode:
-rw-r--r--docs/env_variables.md7
-rw-r--r--docs/fuzzing_expert.md12
-rw-r--r--instrumentation/README.ctx.md38
-rw-r--r--instrumentation/README.llvm.md229
-rw-r--r--instrumentation/README.ngram.md28
5 files changed, 152 insertions, 162 deletions
diff --git a/docs/env_variables.md b/docs/env_variables.md
index 65cca0dc..4386c5f8 100644
--- a/docs/env_variables.md
+++ b/docs/env_variables.md
@@ -171,7 +171,7 @@ config.h to at least 18 and maybe up to 20 for this as otherwise too many map
 collisions occur.
 
 For more information, see
-[instrumentation/README.ctx.md](../instrumentation/README.ctx.md).
+[instrumentation/README.llvm.md#6) AFL Context Sensitive Branch Coverage](../instrumentation/README.llvm.md#6-afl-context-sensitive-branch-coverage).
 
 #### INSTRUMENT LIST (selectively instrument files and functions)
 
@@ -247,7 +247,7 @@ in config.h to at least 18 and maybe up to 20 for this as otherwise too many map
 collisions occur.
 
 For more information, see
-[instrumentation/README.ngram.md](../instrumentation/README.ngram.md).
+[instrumentation/README.llvm.md#7) AFL N-Gram Branch Coverage](../instrumentation/README.llvm.md#7-afl-n-gram-branch-coverage).
 
 #### NOT_ZERO
 
@@ -261,9 +261,6 @@ For more information, see
     If the target performs only a few loops, then this will give a small
     performance boost.
 
-For more information, see
-[instrumentation/README.neverzero.md](../instrumentation/README.neverzero.md).
-
 #### Thread safe instrumentation counters (in all modes)
 
 Setting `AFL_LLVM_THREADSAFE_INST` will inject code that implements thread safe
diff --git a/docs/fuzzing_expert.md b/docs/fuzzing_expert.md
index 876c5fbb..5945d114 100644
--- a/docs/fuzzing_expert.md
+++ b/docs/fuzzing_expert.md
@@ -112,12 +112,8 @@ are interested in:
 
 There are many more options and modes available however these are most of the
 time less effective. See:
- * [instrumentation/README.ctx.md](../instrumentation/README.ctx.md)
- * [instrumentation/README.ngram.md](../instrumentation/README.ngram.md)
-
-AFL++ performs "never zero" counting in its bitmap. You can read more about this
-here:
- * [instrumentation/README.neverzero.md](../instrumentation/README.neverzero.md)
+ * [instrumentation/README.llvm.md#6) AFL Context Sensitive Branch Coverage](../instrumentation/README.llvm.md#6-afl-context-sensitive-branch-coverage).
+ * [instrumentation/README.llvm.md#7) AFL N-Gram Branch Coverage](../instrumentation/README.llvm.md#7-afl-n-gram-branch-coverage)
 
 #### c) Sanitizers
 
@@ -247,7 +243,7 @@ For meson you have to set the AFL++ compiler with the very first command!
 
 Sometimes cmake and configure do not pick up the AFL++ compiler, or the
 ranlib/ar that is needed - because this was just not foreseen by the developer
-of the target. Or they have non-standard options. Figure out if there is a 
+of the target. Or they have non-standard options. Figure out if there is a
 non-standard way to set this, otherwise set up the build normally and edit the
 generated build environment afterwards manually to point it to the right compiler
 (and/or ranlib and ar).
@@ -337,7 +333,7 @@ Note that this step is rather optional though.
 
 #### Done!
 
-The INPUTS_UNIQUE/ directory from step b) - or even better the directory input/ 
+The INPUTS_UNIQUE/ directory from step b) - or even better the directory input/
 if you minimized the corpus in step c) - is the resulting input corpus directory
 to be used in fuzzing! :-)
 
diff --git a/instrumentation/README.ctx.md b/instrumentation/README.ctx.md
deleted file mode 100644
index 335e9921..00000000
--- a/instrumentation/README.ctx.md
+++ /dev/null
@@ -1,38 +0,0 @@
-# AFL Context Sensitive Branch Coverage
-
-## What is this?
-
-This is an LLVM-based implementation of the context sensitive branch coverage.
-
-Basically every function gets its own ID and, every time when an edge is logged,
-all the IDs in the callstack are hashed and combined with the edge transition
-hash to augment the classic edge coverage with the information about the
-calling context.
-
-So if both function A and function B call a function C, the coverage
-collected in C will be different.
-
-In math the coverage is collected as follows:
-`map[current_location_ID ^ previous_location_ID >> 1 ^ hash_callstack_IDs] += 1`
-
-The callstack hash is produced XOR-ing the function IDs to avoid explosion with
-recursive functions.
-
-## Usage
-
-Set the `AFL_LLVM_INSTRUMENT=CTX` or `AFL_LLVM_CTX=1` environment variable.
-
-It is highly recommended to increase the MAP_SIZE_POW2 definition in
-config.h to at least 18 and maybe up to 20 for this as otherwise too
-many map collisions occur.
-
-## Caller Branch Coverage
-
-If the context sensitive coverage introduces too may collisions and becoming
-detrimental, the user can choose to augment edge coverage with just the
-called function ID, instead of the entire callstack hash.
-
-In math the coverage is collected as follows:
-`map[current_location_ID ^ previous_location_ID >> 1 ^ previous_callee_ID] += 1`
-
-Set the `AFL_LLVM_INSTRUMENT=CALLER` or `AFL_LLVM_CALLER=1` environment variable.
diff --git a/instrumentation/README.llvm.md b/instrumentation/README.llvm.md
index dbb604f2..1671f385 100644
--- a/instrumentation/README.llvm.md
+++ b/instrumentation/README.llvm.md
@@ -1,72 +1,79 @@
 # Fast LLVM-based instrumentation for afl-fuzz
 
-  (See [../README.md](../README.md) for the general instruction manual.)
+For the general instruction manual, see [../README.md](../README.md).
 
-  (See [README.gcc_plugin.md](README.gcc_plugin.md) for the GCC-based instrumentation.)
+For the GCC-based instrumentation, see
+[README.gcc_plugin.md](README.gcc_plugin.md).
 
 ## 1) Introduction
 
 ! llvm_mode works with llvm versions 3.8 up to 13 !
 
-The code in this directory allows you to instrument programs for AFL using
-true compiler-level instrumentation, instead of the more crude
-assembly-level rewriting approach taken by afl-gcc and afl-clang. This has
-several interesting properties:
+The code in this directory allows you to instrument programs for AFL using true
+compiler-level instrumentation, instead of the more crude assembly-level
+rewriting approach taken by afl-gcc and afl-clang. This has several interesting
+properties:
 
-  - The compiler can make many optimizations that are hard to pull off when
-    manually inserting assembly. As a result, some slow, CPU-bound programs will
-    run up to around 2x faster.
+- The compiler can make many optimizations that are hard to pull off when
+  manually inserting assembly. As a result, some slow, CPU-bound programs will
+  run up to around 2x faster.
 
-    The gains are less pronounced for fast binaries, where the speed is limited
-    chiefly by the cost of creating new processes. In such cases, the gain will
-    probably stay within 10%.
+  The gains are less pronounced for fast binaries, where the speed is limited
+  chiefly by the cost of creating new processes. In such cases, the gain will
+  probably stay within 10%.
 
-  - The instrumentation is CPU-independent. At least in principle, you should
-    be able to rely on it to fuzz programs on non-x86 architectures (after
-    building afl-fuzz with AFL_NO_X86=1).
+- The instrumentation is CPU-independent. At least in principle, you should be
+  able to rely on it to fuzz programs on non-x86 architectures (after building
+  afl-fuzz with AFL_NO_X86=1).
 
-  - The instrumentation can cope a bit better with multi-threaded targets.
+- The instrumentation can cope a bit better with multi-threaded targets.
 
-  - Because the feature relies on the internals of LLVM, it is clang-specific
-    and will *not* work with GCC (see ../gcc_plugin/ for an alternative once
-    it is available).
+- Because the feature relies on the internals of LLVM, it is clang-specific and
+  will *not* work with GCC (see ../gcc_plugin/ for an alternative once it is
+  available).
 
 Once this implementation is shown to be sufficiently robust and portable, it
 will probably replace afl-clang. For now, it can be built separately and
 co-exists with the original code.
 
-The idea and much of the intial implementation came from Laszlo Szekeres.
+The idea and much of the initial implementation came from Laszlo Szekeres.
 
 ## 2a) How to use this - short
 
 Set the `LLVM_CONFIG` variable to the clang version you want to use, e.g.
+
 ```
 LLVM_CONFIG=llvm-config-9 make
 ```
+
 In case you have your own compiled llvm version specify the full path:
+
 ```
 LLVM_CONFIG=~/llvm-project/build/bin/llvm-config make
 ```
+
 If you try to use a new llvm version on an old Linux this can fail because of
 old c++ libraries. In this case usually switching to gcc/g++ to compile
 llvm_mode will work:
+
 ```
 LLVM_CONFIG=llvm-config-7 REAL_CC=gcc REAL_CXX=g++ make
 ```
-It is highly recommended to use the newest clang version you can put your
-hands on :)
+
+It is highly recommended to use the newest clang version you can put your hands
+on :)
 
 Then look at [README.persistent_mode.md](README.persistent_mode.md).
 
 ## 2b) How to use this - long
 
 In order to leverage this mechanism, you need to have clang installed on your
-system. You should also make sure that the llvm-config tool is in your path
-(or pointed to via LLVM_CONFIG in the environment).
+system. You should also make sure that the llvm-config tool is in your path (or
+pointed to via LLVM_CONFIG in the environment).
 
-Note that if you have several LLVM versions installed, pointing LLVM_CONFIG
-to the version you want to use will switch compiling to this specific
-version - if you installation is set up correctly :-)
+Note that if you have several LLVM versions installed, pointing LLVM_CONFIG to
+the version you want to use will switch compiling to this specific version - if
+you installation is set up correctly :-)
 
 Unfortunately, some systems that do have clang come without llvm-config or the
 LLVM development headers; one example of this is FreeBSD. FreeBSD users will
@@ -75,15 +82,15 @@ load modules (you'll see "Service unavailable" when loading afl-llvm-pass.so).
 
 To solve all your problems, you can grab pre-built binaries for your OS from:
 
-  https://llvm.org/releases/download.html
+[https://llvm.org/releases/download.html](https://llvm.org/releases/download.html)
 
 ...and then put the bin/ directory from the tarball at the beginning of your
 $PATH when compiling the feature and building packages later on. You don't need
 to be root for that.
 
-To build the instrumentation itself, type 'make'. This will generate binaries
-called afl-clang-fast and afl-clang-fast++ in the parent directory. Once this
-is done, you can instrument third-party code in a way similar to the standard
+To build the instrumentation itself, type `make`. This will generate binaries
+called afl-clang-fast and afl-clang-fast++ in the parent directory. Once this is
+done, you can instrument third-party code in a way similar to the standard
 operating mode of AFL, e.g.:
 
 ```
@@ -93,81 +100,137 @@ operating mode of AFL, e.g.:
 
 Be sure to also include CXX set to afl-clang-fast++ for C++ code.
 
-Note that afl-clang-fast/afl-clang-fast++ are just pointers to afl-cc.
-You can also use afl-cc/afl-c++ and instead direct it to use LLVM
-instrumentation by either setting `AFL_CC_COMPILER=LLVM` or pass the parameter
-`--afl-llvm` via CFLAGS/CXXFLAGS/CPPFLAGS.
+Note that afl-clang-fast/afl-clang-fast++ are just pointers to afl-cc. You can
+also use afl-cc/afl-c++ and instead direct it to use LLVM instrumentation by
+either setting `AFL_CC_COMPILER=LLVM` or pass the parameter `--afl-llvm` via
+CFLAGS/CXXFLAGS/CPPFLAGS.
 
 The tool honors roughly the same environmental variables as afl-gcc (see
 [docs/env_variables.md](../docs/env_variables.md)). This includes AFL_USE_ASAN,
-AFL_HARDEN, and AFL_DONT_OPTIMIZE. However AFL_INST_RATIO is not honored
-as it does not serve a good purpose with the more effective PCGUARD analysis.
+AFL_HARDEN, and AFL_DONT_OPTIMIZE. However AFL_INST_RATIO is not honored as it
+does not serve a good purpose with the more effective PCGUARD analysis.
 
 ## 3) Options
 
-Several options are present to make llvm_mode faster or help it rearrange
-the code to make afl-fuzz path discovery easier.
+Several options are present to make llvm_mode faster or help it rearrange the
+code to make afl-fuzz path discovery easier.
 
-If you need just to instrument specific parts of the code, you can the instrument file list
-which C/C++ files to actually instrument. See [README.instrument_list.md](README.instrument_list.md)
+If you need just to instrument specific parts of the code, you can the
+instrument file list which C/C++ files to actually instrument. See
+[README.instrument_list.md](README.instrument_list.md)
 
-For splitting memcmp, strncmp, etc. please see [README.laf-intel.md](README.laf-intel.md)
+For splitting memcmp, strncmp, etc. please see
+[README.laf-intel.md](README.laf-intel.md)
 
 Then there are different ways of instrumenting the target:
 
-1. An better instrumentation strategy uses LTO and link time
-instrumentation. Note that not all targets can compile in this mode, however
-if it works it is the best option you can use.
-Simply use afl-clang-lto/afl-clang-lto++ to use this option.
-See [README.lto.md](README.lto.md)
+1. An better instrumentation strategy uses LTO and link time instrumentation.
+   Note that not all targets can compile in this mode, however if it works it is
+   the best option you can use. Simply use afl-clang-lto/afl-clang-lto++ to use
+   this option. See [README.lto.md](README.lto.md).
 
-2. Alternativly you can choose a completely different coverage method:
+2. Alternatively you can choose a completely different coverage method:
 
-2a. N-GRAM coverage - which combines the previous visited edges with the
-current one. This explodes the map but on the other hand has proven to be
-effective for fuzzing.
-See [README.ngram.md](README.ngram.md)
+2a. N-GRAM coverage - which combines the previous visited edges with the current
+    one. This explodes the map but on the other hand has proven to be effective
+    for fuzzing. See
+    [7) AFL N-Gram Branch Coverage](#7-afl-n-gram-branch-coverage).
 
 2b. Context sensitive coverage - which combines the visited edges with an
-individual caller ID (the function that called the current one)
-[README.ctx.md](README.ctx.md)
+    individual caller ID (the function that called the current one). See
+    [6) AFL Context Sensitive Branch Coverage](#6-afl-context-sensitive-branch-coverage).
 
-Then - additionally to one of the instrumentation options above - there is
-a very effective new instrumentation option called CmpLog as an alternative to
-laf-intel that allow AFL++ to apply mutations similar to Redqueen.
-See [README.cmplog.md](README.cmplog.md)
+Then - additionally to one of the instrumentation options above - there is a
+very effective new instrumentation option called CmpLog as an alternative to
+laf-intel that allow AFL++ to apply mutations similar to Redqueen. See
+[README.cmplog.md](README.cmplog.md).
 
-Finally if your llvm version is 8 or lower, you can activate a mode that
-prevents that a counter overflow result in a 0 value. This is good for
-path discovery, but the llvm implementation for x86 for this functionality
-is not optimal and was only fixed in llvm 9.
-You can set this with AFL_LLVM_NOT_ZERO=1
-See [README.neverzero.md](README.neverzero.md)
+Finally, if your llvm version is 8 or lower, you can activate a mode that
+prevents that a counter overflow result in a 0 value. This is good for path
+discovery, but the llvm implementation for x86 for this functionality is not
+optimal and was only fixed in llvm 9. You can set this with AFL_LLVM_NOT_ZERO=1.
 
-Support for thread safe counters has been added for all modes.
-Activate it with `AFL_LLVM_THREADSAFE_INST=1`. The tradeoff is better precision
-in multi threaded apps for a slightly higher instrumentation overhead.
-This also disables the nozero counter default for performance reasons.
+Support for thread safe counters has been added for all modes. Activate it with
+`AFL_LLVM_THREADSAFE_INST=1`. The tradeoff is better precision in multi threaded
+apps for a slightly higher instrumentation overhead. This also disables the
+nozero counter default for performance reasons.
 
-## 4) Snapshot feature
+## 4) deferred initialization, persistent mode, shared memory fuzzing
 
-To speed up fuzzing you can use a linux loadable kernel module which enables
-a snapshot feature.
-See [README.snapshot.md](README.snapshot.md)
+This is the most powerful and effective fuzzing you can do. Please see
+[README.persistent_mode.md](README.persistent_mode.md) for a full explanation.
 
-## 5) Gotchas, feedback, bugs
+## 5) Bonus feature: 'dict2file' pass
 
-This is an early-stage mechanism, so field reports are welcome. You can send bug
-reports to <afl-users@googlegroups.com>.
+Just specify `AFL_LLVM_DICT2FILE=/absolute/path/file.txt` and during compilation
+all constant string compare parameters will be written to this file to be used
+with afl-fuzz' `-x` option.
 
-## 6) deferred initialization, persistent mode, shared memory fuzzing
+## 6) AFL Context Sensitive Branch Coverage
 
-This is the most powerful and effective fuzzing you can do.
-Please see [README.persistent_mode.md](README.persistent_mode.md) for a
-full explanation.
+### What is this?
 
-## 7) Bonus feature: 'dict2file' pass
+This is an LLVM-based implementation of the context sensitive branch coverage.
 
-Just specify `AFL_LLVM_DICT2FILE=/absolute/path/file.txt` and during compilation
-all constant string compare parameters will be written to this file to be
-used with afl-fuzz' `-x` option.
+Basically every function gets its own ID and, every time when an edge is logged,
+all the IDs in the callstack are hashed and combined with the edge transition
+hash to augment the classic edge coverage with the information about the calling
+context.
+
+So if both function A and function B call a function C, the coverage collected
+in C will be different.
+
+In math the coverage is collected as follows: `map[current_location_ID ^
+previous_location_ID >> 1 ^ hash_callstack_IDs] += 1`
+
+The callstack hash is produced XOR-ing the function IDs to avoid explosion with
+recursive functions.
+
+### Usage
+
+Set the `AFL_LLVM_INSTRUMENT=CTX` or `AFL_LLVM_CTX=1` environment variable.
+
+It is highly recommended to increase the MAP_SIZE_POW2 definition in config.h to
+at least 18 and maybe up to 20 for this as otherwise too many map collisions
+occur.
+
+### Caller Branch Coverage
+
+If the context sensitive coverage introduces too may collisions and becoming
+detrimental, the user can choose to augment edge coverage with just the called
+function ID, instead of the entire callstack hash.
+
+In math the coverage is collected as follows: `map[current_location_ID ^
+previous_location_ID >> 1 ^ previous_callee_ID] += 1`
+
+Set the `AFL_LLVM_INSTRUMENT=CALLER` or `AFL_LLVM_CALLER=1` environment
+variable.
+
+## 7) AFL N-Gram Branch Coverage
+
+### Source
+
+This is an LLVM-based implementation of the n-gram branch coverage proposed in
+the paper
+["Be Sensitive and Collaborative: Analyzing Impact of Coverage Metrics in Greybox Fuzzing"](https://www.usenix.org/system/files/raid2019-wang-jinghan.pdf)
+by Jinghan Wang, et. al.
+
+Note that the original implementation (available
+[here](https://github.com/bitsecurerlab/afl-sensitive)) is built on top of AFL's
+qemu_mode. This is essentially a port that uses LLVM vectorized instructions
+(available from llvm versions 4.0.1 and higher) to achieve the same results when
+compiling source code.
+
+In math the branch coverage is performed as follows: `map[current_location ^
+prev_location[0] >> 1 ^ prev_location[1] >> 1 ^ ... up to n-1`] += 1`
+
+### Usage
+
+The size of `n` (i.e., the number of branches to remember) is an option that is
+specified either in the `AFL_LLVM_INSTRUMENT=NGRAM-{value}` or the
+`AFL_LLVM_NGRAM_SIZE` environment variable. Good values are 2, 4, or 8, valid
+are 2-16.
+
+It is highly recommended to increase the MAP_SIZE_POW2 definition in config.h to
+at least 18 and maybe up to 20 for this as otherwise too many map collisions
+occur.
\ No newline at end of file
diff --git a/instrumentation/README.ngram.md b/instrumentation/README.ngram.md
deleted file mode 100644
index da61ef32..00000000
--- a/instrumentation/README.ngram.md
+++ /dev/null
@@ -1,28 +0,0 @@
-# AFL N-Gram Branch Coverage
-
-## Source
-
-This is an LLVM-based implementation of the n-gram branch coverage proposed in
-the paper ["Be Sensitive and Collaborative: Analzying Impact of Coverage Metrics
-in Greybox Fuzzing"](https://www.usenix.org/system/files/raid2019-wang-jinghan.pdf),
-by Jinghan Wang, et. al.
-
-Note that the original implementation (available
-[here](https://github.com/bitsecurerlab/afl-sensitive))
-is built on top of AFL's QEMU mode.
-This is essentially a port that uses LLVM vectorized instructions (available from
-llvm versions 4.0.1 and higher) to achieve the same results when compiling source code.
-
-In math the branch coverage is performed as follows:
-`map[current_location ^ prev_location[0] >> 1 ^ prev_location[1] >> 1 ^ ... up to n-1`] += 1`
-
-## Usage
-
-The size of `n` (i.e., the number of branches to remember) is an option
-that is specified either in the `AFL_LLVM_INSTRUMENT=NGRAM-{value}` or the
-`AFL_LLVM_NGRAM_SIZE` environment variable.
-Good values are 2, 4 or 8, valid are 2-16.
-
-It is highly recommended to increase the MAP_SIZE_POW2 definition in
-config.h to at least 18 and maybe up to 20 for this as otherwise too
-many map collisions occur.