Edit instrumentation READMEs

author: llzmb <46303940+llzmb@users.noreply.github.com> 2021-11-23 21:03:56 +0100
committer: llzmb <46303940+llzmb@users.noreply.github.com> 2021-11-23 21:03:56 +0100
commit: 6cce577b907eb2ac58b0bc5ddacf373627b3480f (patch)
tree: 002ab2f79f37442826ad9d586fca2cda3c4b946f /instrumentation
parent: d9ff3745d01e30f3addbb51e391b8b5d456d07a4 (diff)
download: afl++-6cce577b907eb2ac58b0bc5ddacf373627b3480f.tar.gz
6 files changed, 350 insertions, 321 deletions
diff --git a/instrumentation/README.cmplog.md b/instrumentation/README.cmplog.md
index a796c7a7..146b4620 100644
--- a/instrumentation/README.cmplog.md
+++ b/instrumentation/README.cmplog.md
@@ -1,11 +1,12 @@
 # CmpLog instrumentation
 
-The CmpLog instrumentation enables logging of comparison operands in a
-shared memory.
+The CmpLog instrumentation enables logging of comparison operands in a shared
+memory.
 
-These values can be used by various mutators built on top of it.
-At the moment we support the RedQueen mutator (input-2-state instructions only), 
-for details see [the RedQueen paper](https://www.syssec.ruhr-uni-bochum.de/media/emma/veroeffentlichungen/2018/12/17/NDSS19-Redqueen.pdf).
+These values can be used by various mutators built on top of it. At the moment,
+we support the RedQueen mutator (input-2-state instructions only), for details
+see
+[the RedQueen paper](https://www.syssec.ruhr-uni-bochum.de/media/emma/veroeffentlichungen/2018/12/17/NDSS19-Redqueen.pdf).
 
 ## Build
 
@@ -14,7 +15,8 @@ program.
 
 The first version is built using the regular AFL++ instrumentation.
 
-The second one, the CmpLog binary, is built with setting AFL_LLVM_CMPLOG during the compilation.
+The second one, the CmpLog binary, is built with setting AFL_LLVM_CMPLOG during
+the compilation.
 
 For example:
 
@@ -32,8 +34,8 @@ unset AFL_LLVM_CMPLOG
 
 ## Use
 
-AFL++ has the new `-c` option that needs to be used to specify the CmpLog binary (the second
-build).
+AFL++ has the new `-c` option that needs to be used to specify the CmpLog binary
+(the second build).
 
 For example:
 
@@ -41,4 +43,4 @@ For example:
 afl-fuzz -i input -o output -c ./program.cmplog -m none -- ./program.afl @@
 ```
 
-Be sure to use `-m none` because CmpLog can map a lot of pages.
+Be sure to use `-m none` because CmpLog can map a lot of pages.
+\ No newline at end of file
diff --git a/instrumentation/README.gcc_plugin.md b/instrumentation/README.gcc_plugin.md
index 230ceb73..33cf1c33 100644
--- a/instrumentation/README.gcc_plugin.md
+++ b/instrumentation/README.gcc_plugin.md
@@ -1,64 +1,68 @@
 # GCC-based instrumentation for afl-fuzz
 
-See [../README.md](../README.md) for the general instruction manual.
-See [README.llvm.md](README.llvm.md) for the LLVM-based instrumentation.
+For the general instruction manual, see [../README.md](../README.md).
+For the LLVM-based instrumentation, see [README.llvm.md](README.llvm.md).
 
 This document describes how to build and use `afl-gcc-fast` and `afl-g++-fast`,
 which instrument the target with the help of gcc plugins.
 
-TLDR:
-  * check the version of your gcc compiler: `gcc --version`
-  * `apt-get install gcc-VERSION-plugin-dev` or similar to install headers for gcc plugins
-  * `gcc` and `g++` must match the gcc-VERSION you installed headers for. You can set `AFL_CC`/`AFL_CXX`
-    to point to these!
-  * `make`
-  * just use `afl-gcc-fast`/`afl-g++-fast` normally like you would do with `afl-clang-fast`
+TL;DR:
+* Check the version of your gcc compiler: `gcc --version`
+* `apt-get install gcc-VERSION-plugin-dev` or similar to install headers for gcc
+  plugins.
+* `gcc` and `g++` must match the gcc-VERSION you installed headers for. You can
+  set `AFL_CC`/`AFL_CXX` to point to these!
+* `make`
+* Just use `afl-gcc-fast`/`afl-g++-fast` normally like you would do with
+  `afl-clang-fast`.
 
 ## 1) Introduction
 
-The code in this directory allows to instrument programs for AFL using
-true compiler-level instrumentation, instead of the more crude
-assembly-level rewriting approach taken by afl-gcc and afl-clang. This has
-several interesting properties:
+The code in this directory allows to instrument programs for AFL++ using true
+compiler-level instrumentation, instead of the more crude assembly-level
+rewriting approach taken by afl-gcc and afl-clang. This has several interesting
+properties:
 
-  - The compiler can make many optimizations that are hard to pull off when
-    manually inserting assembly. As a result, some slow, CPU-bound programs will
-    run up to around faster.
+- The compiler can make many optimizations that are hard to pull off when
+  manually inserting assembly. As a result, some slow, CPU-bound programs will
+  run up to around faster.
 
-    The gains are less pronounced for fast binaries, where the speed is limited
-    chiefly by the cost of creating new processes. In such cases, the gain will
-    probably stay within 10%.
+  The gains are less pronounced for fast binaries, where the speed is limited
+  chiefly by the cost of creating new processes. In such cases, the gain will
+  probably stay within 10%.
 
-  - The instrumentation is CPU-independent. At least in principle, you should
-    be able to rely on it to fuzz programs on non-x86 architectures (after
-    building `afl-fuzz` with `AFL_NOX86=1`).
+- The instrumentation is CPU-independent. At least in principle, you should be
+  able to rely on it to fuzz programs on non-x86 architectures (after building
+  `afl-fuzz` with `AFL_NOX86=1`).
 
-  - Because the feature relies on the internals of GCC, it is gcc-specific
-    and will *not* work with LLVM (see [README.llvm.md](README.llvm.md) for an alternative).
+- Because the feature relies on the internals of GCC, it is gcc-specific and
+  will *not* work with LLVM (see [README.llvm.md](README.llvm.md) for an
+  alternative).
 
 Once this implementation is shown to be sufficiently robust and portable, it
-will probably replace afl-gcc. For now, it can be built separately and
-co-exists with the original code.
+will probably replace afl-gcc. For now, it can be built separately and co-exists
+with the original code.
 
 The idea and much of the implementation comes from Laszlo Szekeres.
 
 ## 2) How to use
 
-In order to leverage this mechanism, you need to have modern enough GCC
-(>= version 4.5.0) and the plugin development headers installed on your system. That
+In order to leverage this mechanism, you need to have modern enough GCC (>=
+version 4.5.0) and the plugin development headers installed on your system. That
 should be all you need. On Debian machines, these headers can be acquired by
 installing the `gcc-VERSION-plugin-dev` packages.
 
 To build the instrumentation itself, type `make`. This will generate binaries
-called `afl-gcc-fast` and `afl-g++-fast` in the parent directory. 
+called `afl-gcc-fast` and `afl-g++-fast` in the parent directory.
 
-The gcc and g++ compiler links have to point to gcc-VERSION - or set these
-by pointing the environment variables `AFL_CC`/`AFL_CXX` to them.
-If the `CC`/`CXX` environment variables have been set, those compilers will be 
-preferred over those from the `AFL_CC`/`AFL_CXX` settings.
+The gcc and g++ compiler links have to point to gcc-VERSION - or set these by
+pointing the environment variables `AFL_CC`/`AFL_CXX` to them. If the `CC`/`CXX`
+environment variables have been set, those compilers will be preferred over
+those from the `AFL_CC`/`AFL_CXX` settings.
 
 Once this is done, you can instrument third-party code in a way similar to the
-standard operating mode of AFL, e.g.:
+standard operating mode of AFL++, e.g.:
+
 ```
   CC=/path/to/afl/afl-gcc-fast
   CXX=/path/to/afl/afl-g++-fast
@@ -66,15 +70,15 @@ standard operating mode of AFL, e.g.:
   ./configure [...options...]
   make
 ```
+
 Note: We also used `CXX` to set the C++ compiler to `afl-g++-fast` for C++ code.
 
 The tool honors roughly the same environmental variables as `afl-gcc` (see
-[env_variables.md](../docs/env_variables.md). This includes `AFL_INST_RATIO`,
-`AFL_USE_ASAN`, `AFL_HARDEN`, and `AFL_DONT_OPTIMIZE`.
+[docs/env_variables.md](../docs/env_variables.md). This includes
+`AFL_INST_RATIO`, `AFL_USE_ASAN`, `AFL_HARDEN`, and `AFL_DONT_OPTIMIZE`.
 
-Note: if you want the GCC plugin to be installed on your system for all
-users, you need to build it before issuing 'make install' in the parent
-directory.
+Note: if you want the GCC plugin to be installed on your system for all users,
+you need to build it before issuing 'make install' in the parent directory.
 
 ## 3) Gotchas, feedback, bugs
 
@@ -83,41 +87,40 @@ reports to afl@aflplus.plus.
 
 ## 4) Bonus feature #1: deferred initialization
 
-AFL tries to optimize performance by executing the targeted binary just once,
-stopping it just before main(), and then cloning this "main" process to get
-a steady supply of targets to fuzz.
+AFL++ tries to optimize performance by executing the targeted binary just once,
+stopping it just before `main()`, and then cloning this "main" process to get a
+steady supply of targets to fuzz.
 
-Although this approach eliminates much of the OS-, linker- and libc-level
-costs of executing the program, it does not always help with binaries that
-perform other time-consuming initialization steps - say, parsing a large config
-file before getting to the fuzzed data.
+Although this approach eliminates much of the OS-, linker- and libc-level costs
+of executing the program, it does not always help with binaries that perform
+other time-consuming initialization steps - say, parsing a large config file
+before getting to the fuzzed data.
 
 In such cases, it's beneficial to initialize the forkserver a bit later, once
 most of the initialization work is already done, but before the binary attempts
 to read the fuzzed input and parse it; in some cases, this can offer a 10x+
 performance gain. You can implement delayed initialization in GCC mode in a
-fairly simple way.
+fairly simple way:
 
-First, locate a suitable location in the code where the delayed cloning can
-take place. This needs to be done with *extreme* care to avoid breaking the
-binary. In particular, the program will probably malfunction if you select
-a location after:
+First, locate a suitable location in the code where the delayed cloning can take
+place. This needs to be done with *extreme* care to avoid breaking the binary.
+In particular, the program will probably malfunction if you select a location
+after:
 
-  - The creation of any vital threads or child processes - since the forkserver
-    can't clone them easily.
+- The creation of any vital threads or child processes - since the forkserver
+  can't clone them easily.
 
-  - The initialization of timers via setitimer() or equivalent calls.
+- The initialization of timers via `setitimer()` or equivalent calls.
 
-  - The creation of temporary files, network sockets, offset-sensitive file
-    descriptors, and similar shared-state resources - but only provided that
-    their state meaningfully influences the behavior of the program later on.
+- The creation of temporary files, network sockets, offset-sensitive file
+  descriptors, and similar shared-state resources - but only provided that their
+  state meaningfully influences the behavior of the program later on.
 
-  - Any access to the fuzzed input, including reading the metadata about its
-    size.
+- Any access to the fuzzed input, including reading the metadata about its size.
 
 With the location selected, add this code in the appropriate spot:
 
-```
+```c
 #ifdef __AFL_HAVE_MANUAL_CONTROL
   __AFL_INIT();
 #endif
@@ -131,14 +134,14 @@ Finally, recompile the program with afl-gcc-fast (afl-gcc or afl-clang will
 
 ## 5) Bonus feature #2: persistent mode
 
-Some libraries provide APIs that are stateless, or whose state can be reset in
+Some libraries provide APIs that are stateless or whose state can be reset in
 between processing different input files. When such a reset is performed, a
 single long-lived process can be reused to try out multiple test cases,
 eliminating the need for repeated `fork()` calls and the associated OS overhead.
 
 The basic structure of the program that does this would be:
 
-```
+```c
   while (__AFL_LOOP(1000)) {
 
     /* Read input data. */
@@ -147,22 +150,21 @@ The basic structure of the program that does this would be:
 
   }
 
-  /* Exit normally */
+  /* Exit normally. */
 ```
 
-The numerical value specified within the loop controls the maximum number
-of iterations before AFL will restart the process from scratch. This minimizes
+The numerical value specified within the loop controls the maximum number of
+iterations before AFL++ will restart the process from scratch. This minimizes
 the impact of memory leaks and similar glitches; 1000 is a good starting point.
 
-A more detailed template is shown in ../utils/persistent_mode/.
-Similarly to the previous mode, the feature works only with afl-gcc-fast or
-afl-clang-fast; #ifdef guards can be used to suppress it when using other
-compilers.
+A more detailed template is shown in ../utils/persistent_mode/. Similarly to the
+previous mode, the feature works only with afl-gcc-fast or afl-clang-fast;
+#ifdef guards can be used to suppress it when using other compilers.
 
-Note that as with the previous mode, the feature is easy to misuse; if you
-do not reset the critical state fully, you may end up with false positives or
-waste a whole lot of CPU power doing nothing useful at all. Be particularly
-wary of memory leaks and the state of file descriptors.
+Note that as with the previous mode, the feature is easy to misuse; if you do
+not reset the critical state fully, you may end up with false positives or waste
+a whole lot of CPU power doing nothing useful at all. Be particularly wary of
+memory leaks and the state of file descriptors.
 
 When running in this mode, the execution paths will inherently vary a bit
 depending on whether the input loop is being entered for the first time or
@@ -171,5 +173,5 @@ executed again. To avoid spurious warnings, the feature implies
 
 ## 6) Bonus feature #3: selective instrumentation
 
-It can be more effective to fuzzing to only instrument parts of the code.
-For details see [README.instrument_list.md](README.instrument_list.md).
+It can be more effective to fuzzing to only instrument parts of the code. For
+details, see [README.instrument_list.md](README.instrument_list.md).
+\ No newline at end of file
diff --git a/instrumentation/README.instrument_list.md b/instrumentation/README.instrument_list.md
index 7db9c055..b412b600 100644
--- a/instrumentation/README.instrument_list.md
+++ b/instrumentation/README.instrument_list.md
@@ -1,80 +1,84 @@
 # Using AFL++ with partial instrumentation
 
-  This file describes two different mechanisms to selectively instrument
-  only specific parts in the target.
+This file describes two different mechanisms to selectively instrument only
+specific parts in the target.
 
-  Both mechanisms work for LLVM and GCC_PLUGIN, but not for afl-clang/afl-gcc.
+Both mechanisms work for LLVM and GCC_PLUGIN, but not for afl-clang/afl-gcc.
 
 ## 1) Description and purpose
 
 When building and testing complex programs where only a part of the program is
-the fuzzing target, it often helps to only instrument the necessary parts of
-the program, leaving the rest uninstrumented. This helps to focus the fuzzer
-on the important parts of the program, avoiding undesired noise and
-disturbance by uninteresting code being exercised.
+the fuzzing target, it often helps to only instrument the necessary parts of the
+program, leaving the rest uninstrumented. This helps to focus the fuzzer on the
+important parts of the program, avoiding undesired noise and disturbance by
+uninteresting code being exercised.
 
 For this purpose, "partial instrumentation" support is provided by AFL++ that
 allows to specify what should be instrumented and what not.
 
-Both mechanisms can be used together.
+Both mechanisms for partial instrumentation can be used together.
 
 ## 2) Selective instrumentation with __AFL_COVERAGE_... directives
 
-In this mechanism the selective instrumentation is done in the source code.
+In this mechanism, the selective instrumentation is done in the source code.
 
-After the includes a special define has to be made, eg.:
+After the includes, a special define has to be made, e.g.:
 
 ```
 #include <stdio.h>
 #include <stdint.h>
 // ...
- 
+
 __AFL_COVERAGE();  // <- required for this feature to work
 ```
 
-If you want to disable the coverage at startup until you specify coverage
-should be started, then add `__AFL_COVERAGE_START_OFF();` at that position.
+If you want to disable the coverage at startup until you specify coverage should
+be started, then add `__AFL_COVERAGE_START_OFF();` at that position.
 
-From here on out you have the following macros available that you can use
-in any function where you want:
+From here on out, you have the following macros available that you can use in
+any function where you want:
 
-  * `__AFL_COVERAGE_ON();` - enable coverage from this point onwards
-  * `__AFL_COVERAGE_OFF();` - disable coverage from this point onwards
-  * `__AFL_COVERAGE_DISCARD();` - reset all coverage gathered until this point
-  * `__AFL_COVERAGE_SKIP();` - mark this test case as unimportant. Whatever happens, afl-fuzz will ignore it.
+* `__AFL_COVERAGE_ON();` - Enable coverage from this point onwards.
+* `__AFL_COVERAGE_OFF();` - Disable coverage from this point onwards.
+* `__AFL_COVERAGE_DISCARD();` - Reset all coverage gathered until this point.
+* `__AFL_COVERAGE_SKIP();` - Mark this test case as unimportant. Whatever
+  happens, afl-fuzz will ignore it.
 
-A special function is `__afl_coverage_interesting`.
-To use this, you must define `void __afl_coverage_interesting(u8 val, u32 id);`.
-Then you can use this function globally, where the `val` parameter can be set
-by you, the `id` parameter is for afl-fuzz and will be overwritten.
-Note that useful parameters for `val` are: 1, 2, 3, 4, 8, 16, 32, 64, 128.
-A value of e.g. 33 will be seen as 32 for coverage purposes.
+A special function is `__afl_coverage_interesting`. To use this, you must define
+`void __afl_coverage_interesting(u8 val, u32 id);`. Then you can use this
+function globally, where the `val` parameter can be set by you, the `id`
+parameter is for afl-fuzz and will be overwritten. Note that useful parameters
+for `val` are: 1, 2, 3, 4, 8, 16, 32, 64, 128. A value of, e.g., 33 will be seen
+as 32 for coverage purposes.
 
 ## 3) Selective instrumentation with AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST
 
-This feature is equivalent to llvm 12 sancov feature and allows to specify
-on a filename and/or function name level to instrument these or skip them.
+This feature is equivalent to llvm 12 sancov feature and allows to specify on a
+filename and/or function name level to instrument these or skip them.
 
 ### 3a) How to use the partial instrumentation mode
 
 In order to build with partial instrumentation, you need to build with
-afl-clang-fast/afl-clang-fast++ or afl-clang-lto/afl-clang-lto++.
-The only required change is that you need to set either the environment variable
-AFL_LLVM_ALLOWLIST or AFL_LLVM_DENYLIST set with a filename.
+afl-clang-fast/afl-clang-fast++ or afl-clang-lto/afl-clang-lto++. The only
+required change is that you need to set either the environment variable
+`AFL_LLVM_ALLOWLIST` or `AFL_LLVM_DENYLIST` set with a filename.
 
 That file should contain the file names or functions that are to be instrumented
-(AFL_LLVM_ALLOWLIST) or are specifically NOT to be instrumented (AFL_LLVM_DENYLIST).
+(`AFL_LLVM_ALLOWLIST`) or are specifically NOT to be instrumented
+(`AFL_LLVM_DENYLIST`).
+
+GCC_PLUGIN: you can use either `AFL_LLVM_ALLOWLIST` or `AFL_GCC_ALLOWLIST` (or
+the same for `_DENYLIST`), both work.
 
-GCC_PLUGIN: you can use either AFL_LLVM_ALLOWLIST or AFL_GCC_ALLOWLIST (or the
-same for _DENYLIST), both work.
+For matching to succeed, the function/file name that is being compiled must end
+in the function/file name entry contained in this instrument file list. That is
+to avoid breaking the match when absolute paths are used during compilation.
 
-For matching to succeed, the function/file name that is being compiled must end in the
-function/file name entry contained in this instrument file list. That is to avoid
-breaking the match when absolute paths are used during compilation.
+**NOTE:** In builds with optimization enabled, functions might be inlined and
+would not match!
 
-**NOTE:** In builds with optimization enabled, functions might be inlined and would not match!
+For example, if your source tree looks like this:
 
-For example if your source tree looks like this:
 ```
 project/
 project/feature_a/a1.cpp
@@ -83,36 +87,45 @@ project/feature_b/b1.cpp
 project/feature_b/b2.cpp
 ```
 
-and you only want to test feature_a, then create an "instrument file list" file containing:
+And you only want to test feature_a, then create an "instrument file list" file
+containing:
+
 ```
 feature_a/a1.cpp
 feature_a/a2.cpp
 ```
 
-However if the "instrument file list" file contains only this, it works as well:
+However, if the "instrument file list" file contains only this, it works as
+well:
+
 ```
 a1.cpp
 a2.cpp
 ```
-but it might lead to files being unwantedly instrumented if the same filename
+
+But it might lead to files being unwantedly instrumented if the same filename
 exists somewhere else in the project directories.
 
-You can also specify function names. Note that for C++ the function names
-must be mangled to match! `nm` can print these names.
+You can also specify function names. Note that for C++ the function names must
+be mangled to match! `nm` can print these names.
+
+AFL++ is able to identify whether an entry is a filename or a function. However,
+if you want to be sure (and compliant to the sancov allow/blocklist format), you
+can specify source file entries like this:
 
-AFL++ is able to identify whether an entry is a filename or a function.
-However if you want to be sure (and compliant to the sancov allow/blocklist
-format), you can specify source file entries like this:
 ```
 src: *malloc.c
 ```
-and function entries like this:
+
+And function entries like this:
+
 ```
 fun: MallocFoo
 ```
+
 Note that whitespace is ignored and comments (`# foo`) are supported.
 
 ### 3b) UNIX-style pattern matching
 
 You can add UNIX-style pattern matching in the "instrument file list" entries.
-See `man fnmatch` for the syntax. We do not set any of the `fnmatch` flags.
+See `man fnmatch` for the syntax. We do not set any of the `fnmatch` flags.
+\ No newline at end of file
diff --git a/instrumentation/README.laf-intel.md b/instrumentation/README.laf-intel.md
index 789055ed..3cde10c3 100644
--- a/instrumentation/README.laf-intel.md
+++ b/instrumentation/README.laf-intel.md
@@ -2,19 +2,17 @@
 
 ## Introduction
 
-This originally is the work of an individual nicknamed laf-intel.
-His blog [Circumventing Fuzzing Roadblocks with Compiler Transformations](https://lafintel.wordpress.com/)
-and gitlab repo [laf-llvm-pass](https://gitlab.com/laf-intel/laf-llvm-pass/)
-describe some code transformations that
-help AFL++ to enter conditional blocks, where conditions consist of
-comparisons of large values.
+This originally is the work of an individual nicknamed laf-intel. His blog
+[Circumventing Fuzzing Roadblocks with Compiler Transformations](https://lafintel.wordpress.com/)
+and GitLab repo [laf-llvm-pass](https://gitlab.com/laf-intel/laf-llvm-pass/)
+describe some code transformations that help AFL++ to enter conditional blocks,
+where conditions consist of comparisons of large values.
 
 ## Usage
 
-By default these passes will not run when you compile programs using 
-afl-clang-fast. Hence, you can use AFL as usual.
-To enable the passes you must set environment variables before you
-compile the target project.
+By default, these passes will not run when you compile programs using
+afl-clang-fast. Hence, you can use AFL++ as usual. To enable the passes, you
+must set environment variables before you compile the target project.
 
 The following options exist:
 
@@ -24,32 +22,30 @@ Enables the split-switches pass.
 
 `export AFL_LLVM_LAF_TRANSFORM_COMPARES=1`
 
-Enables the transform-compares pass (strcmp, memcmp, strncmp,
-strcasecmp, strncasecmp).
+Enables the transform-compares pass (strcmp, memcmp, strncmp, strcasecmp,
+strncasecmp).
 
 `export AFL_LLVM_LAF_SPLIT_COMPARES=1`
 
-Enables the split-compares pass.
-By default it will 
+Enables the split-compares pass. By default, it will
 1. simplify operators >= (and <=) into chains of > (<) and == comparisons
-2. change signed integer comparisons to a chain of sign-only comparison
-and unsigned integer comparisons
-3. split all unsigned integer comparisons with bit widths of
-64, 32 or 16 bits to chains of 8 bits comparisons.
-
-You can change the behaviour of the last step by setting
-`export AFL_LLVM_LAF_SPLIT_COMPARES_BITW=<bit_width>`, where 
-bit_width may be 64, 32 or 16. For example, a bit_width of 16
-would split larger comparisons down to 16 bit comparisons.
-
-A new experimental feature is splitting floating point comparisons into a
-series of sign, exponent and mantissa comparisons followed by splitting each
-of them into 8 bit comparisons when necessary.
-It is activated with the `AFL_LLVM_LAF_SPLIT_FLOATS` setting.
-Please note that full IEEE 754 functionality is not preserved, that is
-values of nan and infinity will probably behave differently.
-
-Note that setting this automatically activates `AFL_LLVM_LAF_SPLIT_COMPARES`
-
-You can also set `AFL_LLVM_LAF_ALL` and have all of the above enabled :-)
-
+2. change signed integer comparisons to a chain of sign-only comparison and
+   unsigned integer comparisons
+3. split all unsigned integer comparisons with bit widths of 64, 32, or 16 bits
+   to chains of 8 bits comparisons.
+
+You can change the behavior of the last step by setting `export
+AFL_LLVM_LAF_SPLIT_COMPARES_BITW=<bit_width>`, where bit_width may be 64, 32, or
+16. For example, a bit_width of 16 would split larger comparisons down to 16 bit
+comparisons.
+
+A new experimental feature is splitting floating point comparisons into a series
+of sign, exponent and mantissa comparisons followed by splitting each of them
+into 8 bit comparisons when necessary. It is activated with the
+`AFL_LLVM_LAF_SPLIT_FLOATS` setting. Please note that full IEEE 754
+functionality is not preserved, that is values of nan and infinity will probably
+behave differently.
+
+Note that setting this automatically activates `AFL_LLVM_LAF_SPLIT_COMPARES`.
+
+You can also set `AFL_LLVM_LAF_ALL` and have all of the above enabled. :-)
+\ No newline at end of file
diff --git a/instrumentation/README.lto.md b/instrumentation/README.lto.md
index 6174cdc0..a74425dc 100644
--- a/instrumentation/README.lto.md
+++ b/instrumentation/README.lto.md
@@ -1,55 +1,56 @@
 # afl-clang-lto - collision free instrumentation at link time
 
-## TLDR;
+## TL;DR:
 
-This version requires a current llvm 11+ compiled from the github master.
+This version requires a current llvm 11+ compiled from the GitHub master.
 
 1. Use afl-clang-lto/afl-clang-lto++ because it is faster and gives better
-   coverage than anything else that is out there in the AFL world
+   coverage than anything else that is out there in the AFL world.
 
-2. You can use it together with llvm_mode: laf-intel and the instrument file listing
-   features and can be combined with cmplog/Redqueen
+2. You can use it together with llvm_mode: laf-intel and the instrument file
+   listing features and can be combined with cmplog/Redqueen.
 
-3. It only works with llvm 11+
+3. It only works with llvm 11+.
 
-4. AUTODICTIONARY feature! see below
+4. AUTODICTIONARY feature (see below)!
 
-5. If any problems arise be sure to set `AR=llvm-ar RANLIB=llvm-ranlib`.
-   Some targets might need `LD=afl-clang-lto` and others `LD=afl-ld-lto`.
+5. If any problems arise, be sure to set `AR=llvm-ar RANLIB=llvm-ranlib`. Some
+   targets might need `LD=afl-clang-lto` and others `LD=afl-ld-lto`.
 
 ## Introduction and problem description
 
-A big issue with how AFL/AFL++ works is that the basic block IDs that are
-set during compilation are random - and hence naturally the larger the number
-of instrumented locations, the higher the number of edge collisions are in the
-map. This can result in not discovering new paths and therefore degrade the
+A big issue with how AFL++ works is that the basic block IDs that are set during
+compilation are random - and hence naturally the larger the number of
+instrumented locations, the higher the number of edge collisions are in the map.
+This can result in not discovering new paths and therefore degrade the
 efficiency of the fuzzing process.
 
-*This issue is underestimated in the fuzzing community!*
-With a 2^16 = 64kb standard map at already 256 instrumented blocks there is
-on average one collision. On average a target has 10.000 to 50.000
-instrumented blocks hence the real collisions are between 750-18.000!
+*This issue is underestimated in the fuzzing community!* With a 2^16 = 64kb
+standard map at already 256 instrumented blocks, there is on average one
+collision. On average, a target has 10.000 to 50.000 instrumented blocks, hence
+the real collisions are between 750-18.000!
 
-To reach a solution that prevents any collisions took several approaches
-and many dead ends until we got to this:
+To reach a solution that prevents any collisions took several approaches and
+many dead ends until we got to this:
 
- * We instrument at link time when we have all files pre-compiled
- * To instrument at link time we compile in LTO (link time optimization) mode
- * Our compiler (afl-clang-lto/afl-clang-lto++) takes care of setting the
-   correct LTO options and runs our own afl-ld linker instead of the system
-   linker
- * The LLVM linker collects all LTO files to link and instruments them so that
-   we have non-colliding edge overage
- * We use a new (for afl) edge coverage - which is the same as in llvm
-   -fsanitize=coverage edge coverage mode :)
+* We instrument at link time when we have all files pre-compiled.
+* To instrument at link time, we compile in LTO (link time optimization) mode.
+* Our compiler (afl-clang-lto/afl-clang-lto++) takes care of setting the correct
+  LTO options and runs our own afl-ld linker instead of the system linker.
+* The LLVM linker collects all LTO files to link and instruments them so that we
+  have non-colliding edge overage.
+* We use a new (for afl) edge coverage - which is the same as in llvm
+  -fsanitize=coverage edge coverage mode. :)
 
 The result:
- * 10-25% speed gain compared to llvm_mode
- * guaranteed non-colliding edge coverage :-)
- * The compile time especially for binaries to an instrumented library can be
-   much longer
+
+* 10-25% speed gain compared to llvm_mode
+* guaranteed non-colliding edge coverage :-)
+* The compile time, especially for binaries to an instrumented library, can be
+  much longer.
 
 Example build output from a libtiff build:
+
 ```
 libtool: link: afl-clang-lto -g -O2 -Wall -W -o thumbnail thumbnail.o  ../libtiff/.libs/libtiff.a ../port/.libs/libport.a -llzma -ljbig -ljpeg -lz -lm
 afl-clang-lto++2.63d by Marc "vanHauser" Heuse <mh@mh-sec.de> in mode LTO
@@ -62,21 +63,24 @@ AUTODICTIONARY: 11 strings found
 
 ### Installing llvm version 11 or 12
 
-llvm 11 or even 12 should be available in all current Linux repositories.
-If you use an outdated Linux distribution read the next section.
+llvm 11 or even 12 should be available in all current Linux repositories. If you
+use an outdated Linux distribution, read the next section.
 
 ### Installing llvm from the llvm repository (version 12+)
 
 Installing the llvm snapshot builds is easy and mostly painless:
 
-In the follow line change `NAME` for your Debian or Ubuntu release name
+In the following line, change `NAME` for your Debian or Ubuntu release name
 (e.g. buster, focal, eon, etc.):
+
 ```
 echo deb http://apt.llvm.org/NAME/ llvm-toolchain-NAME NAME >> /etc/apt/sources.list
 ```
-then add the pgp key of llvm and install the packages:
+
+Then add the pgp key of llvm and install the packages:
+
 ```
-wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add - 
+wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
 apt-get update && apt-get upgrade -y
 apt-get install -y clang-12 clang-tools-12 libc++1-12 libc++-12-dev \
     libc++abi1-12 libc++abi-12-dev libclang1-12 libclang-12-dev \
@@ -87,7 +91,8 @@ apt-get install -y clang-12 clang-tools-12 libc++1-12 libc++-12-dev \
 
 ### Building llvm yourself (version 12+)
 
-Building llvm from github takes quite some long time and is not painless:
+Building llvm from GitHub takes quite some time and is not painless:
+
 ```sh
 sudo apt install binutils-dev  # this is *essential*!
 git clone --depth=1 https://github.com/llvm/llvm-project
@@ -126,10 +131,12 @@ sudo make install
 
 Just use afl-clang-lto like you did with afl-clang-fast or afl-gcc.
 
-Also the instrument file listing (AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST -> [README.instrument_list.md](README.instrument_list.md)) and
-laf-intel/compcov (AFL_LLVM_LAF_* -> [README.laf-intel.md](README.laf-intel.md)) work.
+Also, the instrument file listing (AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST ->
+[README.instrument_list.md](README.instrument_list.md)) and laf-intel/compcov
+(AFL_LLVM_LAF_* -> [README.laf-intel.md](README.laf-intel.md)) work.
 
 Example:
+
 ```
 CC=afl-clang-lto CXX=afl-clang-lto++ RANLIB=llvm-ranlib AR=llvm-ar ./configure
 make
@@ -143,51 +150,48 @@ NOTE: some targets also need to set the linker, try both `afl-clang-lto` and
 Note: this is highly discouraged! Try to compile to static libraries with
 afl-clang-lto instead of shared libraries!
 
-To make instrumented shared libraries work with afl-clang-lto you have to do
+To make instrumented shared libraries work with afl-clang-lto, you have to do
 quite some extra steps.
 
-Every shared library you want to instrument has to be individually compiled.
-The environment variable `AFL_LLVM_LTO_DONTWRITEID=1` has to be set during
-compilation.
-Additionally the environment variable `AFL_LLVM_LTO_STARTID` has to be set to
-the added edge count values of all previous compiled instrumented shared
-libraries for that target.
-E.g. for the first shared library this would be `AFL_LLVM_LTO_STARTID=0` and
-afl-clang-lto will then report how many edges have been instrumented (let's say
-it reported 1000 instrumented edges).
-The second shared library then has to be set to that value
+Every shared library you want to instrument has to be individually compiled. The
+environment variable `AFL_LLVM_LTO_DONTWRITEID=1` has to be set during
+compilation. Additionally, the environment variable `AFL_LLVM_LTO_STARTID` has
+to be set to the added edge count values of all previous compiled instrumented
+shared libraries for that target. E.g., for the first shared library this would
+be `AFL_LLVM_LTO_STARTID=0` and afl-clang-lto will then report how many edges
+have been instrumented (let's say it reported 1000 instrumented edges). The
+second shared library then has to be set to that value
 (`AFL_LLVM_LTO_STARTID=1000` in our example), for the third to all previous
 counts added, etc.
 
-The final program compilation step then may *not* have `AFL_LLVM_LTO_DONTWRITEID`
-set, and `AFL_LLVM_LTO_STARTID` must be set to all edge counts added of all shared
-libraries it will be linked to.
+The final program compilation step then may *not* have
+`AFL_LLVM_LTO_DONTWRITEID` set, and `AFL_LLVM_LTO_STARTID` must be set to all
+edge counts added of all shared libraries it will be linked to.
 
-This is quite some hands-on work, so better stay away from instrumenting
-shared libraries :-)
+This is quite some hands-on work, so better stay away from instrumenting shared
+libraries. :-)
 
 ## AUTODICTIONARY feature
 
 While compiling, a dictionary based on string comparisons is automatically
-generated and put into the target binary. This dictionary is transfered to afl-fuzz
-on start. This improves coverage statistically by 5-10% :)
+generated and put into the target binary. This dictionary is transferred to
+afl-fuzz on start. This improves coverage statistically by 5-10%. :)
 
-Note that if for any reason you do not want to use the autodictionary feature
+Note that if for any reason you do not want to use the autodictionary feature,
 then just set the environment variable `AFL_NO_AUTODICT` when starting afl-fuzz.
 
 ## Fixed memory map
 
 To speed up fuzzing a little bit more, it is possible to set a fixed shared
-memory map.
-Recommended is the value 0x10000.
+memory map. Recommended is the value 0x10000.
 
-In most cases this will work without any problems. However if a target uses
-early constructors, ifuncs or a deferred forkserver this can crash the target.
+In most cases, this will work without any problems. However, if a target uses
+early constructors, ifuncs, or a deferred forkserver, this can crash the target.
 
-Also on unusual operating systems/processors/kernels or weird libraries the
+Also, on unusual operating systems/processors/kernels or weird libraries the
 recommended 0x10000 address might not work, so then change the fixed address.
 
-To enable this feature set AFL_LLVM_MAP_ADDR with the address.
+To enable this feature, set `AFL_LLVM_MAP_ADDR` with the address.
 
 ## Document edge IDs
 
@@ -206,143 +210,155 @@ these.
 An example of a hard to solve target is ffmpeg. Here is how to successfully
 instrument it:
 
-1. Get and extract the current ffmpeg and change to its directory
+1. Get and extract the current ffmpeg and change to its directory.
 
 2. Running configure with --cc=clang fails and various other items will fail
    when compiling, so we have to trick configure:
 
-```
-./configure --enable-lto --disable-shared --disable-inline-asm
-```
-
-3. Now the configuration is done - and we edit the settings in `./ffbuild/config.mak`
-   (-: the original line, +: what to change it into):
-```
--CC=gcc
-+CC=afl-clang-lto
--CXX=g++
-+CXX=afl-clang-lto++
--AS=gcc
-+AS=llvm-as
--LD=gcc
-+LD=afl-clang-lto++
--DEPCC=gcc
-+DEPCC=afl-clang-lto
--DEPAS=gcc
-+DEPAS=afl-clang-lto++
--AR=ar
-+AR=llvm-ar
--AR_CMD=ar
-+AR_CMD=llvm-ar
--NM_CMD=nm -g
-+NM_CMD=llvm-nm -g
--RANLIB=ranlib -D
-+RANLIB=llvm-ranlib -D
-```
-
-4. Then type make, wait for a long time and you are done :)
+    ```
+    ./configure --enable-lto --disable-shared --disable-inline-asm
+    ```
+
+3. Now the configuration is done - and we edit the settings in
+   `./ffbuild/config.mak` (-: the original line, +: what to change it into):
+
+    ```
+    -CC=gcc
+    +CC=afl-clang-lto
+    -CXX=g++
+    +CXX=afl-clang-lto++
+    -AS=gcc
+    +AS=llvm-as
+    -LD=gcc
+    +LD=afl-clang-lto++
+    -DEPCC=gcc
+    +DEPCC=afl-clang-lto
+    -DEPAS=gcc
+    +DEPAS=afl-clang-lto++
+    -AR=ar
+    +AR=llvm-ar
+    -AR_CMD=ar
+    +AR_CMD=llvm-ar
+    -NM_CMD=nm -g
+    +NM_CMD=llvm-nm -g
+    -RANLIB=ranlib -D
+    +RANLIB=llvm-ranlib -D
+    ```
+
+4. Then type make, wait for a long time, and you are done. :)
 
 ### Example: WebKit jsc
 
 Building jsc is difficult as the build script has bugs.
 
-1. checkout Webkit: 
-```
-svn checkout https://svn.webkit.org/repository/webkit/trunk WebKit
-cd WebKit
-```
+1. Checkout Webkit:
+
+    ```
+    svn checkout https://svn.webkit.org/repository/webkit/trunk WebKit
+    cd WebKit
+    ```
 
 2. Fix the build environment:
-```
-mkdir -p WebKitBuild/Release
-cd WebKitBuild/Release
-ln -s ../../../../../usr/bin/llvm-ar-12 llvm-ar-12
-ln -s ../../../../../usr/bin/llvm-ranlib-12 llvm-ranlib-12
-cd ../..
-```
 
-3. Build :)
+    ```
+    mkdir -p WebKitBuild/Release
+    cd WebKitBuild/Release
+    ln -s ../../../../../usr/bin/llvm-ar-12 llvm-ar-12
+    ln -s ../../../../../usr/bin/llvm-ranlib-12 llvm-ranlib-12
+    cd ../..
+    ```
 
-```
-Tools/Scripts/build-jsc --jsc-only --cli --cmakeargs="-DCMAKE_AR='llvm-ar-12' -DCMAKE_RANLIB='llvm-ranlib-12' -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_CC_FLAGS='-O3 -lrt' -DCMAKE_CXX_FLAGS='-O3 -lrt' -DIMPORTED_LOCATION='/lib/x86_64-linux-gnu/' -DCMAKE_CC=afl-clang-lto -DCMAKE_CXX=afl-clang-lto++ -DENABLE_STATIC_JSC=ON"
-```
+3. Build. :)
+
+    ```
+    Tools/Scripts/build-jsc --jsc-only --cli --cmakeargs="-DCMAKE_AR='llvm-ar-12' -DCMAKE_RANLIB='llvm-ranlib-12' -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_CC_FLAGS='-O3 -lrt' -DCMAKE_CXX_FLAGS='-O3 -lrt' -DIMPORTED_LOCATION='/lib/x86_64-linux-gnu/' -DCMAKE_CC=afl-clang-lto -DCMAKE_CXX=afl-clang-lto++ -DENABLE_STATIC_JSC=ON"
+    ```
 
 ## Potential issues
 
-### compiling libraries fails
+### Compiling libraries fails
 
 If you see this message:
+
 ```
 /bin/ld: libfoo.a: error adding symbols: archive has no index; run ranlib to add one
 ```
-This is because usually gnu gcc ranlib is being called which cannot deal with clang LTO files.
-The solution is simple: when you ./configure you also have to set RANLIB=llvm-ranlib and AR=llvm-ar
+
+This is because usually gnu gcc ranlib is being called which cannot deal with
+clang LTO files. The solution is simple: when you `./configure`, you also have
+to set `RANLIB=llvm-ranlib` and `AR=llvm-ar`.
 
 Solution:
+
 ```
 AR=llvm-ar RANLIB=llvm-ranlib CC=afl-clang-lto CXX=afl-clang-lto++ ./configure --disable-shared
 ```
-and on some targets you have to set AR=/RANLIB= even for make as the configure script does not save it.
-Other targets ignore environment variables and need the parameters set via
-`./configure --cc=... --cxx= --ranlib= ...` etc. (I am looking at you ffmpeg!).
 
+And on some targets you have to set `AR=/RANLIB=` even for `make` as the
+configure script does not save it. Other targets ignore environment variables
+and need the parameters set via `./configure --cc=... --cxx= --ranlib= ...` etc.
+(I am looking at you ffmpeg!)
+
+If you see this message:
 
-If you see this message
 ```
 assembler command failed ...
 ```
-then try setting `llvm-as` for configure:
+
+Then try setting `llvm-as` for configure:
+
 ```
 AS=llvm-as  ...
 ```
 
-### compiling programs still fail
+### Compiling programs still fail
 
 afl-clang-lto is still work in progress.
 
 Known issues:
-  * Anything that llvm 11+ cannot compile, afl-clang-lto cannot compile either - obviously
-  * Anything that does not compile with LTO, afl-clang-lto cannot compile either - obviously
+* Anything that llvm 11+ cannot compile, afl-clang-lto cannot compile either -
+  obviously.
+* Anything that does not compile with LTO, afl-clang-lto cannot compile either -
+  obviously.
 
-Hence if building a target with afl-clang-lto fails try to build it with llvm12
-and LTO enabled (`CC=clang-12` `CXX=clang++-12` `CFLAGS=-flto=full` and
-`CXXFLAGS=-flto=full`).
+Hence, if building a target with afl-clang-lto fails, try to build it with
+llvm12 and LTO enabled (`CC=clang-12`, `CXX=clang++-12`, `CFLAGS=-flto=full`,
+and `CXXFLAGS=-flto=full`).
 
-If this succeeeds then there is an issue with afl-clang-lto. Please report at
-[https://github.com/AFLplusplus/AFLplusplus/issues/226](https://github.com/AFLplusplus/AFLplusplus/issues/226)
+If this succeeds, then there is an issue with afl-clang-lto. Please report at
+[https://github.com/AFLplusplus/AFLplusplus/issues/226](https://github.com/AFLplusplus/AFLplusplus/issues/226).
 
 Even some targets where clang-12 fails can be build if the fail is just in
 `./configure`, see `Solving difficult targets` above.
 
 ## History
 
-This was originally envisioned by hexcoder- in Summer 2019, however we saw no
-way to create a pass that is run at link time - although there is a option
-for this in the PassManager: EP_FullLinkTimeOptimizationLast
-("Fun" info - nobody knows what this is doing. And the developer who
-implemented this didn't respond to emails.)
-
-In December then came the idea to implement this as a pass that is run via
-the llvm "opt" program, which is performed via an own linker that afterwards
-calls the real linker.
-This was first implemented in January and work ... kinda.
-The LTO time instrumentation worked, however "how" the basic blocks were
-instrumented was a problem, as reducing duplicates turned out to be very,
-very difficult with a program that has so many paths and therefore so many
-dependencies. A lot of strategies were implemented - and failed.
-And then sat solvers were tried, but with over 10.000 variables that turned
-out to be a dead-end too.
+This was originally envisioned by hexcoder- in Summer 2019. However, we saw no
+way to create a pass that is run at link time - although there is a option for
+this in the PassManager: EP_FullLinkTimeOptimizationLast. ("Fun" info - nobody
+knows what this is doing. And the developer who implemented this didn't respond
+to emails.)
+
+In December then came the idea to implement this as a pass that is run via the
+llvm "opt" program, which is performed via an own linker that afterwards calls
+the real linker. This was first implemented in January and work ... kinda. The
+LTO time instrumentation worked, however, "how" the basic blocks were
+instrumented was a problem, as reducing duplicates turned out to be very, very
+difficult with a program that has so many paths and therefore so many
+dependencies. A lot of strategies were implemented - and failed. And then sat
+solvers were tried, but with over 10.000 variables that turned out to be a
+dead-end too.
 
 The final idea to solve this came from domenukk who proposed to insert a block
-into an edge and then just use incremental counters ... and this worked!
-After some trials and errors to implement this vanhauser-thc found out that
-there is actually an llvm function for this: SplitEdge() :-)
+into an edge and then just use incremental counters ... and this worked! After
+some trials and errors to implement this vanhauser-thc found out that there is
+actually an llvm function for this: SplitEdge() :-)
 
-Still more problems came up though as this only works without bugs from
-llvm 9 onwards, and with high optimization the link optimization ruins
-the instrumented control flow graph.
+Still more problems came up though as this only works without bugs from llvm 9
+onwards, and with high optimization the link optimization ruins the instrumented
+control flow graph.
 
-This is all now fixed with llvm 11+. The llvm's own linker is now able to
-load passes and this bypasses all problems we had.
+This is all now fixed with llvm 11+. The llvm's own linker is now able to load
+passes and this bypasses all problems we had.
 
-Happy end :)
+Happy end :)
+\ No newline at end of file
diff --git a/instrumentation/README.persistent_mode.md b/instrumentation/README.persistent_mode.md
index e9d2a523..d0ccba8c 100644
--- a/instrumentation/README.persistent_mode.md
+++ b/instrumentation/README.persistent_mode.md
@@ -132,7 +132,7 @@ and you should be all set!
 Some libraries provide APIs that are stateless, or whose state can be reset in
 between processing different input files. When such a reset is performed, a
 single long-lived process can be reused to try out multiple test cases,
-eliminating the need for repeated fork() calls and the associated OS overhead.
+eliminating the need for repeated `fork()` calls and the associated OS overhead.
 
 The basic structure of the program that does this would be:
author	llzmb <46303940+llzmb@users.noreply.github.com>	2021-11-23 21:03:56 +0100
committer	llzmb <46303940+llzmb@users.noreply.github.com>	2021-11-23 21:03:56 +0100
commit	6cce577b907eb2ac58b0bc5ddacf373627b3480f (patch)
tree	002ab2f79f37442826ad9d586fca2cda3c4b946f /instrumentation
parent	d9ff3745d01e30f3addbb51e391b8b5d456d07a4 (diff)
download	afl++-6cce577b907eb2ac58b0bc5ddacf373627b3480f.tar.gz