From c31f4646cbd00f591dad3258c08ff8e56aa94420 Mon Sep 17 00:00:00 2001 From: llzmb <46303940+llzmb@users.noreply.github.com> Date: Sun, 21 Nov 2021 21:11:52 +0100 Subject: Clean up docs folder --- docs/env_variables.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) (limited to 'docs/env_variables.md') diff --git a/docs/env_variables.md b/docs/env_variables.md index 65cca0dc..34318cd4 100644 --- a/docs/env_variables.md +++ b/docs/env_variables.md @@ -306,8 +306,9 @@ checks or alter some of the more exotic semantics of the tool: exit soon after the first crash is found. - `AFL_CMPLOG_ONLY_NEW` will only perform the expensive cmplog feature for - newly found testcases and not for testcases that are loaded on startup (`-i - in`). This is an important feature to set when resuming a fuzzing session. + newly found test cases and not for test cases that are loaded on startup + (`-i in`). This is an important feature to set when resuming a fuzzing + session. - Setting `AFL_CRASH_EXITCODE` sets the exit code AFL treats as crash. For example, if `AFL_CRASH_EXITCODE='-1'` is set, each input resulting in a `-1` @@ -447,8 +448,8 @@ checks or alter some of the more exotic semantics of the tool: - If you are using persistent mode (you should, see [instrumentation/README.persistent_mode.md](../instrumentation/README.persistent_mode.md)), - some targets keep inherent state due which a detected crash testcase does - not crash the target again when the testcase is given. To be able to still + some targets keep inherent state due which a detected crash test case does + not crash the target again when the test case is given. To be able to still re-trigger these crashes, you can use the `AFL_PERSISTENT_RECORD` variable with a value of how many previous fuzz cases to keep prio a crash. If set to e.g. 10, then the 9 previous inputs are written to out/default/crashes as -- cgit 1.4.1 From 10365a22bdd5b87711a859816a8a550a6481b038 Mon Sep 17 00:00:00 2001 From: llzmb <46303940+llzmb@users.noreply.github.com> Date: Mon, 22 Nov 2021 22:08:49 +0100 Subject: Merge ctx and ngram into llvm, fix references --- docs/env_variables.md | 7 +- docs/fuzzing_expert.md | 12 +-- instrumentation/README.ctx.md | 38 ------- instrumentation/README.llvm.md | 229 +++++++++++++++++++++++++--------------- instrumentation/README.ngram.md | 28 ----- 5 files changed, 152 insertions(+), 162 deletions(-) delete mode 100644 instrumentation/README.ctx.md delete mode 100644 instrumentation/README.ngram.md (limited to 'docs/env_variables.md') diff --git a/docs/env_variables.md b/docs/env_variables.md index 65cca0dc..4386c5f8 100644 --- a/docs/env_variables.md +++ b/docs/env_variables.md @@ -171,7 +171,7 @@ config.h to at least 18 and maybe up to 20 for this as otherwise too many map collisions occur. For more information, see -[instrumentation/README.ctx.md](../instrumentation/README.ctx.md). +[instrumentation/README.llvm.md#6) AFL Context Sensitive Branch Coverage](../instrumentation/README.llvm.md#6-afl-context-sensitive-branch-coverage). #### INSTRUMENT LIST (selectively instrument files and functions) @@ -247,7 +247,7 @@ in config.h to at least 18 and maybe up to 20 for this as otherwise too many map collisions occur. For more information, see -[instrumentation/README.ngram.md](../instrumentation/README.ngram.md). +[instrumentation/README.llvm.md#7) AFL N-Gram Branch Coverage](../instrumentation/README.llvm.md#7-afl-n-gram-branch-coverage). #### NOT_ZERO @@ -261,9 +261,6 @@ For more information, see If the target performs only a few loops, then this will give a small performance boost. -For more information, see -[instrumentation/README.neverzero.md](../instrumentation/README.neverzero.md). - #### Thread safe instrumentation counters (in all modes) Setting `AFL_LLVM_THREADSAFE_INST` will inject code that implements thread safe diff --git a/docs/fuzzing_expert.md b/docs/fuzzing_expert.md index 876c5fbb..5945d114 100644 --- a/docs/fuzzing_expert.md +++ b/docs/fuzzing_expert.md @@ -112,12 +112,8 @@ are interested in: There are many more options and modes available however these are most of the time less effective. See: - * [instrumentation/README.ctx.md](../instrumentation/README.ctx.md) - * [instrumentation/README.ngram.md](../instrumentation/README.ngram.md) - -AFL++ performs "never zero" counting in its bitmap. You can read more about this -here: - * [instrumentation/README.neverzero.md](../instrumentation/README.neverzero.md) + * [instrumentation/README.llvm.md#6) AFL Context Sensitive Branch Coverage](../instrumentation/README.llvm.md#6-afl-context-sensitive-branch-coverage). + * [instrumentation/README.llvm.md#7) AFL N-Gram Branch Coverage](../instrumentation/README.llvm.md#7-afl-n-gram-branch-coverage) #### c) Sanitizers @@ -247,7 +243,7 @@ For meson you have to set the AFL++ compiler with the very first command! Sometimes cmake and configure do not pick up the AFL++ compiler, or the ranlib/ar that is needed - because this was just not foreseen by the developer -of the target. Or they have non-standard options. Figure out if there is a +of the target. Or they have non-standard options. Figure out if there is a non-standard way to set this, otherwise set up the build normally and edit the generated build environment afterwards manually to point it to the right compiler (and/or ranlib and ar). @@ -337,7 +333,7 @@ Note that this step is rather optional though. #### Done! -The INPUTS_UNIQUE/ directory from step b) - or even better the directory input/ +The INPUTS_UNIQUE/ directory from step b) - or even better the directory input/ if you minimized the corpus in step c) - is the resulting input corpus directory to be used in fuzzing! :-) diff --git a/instrumentation/README.ctx.md b/instrumentation/README.ctx.md deleted file mode 100644 index 335e9921..00000000 --- a/instrumentation/README.ctx.md +++ /dev/null @@ -1,38 +0,0 @@ -# AFL Context Sensitive Branch Coverage - -## What is this? - -This is an LLVM-based implementation of the context sensitive branch coverage. - -Basically every function gets its own ID and, every time when an edge is logged, -all the IDs in the callstack are hashed and combined with the edge transition -hash to augment the classic edge coverage with the information about the -calling context. - -So if both function A and function B call a function C, the coverage -collected in C will be different. - -In math the coverage is collected as follows: -`map[current_location_ID ^ previous_location_ID >> 1 ^ hash_callstack_IDs] += 1` - -The callstack hash is produced XOR-ing the function IDs to avoid explosion with -recursive functions. - -## Usage - -Set the `AFL_LLVM_INSTRUMENT=CTX` or `AFL_LLVM_CTX=1` environment variable. - -It is highly recommended to increase the MAP_SIZE_POW2 definition in -config.h to at least 18 and maybe up to 20 for this as otherwise too -many map collisions occur. - -## Caller Branch Coverage - -If the context sensitive coverage introduces too may collisions and becoming -detrimental, the user can choose to augment edge coverage with just the -called function ID, instead of the entire callstack hash. - -In math the coverage is collected as follows: -`map[current_location_ID ^ previous_location_ID >> 1 ^ previous_callee_ID] += 1` - -Set the `AFL_LLVM_INSTRUMENT=CALLER` or `AFL_LLVM_CALLER=1` environment variable. diff --git a/instrumentation/README.llvm.md b/instrumentation/README.llvm.md index dbb604f2..1671f385 100644 --- a/instrumentation/README.llvm.md +++ b/instrumentation/README.llvm.md @@ -1,72 +1,79 @@ # Fast LLVM-based instrumentation for afl-fuzz - (See [../README.md](../README.md) for the general instruction manual.) +For the general instruction manual, see [../README.md](../README.md). - (See [README.gcc_plugin.md](README.gcc_plugin.md) for the GCC-based instrumentation.) +For the GCC-based instrumentation, see +[README.gcc_plugin.md](README.gcc_plugin.md). ## 1) Introduction ! llvm_mode works with llvm versions 3.8 up to 13 ! -The code in this directory allows you to instrument programs for AFL using -true compiler-level instrumentation, instead of the more crude -assembly-level rewriting approach taken by afl-gcc and afl-clang. This has -several interesting properties: +The code in this directory allows you to instrument programs for AFL using true +compiler-level instrumentation, instead of the more crude assembly-level +rewriting approach taken by afl-gcc and afl-clang. This has several interesting +properties: - - The compiler can make many optimizations that are hard to pull off when - manually inserting assembly. As a result, some slow, CPU-bound programs will - run up to around 2x faster. +- The compiler can make many optimizations that are hard to pull off when + manually inserting assembly. As a result, some slow, CPU-bound programs will + run up to around 2x faster. - The gains are less pronounced for fast binaries, where the speed is limited - chiefly by the cost of creating new processes. In such cases, the gain will - probably stay within 10%. + The gains are less pronounced for fast binaries, where the speed is limited + chiefly by the cost of creating new processes. In such cases, the gain will + probably stay within 10%. - - The instrumentation is CPU-independent. At least in principle, you should - be able to rely on it to fuzz programs on non-x86 architectures (after - building afl-fuzz with AFL_NO_X86=1). +- The instrumentation is CPU-independent. At least in principle, you should be + able to rely on it to fuzz programs on non-x86 architectures (after building + afl-fuzz with AFL_NO_X86=1). - - The instrumentation can cope a bit better with multi-threaded targets. +- The instrumentation can cope a bit better with multi-threaded targets. - - Because the feature relies on the internals of LLVM, it is clang-specific - and will *not* work with GCC (see ../gcc_plugin/ for an alternative once - it is available). +- Because the feature relies on the internals of LLVM, it is clang-specific and + will *not* work with GCC (see ../gcc_plugin/ for an alternative once it is + available). Once this implementation is shown to be sufficiently robust and portable, it will probably replace afl-clang. For now, it can be built separately and co-exists with the original code. -The idea and much of the intial implementation came from Laszlo Szekeres. +The idea and much of the initial implementation came from Laszlo Szekeres. ## 2a) How to use this - short Set the `LLVM_CONFIG` variable to the clang version you want to use, e.g. + ``` LLVM_CONFIG=llvm-config-9 make ``` + In case you have your own compiled llvm version specify the full path: + ``` LLVM_CONFIG=~/llvm-project/build/bin/llvm-config make ``` + If you try to use a new llvm version on an old Linux this can fail because of old c++ libraries. In this case usually switching to gcc/g++ to compile llvm_mode will work: + ``` LLVM_CONFIG=llvm-config-7 REAL_CC=gcc REAL_CXX=g++ make ``` -It is highly recommended to use the newest clang version you can put your -hands on :) + +It is highly recommended to use the newest clang version you can put your hands +on :) Then look at [README.persistent_mode.md](README.persistent_mode.md). ## 2b) How to use this - long In order to leverage this mechanism, you need to have clang installed on your -system. You should also make sure that the llvm-config tool is in your path -(or pointed to via LLVM_CONFIG in the environment). +system. You should also make sure that the llvm-config tool is in your path (or +pointed to via LLVM_CONFIG in the environment). -Note that if you have several LLVM versions installed, pointing LLVM_CONFIG -to the version you want to use will switch compiling to this specific -version - if you installation is set up correctly :-) +Note that if you have several LLVM versions installed, pointing LLVM_CONFIG to +the version you want to use will switch compiling to this specific version - if +you installation is set up correctly :-) Unfortunately, some systems that do have clang come without llvm-config or the LLVM development headers; one example of this is FreeBSD. FreeBSD users will @@ -75,15 +82,15 @@ load modules (you'll see "Service unavailable" when loading afl-llvm-pass.so). To solve all your problems, you can grab pre-built binaries for your OS from: - https://llvm.org/releases/download.html +[https://llvm.org/releases/download.html](https://llvm.org/releases/download.html) ...and then put the bin/ directory from the tarball at the beginning of your $PATH when compiling the feature and building packages later on. You don't need to be root for that. -To build the instrumentation itself, type 'make'. This will generate binaries -called afl-clang-fast and afl-clang-fast++ in the parent directory. Once this -is done, you can instrument third-party code in a way similar to the standard +To build the instrumentation itself, type `make`. This will generate binaries +called afl-clang-fast and afl-clang-fast++ in the parent directory. Once this is +done, you can instrument third-party code in a way similar to the standard operating mode of AFL, e.g.: ``` @@ -93,81 +100,137 @@ operating mode of AFL, e.g.: Be sure to also include CXX set to afl-clang-fast++ for C++ code. -Note that afl-clang-fast/afl-clang-fast++ are just pointers to afl-cc. -You can also use afl-cc/afl-c++ and instead direct it to use LLVM -instrumentation by either setting `AFL_CC_COMPILER=LLVM` or pass the parameter -`--afl-llvm` via CFLAGS/CXXFLAGS/CPPFLAGS. +Note that afl-clang-fast/afl-clang-fast++ are just pointers to afl-cc. You can +also use afl-cc/afl-c++ and instead direct it to use LLVM instrumentation by +either setting `AFL_CC_COMPILER=LLVM` or pass the parameter `--afl-llvm` via +CFLAGS/CXXFLAGS/CPPFLAGS. The tool honors roughly the same environmental variables as afl-gcc (see [docs/env_variables.md](../docs/env_variables.md)). This includes AFL_USE_ASAN, -AFL_HARDEN, and AFL_DONT_OPTIMIZE. However AFL_INST_RATIO is not honored -as it does not serve a good purpose with the more effective PCGUARD analysis. +AFL_HARDEN, and AFL_DONT_OPTIMIZE. However AFL_INST_RATIO is not honored as it +does not serve a good purpose with the more effective PCGUARD analysis. ## 3) Options -Several options are present to make llvm_mode faster or help it rearrange -the code to make afl-fuzz path discovery easier. +Several options are present to make llvm_mode faster or help it rearrange the +code to make afl-fuzz path discovery easier. -If you need just to instrument specific parts of the code, you can the instrument file list -which C/C++ files to actually instrument. See [README.instrument_list.md](README.instrument_list.md) +If you need just to instrument specific parts of the code, you can the +instrument file list which C/C++ files to actually instrument. See +[README.instrument_list.md](README.instrument_list.md) -For splitting memcmp, strncmp, etc. please see [README.laf-intel.md](README.laf-intel.md) +For splitting memcmp, strncmp, etc. please see +[README.laf-intel.md](README.laf-intel.md) Then there are different ways of instrumenting the target: -1. An better instrumentation strategy uses LTO and link time -instrumentation. Note that not all targets can compile in this mode, however -if it works it is the best option you can use. -Simply use afl-clang-lto/afl-clang-lto++ to use this option. -See [README.lto.md](README.lto.md) +1. An better instrumentation strategy uses LTO and link time instrumentation. + Note that not all targets can compile in this mode, however if it works it is + the best option you can use. Simply use afl-clang-lto/afl-clang-lto++ to use + this option. See [README.lto.md](README.lto.md). -2. Alternativly you can choose a completely different coverage method: +2. Alternatively you can choose a completely different coverage method: -2a. N-GRAM coverage - which combines the previous visited edges with the -current one. This explodes the map but on the other hand has proven to be -effective for fuzzing. -See [README.ngram.md](README.ngram.md) +2a. N-GRAM coverage - which combines the previous visited edges with the current + one. This explodes the map but on the other hand has proven to be effective + for fuzzing. See + [7) AFL N-Gram Branch Coverage](#7-afl-n-gram-branch-coverage). 2b. Context sensitive coverage - which combines the visited edges with an -individual caller ID (the function that called the current one) -[README.ctx.md](README.ctx.md) + individual caller ID (the function that called the current one). See + [6) AFL Context Sensitive Branch Coverage](#6-afl-context-sensitive-branch-coverage). -Then - additionally to one of the instrumentation options above - there is -a very effective new instrumentation option called CmpLog as an alternative to -laf-intel that allow AFL++ to apply mutations similar to Redqueen. -See [README.cmplog.md](README.cmplog.md) +Then - additionally to one of the instrumentation options above - there is a +very effective new instrumentation option called CmpLog as an alternative to +laf-intel that allow AFL++ to apply mutations similar to Redqueen. See +[README.cmplog.md](README.cmplog.md). -Finally if your llvm version is 8 or lower, you can activate a mode that -prevents that a counter overflow result in a 0 value. This is good for -path discovery, but the llvm implementation for x86 for this functionality -is not optimal and was only fixed in llvm 9. -You can set this with AFL_LLVM_NOT_ZERO=1 -See [README.neverzero.md](README.neverzero.md) +Finally, if your llvm version is 8 or lower, you can activate a mode that +prevents that a counter overflow result in a 0 value. This is good for path +discovery, but the llvm implementation for x86 for this functionality is not +optimal and was only fixed in llvm 9. You can set this with AFL_LLVM_NOT_ZERO=1. -Support for thread safe counters has been added for all modes. -Activate it with `AFL_LLVM_THREADSAFE_INST=1`. The tradeoff is better precision -in multi threaded apps for a slightly higher instrumentation overhead. -This also disables the nozero counter default for performance reasons. +Support for thread safe counters has been added for all modes. Activate it with +`AFL_LLVM_THREADSAFE_INST=1`. The tradeoff is better precision in multi threaded +apps for a slightly higher instrumentation overhead. This also disables the +nozero counter default for performance reasons. -## 4) Snapshot feature +## 4) deferred initialization, persistent mode, shared memory fuzzing -To speed up fuzzing you can use a linux loadable kernel module which enables -a snapshot feature. -See [README.snapshot.md](README.snapshot.md) +This is the most powerful and effective fuzzing you can do. Please see +[README.persistent_mode.md](README.persistent_mode.md) for a full explanation. -## 5) Gotchas, feedback, bugs +## 5) Bonus feature: 'dict2file' pass -This is an early-stage mechanism, so field reports are welcome. You can send bug -reports to . +Just specify `AFL_LLVM_DICT2FILE=/absolute/path/file.txt` and during compilation +all constant string compare parameters will be written to this file to be used +with afl-fuzz' `-x` option. -## 6) deferred initialization, persistent mode, shared memory fuzzing +## 6) AFL Context Sensitive Branch Coverage -This is the most powerful and effective fuzzing you can do. -Please see [README.persistent_mode.md](README.persistent_mode.md) for a -full explanation. +### What is this? -## 7) Bonus feature: 'dict2file' pass +This is an LLVM-based implementation of the context sensitive branch coverage. -Just specify `AFL_LLVM_DICT2FILE=/absolute/path/file.txt` and during compilation -all constant string compare parameters will be written to this file to be -used with afl-fuzz' `-x` option. +Basically every function gets its own ID and, every time when an edge is logged, +all the IDs in the callstack are hashed and combined with the edge transition +hash to augment the classic edge coverage with the information about the calling +context. + +So if both function A and function B call a function C, the coverage collected +in C will be different. + +In math the coverage is collected as follows: `map[current_location_ID ^ +previous_location_ID >> 1 ^ hash_callstack_IDs] += 1` + +The callstack hash is produced XOR-ing the function IDs to avoid explosion with +recursive functions. + +### Usage + +Set the `AFL_LLVM_INSTRUMENT=CTX` or `AFL_LLVM_CTX=1` environment variable. + +It is highly recommended to increase the MAP_SIZE_POW2 definition in config.h to +at least 18 and maybe up to 20 for this as otherwise too many map collisions +occur. + +### Caller Branch Coverage + +If the context sensitive coverage introduces too may collisions and becoming +detrimental, the user can choose to augment edge coverage with just the called +function ID, instead of the entire callstack hash. + +In math the coverage is collected as follows: `map[current_location_ID ^ +previous_location_ID >> 1 ^ previous_callee_ID] += 1` + +Set the `AFL_LLVM_INSTRUMENT=CALLER` or `AFL_LLVM_CALLER=1` environment +variable. + +## 7) AFL N-Gram Branch Coverage + +### Source + +This is an LLVM-based implementation of the n-gram branch coverage proposed in +the paper +["Be Sensitive and Collaborative: Analyzing Impact of Coverage Metrics in Greybox Fuzzing"](https://www.usenix.org/system/files/raid2019-wang-jinghan.pdf) +by Jinghan Wang, et. al. + +Note that the original implementation (available +[here](https://github.com/bitsecurerlab/afl-sensitive)) is built on top of AFL's +qemu_mode. This is essentially a port that uses LLVM vectorized instructions +(available from llvm versions 4.0.1 and higher) to achieve the same results when +compiling source code. + +In math the branch coverage is performed as follows: `map[current_location ^ +prev_location[0] >> 1 ^ prev_location[1] >> 1 ^ ... up to n-1`] += 1` + +### Usage + +The size of `n` (i.e., the number of branches to remember) is an option that is +specified either in the `AFL_LLVM_INSTRUMENT=NGRAM-{value}` or the +`AFL_LLVM_NGRAM_SIZE` environment variable. Good values are 2, 4, or 8, valid +are 2-16. + +It is highly recommended to increase the MAP_SIZE_POW2 definition in config.h to +at least 18 and maybe up to 20 for this as otherwise too many map collisions +occur. \ No newline at end of file diff --git a/instrumentation/README.ngram.md b/instrumentation/README.ngram.md deleted file mode 100644 index da61ef32..00000000 --- a/instrumentation/README.ngram.md +++ /dev/null @@ -1,28 +0,0 @@ -# AFL N-Gram Branch Coverage - -## Source - -This is an LLVM-based implementation of the n-gram branch coverage proposed in -the paper ["Be Sensitive and Collaborative: Analzying Impact of Coverage Metrics -in Greybox Fuzzing"](https://www.usenix.org/system/files/raid2019-wang-jinghan.pdf), -by Jinghan Wang, et. al. - -Note that the original implementation (available -[here](https://github.com/bitsecurerlab/afl-sensitive)) -is built on top of AFL's QEMU mode. -This is essentially a port that uses LLVM vectorized instructions (available from -llvm versions 4.0.1 and higher) to achieve the same results when compiling source code. - -In math the branch coverage is performed as follows: -`map[current_location ^ prev_location[0] >> 1 ^ prev_location[1] >> 1 ^ ... up to n-1`] += 1` - -## Usage - -The size of `n` (i.e., the number of branches to remember) is an option -that is specified either in the `AFL_LLVM_INSTRUMENT=NGRAM-{value}` or the -`AFL_LLVM_NGRAM_SIZE` environment variable. -Good values are 2, 4 or 8, valid are 2-16. - -It is highly recommended to increase the MAP_SIZE_POW2 definition in -config.h to at least 18 and maybe up to 20 for this as otherwise too -many map collisions occur. -- cgit 1.4.1 From e0c8a5c0c6ae67af3280c0ead8124a2ffe920241 Mon Sep 17 00:00:00 2001 From: llzmb <46303940+llzmb@users.noreply.github.com> Date: Thu, 25 Nov 2021 16:47:49 +0100 Subject: Change "AFL" to "AFL++" in "README.llvm.md", fix references --- docs/env_variables.md | 4 ++-- docs/fuzzing_expert.md | 4 ++-- instrumentation/README.llvm.md | 12 ++++++------ 3 files changed, 10 insertions(+), 10 deletions(-) (limited to 'docs/env_variables.md') diff --git a/docs/env_variables.md b/docs/env_variables.md index 4386c5f8..cbc63032 100644 --- a/docs/env_variables.md +++ b/docs/env_variables.md @@ -171,7 +171,7 @@ config.h to at least 18 and maybe up to 20 for this as otherwise too many map collisions occur. For more information, see -[instrumentation/README.llvm.md#6) AFL Context Sensitive Branch Coverage](../instrumentation/README.llvm.md#6-afl-context-sensitive-branch-coverage). +[instrumentation/README.llvm.md#6) AFL++ Context Sensitive Branch Coverage](../instrumentation/README.llvm.md#6-afl-context-sensitive-branch-coverage). #### INSTRUMENT LIST (selectively instrument files and functions) @@ -247,7 +247,7 @@ in config.h to at least 18 and maybe up to 20 for this as otherwise too many map collisions occur. For more information, see -[instrumentation/README.llvm.md#7) AFL N-Gram Branch Coverage](../instrumentation/README.llvm.md#7-afl-n-gram-branch-coverage). +[instrumentation/README.llvm.md#7) AFL++ N-Gram Branch Coverage](../instrumentation/README.llvm.md#7-afl-n-gram-branch-coverage). #### NOT_ZERO diff --git a/docs/fuzzing_expert.md b/docs/fuzzing_expert.md index 5945d114..d0d28582 100644 --- a/docs/fuzzing_expert.md +++ b/docs/fuzzing_expert.md @@ -112,8 +112,8 @@ are interested in: There are many more options and modes available however these are most of the time less effective. See: - * [instrumentation/README.llvm.md#6) AFL Context Sensitive Branch Coverage](../instrumentation/README.llvm.md#6-afl-context-sensitive-branch-coverage). - * [instrumentation/README.llvm.md#7) AFL N-Gram Branch Coverage](../instrumentation/README.llvm.md#7-afl-n-gram-branch-coverage) + * [instrumentation/README.llvm.md#6) AFL++ Context Sensitive Branch Coverage](../instrumentation/README.llvm.md#6-afl-context-sensitive-branch-coverage). + * [instrumentation/README.llvm.md#7) AFL++ N-Gram Branch Coverage](../instrumentation/README.llvm.md#7-afl-n-gram-branch-coverage) #### c) Sanitizers diff --git a/instrumentation/README.llvm.md b/instrumentation/README.llvm.md index 1671f385..88ea0127 100644 --- a/instrumentation/README.llvm.md +++ b/instrumentation/README.llvm.md @@ -9,8 +9,8 @@ For the GCC-based instrumentation, see ! llvm_mode works with llvm versions 3.8 up to 13 ! -The code in this directory allows you to instrument programs for AFL using true -compiler-level instrumentation, instead of the more crude assembly-level +The code in this directory allows you to instrument programs for AFL++ using +true compiler-level instrumentation, instead of the more crude assembly-level rewriting approach taken by afl-gcc and afl-clang. This has several interesting properties: @@ -134,11 +134,11 @@ Then there are different ways of instrumenting the target: 2a. N-GRAM coverage - which combines the previous visited edges with the current one. This explodes the map but on the other hand has proven to be effective for fuzzing. See - [7) AFL N-Gram Branch Coverage](#7-afl-n-gram-branch-coverage). + [7) AFL++ N-Gram Branch Coverage](#7-afl-n-gram-branch-coverage). 2b. Context sensitive coverage - which combines the visited edges with an individual caller ID (the function that called the current one). See - [6) AFL Context Sensitive Branch Coverage](#6-afl-context-sensitive-branch-coverage). + [6) AFL++ Context Sensitive Branch Coverage](#6-afl-context-sensitive-branch-coverage). Then - additionally to one of the instrumentation options above - there is a very effective new instrumentation option called CmpLog as an alternative to @@ -166,7 +166,7 @@ Just specify `AFL_LLVM_DICT2FILE=/absolute/path/file.txt` and during compilation all constant string compare parameters will be written to this file to be used with afl-fuzz' `-x` option. -## 6) AFL Context Sensitive Branch Coverage +## 6) AFL++ Context Sensitive Branch Coverage ### What is this? @@ -206,7 +206,7 @@ previous_location_ID >> 1 ^ previous_callee_ID] += 1` Set the `AFL_LLVM_INSTRUMENT=CALLER` or `AFL_LLVM_CALLER=1` environment variable. -## 7) AFL N-Gram Branch Coverage +## 7) AFL++ N-Gram Branch Coverage ### Source -- cgit 1.4.1 From 7604dba6d6ee617d75ad7523ead02b6273233db5 Mon Sep 17 00:00:00 2001 From: llzmb <46303940+llzmb@users.noreply.github.com> Date: Fri, 26 Nov 2021 13:28:04 +0100 Subject: Fix typos --- docs/env_variables.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) (limited to 'docs/env_variables.md') diff --git a/docs/env_variables.md b/docs/env_variables.md index 34318cd4..2a004235 100644 --- a/docs/env_variables.md +++ b/docs/env_variables.md @@ -143,7 +143,7 @@ Available options: - CLANG - outdated clang instrumentation - CLASSIC - classic AFL (map[cur_loc ^ prev_loc >> 1]++) (default) - You can also specify CTX and/or NGRAM, seperate the options with a comma "," + You can also specify CTX and/or NGRAM, separate the options with a comma "," then, e.g.: `AFL_LLVM_INSTRUMENT=CLASSIC,CTX,NGRAM-4` Note: It is actually not a good idea to use both CTX and NGRAM. :) @@ -527,23 +527,23 @@ checks or alter some of the more exotic semantics of the tool: The QEMU wrapper used to instrument binary-only code supports several settings: - Setting `AFL_COMPCOV_LEVEL` enables the CompareCoverage tracing of all cmp - and sub in x86 and x86_64 and memory comparions functions (e.g. strcmp, + and sub in x86 and x86_64 and memory comparison functions (e.g., strcmp, memcmp, ...) when libcompcov is preloaded using `AFL_PRELOAD`. More info at [qemu_mode/libcompcov/README.md](../qemu_mode/libcompcov/README.md). There are two levels at the moment, `AFL_COMPCOV_LEVEL=1` that instruments only comparisons with immediate values / read-only memory and - `AFL_COMPCOV_LEVEL=2` that instruments all the comparions. Level 2 is more + `AFL_COMPCOV_LEVEL=2` that instruments all the comparisons. Level 2 is more accurate but may need a larger shared memory. - - `AFL_DEBUG` will print the found entrypoint for the binary to stderr. Use - this if you are unsure if the entrypoint might be wrong - but use it + - `AFL_DEBUG` will print the found entry point for the binary to stderr. Use + this if you are unsure if the entry point might be wrong - but use it directly, e.g. `afl-qemu-trace ./program`. - - `AFL_ENTRYPOINT` allows you to specify a specific entrypoint into the binary - (this can be very good for the performance!). The entrypoint is specified as - hex address, e.g. `0x4004110`. Note that the address must be the address of - a basic block. + - `AFL_ENTRYPOINT` allows you to specify a specific entry point into the + binary (this can be very good for the performance!). The entry point is + specified as hex address, e.g. `0x4004110`. Note that the address must be + the address of a basic block. - Setting `AFL_INST_LIBS` causes the translator to also instrument the code inside any dynamically linked libraries (notably including glibc). -- cgit 1.4.1 From a699dc2d2d54d10c729466408925384f2e07819b Mon Sep 17 00:00:00 2001 From: Your Name Date: Mon, 29 Nov 2021 17:38:06 +0000 Subject: Update docs --- docs/env_variables.md | 99 +++++++++++++++++++++++++++++++++++++++++++++++---- frida_mode/README.md | 6 ++++ 2 files changed, 98 insertions(+), 7 deletions(-) (limited to 'docs/env_variables.md') diff --git a/docs/env_variables.md b/docs/env_variables.md index cbc63032..1a330158 100644 --- a/docs/env_variables.md +++ b/docs/env_variables.md @@ -578,7 +578,92 @@ The QEMU wrapper used to instrument binary-only code supports several settings: emulation" variables (e.g., `QEMU_STACK_SIZE`), but there should be no reason to touch them. -## 6) Settings for afl-cmin +## 7) Settings for afl-frida-trace + +The FRIDA wrapper used to instrument binary-only code supports many of the same +options as `afl-qemu-trace`, but also has a number of additional advanced +options. These are listed in brief below (see [here](../frida_mode/README.md) +for more details). These settings are provided for compatibiltiy with QEMU mode, +the preferred way to configure FRIDA mode is through its +[scripting](../frida_mode/Scripting.md) support. + +* `AFL_FRIDA_DEBUG_MAPS` - See `AFL_QEMU_DEBUG_MAPS` +* `AFL_FRIDA_DRIVER_NO_HOOK` - See `AFL_QEMU_DRIVER_NO_HOOK`. When using the +QEMU driver to provide a `main` loop for a user provided +`LLVMFuzzerTestOneInput`, this option configures the driver to read input from +`stdin` rather than using in-memory test cases. +* `AFL_FRIDA_EXCLUDE_RANGES` - See `AFL_QEMU_EXCLUDE_RANGES` +* `AFL_FRIDA_INST_COVERAGE_FILE` - File to write DynamoRio format coverage +information (e.g. to be loaded within IDA lighthouse). +* `AFL_FRIDA_INST_DEBUG_FILE` - File to write raw assembly of original blocks +and their instrumented counterparts during block compilation. +* `AFL_FRIDA_INST_JIT` - Enable the instrumentation of Just-In-Time compiled +code. Code is considered to be JIT if the executable segment is not backed by a +file. +* `AFL_FRIDA_INST_NO_OPTIMIZE` - Don't use optimized inline assembly coverage +instrumentation (the default where available). Required to use +`AFL_FRIDA_INST_TRACE`. +* `AFL_FRIDA_INST_NO_BACKPATCH` - Disable backpatching. At the end of executing +each block, control will return to FRIDA to identify the next block to execute. +* `AFL_FRIDA_INST_NO_PREFETCH` - Disable prefetching. By default the child will +report instrumented blocks back to the parent so that it can also instrument +them and they be inherited by the next child on fork, implies +`AFL_FRIDA_INST_NO_PREFETCH_BACKPATCH`. +* `AFL_FRIDA_INST_NO_PREFETCH_BACKPATCH` - Disable prefetching of stalker +backpatching information. By default the child will report applied backpatches +to the parent so that they can be applied and then be inherited by the next +child on fork. +* `AFL_FRIDA_INST_RANGES` - See `AFL_QEMU_INST_RANGES` +* `AFL_FRIDA_INST_SEED` - Sets the initial seed for the hash function used to +generate block (and hence edge) IDs. Setting this to a constant value may be +useful for debugging purposes, e.g. investigating unstable edges. +* `AFL_FRIDA_INST_TRACE` - Log to stdout the address of executed blocks, +implies `AFL_FRIDA_INST_NO_OPTIMIZE`. +* `AFL_FRIDA_INST_TRACE_UNIQUE` - As per `AFL_FRIDA_INST_TRACE`, but each edge +is logged only once, requires `AFL_FRIDA_INST_NO_OPTIMIZE`. +* `AFL_FRIDA_INST_UNSTABLE_COVERAGE_FILE` - File to write DynamoRio format +coverage information for unstable edges (e.g. to be loaded within IDA +lighthouse). +* `AFL_FRIDA_JS_SCRIPT` - Set the script to be loaded by the FRIDA scripting +engine. See [here](Scripting.md) for details. +* `AFL_FRIDA_OUTPUT_STDOUT` - Redirect the standard output of the target +application to the named file (supersedes the setting of `AFL_DEBUG_CHILD`) +* `AFL_FRIDA_OUTPUT_STDERR` - Redirect the standard error of the target +application to the named file (supersedes the setting of `AFL_DEBUG_CHILD`) +* `AFL_FRIDA_PERSISTENT_ADDR` - See `AFL_QEMU_PERSISTENT_ADDR` +* `AFL_FRIDA_PERSISTENT_CNT` - See `AFL_QEMU_PERSISTENT_CNT` +* `AFL_FRIDA_PERSISTENT_DEBUG` - Insert a Breakpoint into the instrumented code +at `AFL_FRIDA_PERSISTENT_HOOK` and `AFL_FRIDA_PERSISTENT_RET` to allow the user +to detect issues in the persistent loop using a debugger. +* `AFL_FRIDA_PERSISTENT_HOOK` - See `AFL_QEMU_PERSISTENT_HOOK` +* `AFL_FRIDA_PERSISTENT_RET` - See `AFL_QEMU_PERSISTENT_RET` +* `AFL_FRIDA_SECCOMP_FILE` - Write a log of any syscalls made by the target to +the specified file. +* `AFL_FRIDA_STALKER_ADJACENT_BLOCKS` - Configure the number of adjacent blocks + to fetch when generating instrumented code. By fetching blocks in the same + order they appear in the original program, rather than the order of execution + should help reduce locallity and adjacency. This includes allowing us to vector + between adjancent blocks using a NOP slide rather than an immediate branch. +* `AFL_FRIDA_STALKER_IC_ENTRIES` - Configure the number of inline cache entries +stored along-side branch instructions which provide a cache to avoid having to +call back into FRIDA to find the next block. Default is 32. +* `AFL_FRIDA_STATS_FILE` - Write statistics information about the code being +instrumented to the given file name. The statistics are written only for the +child process when new block is instrumented (when the +`AFL_FRIDA_STATS_INTERVAL` has expired). Note that simply because a new path is +found does not mean a new block needs to be compiled. It could simply be that +the existing blocks instrumented have been executed in a different order. +* `AFL_FRIDA_STATS_INTERVAL` - The maximum frequency to output statistics +information. Stats will be written whenever they are updated if the given +interval has elapsed since last time they were written. +* `AFL_FRIDA_TRACEABLE` - Set the child process to be traceable by any process +to aid debugging and overcome the restrictions imposed by YAMA. Supported on +Linux only. Permits a non-root user to use `gcore` or similar to collect a core +dump of the instrumented target. Note that in order to capture the core dump you +must set a sufficient timeout (using `-t`) to avoid `afl-fuzz` killing the +process whilst it is being dumped. + +## 8) Settings for afl-cmin The corpus minimization script offers very little customization: @@ -596,7 +681,7 @@ The corpus minimization script offers very little customization: - `AFL_PRINT_FILENAMES` prints each filename to stdout, as it gets processed. This can help when embedding `afl-cmin` or `afl-showmap` in other scripts. -## 7) Settings for afl-tmin +## 9) Settings for afl-tmin Virtually nothing to play with. Well, in QEMU mode (`-Q`), `AFL_PATH` will be searched for afl-qemu-trace. In addition to this, `TMPDIR` may be used if a @@ -607,12 +692,12 @@ to match when minimizing crashes. This will make minimization less useful, but may prevent the tool from "jumping" from one crashing condition to another in very buggy software. You probably want to combine it with the `-e` flag. -## 8) Settings for afl-analyze +## 10) Settings for afl-analyze You can set `AFL_ANALYZE_HEX` to get file offsets printed as hexadecimal instead of decimal. -## 9) Settings for libdislocator +## 11) Settings for libdislocator The library honors these environment variables: @@ -634,12 +719,12 @@ The library honors these environment variables: - `AFL_LD_VERBOSE` causes the library to output some diagnostic messages that may be useful for pinpointing the cause of any observed issues. -## 10) Settings for libtokencap +## 11) Settings for libtokencap This library accepts `AFL_TOKEN_FILE` to indicate the location to which the discovered tokens should be written. -## 11) Third-party variables set by afl-fuzz & other tools +## 12) Third-party variables set by afl-fuzz & other tools Several variables are not directly interpreted by afl-fuzz, but are set to optimal values if not already present in the environment: @@ -684,4 +769,4 @@ optimal values if not already present in the environment: - By default, `LD_BIND_NOW` is set to speed up fuzzing by forcing the linker to do all the work before the fork server kicks in. You can override this by - setting `LD_BIND_LAZY` beforehand, but it is almost certainly pointless. \ No newline at end of file + setting `LD_BIND_LAZY` beforehand, but it is almost certainly pointless. diff --git a/frida_mode/README.md b/frida_mode/README.md index a75324d5..6c46fe08 100644 --- a/frida_mode/README.md +++ b/frida_mode/README.md @@ -145,6 +145,10 @@ instances run CMPLOG mode and instrumentation of the binary is less frequent (only on CMP, SUB and CALL instructions) performance is not quite so critical. ## Advanced configuration options +* `AFL_FRIDA_DRIVER_NO_HOOK` - See `AFL_QEMU_DRIVER_NO_HOOK`. When using the +QEMU driver to provide a `main` loop for a user provided +`LLVMFuzzerTestOneInput`, this option configures the driver to read input from +`stdin` rather than using in-memory test cases. * `AFL_FRIDA_INST_COVERAGE_FILE` - File to write DynamoRio format coverage information (e.g. to be loaded within IDA lighthouse). * `AFL_FRIDA_INST_DEBUG_FILE` - File to write raw assembly of original blocks @@ -194,6 +198,8 @@ is logged only once, requires `AFL_FRIDA_INST_NO_OPTIMIZE`. * `AFL_FRIDA_INST_UNSTABLE_COVERAGE_FILE` - File to write DynamoRio format coverage information for unstable edges (e.g. to be loaded within IDA lighthouse). +* `AFL_FRIDA_JS_SCRIPT` - Set the script to be loaded by the FRIDA scripting +engine. See [here](Scripting.md) for details. * `AFL_FRIDA_OUTPUT_STDOUT` - Redirect the standard output of the target application to the named file (supersedes the setting of `AFL_DEBUG_CHILD`) * `AFL_FRIDA_OUTPUT_STDERR` - Redirect the standard error of the target -- cgit 1.4.1