aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorvan Hauser <vh@thc.org>2020-08-05 01:13:51 +0200
committervan Hauser <vh@thc.org>2020-08-05 01:13:51 +0200
commit6d364dd2cb0ac31797b52e590b57bf9c10cc2302 (patch)
treec241c6133910fdce295760657a1a2dd6879487d6
parent8ed6207b5cec628cb51a807a0a585f129de2e041 (diff)
downloadafl++-6d364dd2cb0ac31797b52e590b57bf9c10cc2302.tar.gz
add sancov-like allow/denylist instrument feature
-rw-r--r--README.md16
-rw-r--r--TODO.md1
-rw-r--r--docs/Changelog.md4
-rw-r--r--docs/FAQ.md2
-rw-r--r--docs/env_variables.md11
-rw-r--r--docs/perf_tips.md2
-rw-r--r--gcc_plugin/GNUmakefile2
-rw-r--r--gcc_plugin/Makefile2
-rw-r--r--include/envs.h3
-rw-r--r--llvm_mode/README.instrument_file.md81
-rw-r--r--llvm_mode/README.instrument_list.md86
-rw-r--r--llvm_mode/README.lto.md5
-rw-r--r--llvm_mode/README.md2
-rw-r--r--llvm_mode/afl-clang-fast.c32
-rw-r--r--llvm_mode/afl-llvm-common.cc485
-rw-r--r--llvm_mode/afl-llvm-lto-instrumentlist.so.cc156
16 files changed, 567 insertions, 323 deletions
diff --git a/README.md b/README.md
index bd2784ae..2e24a534 100644
--- a/README.md
+++ b/README.md
@@ -246,7 +246,7 @@ anything below 9 is not recommended.
+--------------------------------+
| if you want to instrument only | -> use afl-gcc-fast and afl-gcc-fast++
| parts of the target | see [gcc_plugin/README.md](gcc_plugin/README.md) and
- +--------------------------------+ [gcc_plugin/README.instrument_file.md](gcc_plugin/README.instrument_file.md)
+ +--------------------------------+ [gcc_plugin/README.instrument_list.md](gcc_plugin/README.instrument_list.md)
|
| if not, or if you do not have a gcc with plugin support
|
@@ -290,12 +290,18 @@ selectively only instrument parts of the target that you are interested in:
create a file with all the filenames of the source code that should be
instrumented.
For afl-clang-lto and afl-gcc-fast - or afl-clang-fast if either the clang
- version is < 7 or the CLASSIC instrumentation is used - just put one
- filename per line, no directory information necessary, and set
- `export AFL_LLVM_INSTRUMENT_FILE=yourfile.txt`
- see [llvm_mode/README.instrument_file.md](llvm_mode/README.instrument_file.md)
+ version is below 7 or the CLASSIC instrumentation is used - just put one
+ filename or function per line (no directory information necessary for
+ filenames9, and either set `export AFL_LLVM_ALLOWLIST=allowlist.txt` **or**
+ `export AFL_LLVM_DENYLIST=denylist.txt` - depending on if you want per
+ default to instrument unless noted (DENYLIST) or not perform instrumentation
+ unless requested (ALLOWLIST).
+ **NOTE:** In optimization functions might be inlined and then not match!
+ see [llvm_mode/README.instrument_list.md](llvm_mode/README.instrument_list.md)
For afl-clang-fast > 6.0 or if PCGUARD instrumentation is used then use the
llvm sancov allow-list feature: [http://clang.llvm.org/docs/SanitizerCoverage.html](http://clang.llvm.org/docs/SanitizerCoverage.html)
+ The llvm sancov format works with the allowlist/denylist feature of afl++
+ however afl++ is more flexible in the format.
There are many more options and modes available however these are most of the
time less effective. See:
diff --git a/TODO.md b/TODO.md
index 999cb9d3..e81b82a3 100644
--- a/TODO.md
+++ b/TODO.md
@@ -2,7 +2,6 @@
## Roadmap 2.67+
- - expand on AFL_LLVM_INSTRUMENT_FILE to also support sancov allowlist format
- AFL_MAP_SIZE for qemu_mode and unicorn_mode
- CPU affinity for many cores? There seems to be an issue > 96 cores
diff --git a/docs/Changelog.md b/docs/Changelog.md
index ae7377f2..f98f8b9b 100644
--- a/docs/Changelog.md
+++ b/docs/Changelog.md
@@ -22,6 +22,10 @@ sending a mail to <afl-users+subscribe@googlegroups.com>.
- fixed a bug in redqueen for strings
- llvm_mode:
- now supports llvm 12!
+ - support for AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST (previous
+ AFL_LLVM_WHITELIST and AFL_LLVM_INSTRUMENT_FILE are deprecated and
+ are matched to AFL_LLVM_ALLOWLIST). The format is compatible to llvm
+ sancov, and also supports function matching!
- fixes for laf-intel float splitting (thanks to mark-griffin for
reporting)
- LTO: autodictionary mode is a default
diff --git a/docs/FAQ.md b/docs/FAQ.md
index c15cd484..33ce49e6 100644
--- a/docs/FAQ.md
+++ b/docs/FAQ.md
@@ -117,7 +117,7 @@ afl-clang-fast PCGUARD and afl-clang-lto LTO instrumentation!
Identify which source code files contain the functions that you need to
remove from instrumentation.
- Simply follow this document on how to do this: [llvm_mode/README.instrument_file.md](llvm_mode/README.instrument_file.md)
+ Simply follow this document on how to do this: [llvm_mode/README.instrument_list.md](llvm_mode/README.instrument_list.md)
If PCGUARD is used, then you need to follow this guide (needs llvm 12+!):
[http://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation](http://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation)
diff --git a/docs/env_variables.md b/docs/env_variables.md
index 811c5658..f0ae0b6c 100644
--- a/docs/env_variables.md
+++ b/docs/env_variables.md
@@ -202,14 +202,15 @@ Then there are a few specific features that are only available in llvm_mode:
See llvm_mode/README.laf-intel.md for more information.
-### INSTRUMENT_FILE
+### INSTRUMENT LIST (selectively instrument files and functions)
This feature allows selectively instrumentation of the source
- - Setting AFL_LLVM_INSTRUMENT_FILE with a filename will only instrument those
- files that match the names listed in this file.
+ - Setting AFL_LLVM_ALLOWLIST or AFL_LLVM_DENYLIST with a filenames and/or
+ function will only instrument (or skip) those files that match the names
+ listed in the specified file.
- See llvm_mode/README.instrument_file.md for more information.
+ See llvm_mode/README.instrument_list.md for more information.
### NOT_ZERO
@@ -241,7 +242,7 @@ Then there are a few specific features that are only available in the gcc_plugin
- Setting AFL_GCC_INSTRUMENT_FILE with a filename will only instrument those
files that match the names listed in this file (one filename per line).
- See gcc_plugin/README.instrument_file.md for more information.
+ See gcc_plugin/README.instrument_list.md for more information.
## 3) Settings for afl-fuzz
diff --git a/docs/perf_tips.md b/docs/perf_tips.md
index 7a690b77..731dc238 100644
--- a/docs/perf_tips.md
+++ b/docs/perf_tips.md
@@ -67,7 +67,7 @@ to get to the important parts in the code.
If you are only interested in specific parts of the code being fuzzed, you can
instrument_files the files that are actually relevant. This improves the speed and
-accuracy of afl. See llvm_mode/README.instrument_file.md
+accuracy of afl. See llvm_mode/README.instrument_list.md
Also use the InsTrim mode on larger binaries, this improves performance and
coverage a lot.
diff --git a/gcc_plugin/GNUmakefile b/gcc_plugin/GNUmakefile
index 4a4f0dcd..f10a6c1d 100644
--- a/gcc_plugin/GNUmakefile
+++ b/gcc_plugin/GNUmakefile
@@ -163,7 +163,7 @@ install: all
install -m 755 ../afl-gcc-fast $${DESTDIR}$(BIN_PATH)
install -m 755 ../afl-gcc-pass.so ../afl-gcc-rt.o $${DESTDIR}$(HELPER_PATH)
install -m 644 -T README.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.md
- install -m 644 -T README.instrument_file.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.instrument_file.md
+ install -m 644 -T README.instrument_list.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.instrument_file.md
clean:
rm -f *.o *.so *~ a.out core core.[1-9][0-9]* test-instr .test-instr0 .test-instr1 .test2
diff --git a/gcc_plugin/Makefile b/gcc_plugin/Makefile
index f720112f..c088b61c 100644
--- a/gcc_plugin/Makefile
+++ b/gcc_plugin/Makefile
@@ -152,7 +152,7 @@ install: all
install -m 755 ../afl-gcc-fast $${DESTDIR}$(BIN_PATH)
install -m 755 ../afl-gcc-pass.so ../afl-gcc-rt.o $${DESTDIR}$(HELPER_PATH)
install -m 644 -T README.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.md
- install -m 644 -T README.instrument_file.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.instrument_file.md
+ install -m 644 -T README.instrument_list.md $${DESTDIR}$(DOC_PATH)/README.gcc_plugin.instrument_file.md
clean:
rm -f *.o *.so *~ a.out core core.[1-9][0-9]* test-instr .test-instr0 .test-instr1 .test2
diff --git a/include/envs.h b/include/envs.h
index 7153ed47..96ae91ba 100644
--- a/include/envs.h
+++ b/include/envs.h
@@ -62,6 +62,9 @@ static char *afl_environment_variables[] = {
"AFL_REAL_LD",
"AFL_LD_PRELOAD",
"AFL_LD_VERBOSE",
+ "AFL_LLVM_ALLOWLIST",
+ "AFL_LLVM_DENYLIST",
+ "AFL_LLVM_BLOCKLIST",
"AFL_LLVM_CMPLOG",
"AFL_LLVM_INSTRIM",
"AFL_LLVM_CTX",
diff --git a/llvm_mode/README.instrument_file.md b/llvm_mode/README.instrument_file.md
deleted file mode 100644
index 46e45ba2..00000000
--- a/llvm_mode/README.instrument_file.md
+++ /dev/null
@@ -1,81 +0,0 @@
-# Using afl++ with partial instrumentation
-
- This file describes how you can selectively instrument only the source files
- that are interesting to you using the LLVM instrumentation provided by
- afl++
-
- Originally developed by Christian Holler (:decoder) <choller@mozilla.com>.
-
-## 1) Description and purpose
-
-When building and testing complex programs where only a part of the program is
-the fuzzing target, it often helps to only instrument the necessary parts of
-the program, leaving the rest uninstrumented. This helps to focus the fuzzer
-on the important parts of the program, avoiding undesired noise and
-disturbance by uninteresting code being exercised.
-
-For this purpose, I have added a "partial instrumentation" support to the LLVM
-mode of AFLFuzz that allows you to specify on a source file level which files
-should be compiled with or without instrumentation.
-
-Note: When using PCGUARD mode - and have llvm 12+ - you can use this instead:
-https://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation
-
-## 2) Building the LLVM module
-
-The new code is part of the existing afl++ LLVM module in the llvm_mode/
-subdirectory. There is nothing specifically to do :)
-
-
-## 3) How to use the partial instrumentation mode
-
-In order to build with partial instrumentation, you need to build with
-afl-clang-fast and afl-clang-fast++ respectively. The only required change is
-that you need to set the environment variable AFL_LLVM_INSTRUMENT_FILE when calling
-the compiler.
-
-The environment variable must point to a file containing all the filenames
-that should be instrumented. For matching, the filename that is being compiled
-must end in the filename entry contained in this the instrument file list (to avoid breaking
-the matching when absolute paths are used during compilation).
-
-For example if your source tree looks like this:
-
-```
-project/
-project/feature_a/a1.cpp
-project/feature_a/a2.cpp
-project/feature_b/b1.cpp
-project/feature_b/b2.cpp
-```
-
-and you only want to test feature_a, then create a the instrument file list file containing:
-
-```
-feature_a/a1.cpp
-feature_a/a2.cpp
-```
-
-However if the instrument file list file contains only this, it works as well:
-
-```
-a1.cpp
-a2.cpp
-```
-
-but it might lead to files being unwantedly instrumented if the same filename
-exists somewhere else in the project directories.
-
-The created the instrument file list file is then set to AFL_LLVM_INSTRUMENT_FILE when you compile
-your program. For each file that didn't match the the instrument file list, the compiler will
-issue a warning at the end stating that no blocks were instrumented. If you
-didn't intend to instrument that file, then you can safely ignore that warning.
-
-For old LLVM versions this feature might require to be compiled with debug
-information (-g), however at least from llvm version 6.0 onwards this is not
-required anymore (and might hurt performance and crash detection, so better not
-use -g).
-
-## 4) UNIX-style filename pattern matching
-You can add UNIX-style pattern matching in the the instrument file list entries. See `man
-fnmatch` for the syntax. We do not set any of the `fnmatch` flags.
diff --git a/llvm_mode/README.instrument_list.md b/llvm_mode/README.instrument_list.md
new file mode 100644
index 00000000..b0e0cc1e
--- /dev/null
+++ b/llvm_mode/README.instrument_list.md
@@ -0,0 +1,86 @@
+# Using afl++ with partial instrumentation
+
+ This file describes how you can selectively instrument only the source files
+ or functions that are interesting to you using the LLVM instrumentation
+ provided by afl++
+
+## 1) Description and purpose
+
+When building and testing complex programs where only a part of the program is
+the fuzzing target, it often helps to only instrument the necessary parts of
+the program, leaving the rest uninstrumented. This helps to focus the fuzzer
+on the important parts of the program, avoiding undesired noise and
+disturbance by uninteresting code being exercised.
+
+For this purpose, a "partial instrumentation" support en par with llvm sancov
+is provided by afl++ that allows you to specify on a source file and function
+level which should be compiled with or without instrumentation.
+
+Note: When using PCGUARD mode - and have llvm 12+ - you can use this instead:
+https://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation
+
+the llvm sancov list format is fully supported by afl++, however afl++ has
+more flexbility.
+
+## 2) Building the LLVM module
+
+The new code is part of the existing afl++ LLVM module in the llvm_mode/
+subdirectory. There is nothing specifically to do :)
+
+## 3) How to use the partial instrumentation mode
+
+In order to build with partial instrumentation, you need to build with
+afl-clang-fast/afl-clang-fast++ or afl-clang-lto/afl-clang-lto++.
+The only required change is that you need to set either the environment variable
+AFL_LLVM_ALLOWLIST or AFL_LLVM_DENYLIST set with a filename.
+
+That file then contains the filenames or functions that should be instrumented
+(AFL_LLVM_ALLOWLIST) or should specifically NOT instrumentd (AFL_LLVM_DENYLIST).
+
+For matching, the function/filename that is being compiled must end in the
+function/filename entry contained in this the instrument file list (to avoid
+breaking the matching when absolute paths are used during compilation).
+
+**NOTE:** In optimization functions might be inlined and then not match!
+
+For example if your source tree looks like this:
+```
+project/
+project/feature_a/a1.cpp
+project/feature_a/a2.cpp
+project/feature_b/b1.cpp
+project/feature_b/b2.cpp
+```
+
+and you only want to test feature_a, then create a the instrument file list file containing:
+```
+feature_a/a1.cpp
+feature_a/a2.cpp
+```
+
+However if the instrument file list file contains only this, it works as well:
+```
+a1.cpp
+a2.cpp
+```
+but it might lead to files being unwantedly instrumented if the same filename
+exists somewhere else in the project directories.
+
+You can also specify function names. Note that for C++ the function names
+must be mangled to match!
+
+afl++ is intelligent to identify if an entry is a filename or a function.
+However if you want to be sure (and compliant to the sancov allow/blocklist
+format), you can file entries like this:
+```
+src: *malloc.c
+```
+and function entries like this:
+```
+fun: MallocFoo
+```
+Note that whitespace is ignored and comments (`# foo`) supported.
+
+## 4) UNIX-style pattern matching
+You can add UNIX-style pattern matching in the the instrument file list entries.
+See `man fnmatch` for the syntax. We do not set any of the `fnmatch` flags.
diff --git a/llvm_mode/README.lto.md b/llvm_mode/README.lto.md
index e521ac82..4d643324 100644
--- a/llvm_mode/README.lto.md
+++ b/llvm_mode/README.lto.md
@@ -108,15 +108,12 @@ make install
Just use afl-clang-lto like you did with afl-clang-fast or afl-gcc.
-Also the instrument file listing (AFL_LLVM_INSTRUMENT_FILE -> [README.instrument_file.md](README.instrument_file.md)) and
+Also the instrument file listing (AFL_LLVM_ALLOWLIST/AFL_LLVM_DENYLIST -> [README.instrument_list.md](README.instrument_list.md)) and
laf-intel/compcov (AFL_LLVM_LAF_* -> [README.laf-intel.md](README.laf-intel.md)) work.
-InsTrim (control flow graph instrumentation) is supported and recommended!
- (set `AFL_LLVM_INSTRUMENT=CFG`)
Example:
```
CC=afl-clang-lto CXX=afl-clang-lto++ RANLIB=llvm-ranlib AR=llvm-ar ./configure
-export AFL_LLVM_INSTRUMENT=CFG
make
```
diff --git a/llvm_mode/README.md b/llvm_mode/README.md
index 22088dfd..f23d7150 100644
--- a/llvm_mode/README.md
+++ b/llvm_mode/README.md
@@ -109,7 +109,7 @@ Several options are present to make llvm_mode faster or help it rearrange
the code to make afl-fuzz path discovery easier.
If you need just to instrument specific parts of the code, you can the instrument file list
-which C/C++ files to actually instrument. See [README.instrument_file](README.instrument_file.md)
+which C/C++ files to actually instrument. See [README.instrument_list](README.instrument_list.md)
For splitting memcmp, strncmp, etc. please see [README.laf-intel](README.laf-intel.md)
diff --git a/llvm_mode/afl-clang-fast.c b/llvm_mode/afl-clang-fast.c
index ef99e3f3..f75adf1e 100644
--- a/llvm_mode/afl-clang-fast.c
+++ b/llvm_mode/afl-clang-fast.c
@@ -229,7 +229,8 @@ static void edit_params(u32 argc, char **argv, char **envp) {
if (lto_mode) {
if (getenv("AFL_LLVM_INSTRUMENT_FILE") != NULL ||
- getenv("AFL_LLVM_WHITELIST")) {
+ getenv("AFL_LLVM_WHITELIST") || getenv("AFL_LLVM_ALLOWLIST") ||
+ getenv("AFL_LLVM_DENYLIST") || getenv("AFL_LLVM_BLOCKLIST")) {
cc_params[cc_par_cnt++] = "-Xclang";
cc_params[cc_par_cnt++] = "-load";
@@ -637,9 +638,13 @@ int main(int argc, char **argv, char **envp) {
}
- if ((getenv("AFL_LLVM_INSTRUMENT_FILE") || getenv("AFL_LLVM_WHITELIST")) &&
+ if ((getenv("AFL_LLVM_INSTRUMENT_FILE") != NULL ||
+ getenv("AFL_LLVM_WHITELIST") || getenv("AFL_LLVM_ALLOWLIST") ||
+ getenv("AFL_LLVM_DENYLIST") || getenv("AFL_LLVM_BLOCKLIST")) &&
getenv("AFL_DONT_OPTIMIZE"))
- FATAL("AFL_LLVM_INSTRUMENT_FILE and AFL_DONT_OPTIMIZE cannot be combined");
+ WARNF(
+ "AFL_LLVM_ALLOWLIST/DENYLIST and AFL_DONT_OPTIMIZE cannot be combined "
+ "for file matching, only function matching!");
if (getenv("AFL_LLVM_INSTRIM") || getenv("INSTRIM") ||
getenv("INSTRIM_LIB")) {
@@ -787,15 +792,17 @@ int main(int argc, char **argv, char **envp) {
#if LLVM_VERSION_MAJOR <= 6
instrument_mode = INSTRUMENT_AFL;
#else
- if (getenv("AFL_LLVM_INSTRUMENT_FILE") || getenv("AFL_LLVM_WHITELIST")) {
+ if (getenv("AFL_LLVM_INSTRUMENT_FILE") != NULL ||
+ getenv("AFL_LLVM_WHITELIST") || getenv("AFL_LLVM_ALLOWLIST") ||
+ getenv("AFL_LLVM_DENYLIST") || getenv("AFL_LLVM_BLOCKLIST")) {
instrument_mode = INSTRUMENT_AFL;
WARNF(
"switching to classic instrumentation because "
- "AFL_LLVM_INSTRUMENT_FILE does not work with PCGUARD. Use "
- "-fsanitize-coverage-allowlist=allowlist.txt if you want to use "
- "PCGUARD. Requires llvm 12+. See "
- "https://clang.llvm.org/docs/"
+ "AFL_LLVM_ALLOWLIST/DENYLIST does not work with PCGUARD. Use "
+ "-fsanitize-coverage-allowlist=allowlist.txt or "
+ "-fsanitize-coverage-blocklist=denylist.txt if you want to use "
+ "PCGUARD. Requires llvm 12+. See https://clang.llvm.org/docs/ "
"SanitizerCoverage.html#partially-disabling-instrumentation");
} else
@@ -846,11 +853,14 @@ int main(int argc, char **argv, char **envp) {
"together");
if (instrument_mode == INSTRUMENT_PCGUARD &&
- (getenv("AFL_LLVM_INSTRUMENT_FILE") || getenv("AFL_LLVM_WHITELIST")))
+ (getenv("AFL_LLVM_INSTRUMENT_FILE") != NULL ||
+ getenv("AFL_LLVM_WHITELIST") || getenv("AFL_LLVM_ALLOWLIST") ||
+ getenv("AFL_LLVM_DENYLIST") || getenv("AFL_LLVM_BLOCKLIST")))
FATAL(
"Instrumentation type PCGUARD does not support "
- "AFL_LLVM_INSTRUMENT_FILE! Use "
- "-fsanitize-coverage-allowlist=allowlist.txt instead (requires llvm "
+ "AFL_LLVM_ALLOWLIST/DENYLIST! Use "
+ "-fsanitize-coverage-allowlist=allowlist.txt or "
+ "-fsanitize-coverage-blocklist=denylist.txt instead (requires llvm "
"12+), see "
"https://clang.llvm.org/docs/"
"SanitizerCoverage.html#partially-disabling-instrumentation");
diff --git a/llvm_mode/afl-llvm-common.cc b/llvm_mode/afl-llvm-common.cc
index 9a884ded..0b89c3b4 100644
--- a/llvm_mode/afl-llvm-common.cc
+++ b/llvm_mode/afl-llvm-common.cc
@@ -20,7 +20,10 @@
using namespace llvm;
-static std::list<std::string> myInstrumentList;
+static std::list<std::string> allowListFiles;
+static std::list<std::string> allowListFunctions;
+static std::list<std::string> denyListFiles;
+static std::list<std::string> denyListFunctions;
char *getBBName(const llvm::BasicBlock *BB) {
@@ -87,30 +90,166 @@ bool isIgnoreFunction(const llvm::Function *F) {
void initInstrumentList() {
- char *instrumentListFilename = getenv("AFL_LLVM_INSTRUMENT_FILE");
- if (!instrumentListFilename)
- instrumentListFilename = getenv("AFL_LLVM_WHITELIST");
+ char *allowlist = getenv("AFL_LLVM_ALLOWLIST");
+ if (!allowlist) allowlist = getenv("AFL_LLVM_INSTRUMENT_FILE");
+ if (!allowlist) allowlist = getenv("AFL_LLVM_WHITELIST");
+ char *denylist = getenv("AFL_LLVM_DENYLIST");
+ if (!denylist) denylist = getenv("AFL_LLVM_BLOCKLIST");
- if (instrumentListFilename) {
+ if (allowlist && denylist)
+ FATAL(
+ "You can only specify either AFL_LLVM_ALLOWLIST or AFL_LLVM_DENYLIST "
+ "but not both!");
+
+ if (allowlist) {
std::string line;
std::ifstream fileStream;
- fileStream.open(instrumentListFilename);
- if (!fileStream)
- report_fatal_error("Unable to open AFL_LLVM_INSTRUMENT_FILE");
+ fileStream.open(allowlist);
+ if (!fileStream) report_fatal_error("Unable to open AFL_LLVM_ALLOWLIST");
getline(fileStream, line);
+
while (fileStream) {
- myInstrumentList.push_back(line);
- getline(fileStream, line);
+ int is_file = -1;
+ std::size_t npos;
+ std::string original_line = line;
+
+ line.erase(std::remove_if(line.begin(), line.end(), ::isspace),
+ line.end());
+
+ // remove # and following
+ if ((npos = line.find("#")) != std::string::npos)
+ line = line.substr(0, npos);
+
+ if (line.compare(0, 4, "fun:") == 0) {
+
+ is_file = 0;
+ line = line.substr(4);
+
+ } else if (line.compare(0, 9, "function:") == 0) {
+
+ is_file = 0;
+ line = line.substr(9);
+
+ } else if (line.compare(0, 4, "src:") == 0) {
+
+ is_file = 1;
+ line = line.substr(4);
+
+ } else if (line.compare(0, 7, "source:") == 0) {
+
+ is_file = 1;
+ line = line.substr(7);
+
+ }
+
+ if (line.find(":") != std::string::npos) {
+
+ FATAL("invalid line in AFL_LLVM_ALLOWLIST: %s", original_line.c_str());
+
+ }
+
+ if (line.length() > 0) {
+
+ // if the entry contains / or . it must be a file
+ if (is_file == -1)
+ if (line.find("/") != std::string::npos ||
+ line.find(".") != std::string::npos)
+ is_file = 1;
+ // otherwise it is a function
+
+ if (is_file == 1)
+ allowListFiles.push_back(line);
+ else
+ allowListFunctions.push_back(line);
+ getline(fileStream, line);
+
+ }
}
+ if (debug)
+ SAYF(cMGN "[D] " cRST
+ "loaded allowlist with %zu file and %zu function entries\n",
+ allowListFiles.size(), allowListFunctions.size());
+
}
- if (debug)
- SAYF(cMGN "[D] " cRST "loaded instrument list with %zu entries\n",
- myInstrumentList.size());
+ if (denylist) {
+
+ std::string line;
+ std::ifstream fileStream;
+ fileStream.open(denylist);
+ if (!fileStream) report_fatal_error("Unable to open AFL_LLVM_DENYLIST");
+ getline(fileStream, line);
+
+ while (fileStream) {
+
+ int is_file = -1;
+ std::size_t npos;
+ std::string original_line = line;
+
+ line.erase(std::remove_if(line.begin(), line.end(), ::isspace),
+ line.end());
+
+ // remove # and following
+ if ((npos = line.find("#")) != std::string::npos)
+ line = line.substr(0, npos);
+
+ if (line.compare(0, 4, "fun:") == 0) {
+
+ is_file = 0;
+ line = line.substr(4);
+
+ } else if (line.compare(0, 9, "function:") == 0) {
+
+ is_file = 0;
+ line = line.substr(9);
+
+ } else if (line.compare(0, 4, "src:") == 0) {
+
+ is_file = 1;
+ line = line.substr(4);
+
+ } else if (line.compare(0, 7, "source:") == 0) {
+
+ is_file = 1;
+ line = line.substr(7);
+
+ }
+
+ if (line.find(":") != std::string::npos) {
+
+ FATAL("invalid line in AFL_LLVM_DENYLIST: %s", original_line.c_str());
+
+ }
+
+ if (line.length() > 0) {
+
+ // if the entry contains / or . it must be a file
+ if (is_file == -1)
+ if (line.find("/") != std::string::npos ||
+ line.find(".") != std::string::npos)
+ is_file = 1;
+ // otherwise it is a function
+
+ if (is_file == 1)
+ denyListFiles.push_back(line);
+ else
+ denyListFunctions.push_back(line);
+ getline(fileStream, line);
+
+ }
+
+ }
+
+ if (debug)
+ SAYF(cMGN "[D] " cRST
+ "loaded denylist with %zu file and %zu function entries\n",
+ denyListFiles.size(), denyListFunctions.size());
+
+ }
}
@@ -121,42 +260,173 @@ bool isInInstrumentList(llvm::Function *F) {
if (!F->size() || isIgnoreFunction(F)) return false;
// if we do not have a the instrument file list return true
- if (myInstrumentList.empty()) return true;
+ if (!allowListFiles.empty() || !allowListFunctions.empty()) {
+
+ if (!allowListFunctions.empty()) {
+
+ std::string instFunction = F->getName().str();
+
+ for (std::list<std::string>::iterator it = allowListFunctions.begin();
+ it != allowListFunctions.end(); ++it) {
+
+ /* We don't check for filename equality here because
+ * filenames might actually be full paths. Instead we
+ * check that the actual filename ends in the filename
+ * specified in the list. We also allow UNIX-style pattern
+ * matching */
- // let's try to get the filename for the function
- auto bb = &F->getEntryBlock();
- BasicBlock::iterator IP = bb->getFirstInsertionPt();
- IRBuilder<> IRB(&(*IP));
- DebugLoc Loc = IP->getDebugLoc();
+ if (instFunction.length() >= it->length()) {
+
+ if (fnmatch(("*" + *it).c_str(), instFunction.c_str(), 0) == 0) {
+
+ if (debug)
+ SAYF(cMGN "[D] " cRST
+ "Function %s is in the allow function list, "
+ "instrumenting ... \n",
+ instFunction.c_str());
+ return true;
+
+ }
+
+ }
+
+ }
+
+ }
+
+ if (!allowListFiles.empty()) {
+
+ // let's try to get the filename for the function
+ auto bb = &F->getEntryBlock();
+ BasicBlock::iterator IP = bb->getFirstInsertionPt();
+ IRBuilder<> IRB(&(*IP));
+ DebugLoc Loc = IP->getDebugLoc();
#if LLVM_VERSION_MAJOR >= 4 || \
(LLVM_VERSION_MAJOR == 3 && LLVM_VERSION_MINOR >= 7)
- if (Loc) {
+ if (Loc) {
+
+ DILocation *cDILoc = dyn_cast<DILocation>(Loc.getAsMDNode());
+
+ unsigned int instLine = cDILoc->getLine();
+ StringRef instFilename = cDILoc->getFilename();
- DILocation *cDILoc = dyn_cast<DILocation>(Loc.getAsMDNode());
+ if (instFilename.str().empty()) {
- unsigned int instLine = cDILoc->getLine();
- StringRef instFilename = cDILoc->getFilename();
+ /* If the original location is empty, try using the inlined location
+ */
+ DILocation *oDILoc = cDILoc->getInlinedAt();
+ if (oDILoc) {
+
+ instFilename = oDILoc->getFilename();
+ instLine = oDILoc->getLine();
+
+ }
+
+ }
- if (instFilename.str().empty()) {
+ /* Continue only if we know where we actually are */
+ if (!instFilename.str().empty()) {
- /* If the original location is empty, try using the inlined location
- */
- DILocation *oDILoc = cDILoc->getInlinedAt();
- if (oDILoc) {
+ for (std::list<std::string>::iterator it = allowListFiles.begin();
+ it != allowListFiles.end(); ++it) {
- instFilename = oDILoc->getFilename();
- instLine = oDILoc->getLine();
+ /* We don't check for filename equality here because
+ * filenames might actually be full paths. Instead we
+ * check that the actual filename ends in the filename
+ * specified in the list. We also allow UNIX-style pattern
+ * matching */
+
+ if (instFilename.str().length() >= it->length()) {
+
+ if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) ==
+ 0) {
+
+ if (debug)
+ SAYF(cMGN "[D] " cRST
+ "Function %s is in the allowlist (%s), "
+ "instrumenting ... \n",
+ F->getName().str().c_str(), instFilename.str().c_str());
+ return true;
+
+ }
+
+ }
+
+ }
+
+ }
+
+ }
+
+ }
+
+#else
+ if (!Loc.isUnknown()) {
+
+ DILocation cDILoc(Loc.getAsMDNode(F->getContext()));
+
+ unsigned int instLine = cDILoc.getLineNumber();
+ StringRef instFilename = cDILoc.getFilename();
+
+ (void)instLine;
+ /* Continue only if we know where we actually are */
+ if (!instFilename.str().empty()) {
+
+ for (std::list<std::string>::iterator it = allowListFiles.begin();
+ it != allowListFiles.end(); ++it) {
+
+ /* We don't check for filename equality here because
+ * filenames might actually be full paths. Instead we
+ * check that the actual filename ends in the filename
+ * specified in the list. We also allow UNIX-style pattern
+ * matching */
+
+ if (instFilename.str().length() >= it->length()) {
+
+ if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) ==
+ 0) {
+
+ return true;
+
+ }
+
+ }
+
+ }
+
+ }
}
}
- /* Continue only if we know where we actually are */
- if (!instFilename.str().empty()) {
+#endif
+ else {
+
+ // we could not find out the location. in this case we say it is not
+ // in the the instrument file list
+ if (!be_quiet)
+ WARNF(
+ "No debug information found for function %s, will not be "
+ "instrumented (recompile with -g -O[1-3]).",
+ F->getName().str().c_str());
+ return false;
+
+ }
+
+ return false;
+
+ }
+
+ if (!denyListFiles.empty() || !denyListFunctions.empty()) {
+
+ if (!denyListFunctions.empty()) {
- for (std::list<std::string>::iterator it = myInstrumentList.begin();
- it != myInstrumentList.end(); ++it) {
+ std::string instFunction = F->getName().str();
+
+ for (std::list<std::string>::iterator it = denyListFunctions.begin();
+ it != denyListFunctions.end(); ++it) {
/* We don't check for filename equality here because
* filenames might actually be full paths. Instead we
@@ -164,16 +434,16 @@ bool isInInstrumentList(llvm::Function *F) {
* specified in the list. We also allow UNIX-style pattern
* matching */
- if (instFilename.str().length() >= it->length()) {
+ if (instFunction.length() >= it->length()) {
- if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) ==
- 0) {
+ if (fnmatch(("*" + *it).c_str(), instFunction.c_str(), 0) == 0) {
if (debug)
SAYF(cMGN "[D] " cRST
- "Function %s is in the list (%s), instrumenting ... \n",
- F->getName().str().c_str(), instFilename.str().c_str());
- return true;
+ "Function %s is in the deny function list, "
+ "not instrumenting ... \n",
+ instFunction.c_str());
+ return false;
}
@@ -183,35 +453,64 @@ bool isInInstrumentList(llvm::Function *F) {
}
- }
+ if (!denyListFiles.empty()) {
-#else
- if (!Loc.isUnknown()) {
+ // let's try to get the filename for the function
+ auto bb = &F->getEntryBlock();
+ BasicBlock::iterator IP = bb->getFirstInsertionPt();
+ IRBuilder<> IRB(&(*IP));
+ DebugLoc Loc = IP->getDebugLoc();
- DILocation cDILoc(Loc.getAsMDNode(F->getContext()));
+#if LLVM_VERSION_MAJOR >= 4 || \
+ (LLVM_VERSION_MAJOR == 3 && LLVM_VERSION_MINOR >= 7)
+ if (Loc) {
- unsigned int instLine = cDILoc.getLineNumber();
- StringRef instFilename = cDILoc.getFilename();
+ DILocation *cDILoc = dyn_cast<DILocation>(Loc.getAsMDNode());
- (void)instLine;
- /* Continue only if we know where we actually are */
- if (!instFilename.str().empty()) {
+ unsigned int instLine = cDILoc->getLine();
+ StringRef instFilename = cDILoc->getFilename();
- for (std::list<std::string>::iterator it = myInstrumentList.begin();
- it != myInstrumentList.end(); ++it) {
+ if (instFilename.str().empty()) {
- /* We don't check for filename equality here because
- * filenames might actually be full paths. Instead we
- * check that the actual filename ends in the filename
- * specified in the list. We also allow UNIX-style pattern
- * matching */
+ /* If the original location is empty, try using the inlined location
+ */
+ DILocation *oDILoc = cDILoc->getInlinedAt();
+ if (oDILoc) {
- if (instFilename.str().length() >= it->length()) {
+ instFilename = oDILoc->getFilename();
+ instLine = oDILoc->getLine();
- if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) ==
- 0) {
+ }
- return true;
+ }
+
+ /* Continue only if we know where we actually are */
+ if (!instFilename.str().empty()) {
+
+ for (std::list<std::string>::iterator it = denyListFiles.begin();
+ it != denyListFiles.end(); ++it) {
+
+ /* We don't check for filename equality here because
+ * filenames might actually be full paths. Instead we
+ * check that the actual filename ends in the filename
+ * specified in the list. We also allow UNIX-style pattern
+ * matching */
+
+ if (instFilename.str().length() >= it->length()) {
+
+ if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) ==
+ 0) {
+
+ if (debug)
+ SAYF(cMGN "[D] " cRST
+ "Function %s is in the denylist (%s), not "
+ "instrumenting ... \n",
+ F->getName().str().c_str(), instFilename.str().c_str());
+ return false;
+
+ }
+
+ }
}
@@ -221,23 +520,65 @@ bool isInInstrumentList(llvm::Function *F) {
}
- }
+#else
+ if (!Loc.isUnknown()) {
+
+ DILocation cDILoc(Loc.getAsMDNode(F->getContext()));
+
+ unsigned int instLine = cDILoc.getLineNumber();
+ StringRef instFilename = cDILoc.getFilename();
+
+ (void)instLine;
+ /* Continue only if we know where we actually are */
+ if (!instFilename.str().empty()) {
+
+ for (std::list<std::string>::iterator it = denyListFiles.begin();
+ it != denyListFiles.end(); ++it) {
+
+ /* We don't check for filename equality here because
+ * filenames might actually be full paths. Instead we
+ * check that the actual filename ends in the filename
+ * specified in the list. We also allow UNIX-style pattern
+ * matching */
+
+ if (instFilename.str().length() >= it->length()) {
+
+ if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) ==
+ 0) {
+
+ return false;
+
+ }
+
+ }
+
+ }
+
+ }
+
+ }
+
+ }
#endif
- else {
-
- // we could not find out the location. in this case we say it is not
- // in the the instrument file list
- if (!be_quiet)
- WARNF(
- "No debug information found for function %s, will not be "
- "instrumented (recompile with -g -O[1-3]).",
- F->getName().str().c_str());
- return false;
+ else {
+
+ // we could not find out the location. in this case we say it is not
+ // in the the instrument file list
+ if (!be_quiet)
+ WARNF(
+ "No debug information found for function %s, will be "
+ "instrumented (recompile with -g -O[1-3]).",
+ F->getName().str().c_str());
+ return true;
+
+ }
+
+ return true;
}
- return false;
+ return true; // not reached
}
diff --git a/llvm_mode/afl-llvm-lto-instrumentlist.so.cc b/llvm_mode/afl-llvm-lto-instrumentlist.so.cc
index ab7c0c58..a7331444 100644
--- a/llvm_mode/afl-llvm-lto-instrumentlist.so.cc
+++ b/llvm_mode/afl-llvm-lto-instrumentlist.so.cc
@@ -59,39 +59,9 @@ class AFLcheckIfInstrument : public ModulePass {
static char ID;
AFLcheckIfInstrument() : ModulePass(ID) {
- int entries = 0;
-
if (getenv("AFL_DEBUG")) debug = 1;
- char *instrumentListFilename = getenv("AFL_LLVM_INSTRUMENT_FILE");
- if (!instrumentListFilename)
- instrumentListFilename = getenv("AFL_LLVM_WHITELIST");
- if (instrumentListFilename) {
-
- std::string line;
- std::ifstream fileStream;
- fileStream.open(instrumentListFilename);
- if (!fileStream)
- report_fatal_error("Unable to open AFL_LLVM_INSTRUMENT_FILE");
- getline(fileStream, line);
- while (fileStream) {
-
- myInstrumentList.push_back(line);
- getline(fileStream, line);
- entries++;
-
- }
-
- } else
-
- PFATAL(
- "afl-llvm-lto-instrumentlist.so loaded without "
- "AFL_LLVM_INSTRUMENT_FILE?!");
-
- if (debug)
- SAYF(cMGN "[D] " cRST
- "loaded the instrument file list %s with %d entries\n",
- instrumentListFilename, entries);
+ initInstrumentList();
}
@@ -129,120 +99,28 @@ bool AFLcheckIfInstrument::runOnModule(Module &M) {
for (auto &F : M) {
if (F.size() < 1) continue;
- // fprintf(stderr, "F:%s\n", F.getName().str().c_str());
- if (isIgnoreFunction(&F)) continue;
-
- BasicBlock::iterator IP = F.getEntryBlock().getFirstInsertionPt();
- IRBuilder<> IRB(&(*IP));
-
- if (!myInstrumentList.empty()) {
-
- bool instrumentFunction = false;
-
- /* Get the current location using debug information.
- * For now, just instrument the block if we are not able
- * to determine our location. */
- DebugLoc Loc = IP->getDebugLoc();
- if (Loc) {
-
- DILocation *cDILoc = dyn_cast<DILocation>(Loc.getAsMDNode());
-
- unsigned int instLine = cDILoc->getLine();
- StringRef instFilename = cDILoc->getFilename();
-
- if (instFilename.str().empty()) {
-
- /* If the original location is empty, try using the inlined location
- */
- DILocation *oDILoc = cDILoc->getInlinedAt();
- if (oDILoc) {
-
- instFilename = oDILoc->getFilename();
- instLine = oDILoc->getLine();
-
- }
-
- if (instFilename.str().empty()) {
- if (!be_quiet)
- WARNF(
- "Function %s has no source file name information and will "
- "not be instrumented.",
- F.getName().str().c_str());
- continue;
-
- }
-
- }
-
- //(void)instLine;
-
- fprintf(stderr, "xxx %s %s\n", F.getName().str().c_str(),
- instFilename.str().c_str());
- if (debug)
- SAYF(cMGN "[D] " cRST "function %s is in file %s\n",
- F.getName().str().c_str(), instFilename.str().c_str());
-
- for (std::list<std::string>::iterator it = myInstrumentList.begin();
- it != myInstrumentList.end(); ++it) {
-
- /* We don't check for filename equality here because
- * filenames might actually be full paths. Instead we
- * check that the actual filename ends in the filename
- * specified in the list. */
- if (instFilename.str().length() >= it->length()) {
-
- if (fnmatch(("*" + *it).c_str(), instFilename.str().c_str(), 0) ==
- 0) {
-
- instrumentFunction = true;
- break;
-
- }
-
- }
-
- }
-
- } else {
-
- if (!be_quiet)
- WARNF(
- "No debug information found for function %s, recompile with -g "
- "-O[1-3]",
- F.getName().str().c_str());
- continue;
-
- }
-
- /* Either we couldn't figure out our location or the location is
- * not the instrument file listed, so we skip instrumentation.
- * We do this by renaming the function. */
- if (instrumentFunction == true) {
-
- if (debug)
- SAYF(cMGN "[D] " cRST "function %s is in the instrument file list\n",
- F.getName().str().c_str());
-
- } else {
-
- if (debug)
- SAYF(cMGN "[D] " cRST
- "function %s is NOT in the instrument file list\n",
- F.getName().str().c_str());
+ // fprintf(stderr, "F:%s\n", F.getName().str().c_str());
- auto & Ctx = F.getContext();
- AttributeList Attrs = F.getAttributes();
- AttrBuilder NewAttrs;
- NewAttrs.addAttribute("skipinstrument");
- F.setAttributes(
- Attrs.addAttributes(Ctx, AttributeList::FunctionIndex, NewAttrs));
+ if (isInInstrumentList(&F)) {
- }
+ if (debug)
+ SAYF(cMGN "[D] " cRST "function %s is in the instrument file list\n",
+ F.getName().str().c_str());
} else {
- PFATAL("InstrumentList is empty");
+ if (debug)
+ SAYF(cMGN "[D] " cRST
+ "function %s is NOT in the instrument file list\n",
+ F.getName().str().c_str());
+
+ auto & Ctx = F.getContext();
+ AttributeList Attrs = F.getAttributes();
+ AttrBuilder NewAttrs;
+ NewAttrs.addAttribute("skipinstrument");
+ F.setAttributes(
+ Attrs.addAttributes(Ctx, AttributeList::FunctionIndex, NewAttrs));
}