about summary refs log tree commit diff
diff options
context:
space:
mode:
-rw-r--r--docs/Changelog.md2
-rw-r--r--docs/FAQ.md12
-rw-r--r--docs/env_variables.md35
-rw-r--r--include/envs.h1
-rw-r--r--llvm_mode/README.lto.md6
-rw-r--r--llvm_mode/afl-clang-fast.c2
-rw-r--r--llvm_mode/afl-llvm-lto-instrumentation.so.cc18
7 files changed, 55 insertions, 21 deletions
diff --git a/docs/Changelog.md b/docs/Changelog.md
index dcaf64a7..14d00a43 100644
--- a/docs/Changelog.md
+++ b/docs/Changelog.md
@@ -26,6 +26,8 @@ sending a mail to <afl-users+subscribe@googlegroups.com>.
      - LTO: autodictionary mode is a default
      - LTO: instrim instrumentation disabled, only classic support used
             as it is always better
+     - LTO: env var AFL_LLVM_DOCUMENT_IDS=file will document which edge ID
+            was given to which function during compilation
      - setting AFL_LLVM_LAF_SPLIT_FLOATS now activates
        AFL_LLVM_LAF_SPLIT_COMPARES
   - added honggfuzz mangle as a custom mutator in custom_mutators/honggfuzz
diff --git a/docs/FAQ.md b/docs/FAQ.md
index e09385a8..b09a16ae 100644
--- a/docs/FAQ.md
+++ b/docs/FAQ.md
@@ -95,12 +95,13 @@ afl-clang-fast PCGUARD and afl-clang-lto LTO instrumentation!
 
   2. Second step: Find the responsible function.
 
-     a) For LTO instrumented binaries just disassemble or decompile the target
-        and look which edge is writing to that edge ID. Ghidra is a good tool
-        for this: [https://ghidra-sre.org/](https://ghidra-sre.org/)
+     a) For LTO instrumented binaries this can be documented during compile
+        time, just set `export AFL_LLVM_DOCUMENT_IDS=/path/to/afile`.
+        This file will have one assigned edge ID and the corresponding function
+        per line.
 
-     b) For PCGUARD instrumented binaries it is more difficult. Here you can
-        either modify the __sanitizer_cov_trace_pc_guard function in
+     b) For PCGUARD instrumented binaries it is much more difficult. Here you
+        can either modify the __sanitizer_cov_trace_pc_guard function in
         llvm_mode/afl-llvm-rt.o.c to write a backtrace to a file if the ID in
         __afl_area_ptr[*guard] is one of the unstable edge IDs. Then recompile
         and reinstall llvm_mode and rebuild your target. Run the recompiled
@@ -121,4 +122,3 @@ afl-clang-fast PCGUARD and afl-clang-lto LTO instrumentation!
   4. Fourth step: recompile the target
 
      Recompile, fuzz it, be happy :)
-
diff --git a/docs/env_variables.md b/docs/env_variables.md
index 87344331..4c0d2db7 100644
--- a/docs/env_variables.md
+++ b/docs/env_variables.md
@@ -121,18 +121,16 @@ Then there are a few specific features that are only available in llvm_mode:
     built if LLVM 11 or newer is used.
 
    - AFL_LLVM_INSTRUMENT=CFG will use Control Flow Graph instrumentation.
-     (recommended)
-
-   - AFL_LLVM_LTO_AUTODICTIONARY will generate a dictionary in the target
-     binary based on string compare and memory compare functions.
-     afl-fuzz will automatically get these transmitted when starting to
-     fuzz.
+     (not recommended!)
 
     None of the following options are necessary to be used and are rather for
     manual use (which only ever the author of this LTO implementation will use).
     These are used if several seperated instrumentation are performed which
     are then later combined.
 
+   - AFL_LLVM_DOCUMENT_IDS=file will document to a file which edge ID was given
+     to which function. This helps to identify functions with variable bytes
+     or which functions were touched by an input.
    - AFL_LLVM_MAP_ADDR sets the fixed map address to a different address than
      the default 0x10000. A value of 0 or empty sets the map address to be
      dynamic (the original afl way, which is slower)
@@ -254,15 +252,6 @@ checks or alter some of the more exotic semantics of the tool:
     useful if you can't change the defaults (e.g., no root access to the
     system) and are OK with some performance loss.
 
-  - Setting AFL_NO_FORKSRV disables the forkserver optimization, reverting to
-    fork + execve() call for every tested input. This is useful mostly when
-    working with unruly libraries that create threads or do other crazy
-    things when initializing (before the instrumentation has a chance to run).
-
-    Note that this setting inhibits some of the user-friendly diagnostics
-    normally done when starting up the forkserver and causes a pretty
-    significant performance drop.
-
   - AFL_EXIT_WHEN_DONE causes afl-fuzz to terminate when all existing paths
     have been fuzzed and there were no new finds for a while. This would be
     normally indicated by the cycle counter in the UI turning green. May be
@@ -338,6 +327,13 @@ checks or alter some of the more exotic semantics of the tool:
 
   - In QEMU mode (-Q), AFL_PATH will be searched for afl-qemu-trace.
 
+  - Setting AFL_CYCLE_SCHEDULES will switch to a different schedule everytime
+    a cycle is finished.
+
+  - Setting AFL_EXPAND_HAVOC_NOW will start in the extended havoc mode that
+    includes costly mutations. afl-fuzz automatically enables this mode when
+    deemed useful otherwise.
+
   - Setting AFL_PRELOAD causes AFL to set LD_PRELOAD for the target binary
     without disrupting the afl-fuzz process itself. This is useful, among other
     things, for bootstrapping libdislocator.so.
@@ -365,6 +361,15 @@ checks or alter some of the more exotic semantics of the tool:
     for an existing out folder, even if a different `-i` was provided.
     Without this setting, afl-fuzz will refuse execution for a long-fuzzed out dir.
 
+  - Setting AFL_NO_FORKSRV disables the forkserver optimization, reverting to
+    fork + execve() call for every tested input. This is useful mostly when
+    working with unruly libraries that create threads or do other crazy
+    things when initializing (before the instrumentation has a chance to run).
+
+    Note that this setting inhibits some of the user-friendly diagnostics
+    normally done when starting up the forkserver and causes a pretty
+    significant performance drop.
+
   - Outdated environment variables that are that not supported anymore:
     AFL_DEFER_FORKSRV
     AFL_PERSISTENT
diff --git a/include/envs.h b/include/envs.h
index c1c7d387..7153ed47 100644
--- a/include/envs.h
+++ b/include/envs.h
@@ -65,6 +65,7 @@ static char *afl_environment_variables[] = {
     "AFL_LLVM_CMPLOG",
     "AFL_LLVM_INSTRIM",
     "AFL_LLVM_CTX",
+    "AFL_LLVM_DOCUMENT_IDS",
     "AFL_LLVM_INSTRUMENT",
     "AFL_LLVM_INSTRIM_LOOPHEAD",
     "AFL_LLVM_LTO_AUTODICTIONARY",
diff --git a/llvm_mode/README.lto.md b/llvm_mode/README.lto.md
index a4c969b9..e521ac82 100644
--- a/llvm_mode/README.lto.md
+++ b/llvm_mode/README.lto.md
@@ -140,6 +140,12 @@ to be dynamic - the original afl way, which is slower).
 AFL_LLVM_MAP_DYNAMIC can be set so the shared memory address is dynamic (which
 is safer but also slower).
 
+## Document edge IDs
+
+Setting `export AFL_LLVM_DOCUMENT_IDS=file` will document to a file which edge
+ID was given to which function. This helps to identify functions with variable
+bytes or which functions were touched by an input.
+
 ## Solving difficult targets
 
 Some targets are difficult because the configure script does unusual stuff that
diff --git a/llvm_mode/afl-clang-fast.c b/llvm_mode/afl-clang-fast.c
index dca11bf3..a2550d2c 100644
--- a/llvm_mode/afl-clang-fast.c
+++ b/llvm_mode/afl-clang-fast.c
@@ -890,6 +890,8 @@ int main(int argc, char **argv, char **envp) {
         "AFL_NO_BUILTIN: compile for use with libtokencap.so\n"
         "AFL_PATH: path to instrumenting pass and runtime "
         "(afl-llvm-rt.*o)\n"
+        "AFL_LLVM_DOCUMENT_IDS: document edge IDs given to which function (LTO "
+        "only)\n"
         "AFL_QUIET: suppress verbose output\n"
         "AFL_USE_ASAN: activate address sanitizer\n"
         "AFL_USE_CFISAN: activate control flow sanitizer\n"
diff --git a/llvm_mode/afl-llvm-lto-instrumentation.so.cc b/llvm_mode/afl-llvm-lto-instrumentation.so.cc
index 3c1d3565..46a97e54 100644
--- a/llvm_mode/afl-llvm-lto-instrumentation.so.cc
+++ b/llvm_mode/afl-llvm-lto-instrumentation.so.cc
@@ -103,6 +103,7 @@ bool AFLLTOPass::runOnModule(Module &M) {
   std::vector<CallInst *>          calls;
   DenseMap<Value *, std::string *> valueMap;
   char *                           ptr;
+  FILE *                           documentFile = NULL;
 
   IntegerType *Int8Ty = IntegerType::getInt8Ty(C);
   IntegerType *Int32Ty = IntegerType::getInt32Ty(C);
@@ -120,6 +121,13 @@ bool AFLLTOPass::runOnModule(Module &M) {
 
     be_quiet = 1;
 
+  if ((ptr = getenv("AFL_LLVM_DOCUMENT_IDS")) != NULL) {
+
+    if ((documentFile = fopen(ptr, "a")) == NULL)
+      WARNF("Cannot access document file %s", ptr);
+
+  }
+
   if (getenv("AFL_LLVM_MAP_DYNAMIC")) map_addr = 0;
 
   if (getenv("AFL_LLVM_INSTRIM_SKIPSINGLEBLOCK") ||
@@ -579,6 +587,14 @@ bool AFLLTOPass::runOnModule(Module &M) {
 
           }
 
+          if (documentFile) {
+
+            fprintf(documentFile, "%s %u\n",
+                    origBB->getParent()->getName().str().c_str(),
+                    afl_global_id);
+
+          }
+
           BasicBlock::iterator IP = newBB->getFirstInsertionPt();
           IRBuilder<>          IRB(&(*IP));
 
@@ -632,6 +648,8 @@ bool AFLLTOPass::runOnModule(Module &M) {
 
     }
 
+    if (documentFile) fclose(documentFile);
+
   }
 
   // save highest location ID to global variable