7 files changed, 288 insertions, 15 deletions
diff --git a/llvm_mode/README.ctx.md b/llvm_mode/README.ctx.md
new file mode 100644
index 00000000..14255313
--- /dev/null
+++ b/llvm_mode/README.ctx.md
@@ -0,0 +1,22 @@
+# AFL Context Sensitive Branch Coverage
+
+## What is this?
+
+This is an LLVM-based implementation of the context sensitive branch coverage.
+
+Basically every function gets it's own ID and that ID is combined with the
+edges of the called functions.
+
+So if both function A and function B call a function C, the coverage
+collected in C will be different.
+
+In math the coverage is collected as follows:
+`map[current_location_ID ^ previous_location_ID >> 1 ^ previous_callee_ID] += 1`
+
+## Usage
+
+Set the `AFL_LLVM_INSTRUMENT=CTX` or `AFL_LLVM_CTX=1` environment variable.
+
+It is highly recommended to increase the MAP_SIZE_POW2 definition in
+config.h to at least 18 and maybe up to 20 for this as otherwise too
+many map collisions occur.
diff --git a/llvm_mode/README.md b/llvm_mode/README.md
index e6c47c9c..607350fb 100644
--- a/llvm_mode/README.md
+++ b/llvm_mode/README.md
@@ -92,13 +92,33 @@ which C/C++ files to actually instrument. See [README.whitelist](README.whitelis
 
 For splitting memcmp, strncmp, etc. please see [README.laf-intel](README.laf-intel.md)
 
-Then there is an optimized instrumentation strategy that uses CFGs and
-markers to just instrument what is needed. This increases speed by 20-25%
-however has a lower path discovery.
-If you want to use this, set AFL_LLVM_INSTRIM=1
+Then there are different ways of instrumenting the target:
+
+1. There is an optimized instrumentation strategy that uses CFGs and
+markers to just instrument what is needed. This increases speed by 10-15%
+without any disadvantages
+If you want to use this, set AFL_LLVM_INSTRUMENT=CFG or AFL_LLVM_INSTRIM=1
 See [README.instrim](README.instrim.md)
 
-A new instrumentation called CmpLog is also available as an alternative to
+2. An even better instrumentation strategy uses LTO and link time
+instrumentation. Note that not all targets can compile in this mode, however
+if it works it is the best option you can use.
+Simply use afl-clang-lto/afl-clang-lto++ to use this option.
+See [README.lto](README.lto.md)
+
+3. Alternativly you can choose a completely different coverage method:
+
+3a. N-GRAM coverage - which combines the previous visited edges with the
+current one. This explodes the map but on the other hand has proven to be
+effective for fuzzing.
+See [README.ngram](README.ngram.md)
+
+3b. Context sensitive coverage - which combines the visited edges with an
+individual caller ID (the function that called the current one)
+[README.ctx](README.ctx.md)
+
+Then - additionally to one of the instrumentation options above - there is
+a very effective new instrumentation option called CmpLog as an alternative to
 laf-intel that allow AFL++ to apply mutations similar to Redqueen.
 See [README.cmplog](README.cmplog.md)
 
@@ -109,12 +129,18 @@ is not optimal and was only fixed in llvm 9.
 You can set this with AFL_LLVM_NOT_ZERO=1
 See [README.neverzero](README.neverzero.md)
 
-## 4) Gotchas, feedback, bugs
+## 4) Snapshot feature
+
+To speed up fuzzing you can use a linux loadable kernel module which enables
+a snapshot feature.
+See [README.snapshot](README.snapshot.md)
+
+## 5) Gotchas, feedback, bugs
 
 This is an early-stage mechanism, so field reports are welcome. You can send bug
 reports to <afl-users@googlegroups.com>.
 
-## 5) Bonus feature #1: deferred initialization
+## 6) Bonus feature #1: deferred initialization
 
 AFL tries to optimize performance by executing the targeted binary just once,
 stopping it just before main(), and then cloning this "master" process to get
@@ -162,7 +188,7 @@ will keep working normally when compiled with a tool other than afl-clang-fast.
 Finally, recompile the program with afl-clang-fast (afl-gcc or afl-clang will
 *not* generate a deferred-initialization binary) - and you should be all set!
 
-## 6) Bonus feature #2: persistent mode
+## 7) Bonus feature #2: persistent mode
 
 Some libraries provide APIs that are stateless, or whose state can be reset in
 between processing different input files. When such a reset is performed, a
diff --git a/llvm_mode/README.ngram.md b/llvm_mode/README.ngram.md
index 3540ada0..de3ba432 100644
--- a/llvm_mode/README.ngram.md
+++ b/llvm_mode/README.ngram.md
@@ -13,9 +13,16 @@ is built on top of AFL's QEMU mode.
 This is essentially a port that uses LLVM vectorized instructions to achieve
 the same results when compiling source code.
 
+In math the branch coverage is performed as follows:
+`map[current_location ^ prev_location[0] >> 1 ^ prev_location[1] >> 1 ^ ... up to n-1`] += 1`
+
 ## Usage
 
 The size of `n` (i.e., the number of branches to remember) is an option
 that is specified either in the `AFL_LLVM_INSTRUMENT=NGRAM-{value}` or the
 `AFL_LLVM_NGRAM_SIZE` environment variable.
 Good values are 2, 4 or 8, valid are 2-16.
+
+It is highly recommended to increase the MAP_SIZE_POW2 definition in
+config.h to at least 18 and maybe up to 20 for this as otherwise too
+many map collisions occur.
diff --git a/llvm_mode/README.snapshot.md b/llvm_mode/README.snapshot.md
new file mode 100644
index 00000000..6bf76b3d
--- /dev/null
+++ b/llvm_mode/README.snapshot.md
@@ -0,0 +1,12 @@
+# AFL++ snapshot feature
+
+Snapshot is a mechanic that makes a snapshot from a process and then restores
+it's state, which is faster then forking it again.
+
+All targets compiled with llvm_mode are automatically enabled for the
+snapshot feature.
+
+To use the snapshot feature for fuzzing compile and load this kernel
+module: [https://github.com/AFLplusplus/AFL-Snapshot-LKM](https://github.com/AFLplusplus/AFL-Snapshot-LKM)
+
+Note that is has little value for persistent (__AFL_LOOP) fuzzing.
diff --git a/llvm_mode/afl-clang-fast.c b/llvm_mode/afl-clang-fast.c
index 0e388cf4..657d1a84 100644
--- a/llvm_mode/afl-clang-fast.c
+++ b/llvm_mode/afl-clang-fast.c
@@ -544,9 +544,12 @@ int main(int argc, char **argv, char **envp) {
       instrument_mode = INSTRUMENT_PCGUARD;
     else if (strncasecmp(ptr, "lto", strlen("lto")) == 0)
       instrument_mode = INSTRUMENT_LTO;
-    else if (strncasecmp(ptr, "ctx", strlen("ctx")) == 0)
+    else if (strncasecmp(ptr, "ctx", strlen("ctx")) == 0) {
+
       instrument_mode = INSTRUMENT_CTX;
-    else if (strncasecmp(ptr, "ngram", strlen("ngram")) == 0) {
+      setenv("AFL_LLVM_CTX", "1", 1);
+
+    } else if (strncasecmp(ptr, "ngram", strlen("ngram")) == 0) {
 
       ptr += strlen("ngram");
       while (*ptr && (*ptr < '0' || *ptr > '9'))
diff --git a/llvm_mode/afl-llvm-pass.so.cc b/llvm_mode/afl-llvm-pass.so.cc
index 56f9ffe2..058ab71f 100644
--- a/llvm_mode/afl-llvm-pass.so.cc
+++ b/llvm_mode/afl-llvm-pass.so.cc
@@ -124,6 +124,8 @@ class AFLCoverage : public ModulePass {
  protected:
   std::list<std::string> myWhitelist;
   uint32_t               ngram_size = 0;
+  uint32_t               debug = 0;
+  char *                 ctx_str = NULL;
 
 };
 
@@ -179,6 +181,8 @@ bool AFLCoverage::runOnModule(Module &M) {
 
   char be_quiet = 0;
 
+  if (getenv("AFL_DEBUG")) debug = 1;
+
   if ((isatty(2) && !getenv("AFL_QUIET")) || getenv("AFL_DEBUG") != NULL) {
 
     SAYF(cCYA "afl-llvm-pass" VERSION cRST
@@ -209,6 +213,7 @@ bool AFLCoverage::runOnModule(Module &M) {
 
   char *ngram_size_str = getenv("AFL_LLVM_NGRAM_SIZE");
   if (!ngram_size_str) ngram_size_str = getenv("AFL_NGRAM_SIZE");
+  ctx_str = getenv("AFL_LLVM_CTX");
 
 #ifdef AFL_HAVE_VECTOR_INTRINSICS
   /* Decide previous location vector size (must be a power of two) */
@@ -228,9 +233,8 @@ bool AFLCoverage::runOnModule(Module &M) {
   else
 #else
   if (ngram_size_str)
-    FATAL(
-        "Sorry, n-gram branch coverage is not supported with llvm version %s!",
-        LLVM_VERSION_STRING);
+    FATAL("Sorry, NGRAM branch coverage is not supported with llvm version %s!",
+          LLVM_VERSION_STRING);
 #endif
     PrevLocSize = 1;
 
@@ -239,6 +243,9 @@ bool AFLCoverage::runOnModule(Module &M) {
   if (ngram_size) PrevLocTy = VectorType::get(IntLocTy, PrevLocVecSize);
 #endif
 
+  if (ctx_str && ngram_size_str)
+    FATAL("you must decide between NGRAM and CTX instrumentation");
+
   /* Get globals for the SHM region and the previous location. Note that
      __afl_prev_loc is thread-local. */
 
@@ -246,6 +253,17 @@ bool AFLCoverage::runOnModule(Module &M) {
       new GlobalVariable(M, PointerType::get(Int8Ty, 0), false,
                          GlobalValue::ExternalLinkage, 0, "__afl_area_ptr");
   GlobalVariable *AFLPrevLoc;
+  GlobalVariable *AFLContext;
+
+  if (ctx_str)
+#ifdef __ANDROID__
+    AFLContext = new GlobalVariable(
+        M, Int32Ty, false, GlobalValue::ExternalLinkage, 0, "__afl_prev_ctx");
+#else
+    AFLContext = new GlobalVariable(
+        M, Int32Ty, false, GlobalValue::ExternalLinkage, 0, "__afl_prev_ctx", 0,
+        GlobalVariable::GeneralDynamicTLSModel, 0, false);
+#endif
 
 #ifdef AFL_HAVE_VECTOR_INTRINSICS
   if (ngram_size)
@@ -291,14 +309,70 @@ bool AFLCoverage::runOnModule(Module &M) {
   ConstantInt *Zero = ConstantInt::get(Int8Ty, 0);
   ConstantInt *One = ConstantInt::get(Int8Ty, 1);
 
+  LoadInst *PrevCtx;  // CTX sensitive coverage
+
   /* Instrument all the things! */
 
   int inst_blocks = 0;
 
   for (auto &F : M) {
 
+    if (debug)
+      fprintf(stderr, "FUNCTION: %s (%zu)\n", F.getName().str().c_str(),
+              F.size());
+
     if (isBlacklisted(&F)) continue;
 
+    // AllocaInst *CallingContext = nullptr;
+
+    if (ctx_str && F.size() > 1) {  // Context sensitive coverage
+      // load the context ID of the previous function and write to to a local
+      // variable on the stack
+      auto                 bb = &F.getEntryBlock();
+      BasicBlock::iterator IP = bb->getFirstInsertionPt();
+      IRBuilder<>          IRB(&(*IP));
+      PrevCtx = IRB.CreateLoad(AFLContext);
+      PrevCtx->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
+
+      // does the function have calls? and is any of the calls larger than one
+      // basic block?
+      int has_calls = 0;
+      for (auto &BB : F) {
+
+        if (has_calls) break;
+        for (auto &IN : BB) {
+
+          CallInst *callInst = nullptr;
+          if ((callInst = dyn_cast<CallInst>(&IN))) {
+
+            Function *Callee = callInst->getCalledFunction();
+            if (!Callee || Callee->size() < 2)
+              continue;
+            else {
+
+              has_calls = 1;
+              break;
+
+            }
+
+          }
+
+        }
+
+      }
+
+      // if yes we store a context ID for this function in the global var
+      if (has_calls) {
+
+        ConstantInt *NewCtx = ConstantInt::get(Int32Ty, AFL_R(MAP_SIZE));
+        StoreInst *  StoreCtx = IRB.CreateStore(NewCtx, AFLContext);
+        StoreCtx->setMetadata(M.getMDKindID("nosanitize"),
+                              MDNode::get(C, None));
+
+      }
+
+    }
+
     for (auto &BB : F) {
 
       BasicBlock::iterator IP = BB.getFirstInsertionPt();
@@ -484,6 +558,9 @@ bool AFLCoverage::runOnModule(Module &M) {
         PrevLocTrans = IRB.CreateXorReduce(PrevLoc);
       else
 #endif
+          if (ctx_str)
+        PrevLocTrans = IRB.CreateZExt(IRB.CreateXor(PrevLoc, PrevCtx), Int32Ty);
+      else
         PrevLocTrans = IRB.CreateZExt(PrevLoc, IRB.getInt32Ty());
 
       /* Load SHM pointer */
@@ -605,6 +682,22 @@ bool AFLCoverage::runOnModule(Module &M) {
 
       }
 
+      // in CTX mode we have to restore the original context for the caller -
+      // she might be calling other functions which need the correct CTX
+      if (ctx_str) {
+
+        Instruction *Inst = BB.getTerminator();
+        if (isa<ReturnInst>(Inst) || isa<ResumeInst>(Inst)) {
+
+          IRBuilder<> Post_IRB(Inst);
+          StoreInst * RestoreCtx = Post_IRB.CreateStore(PrevCtx, AFLContext);
+          RestoreCtx->setMetadata(M.getMDKindID("nosanitize"),
+                                  MDNode::get(C, None));
+
+        }
+
+      }
+
       inst_blocks++;
 
     }
diff --git a/llvm_mode/afl-llvm-rt.o.c b/llvm_mode/afl-llvm-rt.o.c
index ade9eeef..aac7d061 100644
--- a/llvm_mode/afl-llvm-rt.o.c
+++ b/llvm_mode/afl-llvm-rt.o.c
@@ -42,6 +42,10 @@
 #include <sys/wait.h>
 #include <sys/types.h>
 
+#ifdef __linux__
+#include "snapshot-inl.h"
+#endif
+
 /* This is a somewhat ugly hack for the experimental 'trace-pc-guard' mode.
    Basically, we need to make sure that the forkserver is initialized after
    the LLVM-generated runtime initialization pass, not before. */
@@ -65,13 +69,16 @@ u8 *__afl_area_ptr = __afl_area_initial;
 #ifdef __ANDROID__
 PREV_LOC_T __afl_prev_loc[NGRAM_SIZE_MAX];
 u32        __afl_final_loc;
+u32        __afl_prev_ctx;
+u32        __afl_cmp_counter
 #else
 __thread PREV_LOC_T __afl_prev_loc[NGRAM_SIZE_MAX];
 __thread u32        __afl_final_loc;
+__thread u32        __afl_prev_ctx;
+__thread u32        __afl_cmp_counter;
 #endif
 
-struct cmp_map *__afl_cmp_map;
-__thread u32    __afl_cmp_counter;
+    struct cmp_map *__afl_cmp_map;
 
 /* Running in persistent mode? */
 
@@ -177,10 +184,113 @@ static void __afl_map_shm(void) {
 
 }
 
+#ifdef __linux__
+static void __afl_start_snapshots(void) {
+
+  static u8 tmp[4];
+  s32       child_pid;
+
+  u8 child_stopped = 0;
+
+  void (*old_sigchld_handler)(int) = 0;  // = signal(SIGCHLD, SIG_DFL);
+
+  /* Phone home and tell the parent that we're OK. If parent isn't there,
+     assume we're not running in forkserver mode and just execute program. */
+
+  if (write(FORKSRV_FD + 1, tmp, 4) != 4) return;
+
+  while (1) {
+
+    u32 was_killed;
+    int status;
+
+    /* Wait for parent by reading from the pipe. Abort if read fails. */
+
+    if (read(FORKSRV_FD, &was_killed, 4) != 4) _exit(1);
+
+    /* If we stopped the child in persistent mode, but there was a race
+       condition and afl-fuzz already issued SIGKILL, write off the old
+       process. */
+
+    if (child_stopped && was_killed) {
+
+      child_stopped = 0;
+      if (waitpid(child_pid, &status, 0) < 0) _exit(1);
+
+    }
+
+    if (!child_stopped) {
+
+      /* Once woken up, create a clone of our process. */
+
+      child_pid = fork();
+      if (child_pid < 0) _exit(1);
+
+      /* In child process: close fds, resume execution. */
+
+      if (!child_pid) {
+
+        signal(SIGCHLD, old_sigchld_handler);
+
+        close(FORKSRV_FD);
+        close(FORKSRV_FD + 1);
+
+        if (!afl_snapshot_do()) { raise(SIGSTOP); }
+
+        __afl_area_ptr[0] = 1;
+        memset(__afl_prev_loc, 0, NGRAM_SIZE_MAX * sizeof(PREV_LOC_T));
+
+        return;
+
+      }
+
+    } else {
+
+      /* Special handling for persistent mode: if the child is alive but
+         currently stopped, simply restart it with SIGCONT. */
+
+      kill(child_pid, SIGCONT);
+      child_stopped = 0;
+
+    }
+
+    /* In parent process: write PID to pipe, then wait for child. */
+
+    if (write(FORKSRV_FD + 1, &child_pid, 4) != 4) _exit(1);
+
+    if (waitpid(child_pid, &status, WUNTRACED) < 0) _exit(1);
+
+    /* In persistent mode, the child stops itself with SIGSTOP to indicate
+       a successful run. In this case, we want to wake it up without forking
+       again. */
+
+    if (WIFSTOPPED(status)) child_stopped = 1;
+
+    /* Relay wait status to pipe, then loop back. */
+
+    if (write(FORKSRV_FD + 1, &status, 4) != 4) _exit(1);
+
+  }
+
+}
+
+#endif
+
 /* Fork server logic. */
 
 static void __afl_start_forkserver(void) {
 
+#ifdef __linux__
+  if (!is_persistent && !__afl_cmp_map && !getenv("AFL_NO_SNAPSHOT") &&
+      afl_snapshot_init() >= 0) {
+
+    __afl_start_snapshots();
+    return;
+
+  }
+
+#endif
+
   static u8 tmp[4];
   s32       child_pid;