ctx done

author: van Hauser <vh@thc.org> 2020-04-08 03:54:49 +0200
committer: van Hauser <vh@thc.org> 2020-04-09 10:23:37 +0200
commit: 314debb799f5e288c64c5e7938bc09e650420ae9 (patch)
tree: 35ff0b58bad4073574630576b77f7fd5580d3689
parent: 24ad714d0def017ee389f623f4aa48d5c7905a03 (diff)
download: afl++-314debb799f5e288c64c5e7938bc09e650420ae9.tar.gz
9 files changed, 162 insertions, 14 deletions
diff --git a/docs/Changelog.md b/docs/Changelog.md
index 31a9b69a..7af8a62e 100644
--- a/docs/Changelog.md
+++ b/docs/Changelog.md
@@ -43,7 +43,9 @@ sending a mail to <afl-users+subscribe@googlegroups.com>.
     note that this mode is amazing, but quite some targets won't compile
   - Added llvm_mode NGRAM prev_loc coverage by Adrean Herrera
     (https://github.com/adrianherrera/afl-ngram-pass/), activate by setting
-    AFL_LLVM_NGRAM_SIZE
+    AFL_LLVM_INSTRUMENT=NGRAM-<value> or AFL_LLVM_NGRAM_SIZE=<value>
+  - Added llvm_mode context sensitive branch coverage, activated by setting
+    AFL_LLVM_INSTRUMENT=CTX or AFL_LLVM_CTX=1
   - llvm_mode InsTrim mode:
     - removed workaround for bug where paths were not instrumented and
       imported fix by author
diff --git a/docs/env_variables.md b/docs/env_variables.md
index 10a17a99..802e7bd0 100644
--- a/docs/env_variables.md
+++ b/docs/env_variables.md
@@ -146,6 +146,20 @@ Then there are a few specific features that are only available in llvm_mode:
     - Setting AFL_LLVM_NGRAM_SIZE or AFL_LLVM_INSTRUMENT=NGRAM-{value}
       activates ngram prev_loc coverage, good values are 2, 4 or 8
       (any value between 2 and 16 is valid).
+      It is highly recommended to increase the MAP_SIZE_POW2 definition in
+      config.h to at least 18 and maybe up to 20 for this as otherwise too
+      many map collisions occur.
+
+    See llvm_mode/README.ctx.md
+
+### CTX
+
+    - Setting AFL_LLVM_CTX or AFL_LLVM_INSTRUMENT=CTX
+      activates context sensitive branch coverage - meaning that each edge
+      is additionally combined with its caller.
+      It is highly recommended to increase the MAP_SIZE_POW2 definition in
+      config.h to at least 18 and maybe up to 20 for this as otherwise too
+      many map collisions occur.
 
     See llvm_mode/README.ngram.md
 
diff --git a/llvm_mode/README.ctx.md b/llvm_mode/README.ctx.md
new file mode 100644
index 00000000..14255313
--- /dev/null
+++ b/llvm_mode/README.ctx.md
@@ -0,0 +1,22 @@
+# AFL Context Sensitive Branch Coverage
+
+## What is this?
+
+This is an LLVM-based implementation of the context sensitive branch coverage.
+
+Basically every function gets it's own ID and that ID is combined with the
+edges of the called functions.
+
+So if both function A and function B call a function C, the coverage
+collected in C will be different.
+
+In math the coverage is collected as follows:
+`map[current_location_ID ^ previous_location_ID >> 1 ^ previous_callee_ID] += 1`
+
+## Usage
+
+Set the `AFL_LLVM_INSTRUMENT=CTX` or `AFL_LLVM_CTX=1` environment variable.
+
+It is highly recommended to increase the MAP_SIZE_POW2 definition in
+config.h to at least 18 and maybe up to 20 for this as otherwise too
+many map collisions occur.
diff --git a/llvm_mode/README.md b/llvm_mode/README.md
index e6c47c9c..805bb659 100644
--- a/llvm_mode/README.md
+++ b/llvm_mode/README.md
@@ -92,13 +92,33 @@ which C/C++ files to actually instrument. See [README.whitelist](README.whitelis
 
 For splitting memcmp, strncmp, etc. please see [README.laf-intel](README.laf-intel.md)
 
-Then there is an optimized instrumentation strategy that uses CFGs and
-markers to just instrument what is needed. This increases speed by 20-25%
-however has a lower path discovery.
-If you want to use this, set AFL_LLVM_INSTRIM=1
+Then there are different ways of instrumenting the target:
+
+1. There is an optimized instrumentation strategy that uses CFGs and
+markers to just instrument what is needed. This increases speed by 10-15%
+without any disadvantages
+If you want to use this, set AFL_LLVM_INSTRUMENT=CFG or AFL_LLVM_INSTRIM=1
 See [README.instrim](README.instrim.md)
 
-A new instrumentation called CmpLog is also available as an alternative to
+2. An even better instrumentation strategy uses LTO and link time
+instrumentation. Note that not all targets can compile in this mode, however
+if it works it is the best option you can use.
+Simply use afl-clang-lto/afl-clang-lto++ to use this option.
+See [README.lto](README.lto.md)
+
+3. Alternativly you can choose a completely different coverage method:
+
+3a. N-GRAM coverage - which combines the previous visited edges with the
+current one. This explodes the map but on the other hand has proven to be
+effective for fuzzing.
+See [README.ngram](README.ngram.md)
+
+3b. Context sensitive coverage - which combines the visited edges with an
+individual caller ID (the function that called the current one)
+[README.ctx](README.ctx.md)
+
+Then - additionally to one of the instrumentation options above - there is
+a very effective new instrumentation option called CmpLog as an alternative to
 laf-intel that allow AFL++ to apply mutations similar to Redqueen.
 See [README.cmplog](README.cmplog.md)
 
diff --git a/llvm_mode/README.ngram.md b/llvm_mode/README.ngram.md
index 3540ada0..de3ba432 100644
--- a/llvm_mode/README.ngram.md
+++ b/llvm_mode/README.ngram.md
@@ -13,9 +13,16 @@ is built on top of AFL's QEMU mode.
 This is essentially a port that uses LLVM vectorized instructions to achieve
 the same results when compiling source code.
 
+In math the branch coverage is performed as follows:
+`map[current_location ^ prev_location[0] >> 1 ^ prev_location[1] >> 1 ^ ... up to n-1`] += 1`
+
 ## Usage
 
 The size of `n` (i.e., the number of branches to remember) is an option
 that is specified either in the `AFL_LLVM_INSTRUMENT=NGRAM-{value}` or the
 `AFL_LLVM_NGRAM_SIZE` environment variable.
 Good values are 2, 4 or 8, valid are 2-16.
+
+It is highly recommended to increase the MAP_SIZE_POW2 definition in
+config.h to at least 18 and maybe up to 20 for this as otherwise too
+many map collisions occur.
diff --git a/llvm_mode/afl-clang-fast.c b/llvm_mode/afl-clang-fast.c
index 0e388cf4..657d1a84 100644
--- a/llvm_mode/afl-clang-fast.c
+++ b/llvm_mode/afl-clang-fast.c
@@ -544,9 +544,12 @@ int main(int argc, char **argv, char **envp) {
       instrument_mode = INSTRUMENT_PCGUARD;
     else if (strncasecmp(ptr, "lto", strlen("lto")) == 0)
       instrument_mode = INSTRUMENT_LTO;
-    else if (strncasecmp(ptr, "ctx", strlen("ctx")) == 0)
+    else if (strncasecmp(ptr, "ctx", strlen("ctx")) == 0) {
+
       instrument_mode = INSTRUMENT_CTX;
-    else if (strncasecmp(ptr, "ngram", strlen("ngram")) == 0) {
+      setenv("AFL_LLVM_CTX", "1", 1);
+
+    } else if (strncasecmp(ptr, "ngram", strlen("ngram")) == 0) {
 
       ptr += strlen("ngram");
       while (*ptr && (*ptr < '0' || *ptr > '9'))
diff --git a/llvm_mode/afl-llvm-pass.so.cc b/llvm_mode/afl-llvm-pass.so.cc
index 56f9ffe2..31d00fec 100644
--- a/llvm_mode/afl-llvm-pass.so.cc
+++ b/llvm_mode/afl-llvm-pass.so.cc
@@ -124,6 +124,8 @@ class AFLCoverage : public ModulePass {
  protected:
   std::list<std::string> myWhitelist;
   uint32_t               ngram_size = 0;
+  uint32_t               debug = 0;
+  char *                 ctx_str = NULL;
 
 };
 
@@ -179,6 +181,8 @@ bool AFLCoverage::runOnModule(Module &M) {
 
   char be_quiet = 0;
 
+  if (getenv("AFL_DEBUG")) debug = 1;
+
   if ((isatty(2) && !getenv("AFL_QUIET")) || getenv("AFL_DEBUG") != NULL) {
 
     SAYF(cCYA "afl-llvm-pass" VERSION cRST
@@ -209,6 +213,7 @@ bool AFLCoverage::runOnModule(Module &M) {
 
   char *ngram_size_str = getenv("AFL_LLVM_NGRAM_SIZE");
   if (!ngram_size_str) ngram_size_str = getenv("AFL_NGRAM_SIZE");
+  ctx_str = getenv("AFL_LLVM_CTX");
 
 #ifdef AFL_HAVE_VECTOR_INTRINSICS
   /* Decide previous location vector size (must be a power of two) */
@@ -228,9 +233,8 @@ bool AFLCoverage::runOnModule(Module &M) {
   else
 #else
   if (ngram_size_str)
-    FATAL(
-        "Sorry, n-gram branch coverage is not supported with llvm version %s!",
-        LLVM_VERSION_STRING);
+    FATAL("Sorry, NGRAM branch coverage is not supported with llvm version %s!",
+          LLVM_VERSION_STRING);
 #endif
     PrevLocSize = 1;
 
@@ -239,6 +243,9 @@ bool AFLCoverage::runOnModule(Module &M) {
   if (ngram_size) PrevLocTy = VectorType::get(IntLocTy, PrevLocVecSize);
 #endif
 
+  if (ctx_str && ngram_size_str)
+    FATAL("you must decide between NGRAM and CTX instrumentation");
+
   /* Get globals for the SHM region and the previous location. Note that
      __afl_prev_loc is thread-local. */
 
@@ -246,6 +253,17 @@ bool AFLCoverage::runOnModule(Module &M) {
       new GlobalVariable(M, PointerType::get(Int8Ty, 0), false,
                          GlobalValue::ExternalLinkage, 0, "__afl_area_ptr");
   GlobalVariable *AFLPrevLoc;
+  GlobalVariable *AFLContext;
+
+  if (ctx_str)
+#ifdef __ANDROID__
+    AFLContext = new GlobalVariable(
+        M, Int32Ty, false, GlobalValue::ExternalLinkage, 0, "__afl_prev_ctx");
+#else
+    AFLContext = new GlobalVariable(
+        M, Int32Ty, false, GlobalValue::ExternalLinkage, 0, "__afl_prev_ctx", 0,
+        GlobalVariable::GeneralDynamicTLSModel, 0, false);
+#endif
 
 #ifdef AFL_HAVE_VECTOR_INTRINSICS
   if (ngram_size)
@@ -291,14 +309,70 @@ bool AFLCoverage::runOnModule(Module &M) {
   ConstantInt *Zero = ConstantInt::get(Int8Ty, 0);
   ConstantInt *One = ConstantInt::get(Int8Ty, 1);
 
+  LoadInst *PrevCtx;  // CTX sensitive coverage
+
   /* Instrument all the things! */
 
   int inst_blocks = 0;
 
   for (auto &F : M) {
 
+    if (debug)
+      fprintf(stderr, "FUNCTION: %s (%zu)\n", F.getName().str().c_str(),
+              F.size());
+
     if (isBlacklisted(&F)) continue;
 
+    // AllocaInst *CallingContext = nullptr;
+
+    if (ctx_str && F.size() > 1) {  // Context sensitive coverage
+      // load the context ID of the previous function and write to to a local
+      // variable on the stack
+      auto                 bb = &F.getEntryBlock();
+      BasicBlock::iterator IP = bb->getFirstInsertionPt();
+      IRBuilder<>          IRB(&(*IP));
+      PrevCtx = IRB.CreateLoad(AFLContext);
+      PrevCtx->setMetadata(M.getMDKindID("nosanitize"), MDNode::get(C, None));
+
+      // does the function have calls? and is any of the calls larger than one
+      // basic block?
+      int has_calls = 0;
+      for (auto &BB : F) {
+
+        if (has_calls) break;
+        for (auto &IN : BB) {
+
+          CallInst *callInst = nullptr;
+          if ((callInst = dyn_cast<CallInst>(&IN))) {
+
+            Function *Callee = callInst->getCalledFunction();
+            if (!Callee || Callee->size() < 2)
+              continue;
+            else {
+
+              has_calls = 1;
+              break;
+
+            }
+
+          }
+
+        }
+
+      }
+
+      // if yes we store a context ID for this function in the global var
+      if (has_calls) {
+
+        ConstantInt *NewCtx = ConstantInt::get(Int32Ty, AFL_R(MAP_SIZE));
+        StoreInst *  StoreCtx = IRB.CreateStore(NewCtx, AFLContext);
+        StoreCtx->setMetadata(M.getMDKindID("nosanitize"),
+                              MDNode::get(C, None));
+
+      }
+
+    }
+
     for (auto &BB : F) {
 
       BasicBlock::iterator IP = BB.getFirstInsertionPt();
@@ -484,6 +558,9 @@ bool AFLCoverage::runOnModule(Module &M) {
         PrevLocTrans = IRB.CreateXorReduce(PrevLoc);
       else
 #endif
+          if (ctx_str)
+        PrevLocTrans = IRB.CreateZExt(IRB.CreateXor(PrevLoc, PrevCtx), Int32Ty);
+      else
         PrevLocTrans = IRB.CreateZExt(PrevLoc, IRB.getInt32Ty());
 
       /* Load SHM pointer */
diff --git a/llvm_mode/afl-llvm-rt.o.c b/llvm_mode/afl-llvm-rt.o.c
index 6b812423..e2fd5190 100644
--- a/llvm_mode/afl-llvm-rt.o.c
+++ b/llvm_mode/afl-llvm-rt.o.c
@@ -69,13 +69,16 @@ u8 *__afl_area_ptr = __afl_area_initial;
 #ifdef __ANDROID__
 PREV_LOC_T __afl_prev_loc[NGRAM_SIZE_MAX];
 u32        __afl_final_loc;
+u32        __afl_prev_ctx;
+u32        __afl_cmp_counter
 #else
 __thread PREV_LOC_T __afl_prev_loc[NGRAM_SIZE_MAX];
 __thread u32        __afl_final_loc;
+__thread u32        __afl_prev_ctx;
+__thread u32        __afl_cmp_counter;
 #endif
 
-struct cmp_map *__afl_cmp_map;
-__thread u32    __afl_cmp_counter;
+    struct cmp_map *__afl_cmp_map;
 
 /* Running in persistent mode? */
 
diff --git a/src/afl-common.c b/src/afl-common.c
index 7e8bd930..73b3fa8a 100644
--- a/src/afl-common.c
+++ b/src/afl-common.c
@@ -57,7 +57,7 @@ char *afl_environment_variables[] = {
     "AFL_INST_LIBS", "AFL_INST_RATIO", "AFL_KEEP_TRACES", "AFL_KEEP_ASSEMBLY",
     "AFL_LD_HARD_FAIL", "AFL_LD_LIMIT_MB", "AFL_LD_NO_CALLOC_OVER",
     "AFL_LD_PRELOAD", "AFL_LD_VERBOSE", "AFL_LLVM_CMPLOG", "AFL_LLVM_INSTRIM",
-    "AFL_LLVM_INSTRUMENT", "AFL_LLVM_INSTRIM_LOOPHEAD",
+    "AFL_LLVM_CTX", "AFL_LLVM_INSTRUMENT", "AFL_LLVM_INSTRIM_LOOPHEAD",
     "AFL_LLVM_INSTRIM_SKIPSINGLEBLOCK", "AFL_LLVM_LAF_SPLIT_COMPARES",
     "AFL_LLVM_LAF_SPLIT_COMPARES_BITW", "AFL_LLVM_LAF_SPLIT_FLOATS",
     "AFL_LLVM_LAF_SPLIT_SWITCHES", "AFL_LLVM_LAF_TRANSFORM_COMPARES",
author	van Hauser <vh@thc.org>	2020-04-08 03:54:49 +0200
committer	van Hauser <vh@thc.org>	2020-04-09 10:23:37 +0200
commit	314debb799f5e288c64c5e7938bc09e650420ae9 (patch)
tree	35ff0b58bad4073574630576b77f7fd5580d3689
parent	24ad714d0def017ee389f623f4aa48d5c7905a03 (diff)
download	afl++-314debb799f5e288c64c5e7938bc09e650420ae9.tar.gz