29 files changed, 2180 insertions, 439 deletions
diff --git a/unicorn_mode/README.md b/unicorn_mode/README.md
index f6bd4d12..b3df44fa 100644
--- a/unicorn_mode/README.md
+++ b/unicorn_mode/README.md
@@ -8,19 +8,19 @@ The CompareCoverage and NeverZero counters features are by Andrea Fioraldi <andr
 
 ## 1) Introduction
 
-The code in ./unicorn_mode allows you to build a standalone feature that
-leverages the Unicorn Engine and allows callers to obtain instrumentation 
+The code in ./unicorn_mode allows you to build the (Unicorn Engine)[https://github.com/unicorn-engine/unicorn] with afl support.
+This means, you can run anything that can be emulated in unicorn and obtain instrumentation
 output for black-box, closed-source binary code snippets. This mechanism 
 can be then used by afl-fuzz to stress-test targets that couldn't be built 
-with afl-gcc or used in QEMU mode, or with other extensions such as 
-TriforceAFL.
+with afl-cc or used in QEMU mode.
 
 There is a significant performance penalty compared to native AFL,
 but at least we're able to use AFL++ on these binaries, right?
 
 ## 2) How to use
 
-Requirements: you need an installed python environment.
+First, you will need a working harness for your target in unicorn, using Python, C, or Rust.
+For some pointers for more advanced emulation, take a look at [BaseSAFE](https://github.com/fgsect/BaseSAFE) and [Qiling](https://github.com/qilingframework/qiling).
 
 ### Building AFL++'s Unicorn Mode
 
@@ -34,23 +34,23 @@ cd unicorn_mode
 ```
 
 NOTE: This script checks out a Unicorn Engine fork as submodule that has been tested 
-and is stable-ish, based on the unicorn engine master. 
+and is stable-ish, based on the unicorn engine `next` branch. 
 
 Building Unicorn will take a little bit (~5-10 minutes). Once it completes 
 it automatically compiles a sample application and verifies that it works.
 
 ### Fuzzing with Unicorn Mode
 
-To really use unicorn-mode effectively you need to prepare the following:
+To use unicorn-mode effectively you need to prepare the following:
 
 	* Relevant binary code to be fuzzed
 	* Knowledge of the memory map and good starting state
 	* Folder containing sample inputs to start fuzzing with
 		+ Same ideas as any other AFL inputs
-		+ Quality/speed of results will depend greatly on quality of starting 
+		+ Quality/speed of results will depend greatly on the quality of starting 
 		  samples
 		+ See AFL's guidance on how to create a sample corpus
-	* Unicornafl-based test harness which:
+	* Unicornafl-based test harness in Rust, C, or Python, which:
 		+ Adds memory map regions
 		+ Loads binary code into memory		
 		+ Calls uc.afl_fuzz() / uc.afl_start_forkserver
@@ -59,13 +59,13 @@ To really use unicorn-mode effectively you need to prepare the following:
 			  the test harness
 			+ Presumably the data to be fuzzed is at a fixed buffer address
 			+ If input constraints (size, invalid bytes, etc.) are known they 
-			  should be checked after the file is loaded. If a constraint 
-			  fails, just exit the test harness. AFL will treat the input as 
+			  should be checked in the place_input handler. If a constraint 
+			  fails, just return false from the handler. AFL will treat the input as 
 			  'uninteresting' and move on.
 		+ Sets up registers and memory state for beginning of test
-		+ Emulates the interested code from beginning to end
+		+ Emulates the interesting code from beginning to end
 		+ If a crash is detected, the test harness must 'crash' by 
-		  throwing a signal (SIGSEGV, SIGKILL, SIGABORT, etc.)
+		  throwing a signal (SIGSEGV, SIGKILL, SIGABORT, etc.), or indicate a crash in the crash validation callback.
 
 Once you have all those things ready to go you just need to run afl-fuzz in
 'unicorn-mode' by passing in the '-U' flag:
@@ -79,11 +79,12 @@ AFL's main documentation for more info about how to use afl-fuzz effectively.
 
 For a much clearer vision of what all of this looks like, please refer to the
 sample provided in the 'unicorn_mode/samples' directory. There is also a blog
-post that goes over the basics at:
+post that uses slightly older concepts, but describes the general ideas, at:
 
 [https://medium.com/@njvoss299/afl-unicorn-fuzzing-arbitrary-binary-code-563ca28936bf](https://medium.com/@njvoss299/afl-unicorn-fuzzing-arbitrary-binary-code-563ca28936bf)
 
-The 'helper_scripts' directory also contains several helper scripts that allow you 
+
+The ['helper_scripts'](./helper_scripts) directory also contains several helper scripts that allow you 
 to dump context from a running process, load it, and hook heap allocations. For details
 on how to use this check out the follow-up blog post to the one linked above.
 
@@ -92,10 +93,10 @@ A example use of AFL-Unicorn mode is discussed in the paper Unicorefuzz:
 
 ## 3) Options
 
-As for the QEMU-based instrumentation, the afl-unicorn twist of afl++
-comes with a sub-instruction based instrumentation similar in purpose to laf-intel.
+As for the QEMU-based instrumentation, unicornafl comes with a sub-instruction based instrumentation similar in purpose to laf-intel.
 
 The options that enable Unicorn CompareCoverage are the same used for QEMU.
+This will split up each multi-byte compare to give feedback for each correct byte.
 AFL_COMPCOV_LEVEL=1 is to instrument comparisons with only immediate values.
 
 AFL_COMPCOV_LEVEL=2 instruments all comparison instructions.
@@ -119,6 +120,20 @@ unicornafl.monkeypatch()
 
 This will replace all unicorn imports with unicornafl inputs.
 
-Refer to the [samples/arm_example/arm_tester.c](samples/arm_example/arm_tester.c) for an example
-of how to do this properly! If you don't get this right, AFL will not 
-load any mutated inputs and your fuzzing will be useless!
+5) Examples
+
+Apart from reading the documentation in `afl.c` and the python bindings of unicornafl, the best documentation are the [samples/](./samples).
+The following examples exist at the time of writing:
+
+- c: A simple example how to use the c bindings
+- compcov_x64: A python example that uses compcov to traverse hard-to-reach blocks
+- persistent: A c example using persistent mode for maximum speed, and resetting the target state between each iteration
+- simple: A simple python example
+- speedtest/c: The c harness for an example target, used to compare c, python, and rust bindings and fix speed issues
+- speedtest/python: Fuzzing the same target in python
+- speedtest/rust: Fuzzing the same target using a rust harness
+
+Usually, the place to look at is the `harness` in each folder. The source code in each harness is pretty well documented.
+Most harnesses also have the `afl-fuzz` commandline, or even offer a `make fuzz` Makefile target.
+Targets in these folders, if x86, can usually be made using `make target` in each folder or get shipped pre-built (plus their source).
+Especially take a look at the [speedtest documentation](./samples/speedtest/README.md) to see how the languages compare.
\ No newline at end of file
diff --git a/unicorn_mode/UNICORNAFL_VERSION b/unicorn_mode/UNICORNAFL_VERSION
index 02736b77..d9ae5590 100644
--- a/unicorn_mode/UNICORNAFL_VERSION
+++ b/unicorn_mode/UNICORNAFL_VERSION
@@ -1 +1 @@
-c6d66471
+fb2fc9f2
diff --git a/unicorn_mode/build_unicorn_support.sh b/unicorn_mode/build_unicorn_support.sh
index 841728d7..6c376f8d 100755
--- a/unicorn_mode/build_unicorn_support.sh
+++ b/unicorn_mode/build_unicorn_support.sh
@@ -44,7 +44,7 @@ echo "[*] Performing basic sanity checks..."
 
 PLT=`uname -s`
 
-if [ ! "$PLT" = "Linux" ] && [ ! "$PLT" = "Darwin" ] && [ ! "$PLT" = "FreeBSD" ] && [ ! "$PLT" = "NetBSD" ] && [ ! "$PLT" = "OpenBSD" ]; then
+if [ ! "$PLT" = "Linux" ] && [ ! "$PLT" = "Darwin" ] && [ ! "$PLT" = "FreeBSD" ] && [ ! "$PLT" = "NetBSD" ] && [ ! "$PLT" = "OpenBSD" ] && [ ! "$PLT" = "DragonFly" ]; then
 
   echo "[-] Error: Unicorn instrumentation is unsupported on $PLT."
   exit 1
@@ -70,6 +70,11 @@ MAKECMD=make
 TARCMD=tar
 
 if [ "$PLT" = "Linux" ]; then
+  MUSL=`ldd --version 2>&1 | head -n 1 | cut -f 1 -d " "`
+  if [ "musl" = $MUSL ]; then
+  	echo "[-] Error: Unicorn instrumentation is unsupported with the musl's libc."
+  	exit 1
+  fi
   CORES=`nproc`
 fi
 
@@ -84,6 +89,12 @@ if [ "$PLT" = "FreeBSD" ]; then
   TARCMD=gtar
 fi
 
+if [ "$PLT" = "DragonFly" ]; then
+  MAKECMD=gmake
+  CORES=`sysctl -n hw.ncpu`
+  TARCMD=tar
+fi
+
 if [ "$PLT" = "NetBSD" ] || [ "$PLT" = "OpenBSD" ]; then
   MAKECMD=gmake
   CORES=`sysctl -n hw.ncpu`
@@ -106,19 +117,19 @@ done
 
 # some python version should be available now
 PYTHONS="`command -v python3` `command -v python` `command -v python2`"
-EASY_INSTALL_FOUND=0
+SETUPTOOLS_FOUND=0
 for PYTHON in $PYTHONS ; do
 
   if $PYTHON -c "import setuptools" ; then
 
-    EASY_INSTALL_FOUND=1
+    SETUPTOOLS_FOUND=1
     PYTHONBIN=$PYTHON
     break
 
   fi
 
 done
-if [ "0" = $EASY_INSTALL_FOUND ]; then
+if [ "0" = $SETUPTOOLS_FOUND ]; then
 
   echo "[-] Error: Python setup-tools not found. Run 'sudo apt-get install python-setuptools', or install python3-setuptools, or run '$PYTHONBIN -m ensurepip', or create a virtualenv, or ..."
   PREREQ_NOTFOUND=1
@@ -136,6 +147,8 @@ if [ "$PREREQ_NOTFOUND" = "1" ]; then
   exit 1
 fi
 
+unset CFLAGS
+
 echo "[+] All checks passed!"
 
 echo "[*] Making sure unicornafl is checked out"
@@ -144,7 +157,8 @@ git status 1>/dev/null 2>/dev/null
 if [ $? -eq 0 ]; then
   echo "[*] initializing unicornafl submodule"
   git submodule init || exit 1
-  git submodule update 2>/dev/null # ignore errors
+  git submodule update ./unicornafl 2>/dev/null # ignore errors
+  git submodule sync ./unicornafl 2>/dev/null # ignore errors
 else
   echo "[*] cloning unicornafl"
   test -d unicornafl || {
@@ -165,8 +179,9 @@ echo "[*] Checking out $UNICORNAFL_VERSION"
 sh -c 'git stash && git stash drop' 1>/dev/null 2>/dev/null
 git checkout "$UNICORNAFL_VERSION" || exit 1
 
-echo "[*] making sure config.h matches"
-cp "../../config.h" "." || exit 1
+echo "[*] making sure afl++ header files match"
+cp "../../include/config.h" "." || exit 1
+cp "../../include/types.h" "." || exit 1
 
 echo "[*] Configuring Unicorn build..."
 
diff --git a/unicorn_mode/helper_scripts/unicorn_dumper_gdb.py b/unicorn_mode/helper_scripts/unicorn_dumper_gdb.py
index 22b9fd47..1ac4c9f3 100644
--- a/unicorn_mode/helper_scripts/unicorn_dumper_gdb.py
+++ b/unicorn_mode/helper_scripts/unicorn_dumper_gdb.py
@@ -1,13 +1,13 @@
 """
     unicorn_dumper_gdb.py
-    
+
     When run with GDB sitting at a debug breakpoint, this
     dumps the current state (registers/memory/etc) of
-    the process to a directory consisting of an index 
-    file with register and segment information and 
+    the process to a directory consisting of an index
+    file with register and segment information and
     sub-files containing all actual process memory.
-    
-    The output of this script is expected to be used 
+
+    The output of this script is expected to be used
     to initialize context for Unicorn emulation.
 
     -----------
@@ -44,30 +44,32 @@ MAX_SEG_SIZE = 128 * 1024 * 1024
 # Name of the index file
 INDEX_FILE_NAME = "_index.json"
 
-#----------------------
-#---- Helper Functions
+
+# ----------------------
+# ---- Helper Functions
+
 
 def map_arch():
-    arch = get_arch() # from GEF
-    if 'x86_64' in arch or 'x86-64' in arch:
+    arch = get_arch()  # from GEF
+    if "x86_64" in arch or "x86-64" in arch:
         return "x64"
-    elif 'x86' in arch or 'i386' in arch:
+    elif "x86" in arch or "i386" in arch:
         return "x86"
-    elif 'aarch64' in arch or 'arm64' in arch:
+    elif "aarch64" in arch or "arm64" in arch:
         return "arm64le"
-    elif 'aarch64_be' in arch:
+    elif "aarch64_be" in arch:
         return "arm64be"
-    elif 'armeb' in arch:
+    elif "armeb" in arch:
         # check for THUMB mode
-        cpsr = get_register('cpsr')
-        if (cpsr & (1 << 5)):
+        cpsr = get_register("$cpsr")
+        if cpsr & (1 << 5):
             return "armbethumb"
         else:
             return "armbe"
-    elif 'arm' in arch:
+    elif "arm" in arch:
         # check for THUMB mode
-        cpsr = get_register('cpsr')
-        if (cpsr & (1 << 5)):
+        cpsr = get_register("$cpsr")
+        if cpsr & (1 << 5):
             return "armlethumb"
         else:
             return "armle"
@@ -75,8 +77,9 @@ def map_arch():
         return ""
 
 
-#-----------------------
-#---- Dumping functions
+# -----------------------
+# ---- Dumping functions
+
 
 def dump_arch_info():
     arch_info = {}
@@ -88,19 +91,15 @@ def dump_regs():
     reg_state = {}
     for reg in current_arch.all_registers:
         reg_val = get_register(reg)
-        # current dumper script looks for register values to be hex strings
-#         reg_str = "0x{:08x}".format(reg_val)
-#         if "64" in get_arch():
-#             reg_str = "0x{:016x}".format(reg_val)
-#         reg_state[reg.strip().strip('$')] = reg_str
-        reg_state[reg.strip().strip('$')] = reg_val
+        reg_state[reg.strip().strip("$")] = reg_val
+
     return reg_state
 
 
 def dump_process_memory(output_dir):
     # Segment information dictionary
     final_segment_list = []
-    
+
     # GEF:
     vmmap = get_process_maps()
     if not vmmap:
@@ -110,45 +109,91 @@ def dump_process_memory(output_dir):
     for entry in vmmap:
         if entry.page_start == entry.page_end:
             continue
-        
-        seg_info = {'start': entry.page_start, 'end': entry.page_end, 'name': entry.path, 'permissions': {
-            "r": entry.is_readable() > 0,
-            "w": entry.is_writable() > 0,
-            "x": entry.is_executable() > 0
-        }, 'content_file': ''}
+
+        seg_info = {
+            "start": entry.page_start,
+            "end": entry.page_end,
+            "name": entry.path,
+            "permissions": {
+                "r": entry.is_readable() > 0,
+                "w": entry.is_writable() > 0,
+                "x": entry.is_executable() > 0,
+            },
+            "content_file": "",
+        }
 
         # "(deleted)" may or may not be valid, but don't push it.
-        if entry.is_readable() and not '(deleted)' in entry.path:
+        if entry.is_readable() and not "(deleted)" in entry.path:
             try:
                 # Compress and dump the content to a file
                 seg_content = read_memory(entry.page_start, entry.size)
-                if(seg_content == None):
-                    print("Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(entry.page_start, entry.path))
+                if seg_content == None:
+                    print(
+                        "Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(
+                            entry.page_start, entry.path
+                        )
+                    )
                 else:
-                    print("Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(entry.page_start, len(seg_content), entry.path, repr(seg_info['permissions'])))
+                    print(
+                        "Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(
+                            entry.page_start,
+                            len(seg_content),
+                            entry.path,
+                            repr(seg_info["permissions"]),
+                        )
+                    )
                     compressed_seg_content = zlib.compress(seg_content)
                     md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin"
                     seg_info["content_file"] = md5_sum
-                    
+
                     # Write the compressed contents to disk
-                    out_file = open(os.path.join(output_dir, md5_sum), 'wb')
+                    out_file = open(os.path.join(output_dir, md5_sum), "wb")
                     out_file.write(compressed_seg_content)
                     out_file.close()
 
             except:
-                print("Exception reading segment ({}): {}".format(entry.path, sys.exc_info()[0]))
+                print(
+                    "Exception reading segment ({}): {}".format(
+                        entry.path, sys.exc_info()[0]
+                    )
+                )
         else:
-            print("Skipping segment {0}@0x{1:016x}".format(entry.path, entry.page_start))
+            print(
+                "Skipping segment {0}@0x{1:016x}".format(entry.path, entry.page_start)
+            )
 
         # Add the segment to the list
         final_segment_list.append(seg_info)
 
-            
     return final_segment_list
 
-#----------
-#---- Main    
-    
+
+# ---------------------------------------------
+# ---- ARM Extention (dump floating point regs)
+
+
+def dump_float(rge=32):
+    reg_convert = ""
+    if (
+        map_arch() == "armbe"
+        or map_arch() == "armle"
+        or map_arch() == "armbethumb"
+        or map_arch() == "armbethumb"
+    ):
+        reg_state = {}
+        for reg_num in range(32):
+            value = gdb.selected_frame().read_register("d" + str(reg_num))
+            reg_state["d" + str(reg_num)] = int(str(value["u64"]), 16)
+        value = gdb.selected_frame().read_register("fpscr")
+        reg_state["fpscr"] = int(str(value), 16)
+
+        return reg_state
+
+
+# ----------
+# ---- Main
+
+
 def main():
     print("----- Unicorn Context Dumper -----")
     print("You must be actively debugging before running this!")
@@ -159,32 +204,35 @@ def main():
         print("!!! GEF not running in GDB.  Please run gef.py by executing:")
         print('\tpython execfile ("<path_to_gef>/gef.py")')
         return
-    
+
     try:
-    
+
         # Create the output directory
-        timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S')
+        timestamp = datetime.datetime.fromtimestamp(time.time()).strftime(
+            "%Y%m%d_%H%M%S"
+        )
         output_path = "UnicornContext_" + timestamp
         if not os.path.exists(output_path):
             os.makedirs(output_path)
         print("Process context will be output to {}".format(output_path))
-            
+
         # Get the context
         context = {
             "arch": dump_arch_info(),
-            "regs": dump_regs(), 
+            "regs": dump_regs(),
+            "regs_extended": dump_float(),
             "segments": dump_process_memory(output_path),
         }
 
         # Write the index file
-        index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w')
+        index_file = open(os.path.join(output_path, INDEX_FILE_NAME), "w")
         index_file.write(json.dumps(context, indent=4))
-        index_file.close()    
+        index_file.close()
         print("Done.")
-        
+
     except Exception as e:
         print("!!! ERROR:\n\t{}".format(repr(e)))
-        
+
+
 if __name__ == "__main__":
     main()
-    
diff --git a/unicorn_mode/helper_scripts/unicorn_dumper_ida.py b/unicorn_mode/helper_scripts/unicorn_dumper_ida.py
index 6cf9f30f..fa29fb90 100644
--- a/unicorn_mode/helper_scripts/unicorn_dumper_ida.py
+++ b/unicorn_mode/helper_scripts/unicorn_dumper_ida.py
@@ -31,8 +31,9 @@ MAX_SEG_SIZE = 128 * 1024 * 1024
 # Name of the index file
 INDEX_FILE_NAME = "_index.json"
 
-#----------------------
-#---- Helper Functions
+# ----------------------
+# ---- Helper Functions
+
 
 def get_arch():
     if ph.id == PLFM_386 and ph.flag & PR_USE64:
@@ -52,6 +53,7 @@ def get_arch():
     else:
         return ""
 
+
 def get_register_list(arch):
     if arch == "arm64le" or arch == "arm64be":
         arch = "arm64"
@@ -59,84 +61,174 @@ def get_register_list(arch):
         arch = "arm"
 
     registers = {
-        "x64" : [
-            "rax", "rbx", "rcx", "rdx", "rsi", "rdi", "rbp", "rsp",
-            "r8",  "r9",  "r10", "r11", "r12", "r13", "r14", "r15",
-            "rip", "rsp", "efl",
-            "cs", "ds", "es", "fs", "gs", "ss",
+        "x64": [
+            "rax",
+            "rbx",
+            "rcx",
+            "rdx",
+            "rsi",
+            "rdi",
+            "rbp",
+            "rsp",
+            "r8",
+            "r9",
+            "r10",
+            "r11",
+            "r12",
+            "r13",
+            "r14",
+            "r15",
+            "rip",
+            "rsp",
+            "efl",
+            "cs",
+            "ds",
+            "es",
+            "fs",
+            "gs",
+            "ss",
+        ],
+        "x86": [
+            "eax",
+            "ebx",
+            "ecx",
+            "edx",
+            "esi",
+            "edi",
+            "ebp",
+            "esp",
+            "eip",
+            "esp",
+            "efl",
+            "cs",
+            "ds",
+            "es",
+            "fs",
+            "gs",
+            "ss",
         ],
-        "x86" : [
-            "eax", "ebx", "ecx", "edx", "esi", "edi", "ebp", "esp",
-            "eip", "esp", "efl", 
-            "cs", "ds", "es", "fs", "gs", "ss",
-        ],        
-        "arm" : [
-            "R0", "R1", "R2",  "R3",  "R4",  "R5", "R6", "R7",  
-            "R8", "R9", "R10", "R11", "R12", "PC", "SP", "LR",  
+        "arm": [
+            "R0",
+            "R1",
+            "R2",
+            "R3",
+            "R4",
+            "R5",
+            "R6",
+            "R7",
+            "R8",
+            "R9",
+            "R10",
+            "R11",
+            "R12",
+            "PC",
+            "SP",
+            "LR",
             "PSR",
         ],
-        "arm64" : [
-            "X0", "X1", "X2", "X3", "X4", "X5", "X6", "X7",  
-            "X8", "X9", "X10", "X11", "X12", "X13", "X14", 
-            "X15", "X16", "X17", "X18", "X19", "X20", "X21", 
-            "X22", "X23", "X24", "X25", "X26", "X27", "X28", 
-            "PC", "SP", "FP", "LR", "CPSR"
+        "arm64": [
+            "X0",
+            "X1",
+            "X2",
+            "X3",
+            "X4",
+            "X5",
+            "X6",
+            "X7",
+            "X8",
+            "X9",
+            "X10",
+            "X11",
+            "X12",
+            "X13",
+            "X14",
+            "X15",
+            "X16",
+            "X17",
+            "X18",
+            "X19",
+            "X20",
+            "X21",
+            "X22",
+            "X23",
+            "X24",
+            "X25",
+            "X26",
+            "X27",
+            "X28",
+            "PC",
+            "SP",
+            "FP",
+            "LR",
+            "CPSR"
             #    "NZCV",
-        ]
+        ],
     }
-    return registers[arch]  
+    return registers[arch]
+
+
+# -----------------------
+# ---- Dumping functions
 
-#-----------------------
-#---- Dumping functions
 
 def dump_arch_info():
     arch_info = {}
     arch_info["arch"] = get_arch()
     return arch_info
 
+
 def dump_regs():
     reg_state = {}
     for reg in get_register_list(get_arch()):
         reg_state[reg] = GetRegValue(reg)
     return reg_state
 
+
 def dump_process_memory(output_dir):
     # Segment information dictionary
     segment_list = []
-    
+
     # Loop over the segments, fill in the info dictionary
     for seg_ea in Segments():
         seg_start = SegStart(seg_ea)
         seg_end = SegEnd(seg_ea)
         seg_size = seg_end - seg_start
-		
+
         seg_info = {}
-        seg_info["name"]  = SegName(seg_ea)
+        seg_info["name"] = SegName(seg_ea)
         seg_info["start"] = seg_start
-        seg_info["end"]   = seg_end
-        
+        seg_info["end"] = seg_end
+
         perms = getseg(seg_ea).perm
         seg_info["permissions"] = {
-            "r": False if (perms & SEGPERM_READ)  == 0 else True,
+            "r": False if (perms & SEGPERM_READ) == 0 else True,
             "w": False if (perms & SEGPERM_WRITE) == 0 else True,
-            "x": False if (perms & SEGPERM_EXEC)  == 0 else True,
+            "x": False if (perms & SEGPERM_EXEC) == 0 else True,
         }
 
         if (perms & SEGPERM_READ) and seg_size <= MAX_SEG_SIZE and isLoaded(seg_start):
             try:
                 # Compress and dump the content to a file
                 seg_content = get_many_bytes(seg_start, seg_end - seg_start)
-                if(seg_content == None):
-                    print("Segment empty: {0}@0x{1:016x} (size:UNKNOWN)".format(SegName(seg_ea), seg_ea))
+                if seg_content == None:
+                    print(
+                        "Segment empty: {0}@0x{1:016x} (size:UNKNOWN)".format(
+                            SegName(seg_ea), seg_ea
+                        )
+                    )
                     seg_info["content_file"] = ""
                 else:
-                    print("Dumping segment {0}@0x{1:016x} (size:{2})".format(SegName(seg_ea), seg_ea, len(seg_content)))
+                    print(
+                        "Dumping segment {0}@0x{1:016x} (size:{2})".format(
+                            SegName(seg_ea), seg_ea, len(seg_content)
+                        )
+                    )
                     compressed_seg_content = zlib.compress(seg_content)
                     md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin"
                     seg_info["content_file"] = md5_sum
-                    
+
                     # Write the compressed contents to disk
-                    out_file = open(os.path.join(output_dir, md5_sum), 'wb')
+                    out_file = open(os.path.join(output_dir, md5_sum), "wb")
                     out_file.write(compressed_seg_content)
                     out_file.close()
             except:
@@ -145,12 +237,13 @@ def dump_process_memory(output_dir):
         else:
             print("Skipping segment {0}@0x{1:016x}".format(SegName(seg_ea), seg_ea))
             seg_info["content_file"] = ""
-            
+
         # Add the segment to the list
-        segment_list.append(seg_info)     
-   
+        segment_list.append(seg_info)
+
     return segment_list
 
+
 """
     TODO: FINISH IMPORT DUMPING
 def import_callback(ea, name, ord):
@@ -169,41 +262,47 @@ def dump_imports():
     
     return import_dict
 """
- 
-#----------
-#---- Main    
-    
+
+# ----------
+# ---- Main
+
+
 def main():
 
     try:
         print("----- Unicorn Context Dumper -----")
         print("You must be actively debugging before running this!")
-        print("If it fails, double check that you are actively debugging before running.")
+        print(
+            "If it fails, double check that you are actively debugging before running."
+        )
 
         # Create the output directory
-        timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S')
+        timestamp = datetime.datetime.fromtimestamp(time.time()).strftime(
+            "%Y%m%d_%H%M%S"
+        )
         output_path = os.path.dirname(os.path.abspath(GetIdbPath()))
         output_path = os.path.join(output_path, "UnicornContext_" + timestamp)
         if not os.path.exists(output_path):
             os.makedirs(output_path)
         print("Process context will be output to {}".format(output_path))
-            
+
         # Get the context
         context = {
             "arch": dump_arch_info(),
-            "regs": dump_regs(), 
+            "regs": dump_regs(),
             "segments": dump_process_memory(output_path),
-            #"imports": dump_imports(),
+            # "imports": dump_imports(),
         }
 
         # Write the index file
-        index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w')
+        index_file = open(os.path.join(output_path, INDEX_FILE_NAME), "w")
         index_file.write(json.dumps(context, indent=4))
-        index_file.close()    
+        index_file.close()
         print("Done.")
-        
+
     except Exception, e:
         print("!!! ERROR:\n\t{}".format(str(e)))
-        
+
+
 if __name__ == "__main__":
     main()
diff --git a/unicorn_mode/helper_scripts/unicorn_dumper_lldb.py b/unicorn_mode/helper_scripts/unicorn_dumper_lldb.py
index 3c019d77..179d062a 100644
--- a/unicorn_mode/helper_scripts/unicorn_dumper_lldb.py
+++ b/unicorn_mode/helper_scripts/unicorn_dumper_lldb.py
@@ -50,10 +50,11 @@ UNICORN_PAGE_SIZE = 0x1000
 
 # Alignment functions to align all memory segments to Unicorn page boundaries (4KB pages only)
 ALIGN_PAGE_DOWN = lambda x: x & ~(UNICORN_PAGE_SIZE - 1)
-ALIGN_PAGE_UP   = lambda x: (x + UNICORN_PAGE_SIZE - 1) & ~(UNICORN_PAGE_SIZE-1)
+ALIGN_PAGE_UP = lambda x: (x + UNICORN_PAGE_SIZE - 1) & ~(UNICORN_PAGE_SIZE - 1)
+
+# ----------------------
+# ---- Helper Functions
 
-#----------------------
-#---- Helper Functions
 
 def overlap_alignments(segments, memory):
     final_list = []
@@ -61,33 +62,40 @@ def overlap_alignments(segments, memory):
     curr_end_addr = 0
     curr_node = None
     current_segment = None
-    sorted_segments = sorted(segments, key=lambda k: (k['start'], k['end']))
+    sorted_segments = sorted(segments, key=lambda k: (k["start"], k["end"]))
     if curr_seg_idx < len(sorted_segments):
         current_segment = sorted_segments[curr_seg_idx]
-    for mem in sorted(memory, key=lambda k: (k['start'], -k['end'])):
+    for mem in sorted(memory, key=lambda k: (k["start"], -k["end"])):
         if curr_node is None:
-            if current_segment is not None and current_segment['start'] == mem['start']:
+            if current_segment is not None and current_segment["start"] == mem["start"]:
                 curr_node = deepcopy(current_segment)
-                curr_node['permissions'] = mem['permissions']
+                curr_node["permissions"] = mem["permissions"]
             else:
                 curr_node = deepcopy(mem)
 
-            curr_end_addr = curr_node['end']
-
-        while curr_end_addr <= mem['end']:
-            if curr_node['end'] == mem['end']:
-                if current_segment is not None and current_segment['start'] > curr_node['start'] and current_segment['start'] < curr_node['end']:
-                    curr_node['end'] = current_segment['start']
-                    if(curr_node['end'] > curr_node['start']):
+            curr_end_addr = curr_node["end"]
+
+        while curr_end_addr <= mem["end"]:
+            if curr_node["end"] == mem["end"]:
+                if (
+                    current_segment is not None
+                    and current_segment["start"] > curr_node["start"]
+                    and current_segment["start"] < curr_node["end"]
+                ):
+                    curr_node["end"] = current_segment["start"]
+                    if curr_node["end"] > curr_node["start"]:
                         final_list.append(curr_node)
                     curr_node = deepcopy(current_segment)
-                    curr_node['permissions'] = mem['permissions']
-                    curr_end_addr = curr_node['end']
+                    curr_node["permissions"] = mem["permissions"]
+                    curr_end_addr = curr_node["end"]
                 else:
-                    if(curr_node['end'] > curr_node['start']):
+                    if curr_node["end"] > curr_node["start"]:
                         final_list.append(curr_node)
                     # if curr_node is a segment
-                    if current_segment is not None and current_segment['end'] == mem['end']:
+                    if (
+                        current_segment is not None
+                        and current_segment["end"] == mem["end"]
+                    ):
                         curr_seg_idx += 1
                         if curr_seg_idx < len(sorted_segments):
                             current_segment = sorted_segments[curr_seg_idx]
@@ -98,50 +106,56 @@ def overlap_alignments(segments, memory):
                     break
             # could only be a segment
             else:
-                if curr_node['end'] < mem['end']:
+                if curr_node["end"] < mem["end"]:
                     # check for remaining segments and valid segments
-                    if(curr_node['end'] > curr_node['start']):
+                    if curr_node["end"] > curr_node["start"]:
                         final_list.append(curr_node)
-          
+
                     curr_seg_idx += 1
                     if curr_seg_idx < len(sorted_segments):
                         current_segment = sorted_segments[curr_seg_idx]
                     else:
                         current_segment = None
-                        
-                    if current_segment is not None and current_segment['start'] <= curr_end_addr and current_segment['start'] < mem['end']:
+
+                    if (
+                        current_segment is not None
+                        and current_segment["start"] <= curr_end_addr
+                        and current_segment["start"] < mem["end"]
+                    ):
                         curr_node = deepcopy(current_segment)
-                        curr_node['permissions'] = mem['permissions']
+                        curr_node["permissions"] = mem["permissions"]
                     else:
                         # no more segments
                         curr_node = deepcopy(mem)
-                        
-                    curr_node['start'] = curr_end_addr
-                    curr_end_addr = curr_node['end']
 
-    return final_list    
+                    curr_node["start"] = curr_end_addr
+                    curr_end_addr = curr_node["end"]
+
+    return final_list
+
 
 # https://github.com/llvm-mirror/llvm/blob/master/include/llvm/ADT/Triple.h
 def get_arch():
-    arch, arch_vendor, arch_os = lldb.target.GetTriple().split('-')
-    if arch == 'x86_64':
+    arch, arch_vendor, arch_os = lldb.target.GetTriple().split("-")
+    if arch == "x86_64":
         return "x64"
-    elif arch == 'x86' or arch == 'i386':
+    elif arch == "x86" or arch == "i386":
         return "x86"
-    elif arch == 'aarch64' or arch == 'arm64':
+    elif arch == "aarch64" or arch == "arm64":
         return "arm64le"
-    elif arch == 'aarch64_be':
+    elif arch == "aarch64_be":
         return "arm64be"
-    elif arch == 'armeb':
+    elif arch == "armeb":
         return "armbe"
-    elif arch == 'arm':
+    elif arch == "arm":
         return "armle"
     else:
         return ""
 
 
-#-----------------------
-#---- Dumping functions
+# -----------------------
+# ---- Dumping functions
+
 
 def dump_arch_info():
     arch_info = {}
@@ -152,56 +166,64 @@ def dump_arch_info():
 def dump_regs():
     reg_state = {}
     for reg_list in lldb.frame.GetRegisters():
-        if 'general purpose registers' in reg_list.GetName().lower():
+        if "general purpose registers" in reg_list.GetName().lower():
             for reg in reg_list:
                 reg_state[reg.GetName()] = int(reg.GetValue(), 16)
     return reg_state
 
+
 def get_section_info(sec):
-    name = sec.name if sec.name is not None else ''
+    name = sec.name if sec.name is not None else ""
     if sec.GetParent().name is not None:
-        name = sec.GetParent().name + '.' + sec.name
+        name = sec.GetParent().name + "." + sec.name
 
     module_name = sec.addr.module.file.GetFilename()
-    module_name = module_name if module_name is not None else ''
-    long_name = module_name + '.' + name
-    
+    module_name = module_name if module_name is not None else ""
+    long_name = module_name + "." + name
+
     return sec.addr.load_addr, (sec.addr.load_addr + sec.size), sec.size, long_name
- 
+
 
 def dump_process_memory(output_dir):
     # Segment information dictionary
     raw_segment_list = []
     raw_memory_list = []
-    
+
     # 1st pass:
     # Loop over the segments, fill in the segment info dictionary
     for module in lldb.target.module_iter():
         for seg_ea in module.section_iter():
-            seg_info = {'module': module.file.GetFilename() }
-            seg_info['start'], seg_info['end'], seg_size, seg_info['name'] = get_section_info(seg_ea)
+            seg_info = {"module": module.file.GetFilename()}
+            (
+                seg_info["start"],
+                seg_info["end"],
+                seg_size,
+                seg_info["name"],
+            ) = get_section_info(seg_ea)
             # TODO: Ugly hack for -1 LONG address on 32-bit
-            if seg_info['start'] >= sys.maxint or seg_size <= 0:
-                print "Throwing away page: {}".format(seg_info['name'])     
+            if seg_info["start"] >= sys.maxint or seg_size <= 0:
+                print "Throwing away page: {}".format(seg_info["name"])
                 continue
 
             # Page-align segment
-            seg_info['start'] = ALIGN_PAGE_DOWN(seg_info['start'])
-            seg_info['end'] = ALIGN_PAGE_UP(seg_info['end'])
-            print("Appending: {}".format(seg_info['name']))
+            seg_info["start"] = ALIGN_PAGE_DOWN(seg_info["start"])
+            seg_info["end"] = ALIGN_PAGE_UP(seg_info["end"])
+            print ("Appending: {}".format(seg_info["name"]))
             raw_segment_list.append(seg_info)
 
     # Add the stack memory region (just hardcode 0x1000 around the current SP)
     sp = lldb.frame.GetSP()
     start_sp = ALIGN_PAGE_DOWN(sp)
-    raw_segment_list.append({'start': start_sp, 'end': start_sp + 0x1000, 'name': 'STACK'})
+    raw_segment_list.append(
+        {"start": start_sp, "end": start_sp + 0x1000, "name": "STACK"}
+    )
 
     # Write the original memory to file for debugging
-    index_file = open(os.path.join(output_dir, DEBUG_MEM_FILE_NAME), 'w')
+    index_file = open(os.path.join(output_dir, DEBUG_MEM_FILE_NAME), "w")
     index_file.write(json.dumps(raw_segment_list, indent=4))
-    index_file.close()    
+    index_file.close()
 
-    # Loop over raw memory regions 
+    # Loop over raw memory regions
     mem_info = lldb.SBMemoryRegionInfo()
     start_addr = -1
     next_region_addr = 0
@@ -218,15 +240,20 @@ def dump_process_memory(output_dir):
         end_addr = mem_info.GetRegionEnd()
 
         # Unknown region name
-        region_name = 'UNKNOWN'
+        region_name = "UNKNOWN"
 
         # Ignore regions that aren't even mapped
         if mem_info.IsMapped() and mem_info.IsReadable():
-            mem_info_obj = {'start': start_addr, 'end': end_addr, 'name': region_name, 'permissions': {
-                "r": mem_info.IsReadable(),
-                "w": mem_info.IsWritable(),
-                "x": mem_info.IsExecutable()
-            }}
+            mem_info_obj = {
+                "start": start_addr,
+                "end": end_addr,
+                "name": region_name,
+                "permissions": {
+                    "r": mem_info.IsReadable(),
+                    "w": mem_info.IsWritable(),
+                    "x": mem_info.IsExecutable(),
+                },
+            }
 
             raw_memory_list.append(mem_info_obj)
 
@@ -234,65 +261,89 @@ def dump_process_memory(output_dir):
 
     for seg_info in final_segment_list:
         try:
-            seg_info['content_file'] = ''
-            start_addr = seg_info['start']
-            end_addr = seg_info['end']
-            region_name = seg_info['name']
+            seg_info["content_file"] = ""
+            start_addr = seg_info["start"]
+            end_addr = seg_info["end"]
+            region_name = seg_info["name"]
             # Compress and dump the content to a file
             err = lldb.SBError()
-            seg_content = lldb.process.ReadMemory(start_addr, end_addr - start_addr, err)
-            if(seg_content == None):
-                print("Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(start_addr, region_name))
-                seg_info['content_file'] = ''
+            seg_content = lldb.process.ReadMemory(
+                start_addr, end_addr - start_addr, err
+            )
+            if seg_content == None:
+                print (
+                    "Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(
+                        start_addr, region_name
+                    )
+                )
+                seg_info["content_file"] = ""
             else:
-                print("Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(start_addr, len(seg_content), region_name, repr(seg_info['permissions'])))
+                print (
+                    "Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(
+                        start_addr,
+                        len(seg_content),
+                        region_name,
+                        repr(seg_info["permissions"]),
+                    )
+                )
                 compressed_seg_content = zlib.compress(seg_content)
                 md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin"
-                seg_info['content_file'] = md5_sum
-                
+                seg_info["content_file"] = md5_sum
+
                 # Write the compressed contents to disk
-                out_file = open(os.path.join(output_dir, md5_sum), 'wb')
+                out_file = open(os.path.join(output_dir, md5_sum), "wb")
                 out_file.write(compressed_seg_content)
                 out_file.close()
-    
+
         except:
-            print("Exception reading segment ({}): {}".format(region_name, sys.exc_info()[0]))
-            
+            print (
+                "Exception reading segment ({}): {}".format(
+                    region_name, sys.exc_info()[0]
+                )
+            )
+
     return final_segment_list
 
-#----------
-#---- Main    
-    
+
+# ----------
+# ---- Main
+
+
 def main():
 
     try:
-        print("----- Unicorn Context Dumper -----")
-        print("You must be actively debugging before running this!")
-        print("If it fails, double check that you are actively debugging before running.")
-        
+        print ("----- Unicorn Context Dumper -----")
+        print ("You must be actively debugging before running this!")
+        print (
+            "If it fails, double check that you are actively debugging before running."
+        )
+
         # Create the output directory
-        timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S')
+        timestamp = datetime.datetime.fromtimestamp(time.time()).strftime(
+            "%Y%m%d_%H%M%S"
+        )
         output_path = "UnicornContext_" + timestamp
         if not os.path.exists(output_path):
             os.makedirs(output_path)
-        print("Process context will be output to {}".format(output_path))
-            
+        print ("Process context will be output to {}".format(output_path))
+
         # Get the context
         context = {
             "arch": dump_arch_info(),
-            "regs": dump_regs(), 
+            "regs": dump_regs(),
             "segments": dump_process_memory(output_path),
         }
-    
+
         # Write the index file
-        index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w')
+        index_file = open(os.path.join(output_path, INDEX_FILE_NAME), "w")
         index_file.write(json.dumps(context, indent=4))
-        index_file.close()    
-        print("Done.")
-        
+        index_file.close()
+        print ("Done.")
+
     except Exception, e:
-        print("!!! ERROR:\n\t{}".format(repr(e)))
-        
+        print ("!!! ERROR:\n\t{}".format(repr(e)))
+
+
 if __name__ == "__main__":
     main()
 elif lldb.debugger:
diff --git a/unicorn_mode/helper_scripts/unicorn_dumper_pwndbg.py b/unicorn_mode/helper_scripts/unicorn_dumper_pwndbg.py
index dc56b2aa..eccbc8bf 100644
--- a/unicorn_mode/helper_scripts/unicorn_dumper_pwndbg.py
+++ b/unicorn_mode/helper_scripts/unicorn_dumper_pwndbg.py
@@ -59,45 +59,47 @@ MAX_SEG_SIZE = 128 * 1024 * 1024
 # Name of the index file
 INDEX_FILE_NAME = "_index.json"
 
-#----------------------
-#---- Helper Functions
+# ----------------------
+# ---- Helper Functions
+
 
 def map_arch():
-    arch = pwndbg.arch.current # from PWNDBG
-    if 'x86_64' in arch or 'x86-64' in arch:
+    arch = pwndbg.arch.current  # from PWNDBG
+    if "x86_64" in arch or "x86-64" in arch:
         return "x64"
-    elif 'x86' in arch or 'i386' in arch:
+    elif "x86" in arch or "i386" in arch:
         return "x86"
-    elif 'aarch64' in arch or 'arm64' in arch:
+    elif "aarch64" in arch or "arm64" in arch:
         return "arm64le"
-    elif 'aarch64_be' in arch:
+    elif "aarch64_be" in arch:
         return "arm64be"
-    elif 'arm' in arch:
-        cpsr = pwndbg.regs['cpsr']
-        # check endianess 
-        if pwndbg.arch.endian == 'big':
+    elif "arm" in arch:
+        cpsr = pwndbg.regs["cpsr"]
+        # check endianess
+        if pwndbg.arch.endian == "big":
             # check for THUMB mode
-            if (cpsr & (1 << 5)):
+            if cpsr & (1 << 5):
                 return "armbethumb"
             else:
                 return "armbe"
         else:
             # check for THUMB mode
-            if (cpsr & (1 << 5)):
+            if cpsr & (1 << 5):
                 return "armlethumb"
             else:
                 return "armle"
-    elif 'mips' in arch:
-        if pwndbg.arch.endian == 'little':
-            return 'mipsel'
+    elif "mips" in arch:
+        if pwndbg.arch.endian == "little":
+            return "mipsel"
         else:
-            return 'mips'
+            return "mips"
     else:
         return ""
 
 
-#-----------------------
-#---- Dumping functions
+# -----------------------
+# ---- Dumping functions
+
 
 def dump_arch_info():
     arch_info = {}
@@ -110,26 +112,26 @@ def dump_regs():
     for reg in pwndbg.regs.all:
         reg_val = pwndbg.regs[reg]
         # current dumper script looks for register values to be hex strings
-#         reg_str = "0x{:08x}".format(reg_val)
-#         if "64" in get_arch():
-#             reg_str = "0x{:016x}".format(reg_val)
-#         reg_state[reg.strip().strip('$')] = reg_str
-        reg_state[reg.strip().strip('$')] = reg_val
+        #         reg_str = "0x{:08x}".format(reg_val)
+        #         if "64" in get_arch():
+        #             reg_str = "0x{:016x}".format(reg_val)
+        #         reg_state[reg.strip().strip('$')] = reg_str
+        reg_state[reg.strip().strip("$")] = reg_val
     return reg_state
 
 
 def dump_process_memory(output_dir):
     # Segment information dictionary
     final_segment_list = []
-    
+
     # PWNDBG:
     vmmap = pwndbg.vmmap.get()
-    
+
     # Pointer to end of last dumped memory segment
-    segment_last_addr = 0x0;
+    segment_last_addr = 0x0
 
     start = None
-    end   = None
+    end = None
 
     if not vmmap:
         print("No address mapping information found")
@@ -141,86 +143,107 @@ def dump_process_memory(output_dir):
             continue
 
         start = entry.start
-        end   = entry.end
+        end = entry.end
 
-        if (segment_last_addr > entry.start): # indicates overlap
-            if (segment_last_addr > entry.end): # indicates complete overlap, so we skip the segment entirely
+        if segment_last_addr > entry.start:  # indicates overlap
+            if (
+                segment_last_addr > entry.end
+            ):  # indicates complete overlap, so we skip the segment entirely
                 continue
-            else:            
+            else:
                 start = segment_last_addr
-            
-        
-        seg_info = {'start': start, 'end': end, 'name': entry.objfile, 'permissions': {
-            "r": entry.read,
-            "w": entry.write,
-            "x": entry.execute
-        }, 'content_file': ''}
+
+        seg_info = {
+            "start": start,
+            "end": end,
+            "name": entry.objfile,
+            "permissions": {"r": entry.read, "w": entry.write, "x": entry.execute},
+            "content_file": "",
+        }
 
         # "(deleted)" may or may not be valid, but don't push it.
-        if entry.read and not '(deleted)' in entry.objfile:
+        if entry.read and not "(deleted)" in entry.objfile:
             try:
                 # Compress and dump the content to a file
                 seg_content = pwndbg.memory.read(start, end - start)
-                if(seg_content == None):
-                    print("Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(entry.start, entry.objfile))
+                if seg_content == None:
+                    print(
+                        "Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(
+                            entry.start, entry.objfile
+                        )
+                    )
                 else:
-                    print("Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(entry.start, len(seg_content), entry.objfile, repr(seg_info['permissions'])))
+                    print(
+                        "Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(
+                            entry.start,
+                            len(seg_content),
+                            entry.objfile,
+                            repr(seg_info["permissions"]),
+                        )
+                    )
                     compressed_seg_content = zlib.compress(str(seg_content))
                     md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin"
                     seg_info["content_file"] = md5_sum
-                    
+
                     # Write the compressed contents to disk
-                    out_file = open(os.path.join(output_dir, md5_sum), 'wb')
+                    out_file = open(os.path.join(output_dir, md5_sum), "wb")
                     out_file.write(compressed_seg_content)
                     out_file.close()
 
             except Exception as e:
                 traceback.print_exc()
-                print("Exception reading segment ({}): {}".format(entry.objfile, sys.exc_info()[0]))
+                print(
+                    "Exception reading segment ({}): {}".format(
+                        entry.objfile, sys.exc_info()[0]
+                    )
+                )
         else:
             print("Skipping segment {0}@0x{1:016x}".format(entry.objfile, entry.start))
-        
+
         segment_last_addr = end
 
         # Add the segment to the list
         final_segment_list.append(seg_info)
 
-            
     return final_segment_list
 
-#----------
-#---- Main    
-    
+
+# ----------
+# ---- Main
+
+
 def main():
     print("----- Unicorn Context Dumper -----")
     print("You must be actively debugging before running this!")
     print("If it fails, double check that you are actively debugging before running.")
-    
+
     try:
 
         # Create the output directory
-        timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S')
+        timestamp = datetime.datetime.fromtimestamp(time.time()).strftime(
+            "%Y%m%d_%H%M%S"
+        )
         output_path = "UnicornContext_" + timestamp
         if not os.path.exists(output_path):
             os.makedirs(output_path)
         print("Process context will be output to {}".format(output_path))
-            
+
         # Get the context
         context = {
             "arch": dump_arch_info(),
-            "regs": dump_regs(), 
+            "regs": dump_regs(),
             "segments": dump_process_memory(output_path),
         }
 
         # Write the index file
-        index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w')
+        index_file = open(os.path.join(output_path, INDEX_FILE_NAME), "w")
         index_file.write(json.dumps(context, indent=4))
-        index_file.close()    
+        index_file.close()
         print("Done.")
-        
+
     except Exception as e:
         print("!!! ERROR:\n\t{}".format(repr(e)))
-        
+
+
 if __name__ == "__main__" and pwndbg_loaded:
     main()
-    
diff --git a/unicorn_mode/helper_scripts/unicorn_loader.py b/unicorn_mode/helper_scripts/unicorn_loader.py
index adf21b64..1914a83d 100644
--- a/unicorn_mode/helper_scripts/unicorn_loader.py
+++ b/unicorn_mode/helper_scripts/unicorn_loader.py
@@ -1,8 +1,8 @@
 """
     unicorn_loader.py
-    
-    Loads a process context dumped created using a 
-    Unicorn Context Dumper script into a Unicorn Engine 
+
+    Loads a process context dumped created using a
+    Unicorn Context Dumper script into a Unicorn Engine
     instance. Once this is performed emulation can be
     started.
 """
@@ -26,6 +26,13 @@ from unicorn.arm64_const import *
 from unicorn.x86_const import *
 from unicorn.mips_const import *
 
+# If Capstone libraries are availible (only check once)
+try:
+    from capstone import *
+    CAPSTONE_EXISTS = 1
+except:
+    CAPSTONE_EXISTS = 0
+
 # Name of the index file
 INDEX_FILE_NAME = "_index.json"
 
@@ -86,7 +93,7 @@ class UnicornSimpleHeap(object):
         total_chunk_size = UNICORN_PAGE_SIZE + ALIGN_PAGE_UP(size) + UNICORN_PAGE_SIZE
         # Gross but efficient way to find space for the chunk:
         chunk = None
-        for addr in xrange(self.HEAP_MIN_ADDR, self.HEAP_MAX_ADDR, UNICORN_PAGE_SIZE):
+        for addr in range(self.HEAP_MIN_ADDR, self.HEAP_MAX_ADDR, UNICORN_PAGE_SIZE):
             try:
                 self._uc.mem_map(addr, total_chunk_size, UC_PROT_READ | UC_PROT_WRITE)
                 chunk = self.HeapChunk(addr, total_chunk_size, size)
@@ -97,7 +104,7 @@ class UnicornSimpleHeap(object):
                 continue
         # Something went very wrong
         if chunk == None:
-            return 0    
+            return 0
         self._chunks.append(chunk)
         return chunk.data_addr
 
@@ -112,8 +119,8 @@ class UnicornSimpleHeap(object):
         old_chunk = None
         for chunk in self._chunks:
             if chunk.data_addr == ptr:
-                old_chunk = chunk 
-        new_chunk_addr = self.malloc(new_size) 
+                old_chunk = chunk
+        new_chunk_addr = self.malloc(new_size)
         if old_chunk != None:
             self._uc.mem_write(new_chunk_addr, str(self._uc.mem_read(old_chunk.data_addr, old_chunk.data_size)))
             self.free(old_chunk.data_addr)
@@ -184,39 +191,27 @@ class AflUnicornEngine(Uc):
         # Load the registers
         regs = context['regs']
         reg_map = self.__get_register_map(self._arch_str)
-        for register, value in regs.iteritems():
-            if debug_print:
-                print("Reg {0} = {1}".format(register, value))
-            if not reg_map.has_key(register.lower()):
-                if debug_print:
-                    print("Skipping Reg: {}".format(register))
-            else:
-                reg_write_retry = True
-                try:
-                    self.reg_write(reg_map[register.lower()], value)
-                    reg_write_retry = False
-                except Exception as e:
-                    if debug_print:
-                        print("ERROR writing register: {}, value: {} -- {}".format(register, value, repr(e)))
+        self.__load_registers(regs, reg_map, debug_print)
+        # If we have extra FLOATING POINT regs, load them in!
+        if 'regs_extended' in context:
+		if context['regs_extended']:
+		    regs_extended = context['regs_extended']
+		    reg_map = self.__get_registers_extended(self._arch_str)
+		    self.__load_registers(regs_extended, reg_map, debug_print)
+
+        # For ARM, sometimes the stack pointer is erased ??? (I think I fixed this (issue with ordering of dumper.py, I'll keep the write anyways)
+        if self.__get_arch_and_mode(self.get_arch_str())[0] == UC_ARCH_ARM:
+            self.reg_write(UC_ARM_REG_SP, regs['sp'])
 
-                if reg_write_retry:
-                    if debug_print:
-                        print("Trying to parse value ({}) as hex string".format(value))
-                    try:
-                        self.reg_write(reg_map[register.lower()], int(value, 16))
-                    except Exception as e:
-                        if debug_print:
-                            print("ERROR writing hex string register: {}, value: {} -- {}".format(register, value, repr(e)))
-                        
         # Setup the memory map and load memory content
         self.__map_segments(context['segments'], context_directory, debug_print)
-        
+
         if enable_trace:
             self.hook_add(UC_HOOK_BLOCK, self.__trace_block)
             self.hook_add(UC_HOOK_CODE, self.__trace_instruction)
             self.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, self.__trace_mem_access)
             self.hook_add(UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, self.__trace_mem_invalid_access)
-            
+
         if debug_print:
             print("Done loading context.")
 
@@ -228,7 +223,7 @@ class AflUnicornEngine(Uc):
 
     def get_arch_str(self):
         return self._arch_str
-                    
+
     def force_crash(self, uc_error):
         """ This function should be called to indicate to AFL that a crash occurred during emulation.
             You can pass the exception received from Uc.emu_start
@@ -253,21 +248,76 @@ class AflUnicornEngine(Uc):
         for reg in sorted(self.__get_register_map(self._arch_str).items(), key=lambda reg: reg[0]):
             print(">>> {0:>4}: 0x{1:016x}".format(reg[0], self.reg_read(reg[1])))
 
+    def dump_regs_extended(self):
+        """ Dumps the contents of all the registers to STDOUT """
+        try:
+            for reg in sorted(self.__get_registers_extended(self._arch_str).items(), key=lambda reg: reg[0]):
+                print(">>> {0:>4}: 0x{1:016x}".format(reg[0], self.reg_read(reg[1])))
+        except Exception as e:
+            print("ERROR: Are extended registers loaded?")
+
     # TODO: Make this dynamically get the stack pointer register and pointer width for the current architecture
     """
     def dump_stack(self, window=10):
+        arch = self.get_arch()
+        mode = self.get_mode()
+        # Get stack pointers and bit sizes for given architecture
+        if arch == UC_ARCH_X86 and mode == UC_MODE_64:
+            stack_ptr_addr = self.reg_read(UC_X86_REG_RSP)
+            bit_size = 8
+        elif arch == UC_ARCH_X86 and mode == UC_MODE_32:
+            stack_ptr_addr = self.reg_read(UC_X86_REG_ESP)
+            bit_size = 4
+        elif arch == UC_ARCH_ARM64:
+            stack_ptr_addr = self.reg_read(UC_ARM64_REG_SP)
+            bit_size = 8
+        elif arch == UC_ARCH_ARM:
+            stack_ptr_addr = self.reg_read(UC_ARM_REG_SP)
+            bit_size = 4
+        elif arch == UC_ARCH_ARM and mode == UC_MODE_THUMB:
+            stack_ptr_addr = self.reg_read(UC_ARM_REG_SP)
+            bit_size = 4
+        elif arch == UC_ARCH_MIPS:
+            stack_ptr_addr = self.reg_read(UC_MIPS_REG_SP)
+            bit_size = 4
+        print("")
         print(">>> Stack:")
         stack_ptr_addr = self.reg_read(UC_X86_REG_RSP)
         for i in xrange(-window, window + 1):
             addr = stack_ptr_addr + (i*8)
             print("{0}0x{1:016x}: 0x{2:016x}".format( \
-                'SP->' if i == 0 else '    ', addr, \
+               'SP->' if i == 0 else '    ', addr, \
                 struct.unpack('<Q', self.mem_read(addr, 8))[0]))
     """
 
     #-----------------------------
     #---- Loader Helper Functions
 
+    def __load_registers(self, regs, reg_map, debug_print):
+        for register, value in regs.items():
+            if debug_print:
+                print("Reg {0} = {1}".format(register, value))
+            if register.lower() not in reg_map:
+                if debug_print:
+                    print("Skipping Reg: {}".format(register))
+            else:
+                reg_write_retry = True
+                try:
+                    self.reg_write(reg_map[register.lower()], value)
+                    reg_write_retry = False
+                except Exception as e:
+                    if debug_print:
+                        print("ERROR writing register: {}, value: {} -- {}".format(register, value, repr(e)))
+
+                if reg_write_retry:
+                    if debug_print:
+                        print("Trying to parse value ({}) as hex string".format(value))
+                    try:
+                        self.reg_write(reg_map[register.lower()], int(value, 16))
+                    except Exception as e:
+                        if debug_print:
+                            print("ERROR writing hex string register: {}, value: {} -- {}".format(register, value, repr(e)))
+
     def __map_segment(self, name, address, size, perms, debug_print=False):
         # - size is unsigned and must be != 0
         # - starting address must be aligned to 4KB
@@ -289,7 +339,7 @@ class AflUnicornEngine(Uc):
 
     def __map_segments(self, segment_list, context_directory, debug_print=False):
         for segment in segment_list:
-            
+
             # Get the segment information from the index
             name = segment['name']
             seg_start = segment['start']
@@ -297,7 +347,7 @@ class AflUnicornEngine(Uc):
             perms = \
                 (UC_PROT_READ  if segment['permissions']['r'] == True else 0) | \
                 (UC_PROT_WRITE if segment['permissions']['w'] == True else 0) | \
-                (UC_PROT_EXEC  if segment['permissions']['x'] == True else 0)        
+                (UC_PROT_EXEC  if segment['permissions']['x'] == True else 0)
 
             if debug_print:
                 print("Handling segment {}".format(name))
@@ -349,12 +399,12 @@ class AflUnicornEngine(Uc):
                 content_file = open(content_file_path, 'rb')
                 compressed_content = content_file.read()
                 content_file.close()
-                self.mem_write(seg_start, zlib.decompress(compressed_content)) 
+                self.mem_write(seg_start, zlib.decompress(compressed_content))
 
             else:
                 if debug_print:
                     print("No content found for segment {0} @ {1:016x}".format(name, seg_start))
-                self.mem_write(seg_start, '\x00' * (seg_end - seg_start))
+                self.mem_write(seg_start, b'\x00' * (seg_end - seg_start))
 
     def __get_arch_and_mode(self, arch_str):
         arch_map = {
@@ -398,7 +448,6 @@ class AflUnicornEngine(Uc):
                 "r14":    UC_X86_REG_R14,
                 "r15":    UC_X86_REG_R15,
                 "rip":    UC_X86_REG_RIP,
-                "rsp":    UC_X86_REG_RSP,
                 "efl":    UC_X86_REG_EFLAGS,
                 "cs":     UC_X86_REG_CS,
                 "ds":     UC_X86_REG_DS,
@@ -415,13 +464,12 @@ class AflUnicornEngine(Uc):
                 "esi":    UC_X86_REG_ESI,
                 "edi":    UC_X86_REG_EDI,
                 "ebp":    UC_X86_REG_EBP,
-                "esp":    UC_X86_REG_ESP,
                 "eip":    UC_X86_REG_EIP,
                 "esp":    UC_X86_REG_ESP,
-                "efl":    UC_X86_REG_EFLAGS,        
+                "efl":    UC_X86_REG_EFLAGS,
                 # Segment registers removed...
                 # They caused segfaults (from unicorn?) when they were here
-            },        
+            },
             "arm" : {
                 "r0":     UC_ARM_REG_R0,
                 "r1":     UC_ARM_REG_R1,
@@ -476,7 +524,7 @@ class AflUnicornEngine(Uc):
                 "fp":     UC_ARM64_REG_FP,
                 "lr":     UC_ARM64_REG_LR,
                 "nzcv":   UC_ARM64_REG_NZCV,
-                "cpsr": UC_ARM_REG_CPSR, 
+                "cpsr": UC_ARM_REG_CPSR,
             },
             "mips" : {
                 "0" :     UC_MIPS_REG_ZERO,
@@ -499,13 +547,13 @@ class AflUnicornEngine(Uc):
                 "t9":     UC_MIPS_REG_T9,
                 "s0":     UC_MIPS_REG_S0,
                 "s1":     UC_MIPS_REG_S1,
-                "s2":     UC_MIPS_REG_S2,    
+                "s2":     UC_MIPS_REG_S2,
                 "s3":     UC_MIPS_REG_S3,
                 "s4":     UC_MIPS_REG_S4,
                 "s5":     UC_MIPS_REG_S5,
-                "s6":     UC_MIPS_REG_S6,              
+                "s6":     UC_MIPS_REG_S6,
                 "s7":     UC_MIPS_REG_S7,
-                "s8":     UC_MIPS_REG_S8,  
+                "s8":     UC_MIPS_REG_S8,
                 "k0":     UC_MIPS_REG_K0,
                 "k1":     UC_MIPS_REG_K1,
                 "gp":     UC_MIPS_REG_GP,
@@ -517,44 +565,127 @@ class AflUnicornEngine(Uc):
                 "lo":     UC_MIPS_REG_LO
             }
         }
-        return registers[arch]   
+        return registers[arch]
 
+    def __get_registers_extended(self, arch):
+        # Similar to __get_register_map, but for ARM floating point registers
+        if arch == "arm64le" or arch == "arm64be":
+            arch = "arm64"
+        elif arch == "armle" or arch == "armbe" or "thumb" in arch:
+            arch = "arm"
+        elif arch == "mipsel":
+            arch = "mips"
+
+        registers = {
+        "arm": {
+            "d0": UC_ARM_REG_D0,
+            "d1": UC_ARM_REG_D1,
+            "d2": UC_ARM_REG_D2,
+            "d3": UC_ARM_REG_D3,
+            "d4": UC_ARM_REG_D4,
+            "d5": UC_ARM_REG_D5,
+            "d6": UC_ARM_REG_D6,
+            "d7": UC_ARM_REG_D7,
+            "d8": UC_ARM_REG_D8,
+            "d9": UC_ARM_REG_D9,
+            "d10": UC_ARM_REG_D10,
+            "d11": UC_ARM_REG_D11,
+            "d12": UC_ARM_REG_D12,
+            "d13": UC_ARM_REG_D13,
+            "d14": UC_ARM_REG_D14,
+            "d15": UC_ARM_REG_D15,
+            "d16": UC_ARM_REG_D16,
+            "d17": UC_ARM_REG_D17,
+            "d18": UC_ARM_REG_D18,
+            "d19": UC_ARM_REG_D19,
+            "d20": UC_ARM_REG_D20,
+            "d21": UC_ARM_REG_D21,
+            "d22": UC_ARM_REG_D22,
+            "d23": UC_ARM_REG_D23,
+            "d24": UC_ARM_REG_D24,
+            "d25": UC_ARM_REG_D25,
+            "d26": UC_ARM_REG_D26,
+            "d27": UC_ARM_REG_D27,
+            "d28": UC_ARM_REG_D28,
+            "d29": UC_ARM_REG_D29,
+            "d30": UC_ARM_REG_D30,
+            "d31": UC_ARM_REG_D31,
+            "fpscr": UC_ARM_REG_FPSCR
+            }
+        }
+
+        return registers[arch];
     #---------------------------
-    # Callbacks for tracing 
+    # Callbacks for tracing
 
-    # TODO: Make integer-printing fixed widths dependent on bitness of architecture 
-    #       (i.e. only show 4 bytes for 32-bit, 8 bytes for 64-bit)
 
-    # TODO: Figure out how best to determine the capstone mode and architecture here
-    """
-    try:
-        # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary.
-        from capstone import *
-        cs = Cs(CS_ARCH_MIPS, CS_MODE_MIPS32 + CS_MODE_BIG_ENDIAN)
-        def __trace_instruction(self, uc, address, size, user_data):
-            mem = uc.mem_read(address, size)
-            for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size):
-                print("    Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr))
-    except ImportError:
-        def __trace_instruction(self, uc, address, size, user_data):
-            print("    Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))    
-    """
+    # TODO: Extra mode for Capstone (i.e. Cs(cs_arch, cs_mode + cs_extra) not implemented
+
 
     def __trace_instruction(self, uc, address, size, user_data):
-        print("    Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))  
-        
+        if CAPSTONE_EXISTS == 1:
+            # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary.
+            arch = self.get_arch()
+            mode = self.get_mode()
+            bit_size = self.bit_size_arch()
+            # Map current arch to capstone labeling
+            if arch == UC_ARCH_X86 and mode == UC_MODE_64:
+                cs_arch = CS_ARCH_X86
+                cs_mode = CS_MODE_64
+            elif arch == UC_ARCH_X86 and mode == UC_MODE_32:
+                cs_arch = CS_ARCH_X86
+                cs_mode = CS_MODE_32
+            elif arch == UC_ARCH_ARM64:
+                cs_arch = CS_ARCH_ARM64
+                cs_mode = CS_MODE_ARM
+            elif arch == UC_ARCH_ARM and mode == UC_MODE_THUMB:
+                cs_arch = CS_ARCH_ARM
+                cs_mode = CS_MODE_THUMB
+            elif arch == UC_ARCH_ARM:
+                cs_arch = CS_ARCH_ARM
+                cs_mode = CS_MODE_ARM
+            elif arch == UC_ARCH_MIPS:
+                cs_arch = CS_ARCH_MIPS
+                cs_mode = CS_MODE_MIPS32  # No other MIPS supported in program
+
+            cs = Cs(cs_arch, cs_mode)
+            mem = uc.mem_read(address, size)
+            if bit_size == 4:
+                for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size):
+                    print("    Instr: {:#08x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr))
+            else:
+                for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size):
+                    print("    Instr: {:#16x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr))
+        else:
+            print("    Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
+
     def __trace_block(self, uc, address, size, user_data):
         print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
-      
+
     def __trace_mem_access(self, uc, access, address, size, value, user_data):
         if access == UC_MEM_WRITE:
             print("        >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
         else:
-            print("        >>> Read: addr=0x{0:016x} size={1}".format(address, size))    
+            print("        >>> Read: addr=0x{0:016x} size={1}".format(address, size))
 
     def __trace_mem_invalid_access(self, uc, access, address, size, value, user_data):
         if access == UC_MEM_WRITE_UNMAPPED:
             print("        >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
         else:
-            print("        >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size))   
-
+            print("        >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size))
+
+    def bit_size_arch(self):
+        arch = self.get_arch()
+        mode = self.get_mode()
+        # Get bit sizes for given architecture
+        if arch == UC_ARCH_X86 and mode == UC_MODE_64:
+            bit_size = 8
+        elif arch == UC_ARCH_X86 and mode == UC_MODE_32:
+            bit_size = 4
+        elif arch == UC_ARCH_ARM64:
+            bit_size = 8
+        elif arch == UC_ARCH_ARM:
+            bit_size = 4
+        elif arch == UC_ARCH_MIPS:
+            bit_size = 4
+        return bit_size
diff --git a/unicorn_mode/samples/c/COMPILE.md b/unicorn_mode/samples/c/COMPILE.md
index 7857e5bf..7da140f7 100644
--- a/unicorn_mode/samples/c/COMPILE.md
+++ b/unicorn_mode/samples/c/COMPILE.md
@@ -17,6 +17,6 @@ You shouldn't need to compile simple_target.c since a X86_64 binary version is
 pre-built and shipped in this sample folder. This file documents how the binary
 was built in case you want to rebuild it or recompile it for any reason.
 
-The pre-built binary (simple_target_x86_64.bin) was built using -g -O0 in gcc.
+The pre-built binary (persistent_target_x86_64) was built using -g -O0 in gcc.
 
 We then load the binary and execute the main function directly.
diff --git a/unicorn_mode/samples/compcov_x64/compcov_test_harness.py b/unicorn_mode/samples/compcov_x64/compcov_test_harness.py
index b9ebb61d..f0749d1b 100644
--- a/unicorn_mode/samples/compcov_x64/compcov_test_harness.py
+++ b/unicorn_mode/samples/compcov_x64/compcov_test_harness.py
@@ -22,48 +22,81 @@ from unicornafl import *
 from unicornafl.x86_const import *
 
 # Path to the file containing the binary to emulate
-BINARY_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'compcov_target.bin')
+BINARY_FILE = os.path.join(
+    os.path.dirname(os.path.abspath(__file__)), "compcov_target.bin"
+)
 
 # Memory map for the code to be tested
-CODE_ADDRESS  = 0x00100000  # Arbitrary address where code to test will be loaded
+CODE_ADDRESS = 0x00100000  # Arbitrary address where code to test will be loaded
 CODE_SIZE_MAX = 0x00010000  # Max size for the code (64kb)
 STACK_ADDRESS = 0x00200000  # Address of the stack (arbitrarily chosen)
-STACK_SIZE	  = 0x00010000  # Size of the stack (arbitrarily chosen)
-DATA_ADDRESS  = 0x00300000  # Address where mutated data will be placed
+STACK_SIZE = 0x00010000  # Size of the stack (arbitrarily chosen)
+DATA_ADDRESS = 0x00300000  # Address where mutated data will be placed
 DATA_SIZE_MAX = 0x00010000  # Maximum allowable size of mutated data
 
 try:
     # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary.
     from capstone import *
+
     cs = Cs(CS_ARCH_X86, CS_MODE_64)
+
     def unicorn_debug_instruction(uc, address, size, user_data):
         mem = uc.mem_read(address, size)
-        for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size):
+        for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(
+            bytes(mem), size
+        ):
             print("    Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr))
+
+
 except ImportError:
+
     def unicorn_debug_instruction(uc, address, size, user_data):
         print("    Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
 
+
 def unicorn_debug_block(uc, address, size, user_data):
     print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
 
+
 def unicorn_debug_mem_access(uc, access, address, size, value, user_data):
     if access == UC_MEM_WRITE:
-        print("        >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
+        print(
+            "        >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(
+                address, size, value
+            )
+        )
     else:
         print("        >>> Read: addr=0x{0:016x} size={1}".format(address, size))
 
+
 def unicorn_debug_mem_invalid_access(uc, access, address, size, value, user_data):
     if access == UC_MEM_WRITE_UNMAPPED:
-        print("        >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
+        print(
+            "        >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(
+                address, size, value
+            )
+        )
     else:
-        print("        >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size))
+        print(
+            "        >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size)
+        )
+
 
 def main():
 
     parser = argparse.ArgumentParser(description="Test harness for compcov_target.bin")
-    parser.add_argument('input_file', type=str, help="Path to the file containing the mutated input to load")
-    parser.add_argument('-t', '--trace', default=False, action="store_true", help="Enables debug tracing")
+    parser.add_argument(
+        "input_file",
+        type=str,
+        help="Path to the file containing the mutated input to load",
+    )
+    parser.add_argument(
+        "-t",
+        "--trace",
+        default=False,
+        action="store_true",
+        help="Enables debug tracing",
+    )
     args = parser.parse_args()
 
     # Instantiate a MIPS32 big endian Unicorn Engine instance
@@ -73,13 +106,16 @@ def main():
         uc.hook_add(UC_HOOK_BLOCK, unicorn_debug_block)
         uc.hook_add(UC_HOOK_CODE, unicorn_debug_instruction)
         uc.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, unicorn_debug_mem_access)
-        uc.hook_add(UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, unicorn_debug_mem_invalid_access)
+        uc.hook_add(
+            UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID,
+            unicorn_debug_mem_invalid_access,
+        )
 
-    #---------------------------------------------------
+    # ---------------------------------------------------
     # Load the binary to emulate and map it into memory
 
     print("Loading data input from {}".format(args.input_file))
-    binary_file = open(BINARY_FILE, 'rb')
+    binary_file = open(BINARY_FILE, "rb")
     binary_code = binary_file.read()
     binary_file.close()
 
@@ -93,11 +129,11 @@ def main():
     uc.mem_write(CODE_ADDRESS, binary_code)
 
     # Set the program counter to the start of the code
-    start_address = CODE_ADDRESS          # Address of entry point of main()
-    end_address   = CODE_ADDRESS + 0x55   # Address of last instruction in main()
+    start_address = CODE_ADDRESS  # Address of entry point of main()
+    end_address = CODE_ADDRESS + 0x55  # Address of last instruction in main()
     uc.reg_write(UC_X86_REG_RIP, start_address)
 
-    #-----------------
+    # -----------------
     # Setup the stack
 
     uc.mem_map(STACK_ADDRESS, STACK_SIZE)
@@ -106,8 +142,7 @@ def main():
     # Mapping a location to write our buffer to
     uc.mem_map(DATA_ADDRESS, DATA_SIZE_MAX)
 
-
-    #-----------------------------------------------
+    # -----------------------------------------------
     # Load the mutated input and map it into memory
 
     def place_input_callback(uc, input, _, data):
@@ -121,7 +156,7 @@ def main():
         # Write the mutated command into the data buffer
         uc.mem_write(DATA_ADDRESS, input)
 
-    #------------------------------------------------------------
+    # ------------------------------------------------------------
     # Emulate the code, allowing it to process the mutated input
 
     print("Starting the AFL fuzz")
@@ -129,8 +164,9 @@ def main():
         input_file=args.input_file,
         place_input_callback=place_input_callback,
         exits=[end_address],
-        persistent_iters=1
+        persistent_iters=1,
     )
 
+
 if __name__ == "__main__":
     main()
diff --git a/unicorn_mode/samples/persistent/simple_target_noncrashing.c b/unicorn_mode/samples/persistent/simple_target_noncrashing.c
index 00764473..9257643b 100644
--- a/unicorn_mode/samples/persistent/simple_target_noncrashing.c
+++ b/unicorn_mode/samples/persistent/simple_target_noncrashing.c
@@ -10,7 +10,7 @@
  * Written by Nathan Voss <njvoss99@gmail.com>
  * Adapted by Lukas Seidel <seidel.1@campus.tu-berlin.de>
  */
-
+#include <string.h>
 
 int main(int argc, char** argv) {
   if(argc < 2){
@@ -19,15 +19,19 @@ int main(int argc, char** argv) {
 
   char *data_buf = argv[1];
 
-  if len(data_buf < 20) {
-  if (data_buf[20] != 0) {
+  if (strlen(data_buf) >= 21 && data_buf[20] != 0) {
     printf("Not crashing");
-  } else if (data_buf[0] > 0x10 && data_buf[0] < 0x20 && data_buf[1] > data_buf[2]) {
+  } else if (strlen(data_buf) > 1
+             && data_buf[0] > 0x10 && data_buf[0] < 0x20 && data_buf[1] > data_buf[2]) {
     printf("Also not crashing with databuf[0] == %c", data_buf[0])
-  } else if (data_buf[9] == 0x00 && data_buf[10] != 0x00 && data_buf[11] == 0x00) {
+  }
+#if 0
+  // not possible with argv (zero terminated strings) (hexcoder-)
+  // do not try to access data_buf[10] and beyond
+  else if (data_buf[9] == 0x00 && data_buf[10] != 0x00 && data_buf[11] == 0x00) {
     // Cause a crash if data[10] is not zero, but [9] and [11] are zero
     unsigned char invalid_read = *(unsigned char *) 0x00000000;
   }
-
+#endif
   return 0;
 }
diff --git a/unicorn_mode/samples/simple/simple_test_harness.py b/unicorn_mode/samples/simple/simple_test_harness.py
index f4002ca8..cd04ad3a 100644
--- a/unicorn_mode/samples/simple/simple_test_harness.py
+++ b/unicorn_mode/samples/simple/simple_test_harness.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python3
 """ 
    Simple test harness for AFL's Unicorn Mode.
 
@@ -22,48 +22,81 @@ from unicornafl import *
 from unicornafl.mips_const import *
 
 # Path to the file containing the binary to emulate
-BINARY_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'simple_target.bin')
+BINARY_FILE = os.path.join(
+    os.path.dirname(os.path.abspath(__file__)), "simple_target.bin"
+)
 
 # Memory map for the code to be tested
-CODE_ADDRESS  = 0x00100000  # Arbitrary address where code to test will be loaded
+CODE_ADDRESS = 0x00100000  # Arbitrary address where code to test will be loaded
 CODE_SIZE_MAX = 0x00010000  # Max size for the code (64kb)
 STACK_ADDRESS = 0x00200000  # Address of the stack (arbitrarily chosen)
-STACK_SIZE	  = 0x00010000  # Size of the stack (arbitrarily chosen)
-DATA_ADDRESS  = 0x00300000  # Address where mutated data will be placed
+STACK_SIZE = 0x00010000  # Size of the stack (arbitrarily chosen)
+DATA_ADDRESS = 0x00300000  # Address where mutated data will be placed
 DATA_SIZE_MAX = 0x00010000  # Maximum allowable size of mutated data
 
 try:
     # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary.
     from capstone import *
+
     cs = Cs(CS_ARCH_MIPS, CS_MODE_MIPS32 + CS_MODE_BIG_ENDIAN)
+
     def unicorn_debug_instruction(uc, address, size, user_data):
         mem = uc.mem_read(address, size)
-        for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size):
+        for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(
+            bytes(mem), size
+        ):
             print("    Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr))
+
+
 except ImportError:
+
     def unicorn_debug_instruction(uc, address, size, user_data):
-        print("    Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))    
+        print("    Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
+
 
 def unicorn_debug_block(uc, address, size, user_data):
     print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
-    
+
+
 def unicorn_debug_mem_access(uc, access, address, size, value, user_data):
     if access == UC_MEM_WRITE:
-        print("        >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
+        print(
+            "        >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(
+                address, size, value
+            )
+        )
     else:
-        print("        >>> Read: addr=0x{0:016x} size={1}".format(address, size))    
+        print("        >>> Read: addr=0x{0:016x} size={1}".format(address, size))
+
 
 def unicorn_debug_mem_invalid_access(uc, access, address, size, value, user_data):
     if access == UC_MEM_WRITE_UNMAPPED:
-        print("        >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
+        print(
+            "        >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(
+                address, size, value
+            )
+        )
     else:
-        print("        >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size))   
+        print(
+            "        >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size)
+        )
+
 
 def main():
 
     parser = argparse.ArgumentParser(description="Test harness for simple_target.bin")
-    parser.add_argument('input_file', type=str, help="Path to the file containing the mutated input to load")
-    parser.add_argument('-t', '--trace', default=False, action="store_true", help="Enables debug tracing")
+    parser.add_argument(
+        "input_file",
+        type=str,
+        help="Path to the file containing the mutated input to load",
+    )
+    parser.add_argument(
+        "-t",
+        "--trace",
+        default=False,
+        action="store_true",
+        help="Enables debug tracing",
+    )
     args = parser.parse_args()
 
     # Instantiate a MIPS32 big endian Unicorn Engine instance
@@ -73,13 +106,16 @@ def main():
         uc.hook_add(UC_HOOK_BLOCK, unicorn_debug_block)
         uc.hook_add(UC_HOOK_CODE, unicorn_debug_instruction)
         uc.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, unicorn_debug_mem_access)
-        uc.hook_add(UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, unicorn_debug_mem_invalid_access)
+        uc.hook_add(
+            UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID,
+            unicorn_debug_mem_invalid_access,
+        )
 
-    #---------------------------------------------------
+    # ---------------------------------------------------
     # Load the binary to emulate and map it into memory
 
     print("Loading data input from {}".format(args.input_file))
-    binary_file = open(BINARY_FILE, 'rb')
+    binary_file = open(BINARY_FILE, "rb")
     binary_code = binary_file.read()
     binary_file.close()
 
@@ -93,11 +129,11 @@ def main():
     uc.mem_write(CODE_ADDRESS, binary_code)
 
     # Set the program counter to the start of the code
-    start_address = CODE_ADDRESS          # Address of entry point of main()
-    end_address   = CODE_ADDRESS + 0xf4   # Address of last instruction in main()
+    start_address = CODE_ADDRESS  # Address of entry point of main()
+    end_address = CODE_ADDRESS + 0xF4  # Address of last instruction in main()
     uc.reg_write(UC_MIPS_REG_PC, start_address)
 
-    #-----------------
+    # -----------------
     # Setup the stack
 
     uc.mem_map(STACK_ADDRESS, STACK_SIZE)
@@ -106,14 +142,14 @@ def main():
     # reserve some space for data
     uc.mem_map(DATA_ADDRESS, DATA_SIZE_MAX)
 
-    #-----------------------------------------------------
+    # -----------------------------------------------------
     # Set up a callback to place input data (do little work here, it's called for every single iteration)
     # We did not pass in any data and don't use persistent mode, so we can ignore these params.
     # Be sure to check out the docstrings for the uc.afl_* functions.
     def place_input_callback(uc, input, persistent_round, data):
         # Apply constraints to the mutated input
         if len(input) > DATA_SIZE_MAX:
-            #print("Test input is too long (> {} bytes)")
+            # print("Test input is too long (> {} bytes)")
             return False
 
         # Write the mutated command into the data buffer
@@ -122,5 +158,6 @@ def main():
     # Start the fuzzer.
     uc.afl_fuzz(args.input_file, place_input_callback, [end_address])
 
+
 if __name__ == "__main__":
     main()
diff --git a/unicorn_mode/samples/simple/simple_test_harness_alt.py b/unicorn_mode/samples/simple/simple_test_harness_alt.py
index 9c3dbc93..3249b13d 100644
--- a/unicorn_mode/samples/simple/simple_test_harness_alt.py
+++ b/unicorn_mode/samples/simple/simple_test_harness_alt.py
@@ -25,50 +25,79 @@ from unicornafl import *
 from unicornafl.mips_const import *
 
 # Path to the file containing the binary to emulate
-BINARY_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'simple_target.bin')
+BINARY_FILE = os.path.join(
+    os.path.dirname(os.path.abspath(__file__)), "simple_target.bin"
+)
 
 # Memory map for the code to be tested
-CODE_ADDRESS  = 0x00100000  # Arbitrary address where code to test will be loaded
+CODE_ADDRESS = 0x00100000  # Arbitrary address where code to test will be loaded
 CODE_SIZE_MAX = 0x00010000  # Max size for the code (64kb)
 STACK_ADDRESS = 0x00200000  # Address of the stack (arbitrarily chosen)
-STACK_SIZE	  = 0x00010000  # Size of the stack (arbitrarily chosen)
-DATA_ADDRESS  = 0x00300000  # Address where mutated data will be placed
+STACK_SIZE = 0x00010000  # Size of the stack (arbitrarily chosen)
+DATA_ADDRESS = 0x00300000  # Address where mutated data will be placed
 DATA_SIZE_MAX = 0x00010000  # Maximum allowable size of mutated data
 
 try:
     # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary.
     from capstone import *
+
     cs = Cs(CS_ARCH_MIPS, CS_MODE_MIPS32 + CS_MODE_BIG_ENDIAN)
+
     def unicorn_debug_instruction(uc, address, size, user_data):
         mem = uc.mem_read(address, size)
-        for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size):
+        for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(
+            bytes(mem), size
+        ):
             print("    Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr))
+
+
 except ImportError:
+
     def unicorn_debug_instruction(uc, address, size, user_data):
-        print("    Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))    
+        print("    Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
+
 
 def unicorn_debug_block(uc, address, size, user_data):
     print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
-    
+
+
 def unicorn_debug_mem_access(uc, access, address, size, value, user_data):
     if access == UC_MEM_WRITE:
-        print("        >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
+        print(
+            "        >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(
+                address, size, value
+            )
+        )
     else:
-        print("        >>> Read: addr=0x{0:016x} size={1}".format(address, size))    
+        print("        >>> Read: addr=0x{0:016x} size={1}".format(address, size))
+
 
 def unicorn_debug_mem_invalid_access(uc, access, address, size, value, user_data):
     if access == UC_MEM_WRITE_UNMAPPED:
-        print("        >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
+        print(
+            "        >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(
+                address, size, value
+            )
+        )
     else:
-        print("        >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size))   
+        print(
+            "        >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size)
+        )
+
 
 def force_crash(uc_error):
     # This function should be called to indicate to AFL that a crash occurred during emulation.
     # Pass in the exception received from Uc.emu_start()
     mem_errors = [
-        UC_ERR_READ_UNMAPPED, UC_ERR_READ_PROT, UC_ERR_READ_UNALIGNED,
-        UC_ERR_WRITE_UNMAPPED, UC_ERR_WRITE_PROT, UC_ERR_WRITE_UNALIGNED,
-        UC_ERR_FETCH_UNMAPPED, UC_ERR_FETCH_PROT, UC_ERR_FETCH_UNALIGNED,
+        UC_ERR_READ_UNMAPPED,
+        UC_ERR_READ_PROT,
+        UC_ERR_READ_UNALIGNED,
+        UC_ERR_WRITE_UNMAPPED,
+        UC_ERR_WRITE_PROT,
+        UC_ERR_WRITE_UNALIGNED,
+        UC_ERR_FETCH_UNMAPPED,
+        UC_ERR_FETCH_PROT,
+        UC_ERR_FETCH_UNALIGNED,
     ]
     if uc_error.errno in mem_errors:
         # Memory error - throw SIGSEGV
@@ -80,11 +109,22 @@ def force_crash(uc_error):
         # Not sure what happened - throw SIGABRT
         os.kill(os.getpid(), signal.SIGABRT)
 
+
 def main():
 
     parser = argparse.ArgumentParser(description="Test harness for simple_target.bin")
-    parser.add_argument('input_file', type=str, help="Path to the file containing the mutated input to load")
-    parser.add_argument('-d', '--debug', default=False, action="store_true", help="Enables debug tracing")
+    parser.add_argument(
+        "input_file",
+        type=str,
+        help="Path to the file containing the mutated input to load",
+    )
+    parser.add_argument(
+        "-d",
+        "--debug",
+        default=False,
+        action="store_true",
+        help="Enables debug tracing",
+    )
     args = parser.parse_args()
 
     # Instantiate a MIPS32 big endian Unicorn Engine instance
@@ -94,13 +134,16 @@ def main():
         uc.hook_add(UC_HOOK_BLOCK, unicorn_debug_block)
         uc.hook_add(UC_HOOK_CODE, unicorn_debug_instruction)
         uc.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, unicorn_debug_mem_access)
-        uc.hook_add(UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, unicorn_debug_mem_invalid_access)
+        uc.hook_add(
+            UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID,
+            unicorn_debug_mem_invalid_access,
+        )
 
-    #---------------------------------------------------
+    # ---------------------------------------------------
     # Load the binary to emulate and map it into memory
 
     print("Loading data input from {}".format(args.input_file))
-    binary_file = open(BINARY_FILE, 'rb')
+    binary_file = open(BINARY_FILE, "rb")
     binary_code = binary_file.read()
     binary_file.close()
 
@@ -114,11 +157,11 @@ def main():
     uc.mem_write(CODE_ADDRESS, binary_code)
 
     # Set the program counter to the start of the code
-    start_address = CODE_ADDRESS          # Address of entry point of main()
-    end_address   = CODE_ADDRESS + 0xf4   # Address of last instruction in main()
+    start_address = CODE_ADDRESS  # Address of entry point of main()
+    end_address = CODE_ADDRESS + 0xF4  # Address of last instruction in main()
     uc.reg_write(UC_MIPS_REG_PC, start_address)
 
-    #-----------------
+    # -----------------
     # Setup the stack
 
     uc.mem_map(STACK_ADDRESS, STACK_SIZE)
@@ -127,10 +170,10 @@ def main():
     # reserve some space for data
     uc.mem_map(DATA_ADDRESS, DATA_SIZE_MAX)
 
-    #-----------------------------------------------------
+    # -----------------------------------------------------
     #   Kick off AFL's fork server
-    #   THIS MUST BE DONE BEFORE LOADING USER DATA! 
-    #   If this isn't done every single run, the AFL fork server 
+    #   THIS MUST BE DONE BEFORE LOADING USER DATA!
+    #   If this isn't done every single run, the AFL fork server
     #   will not be started appropriately and you'll get erratic results!
 
     print("Starting the AFL forkserver")
@@ -142,12 +185,12 @@ def main():
     else:
         out = lambda x, y: print(x.format(y))
 
-    #-----------------------------------------------
+    # -----------------------------------------------
     # Load the mutated input and map it into memory
 
     # Load the mutated input from disk
     out("Loading data input from {}", args.input_file)
-    input_file = open(args.input_file, 'rb')
+    input_file = open(args.input_file, "rb")
     input = input_file.read()
     input_file.close()
 
@@ -159,7 +202,7 @@ def main():
     # Write the mutated command into the data buffer
     uc.mem_write(DATA_ADDRESS, input)
 
-    #------------------------------------------------------------
+    # ------------------------------------------------------------
     # Emulate the code, allowing it to process the mutated input
 
     out("Executing until a crash or execution reaches 0x{0:016x}", end_address)
@@ -175,5 +218,6 @@ def main():
     # UC_AFL_RET_FINISHED = 3
     out("Done. AFL Mode is {}", afl_mode)
 
+
 if __name__ == "__main__":
     main()
diff --git a/unicorn_mode/samples/speedtest/.gitignore b/unicorn_mode/samples/speedtest/.gitignore
new file mode 100644
index 00000000..78310c60
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/.gitignore
@@ -0,0 +1,6 @@
+output
+harness
+harness-debug
+target
+target.o
+target.offsets.*
diff --git a/unicorn_mode/samples/speedtest/Makefile b/unicorn_mode/samples/speedtest/Makefile
new file mode 100644
index 00000000..23f5cb07
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/Makefile
@@ -0,0 +1,17 @@
+CFLAGS += -Wall -Werror -Wextra -Wpedantic -Og -g -fPIE
+
+.PHONY: all clean
+
+all: target target.offsets.main
+
+clean:
+	rm -rf *.o target target.offsets.*
+
+target.o: target.c
+	${CC} ${CFLAGS} -c target.c -o $@
+
+target: target.o
+	${CC} ${CFLAGS} target.o -o $@
+
+target.offsets.main: target
+	./get_offsets.py
\ No newline at end of file
diff --git a/unicorn_mode/samples/speedtest/README.md b/unicorn_mode/samples/speedtest/README.md
new file mode 100644
index 00000000..3c1184a2
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/README.md
@@ -0,0 +1,65 @@
+# Speedtest
+
+This is a simple sample harness for a non-crashing file,
+to show the raw speed of C, Rust, and Python harnesses.
+
+## Compiling...
+
+Make sure, you built unicornafl first (`../../build_unicorn_support.sh`).
+Then, follow these individual steps:
+
+### Rust
+
+```bash
+cd rust
+cargo build --release
+../../../afl-fuzz -i ../sample_inputs -o out -- ./target/release/harness @@
+```
+
+### C
+
+```bash
+cd c
+make
+../../../afl-fuzz -i ../sample_inputs -o out -- ./harness @@
+```
+
+### python
+
+```bash
+cd python
+../../../afl-fuzz -i ../sample_inputs -o out -U -- python3 ./harness.py @@
+```
+
+## Results
+
+TODO: add results here.
+
+
+## Compiling speedtest_target.c
+
+You shouldn't need to compile simple_target.c since a X86_64 binary version is
+pre-built and shipped in this sample folder. This file documents how the binary
+was built in case you want to rebuild it or recompile it for any reason.
+
+The pre-built binary (simple_target_x86_64.bin) was built using -g -O0 in gcc.
+
+We then load the binary and execute the main function directly.
+
+## Addresses for the harness:
+To find the address (in hex) of main, run:
+```bash
+objdump -M intel -D target | grep '<main>:' | cut -d" " -f1
+```
+To find all call sites to magicfn, run:
+```bash
+objdump -M intel -D target | grep '<magicfn>$' | cut -d":" -f1
+```
+For malloc callsites:
+```bash
+objdump -M intel -D target | grep '<malloc@plt>$' | cut -d":" -f1
+```
+And free callsites:
+```bash
+objdump -M intel -D target | grep '<free@plt>$' | cut -d":" -f1
+```
diff --git a/unicorn_mode/samples/speedtest/c/Makefile b/unicorn_mode/samples/speedtest/c/Makefile
new file mode 100644
index 00000000..ce784d4f
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/c/Makefile
@@ -0,0 +1,54 @@
+# UnicornAFL Usage
+# Original Unicorn Example Makefile by Nguyen Anh Quynh <aquynh@gmail.com>, 2015
+# Adapted for AFL++ by domenukk <domenukk@gmail.com>, 2020
+.POSIX:
+UNAME_S =$(shell uname -s)# GNU make
+UNAME_S:sh=uname -s       # BSD make
+_UNIQ=_QINU_
+
+LIBDIR = ../../../unicornafl
+BIN_EXT =
+AR_EXT = a
+
+# Verbose output?
+V ?= 0
+
+CFLAGS += -Wall -Werror -Wextra -Wno-unused-parameter -I../../../unicornafl/include
+
+LDFLAGS += -L$(LIBDIR) -lpthread -lm
+
+_LRT = $(_UNIQ)$(UNAME_S:Linux=)
+__LRT = $(_LRT:$(_UNIQ)=-lrt)
+LRT = $(__LRT:$(_UNIQ)=)
+
+LDFLAGS += $(LRT)
+
+_CC = $(_UNIQ)$(CROSS)
+__CC = $(_CC:$(_UNIQ)=$(CC))
+MYCC = $(__CC:$(_UNIQ)$(CROSS)=$(CROSS)gcc)
+
+.PHONY: all clean
+
+all: fuzz
+
+clean:
+	rm -rf *.o harness harness-debug
+
+harness.o: harness.c ../../../unicornafl/include/unicorn/*.h
+	${MYCC} ${CFLAGS} -O3 -c harness.c -o $@
+
+harness-debug.o: harness.c ../../../unicornafl/include/unicorn/*.h
+	${MYCC} ${CFLAGS} -fsanitize=address -g -Og -c harness.c -o $@
+
+harness: harness.o
+	${MYCC} -L${LIBDIR} harness.o ../../../unicornafl/libunicornafl.a $(LDFLAGS) -o $@
+
+harness-debug: harness-debug.o
+	${MYCC} -fsanitize=address -g -Og -L${LIBDIR} harness-debug.o ../../../unicornafl/libunicornafl.a $(LDFLAGS) -o harness-debug
+
+../target:
+	$(MAKE) -C ..
+
+fuzz: ../target harness
+	rm -rf ./output
+	SKIP_BINCHECK=1 ../../../../afl-fuzz -s 1 -i ../sample_inputs -o ./output -- ./harness @@
diff --git a/unicorn_mode/samples/speedtest/c/harness.c b/unicorn_mode/samples/speedtest/c/harness.c
new file mode 100644
index 00000000..e8de3d80
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/c/harness.c
@@ -0,0 +1,390 @@
+/*
+   Simple test harness for AFL++'s unicornafl c mode.
+
+   This loads the simple_target_x86_64 binary into
+   Unicorn's memory map for emulation, places the specified input into
+   argv[1], sets up argv, and argc and executes 'main()'.
+   If run inside AFL, afl_fuzz automatically does the "right thing"
+
+   Run under AFL as follows:
+
+   $ cd <afl_path>/unicorn_mode/samples/simple/
+   $ make
+   $ ../../../afl-fuzz -m none -i sample_inputs -o out -- ./harness @@
+*/
+
+// This is not your everyday Unicorn.
+#define UNICORN_AFL
+
+#include <string.h>
+#include <inttypes.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <unistd.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <sys/mman.h>
+
+#include <unicorn/unicorn.h>
+
+// Path to the file containing the binary to emulate
+#define BINARY_FILE ("../target")
+
+// Memory map for the code to be tested
+// Arbitrary address where code to test will be loaded
+static const int64_t BASE_ADDRESS = 0x0;
+// Max size for the code (64kb)
+static const int64_t CODE_SIZE_MAX = 0x00010000;
+// Location where the input will be placed (make sure the emulated program knows this somehow, too ;) )
+static const int64_t INPUT_ADDRESS = 0x00100000;
+// Maximum size for our input
+static const int64_t INPUT_MAX = 0x00100000;
+// Where our pseudo-heap is at
+static const int64_t HEAP_ADDRESS = 0x00200000;
+// Maximum allowable size for the heap
+static const int64_t HEAP_SIZE_MAX = 0x000F0000;
+// Address of the stack (Some random address again)
+static const int64_t STACK_ADDRESS = 0x00400000;
+// Size of the stack (arbitrarily chosen, just make it big enough)
+static const int64_t STACK_SIZE = 0x000F0000;
+
+// Alignment for unicorn mappings (seems to be needed)
+static const int64_t ALIGNMENT = 0x1000;
+
+static void hook_block(uc_engine *uc, uint64_t address, uint32_t size, void *user_data) {
+    printf(">>> Tracing basic block at 0x%"PRIx64 ", block size = 0x%x\n", address, size);
+}
+
+static void hook_code(uc_engine *uc, uint64_t address, uint32_t size, void *user_data) {
+    printf(">>> Tracing instruction at 0x%"PRIx64 ", instruction size = 0x%x\n", address, size);
+}
+
+/* Unicorn page needs to be 0x1000 aligned, apparently */
+static uint64_t pad(uint64_t size) {
+    if (size % ALIGNMENT == 0) { return size; }
+    return ((size / ALIGNMENT) + 1) * ALIGNMENT;
+} 
+
+/* returns the filesize in bytes, -1 or error. */
+static off_t afl_mmap_file(char *filename, char **buf_ptr) {
+
+    off_t ret = -1;
+
+    int fd = open(filename, O_RDONLY);
+
+    struct stat st = {0};
+    if (fstat(fd, &st)) goto exit;
+
+    off_t in_len = st.st_size;
+    if (in_len == -1) {
+        /* This can only ever happen on 32 bit if the file is exactly 4gb. */
+        fprintf(stderr, "Filesize of %s too large\n", filename);
+        goto exit;
+    }
+
+    *buf_ptr = mmap(0, in_len, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
+
+    if (*buf_ptr != MAP_FAILED) ret = in_len;
+
+exit:
+    close(fd);
+    return ret;
+
+}
+
+/* Place the input at the right spot inside unicorn.
+   This code path is *HOT*, do as little work as possible! */
+static bool place_input_callback(
+    uc_engine *uc, 
+    char *input, 
+    size_t input_len, 
+    uint32_t persistent_round, 
+    void *data
+){
+    // printf("Placing input with len %ld to %x\n", input_len, DATA_ADDRESS);
+    if (input_len >= INPUT_MAX) {
+        // Test input too short or too long, ignore this testcase
+        return false;
+    }
+
+    // We need a valid c string, make sure it never goes out of bounds.
+    input[input_len-1] = '\0';
+
+    // Write the testcase to unicorn.
+    uc_mem_write(uc, INPUT_ADDRESS, input, input_len);
+
+    return true;
+}
+
+// exit in case the unicorn-internal mmap fails.
+static void mem_map_checked(uc_engine *uc, uint64_t addr, size_t size, uint32_t mode) {
+    size = pad(size);
+    //printf("SIZE %llx, align: %llx\n", size, ALIGNMENT);
+    uc_err err = uc_mem_map(uc, addr, size, mode);
+    if (err != UC_ERR_OK) {
+        printf("Error mapping %ld bytes at 0x%lx: %s (mode: %d)\n", size, addr, uc_strerror(err), mode);
+        exit(1);
+    }
+}
+
+// allocates an array, reads all addrs to the given array ptr, returns a size
+ssize_t read_all_addrs(char *path, uint64_t *addrs, size_t max_count) {
+
+    FILE *f = fopen(path, "r"); 
+    if (!f) {
+        perror("fopen");
+        fprintf(stderr, "Could not read %s, make sure you ran ./get_offsets.py\n", path);
+        exit(-1);
+    }
+    for (size_t i = 0; i < max_count; i++) {
+        bool end = false;
+        if(fscanf(f, "%lx", &addrs[i]) == EOF) {
+            end = true;
+            i--;
+        } else if (fgetc(f) == EOF) {
+            end = true;
+        }
+        if (end) {
+            printf("Set %ld addrs for %s\n", i + 1, path);
+            fclose(f);
+            return i + 1;
+        }
+    }
+    return max_count;
+}
+
+// Read all addresses from the given file, and set a hook for them.
+void set_all_hooks(uc_engine *uc, char *hook_file, void *hook_fn) {
+
+    FILE *f = fopen(hook_file, "r");
+    if (!f) {
+        fprintf(stderr, "Could not read %s, make sure you ran ./get_offsets.py\n", hook_file);
+        exit(-1);
+    }
+    uint64_t hook_addr;
+    for (int hook_count = 0; 1; hook_count++) {
+        if(fscanf(f, "%lx", &hook_addr) == EOF) {
+            printf("Set %d hooks for %s\n", hook_count, hook_file);
+            fclose(f);
+            return;
+        }
+        printf("got new hook addr %lx (count: %d) ohbytw: sizeof %lx\n", hook_addr, hook_count, sizeof(uc_hook));
+        hook_addr += BASE_ADDRESS;
+        // We'll leek these hooks like a good citizen.
+        uc_hook *hook = calloc(1, sizeof(uc_hook));
+        if (!hook) {
+            perror("calloc");
+            exit(-1);
+        }
+        uc_hook_add(uc, hook, UC_HOOK_CODE, hook_fn, NULL, hook_addr, hook_addr);
+        // guzzle up newline
+        if (fgetc(f) == EOF) {
+            printf("Set %d hooks for %s\n", hook_count, hook_file);
+            fclose(f);
+            return;
+        }
+    }
+
+}
+
+// This is a fancy print function that we're just going to skip for fuzzing.
+static void hook_magicfn(uc_engine *uc, uint64_t address, uint32_t size, void *user_data) {
+    address += size;
+    uc_reg_write(uc, UC_X86_REG_RIP, &address);
+} 
+
+static bool already_allocated = false;
+
+// We use a very simple malloc/free stub here, that only works for exactly one allocation at a time.
+static void hook_malloc(uc_engine *uc, uint64_t address, uint32_t size, void *user_data) {
+    if (already_allocated) {
+        printf("Double malloc, not supported right now!\n");
+        abort();
+    }
+    // read the first param.
+    uint64_t malloc_size;
+    uc_reg_read(uc, UC_X86_REG_RDI, &malloc_size);
+    if (malloc_size > HEAP_SIZE_MAX) {
+        printf("Tried to allocated %ld bytes, but we only support up to %ld\n", malloc_size, HEAP_SIZE_MAX);
+        abort();
+    }
+    uc_reg_write(uc, UC_X86_REG_RAX, &HEAP_ADDRESS);
+    address += size;
+    uc_reg_write(uc, UC_X86_REG_RIP, &address);
+    already_allocated = true;
+}
+
+// No real free, just set the "used"-flag to false.
+static void hook_free(uc_engine *uc, uint64_t address, uint32_t size, void *user_data) {
+    if (!already_allocated) {
+        printf("Double free detected. Real bug?\n");
+        abort();
+    }
+    // read the first param.
+    uint64_t free_ptr;
+    uc_reg_read(uc, UC_X86_REG_RDI, &free_ptr);
+    if (free_ptr != HEAP_ADDRESS) {
+        printf("Tried to free wrong mem region: 0x%lx at code loc 0x%lx\n", free_ptr, address);
+        abort();
+    }
+    address +=  size;
+    uc_reg_write(uc, UC_X86_REG_RIP, &address);
+    already_allocated = false;
+}
+
+int main(int argc, char **argv, char **envp) {
+    if (argc == 1) {
+        printf("Test harness to measure speed against Rust and python. Usage: harness [-t] <inputfile>\n");
+        exit(1);
+    }
+    bool tracing = false;
+    char *filename = argv[1];
+    if (argc > 2 && !strcmp(argv[1], "-t")) {
+        tracing = true;
+        filename = argv[2];
+    }
+
+    uc_engine *uc;
+    uc_err err;
+    uc_hook hooks[2];
+    char *file_contents;
+
+    // Initialize emulator in X86_64 mode
+    err = uc_open(UC_ARCH_X86, UC_MODE_64, &uc);
+    if (err) {
+        printf("Failed on uc_open() with error returned: %u (%s)\n",
+                err, uc_strerror(err));
+        return -1;
+    }
+
+    // If we want tracing output, set the callbacks here
+    if (tracing) {
+        // tracing all basic blocks with customized callback
+        uc_hook_add(uc, &hooks[0], UC_HOOK_BLOCK, hook_block, NULL, 1, 0);
+        uc_hook_add(uc, &hooks[1], UC_HOOK_CODE, hook_code, NULL, 1, 0);
+    }
+
+    printf("The input testcase is set to %s\n", filename);
+
+
+    printf("Loading target from %s\n", BINARY_FILE);
+    off_t len = afl_mmap_file(BINARY_FILE, &file_contents);
+    printf("Binary file size: %lx\n", len);
+    if (len < 0) {
+        perror("Could not read binary to emulate");
+        return -2;
+    }
+    if (len == 0) {
+        fprintf(stderr, "File at '%s' is empty\n", BINARY_FILE);
+        return -3;
+    }
+    if (len > CODE_SIZE_MAX) {
+        fprintf(stderr, "Binary too large, increase CODE_SIZE_MAX\n");
+        return -4;
+    }
+
+    // Map memory.
+    mem_map_checked(uc, BASE_ADDRESS, len, UC_PROT_ALL);
+    fflush(stdout);
+
+    // write machine code to be emulated to memory
+    if (uc_mem_write(uc, BASE_ADDRESS, file_contents, len) != UC_ERR_OK) {
+        puts("Error writing to CODE");
+        exit(-1);
+    }
+
+    // Release copied contents
+    munmap(file_contents, len);
+
+    // Set the program counter to the start of the code
+    FILE *f = fopen("../target.offsets.main", "r");
+    if (!f) {
+        perror("fopen");
+        puts("Could not read offset to main function, make sure you ran ./get_offsets.py");
+        exit(-1);
+    }
+    uint64_t start_address;
+    if(fscanf(f, "%lx", &start_address) == EOF) {
+        puts("Start address not found in target.offests.main");
+        exit(-1);
+    }
+    fclose(f);
+    start_address += BASE_ADDRESS;
+    printf("Execution will start at 0x%lx", start_address);
+    // Set the program counter to the start of the code
+    uc_reg_write(uc, UC_X86_REG_RIP, &start_address); // address of entry point of main()
+
+    // Setup the Stack
+    mem_map_checked(uc, STACK_ADDRESS, STACK_SIZE, UC_PROT_READ | UC_PROT_WRITE);
+    // Setup the stack pointer, but allocate two pointers for the pointers to input
+    uint64_t val = STACK_ADDRESS + STACK_SIZE - 16;
+    //printf("Stack at %lu\n", stack_val);
+    uc_reg_write(uc, UC_X86_REG_RSP, &val);
+
+    // reserve some space for our input data
+    mem_map_checked(uc, INPUT_ADDRESS, INPUT_MAX, UC_PROT_READ);
+
+    // argc = 2
+    val = 2;
+    uc_reg_write(uc, UC_X86_REG_RDI, &val);
+    //RSI points to our little 2 QWORD space at the beginning of the stack...
+    val = STACK_ADDRESS + STACK_SIZE - 16;
+    uc_reg_write(uc, UC_X86_REG_RSI, &val);
+
+    //... which points to the Input. Write the ptr to mem in little endian.
+    uint32_t addr_little = STACK_ADDRESS;
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+    // The chances you are on a big_endian system aren't too high, but still...
+    __builtin_bswap32(addr_little);
+#endif
+
+    uc_mem_write(uc, STACK_ADDRESS + STACK_SIZE - 16, (char *)&addr_little, 4);
+
+    set_all_hooks(uc, "../target.offsets.malloc", hook_malloc);
+    set_all_hooks(uc, "../target.offsets.magicfn", hook_magicfn);
+    set_all_hooks(uc, "../target.offsets.free", hook_free);
+
+    int exit_count_max = 100;
+    // we don't need more exits for now.
+    uint64_t exits[exit_count_max];
+
+    ssize_t exit_count = read_all_addrs("../target.offsets.main_ends", exits, exit_count_max);
+    if (exit_count < 1) {
+        printf("Could not find exits! aborting.\n");
+        abort();
+    }
+
+    printf("Starting to fuzz. Running from addr %ld to one of these %ld exits:\n", start_address, exit_count);
+    for (ssize_t i = 0; i < exit_count; i++) {
+        printf("    exit %ld: %ld\n", i, exits[i]);
+    }
+
+    fflush(stdout);
+
+    // let's gooo
+    uc_afl_ret afl_ret = uc_afl_fuzz(
+        uc, // The unicorn instance we prepared
+        filename, // Filename of the input to process. In AFL this is usually the '@@' placeholder, outside it's any input file.
+        place_input_callback, // Callback that places the input (automatically loaded from the file at filename) in the unicorninstance
+        exits, // Where to exit (this is an array)
+        exit_count,  // Count of end addresses
+        NULL, // Optional calback to run after each exec
+        false, // true, if the optional callback should be run also for non-crashes
+        1000, // For persistent mode: How many rounds to run
+        NULL // additional data pointer
+    );
+    switch(afl_ret) {
+        case UC_AFL_RET_ERROR:
+            printf("Error starting to fuzz");
+            return -3;
+            break;
+        case UC_AFL_RET_NO_AFL:
+            printf("No AFL attached - We are done with a single run.");
+            break;
+        default:
+            break;
+    } 
+    return 0;
+}
diff --git a/unicorn_mode/samples/speedtest/get_offsets.py b/unicorn_mode/samples/speedtest/get_offsets.py
new file mode 100755
index 00000000..c9dc76df
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/get_offsets.py
@@ -0,0 +1,77 @@
+#!/usr/bin/env python3
+
+"""This simple script uses objdump to parse important addresses from the target"""
+import shlex
+import subprocess
+
+objdump_output = subprocess.check_output(
+    shlex.split("objdump -M intel -D target")
+).decode()
+main_loc = None
+main_ends = []
+main_ended = False
+magicfn_calls = []
+malloc_calls = []
+free_calls = []
+strlen_calls = []
+
+
+def line2addr(line):
+    return "0x" + line.split(":", 1)[0].strip()
+
+
+last_line = None
+for line in objdump_output.split("\n"):
+    line = line.strip()
+
+    def read_addr_if_endswith(findme, list_to):
+        """
+        Look, for example, for the addr like:
+        12a9:       e8 f2 fd ff ff          call   10a0 <free@plt>
+        """
+        if line.endswith(findme):
+            list_to.append(line2addr(line))
+
+    if main_loc is not None and main_ended is False:
+        # We want to know where main ends. An empty line in objdump.
+        if len(line) == 0:
+            main_ends.append(line2addr(last_line))
+            main_ended = True
+        elif "ret" in line:
+            main_ends.append(line2addr(line))
+
+    if "<main>:" in line:
+        if main_loc is not None:
+            raise Exception("Found multiple main functions, odd target!")
+        # main_loc is the label, so it's parsed differntly (i.e. `0000000000001220 <main>:`)
+        main_loc = "0x" + line.strip().split(" ", 1)[0].strip()
+    else:
+        [
+            read_addr_if_endswith(*x)
+            for x in [
+                ("<free@plt>", free_calls),
+                ("<malloc@plt>", malloc_calls),
+                ("<strlen@plt>", strlen_calls),
+                ("<magicfn>", magicfn_calls),
+            ]
+        ]
+
+    last_line = line
+
+if main_loc is None:
+    raise (
+        "Could not find main in ./target! Make sure objdump is installed and the target is compiled."
+    )
+
+with open("target.offsets.main", "w") as f:
+    f.write(main_loc)
+with open("target.offsets.main_ends", "w") as f:
+    f.write("\n".join(main_ends))
+with open("target.offsets.magicfn", "w") as f:
+    f.write("\n".join(magicfn_calls))
+with open("target.offsets.malloc", "w") as f:
+    f.write("\n".join(malloc_calls))
+with open("target.offsets.free", "w") as f:
+    f.write("\n".join(free_calls))
+with open("target.offsets.strlen", "w") as f:
+    f.write("\n".join(strlen_calls))
diff --git a/unicorn_mode/samples/speedtest/python/Makefile b/unicorn_mode/samples/speedtest/python/Makefile
new file mode 100644
index 00000000..4282c6cb
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/python/Makefile
@@ -0,0 +1,8 @@
+all: fuzz
+
+../target:
+	$(MAKE) -C ..
+
+fuzz: ../target
+	rm -rf ./ouptput
+	../../../../afl-fuzz -s 1 -U -i ../sample_inputs -o ./output -- python3 harness.py @@
diff --git a/unicorn_mode/samples/speedtest/python/harness.py b/unicorn_mode/samples/speedtest/python/harness.py
new file mode 100644
index 00000000..801ef4d1
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/python/harness.py
@@ -0,0 +1,277 @@
+#!/usr/bin/env python3
+""" 
+    Simple test harness for AFL's Unicorn Mode.
+
+    This loads the speedtest target binary (precompiled X64 code) into
+    Unicorn's memory map for emulation, places the specified input into
+    Argv, and executes main.
+    There should not be any crashes - it's a speedtest against Rust and c.
+
+    Before running this harness, call make in the parent folder.
+
+    Run under AFL as follows:
+
+    $ cd <afl_path>/unicorn_mode/samples/speedtest/python
+    $ ../../../../afl-fuzz -U -i ../sample_inputs -o ./output -- python3 harness.py @@
+"""
+
+import argparse
+import os
+import struct
+
+from unicornafl import *
+from unicornafl.unicorn_const import UC_ARCH_X86, UC_HOOK_CODE, UC_MODE_64
+from unicornafl.x86_const import (
+    UC_X86_REG_RAX,
+    UC_X86_REG_RDI,
+    UC_X86_REG_RIP,
+    UC_X86_REG_RSI,
+    UC_X86_REG_RSP,
+)
+
+# Memory map for the code to be tested
+BASE_ADDRESS = 0x0  # Arbitrary address where the (PIE) target binary will be loaded to
+CODE_SIZE_MAX = 0x00010000  # Max size for the code (64kb)
+INPUT_ADDRESS = 0x00100000  # where we put our stuff
+INPUT_MAX = 0x00100000  # max size for our input
+HEAP_ADDRESS = 0x00200000  # Heap addr
+HEAP_SIZE_MAX = 0x000F0000  # Maximum allowable size for the heap
+STACK_ADDRESS = 0x00400000  # Address of the stack (arbitrarily chosen)
+STACK_SIZE = 0x000F0000  # Size of the stack (arbitrarily chosen)
+
+target_path = os.path.abspath(
+    os.path.join(os.path.dirname(os.path.abspath(__file__)), "..")
+)
+target_bin = os.path.join(target_path, "target")
+
+
+def get_offsets_for(name):
+    full_path = os.path.join(target_path, f"target.offsets.{name}")
+    with open(full_path) as f:
+        return [int(x, 16) + BASE_ADDRESS for x in f.readlines()]
+
+
+# Read all offsets from our objdump file
+main_offset = get_offsets_for("main")[0]
+main_ends = get_offsets_for("main_ends")
+malloc_callsites = get_offsets_for("malloc")
+free_callsites = get_offsets_for("free")
+magicfn_callsites = get_offsets_for("magicfn")
+# Joke's on me: strlen got inlined by my compiler
+strlen_callsites = get_offsets_for("strlen")
+
+try:
+    # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary.
+    from capstone import *
+
+    cs = Cs(CS_ARCH_MIPS, CS_MODE_MIPS32 + CS_MODE_BIG_ENDIAN)
+
+    def unicorn_debug_instruction(uc, address, size, user_data):
+        mem = uc.mem_read(address, size)
+        for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(
+            bytes(mem), size
+        ):
+            print("    Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr))
+
+
+except ImportError:
+
+    def unicorn_debug_instruction(uc, address, size, user_data):
+        print("    Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
+
+
+def unicorn_debug_block(uc, address, size, user_data):
+    print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
+
+
+def unicorn_debug_mem_access(uc, access, address, size, value, user_data):
+    if access == UC_MEM_WRITE:
+        print(
+            "        >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(
+                address, size, value
+            )
+        )
+    else:
+        print("        >>> Read: addr=0x{0:016x} size={1}".format(address, size))
+
+
+def unicorn_debug_mem_invalid_access(uc, access, address, size, value, user_data):
+    if access == UC_MEM_WRITE_UNMAPPED:
+        print(
+            "        >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(
+                address, size, value
+            )
+        )
+    else:
+        print(
+            "        >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size)
+        )
+
+
+already_allocated = False
+
+
+def malloc_hook(uc, address, size, user_data):
+    """
+    We use a very simple malloc/free stub here, that only works for exactly one allocation at a time.
+    """
+    global already_allocated
+    if already_allocated:
+        print("Double malloc, not supported right now!")
+        os.abort()
+    # read the first param
+    malloc_size = uc.reg_read(UC_X86_REG_RDI)
+    if malloc_size > HEAP_SIZE_MAX:
+        print(
+            f"Tried to allocate {malloc_size} bytes, aint't nobody got space for that! (We may only allocate up to {HEAP_SIZE_MAX})"
+        )
+        os.abort()
+    uc.reg_write(UC_X86_REG_RAX, HEAP_ADDRESS)
+    uc.reg_write(UC_X86_REG_RIP, address + size)
+    already_allocated = True
+
+
+def free_hook(uc, address, size, user_data):
+    """
+    No real free, just set the "used"-flag to false.
+    """
+    global already_allocated
+    if not already_allocated:
+        print("Double free detected. Real bug?")
+        os.abort()
+    # read the first param
+    free_ptr = uc.reg_read(UC_X86_REG_RDI)
+    if free_ptr != HEAP_ADDRESS:
+        print(
+            f"Tried to free wrong mem region: {hex(free_ptr)} at code loc {hex(address)}"
+        )
+        os.abort()
+    uc.reg_write(UC_X86_REG_RIP, address + size)
+    already_allocated = False
+
+
+# def strlen_hook(uc, address, size, user_data):
+#     """
+#     No real strlen, we know the len is == our input.
+#     This completely ignores '\0', but for this target, do we really care?
+#     """
+#     global input_len
+#     print(f"Returning len {input_len}")
+#     uc.reg_write(UC_X86_REG_RAX, input_len)
+#     uc.reg_write(UC_X86_REG_RIP, address + size)
+
+
+def magicfn_hook(uc, address, size, user_data):
+    """
+    This is a fancy print function that we're just going to skip for fuzzing.
+    """
+    uc.reg_write(UC_X86_REG_RIP, address + size)
+
+
+def main():
+
+    parser = argparse.ArgumentParser(description="Test harness for simple_target.bin")
+    parser.add_argument(
+        "input_file",
+        type=str,
+        help="Path to the file containing the mutated input to load",
+    )
+    parser.add_argument(
+        "-t",
+        "--trace",
+        default=False,
+        action="store_true",
+        help="Enables debug tracing",
+    )
+    args = parser.parse_args()
+
+    # Instantiate a MIPS32 big endian Unicorn Engine instance
+    uc = Uc(UC_ARCH_X86, UC_MODE_64)
+
+    if args.trace:
+        uc.hook_add(UC_HOOK_BLOCK, unicorn_debug_block)
+        uc.hook_add(UC_HOOK_CODE, unicorn_debug_instruction)
+        uc.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, unicorn_debug_mem_access)
+        uc.hook_add(
+            UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID,
+            unicorn_debug_mem_invalid_access,
+        )
+
+    print("The input testcase is set to {}".format(args.input_file))
+
+    # ---------------------------------------------------
+    # Load the binary to emulate and map it into memory
+    with open(target_bin, "rb") as f:
+        binary_code = f.read()
+
+    # Apply constraints to the mutated input
+    if len(binary_code) > CODE_SIZE_MAX:
+        print("Binary code is too large (> {} bytes)".format(CODE_SIZE_MAX))
+        return
+
+    # Write the binary to its place in mem
+    uc.mem_map(BASE_ADDRESS, CODE_SIZE_MAX)
+    uc.mem_write(BASE_ADDRESS, binary_code)
+
+    # Set the program counter to the start of the code
+    uc.reg_write(UC_X86_REG_RIP, main_offset)
+
+    # Setup the stack.
+    uc.mem_map(STACK_ADDRESS, STACK_SIZE)
+    # Setup the stack pointer, but allocate two pointers for the pointers to input.
+    uc.reg_write(UC_X86_REG_RSP, STACK_ADDRESS + STACK_SIZE - 16)
+
+    # Setup our input space, and push the pointer to it in the function params
+    uc.mem_map(INPUT_ADDRESS, INPUT_MAX)
+    # We have argc = 2
+    uc.reg_write(UC_X86_REG_RDI, 2)
+    # RSI points to our little 2 QWORD space at the beginning of the stack...
+    uc.reg_write(UC_X86_REG_RSI, STACK_ADDRESS + STACK_SIZE - 16)
+    # ... which points to the Input. Write the ptr to mem in little endian.
+    uc.mem_write(STACK_ADDRESS + STACK_SIZE - 16, struct.pack("<Q", INPUT_ADDRESS))
+
+    for addr in malloc_callsites:
+        uc.hook_add(UC_HOOK_CODE, malloc_hook, begin=addr, end=addr)
+
+    for addr in free_callsites:
+        uc.hook_add(UC_HOOK_CODE, free_hook, begin=addr, end=addr)
+
+    if len(strlen_callsites):
+        # strlen got inlined for my compiler.
+        print(
+            "Oops, your compiler emitted strlen as function. You may have to change the harness."
+        )
+    # for addr in strlen_callsites:
+    #     uc.hook_add(UC_HOOK_CODE, strlen_hook, begin=addr, end=addr)
+
+    for addr in magicfn_callsites:
+        uc.hook_add(UC_HOOK_CODE, magicfn_hook, begin=addr, end=addr + 1)
+
+    # -----------------------------------------------------
+    # Set up a callback to place input data (do little work here, it's called for every single iteration! This code is *HOT*)
+    # We did not pass in any data and don't use persistent mode, so we can ignore these params.
+    # Be sure to check out the docstrings for the uc.afl_* functions.
+    def place_input_callback(uc, input, persistent_round, data):
+        # Apply constraints to the mutated input
+        input_len = len(input)
+        # global input_len
+        if input_len > INPUT_MAX:
+            # print("Test input is too long (> {} bytes)")
+            return False
+
+        # print(f"Placing input: {input} in round {persistent_round}")
+
+        # Make sure the string is always 0-terminated (as it would be "in the wild")
+        input[-1] = b"\0"
+
+        # Write the mutated command into the data buffer
+        uc.mem_write(INPUT_ADDRESS, input)
+        # uc.reg_write(UC_X86_REG_RIP, main_offset)
+
+    print(f"Starting to fuzz. Running from addr {main_offset} to one of {main_ends}")
+    # Start the fuzzer.
+    uc.afl_fuzz(args.input_file, place_input_callback, main_ends, persistent_iters=1000)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/unicorn_mode/samples/speedtest/rust/.gitignore b/unicorn_mode/samples/speedtest/rust/.gitignore
new file mode 100644
index 00000000..a9d37c56
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/rust/.gitignore
@@ -0,0 +1,2 @@
+target
+Cargo.lock
diff --git a/unicorn_mode/samples/speedtest/rust/Cargo.toml b/unicorn_mode/samples/speedtest/rust/Cargo.toml
new file mode 100644
index 00000000..c19ee0a1
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/rust/Cargo.toml
@@ -0,0 +1,15 @@
+[package]
+name = "unicornafl_harness"
+version = "0.1.0"
+authors = ["Dominik Maier <domenukk@gmail.com>"]
+edition = "2018"
+
+[profile.release]
+lto = true
+opt-level = 3
+panic = "abort"
+
+[dependencies]
+unicornafl = { path = "../../../unicornafl/bindings/rust/", version="1.0.0" }
+capstone="0.6.0"
+libc="0.2.66"
\ No newline at end of file
diff --git a/unicorn_mode/samples/speedtest/rust/Makefile b/unicorn_mode/samples/speedtest/rust/Makefile
new file mode 100644
index 00000000..fe18d6ee
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/rust/Makefile
@@ -0,0 +1,17 @@
+all: fuzz
+
+clean:
+	cargo clean
+
+./target/release/unicornafl_harness: ./src/main.rs
+	cargo build --release
+
+./target/debug/unicornafl_harness: ./src/main.rs
+	cargo build
+
+../target:
+	$(MAKE) -c ..
+
+fuzz: ../target ./target/release/unicornafl_harness
+	rm -rf ./output
+	SKIP_BINCHECK=1 ../../../../afl-fuzz -s 1 -i ../sample_inputs -o ./output -- ./target/release/unicornafl_harness @@
diff --git a/unicorn_mode/samples/speedtest/rust/src/main.rs b/unicorn_mode/samples/speedtest/rust/src/main.rs
new file mode 100644
index 00000000..1e35ff0b
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/rust/src/main.rs
@@ -0,0 +1,232 @@
+extern crate capstone;
+extern crate libc;
+
+use core::cell::Cell;
+use std::{
+    env,
+    fs::File,
+    io::{self, Read},
+    process::abort,
+    str,
+};
+
+use unicornafl::{
+    unicorn_const::{uc_error, Arch, Mode, Permission},
+    RegisterX86::{self, *},
+    Unicorn, UnicornHandle,
+};
+
+const BINARY: &str = &"../target";
+
+// Memory map for the code to be tested
+// Arbitrary address where code to test will be loaded
+const BASE_ADDRESS: u64 = 0x0;
+// Max size for the code (64kb)
+const CODE_SIZE_MAX: u64 = 0x00010000;
+// Location where the input will be placed (make sure the uclated program knows this somehow, too ;) )
+const INPUT_ADDRESS: u64 = 0x00100000;
+// Maximum size for our input
+const INPUT_MAX: u64 = 0x00100000;
+// Where our pseudo-heap is at
+const HEAP_ADDRESS: u64 = 0x00200000;
+// Maximum allowable size for the heap
+const HEAP_SIZE_MAX: u64 = 0x000F0000;
+// Address of the stack (Some random address again)
+const STACK_ADDRESS: u64 = 0x00400000;
+// Size of the stack (arbitrarily chosen, just make it big enough)
+const STACK_SIZE: u64 = 0x000F0000;
+
+fn read_file(filename: &str) -> Result<Vec<u8>, io::Error> {
+    let mut f = File::open(filename)?;
+    let mut buffer = Vec::new();
+    f.read_to_end(&mut buffer)?;
+    Ok(buffer)
+}
+
+/// Our location parser
+fn parse_locs(loc_name: &str) -> Result<Vec<u64>, io::Error> {
+    let contents = &read_file(&format!("../target.offsets.{}", loc_name))?;
+    //println!("Read: {:?}", contents);
+    Ok(str_from_u8_unchecked(&contents)
+        .split("\n")
+        .map(|x| {
+            //println!("Trying to convert {}", &x[2..]);
+            let result = u64::from_str_radix(&x[2..], 16);
+            result.unwrap()
+        })
+        .collect())
+}
+
+// find null terminated string in vec
+pub fn str_from_u8_unchecked(utf8_src: &[u8]) -> &str {
+    let nul_range_end = utf8_src
+        .iter()
+        .position(|&c| c == b'\0')
+        .unwrap_or(utf8_src.len());
+    unsafe { str::from_utf8_unchecked(&utf8_src[0..nul_range_end]) }
+}
+
+fn align(size: u64) -> u64 {
+    const ALIGNMENT: u64 = 0x1000;
+    if size % ALIGNMENT == 0 {
+        size
+    } else {
+        ((size / ALIGNMENT) + 1) * ALIGNMENT
+    }
+}
+
+fn main() {
+    let args: Vec<String> = env::args().collect();
+    if args.len() == 1 {
+        println!("Missing parameter <uclation_input> (@@ for AFL)");
+        return;
+    }
+    let input_file = &args[1];
+    println!("The input testcase is set to {}", input_file);
+    fuzz(input_file).unwrap();
+}
+
+fn fuzz(input_file: &str) -> Result<(), uc_error> {
+    let mut unicorn = Unicorn::new(Arch::X86, Mode::MODE_64, 0)?;
+    let mut uc: UnicornHandle<'_, _> = unicorn.borrow();
+
+    let binary = read_file(BINARY).expect(&format!("Could not read modem image: {}", BINARY));
+    let _aligned_binary_size = align(binary.len() as u64);
+    // Apply constraints to the mutated input
+    if binary.len() as u64 > CODE_SIZE_MAX {
+        println!("Binary code is too large (> {} bytes)", CODE_SIZE_MAX);
+    }
+
+    // Write the binary to its place in mem
+    uc.mem_map(BASE_ADDRESS, CODE_SIZE_MAX as usize, Permission::ALL)?;
+    uc.mem_write(BASE_ADDRESS, &binary)?;
+
+    // Set the program counter to the start of the code
+    let main_locs = parse_locs("main").unwrap();
+    //println!("Entry Point: {:x}", main_locs[0]);
+    uc.reg_write(RegisterX86::RIP as i32, main_locs[0])?;
+
+    // Setup the stack.
+    uc.mem_map(
+        STACK_ADDRESS,
+        STACK_SIZE as usize,
+        Permission::READ | Permission::WRITE,
+    )?;
+    // Setup the stack pointer, but allocate two pointers for the pointers to input.
+    uc.reg_write(RSP as i32, STACK_ADDRESS + STACK_SIZE - 16)?;
+
+    // Setup our input space, and push the pointer to it in the function params
+    uc.mem_map(INPUT_ADDRESS, INPUT_MAX as usize, Permission::READ)?;
+    // We have argc = 2
+    uc.reg_write(RDI as i32, 2)?;
+    // RSI points to our little 2 QWORD space at the beginning of the stack...
+    uc.reg_write(RSI as i32, STACK_ADDRESS + STACK_SIZE - 16)?;
+    // ... which points to the Input. Write the ptr to mem in little endian.
+    uc.mem_write(
+        STACK_ADDRESS + STACK_SIZE - 16,
+        &(INPUT_ADDRESS as u32).to_le_bytes(),
+    )?;
+
+    let already_allocated = Cell::new(false);
+
+    let already_allocated_malloc = already_allocated.clone();
+    // We use a very simple malloc/free stub here,
+    // that only works for exactly one allocation at a time.
+    let hook_malloc = move |mut uc: UnicornHandle<'_, _>, addr: u64, size: u32| {
+        if already_allocated_malloc.get() {
+            println!("Double malloc, not supported right now!");
+            abort();
+        }
+        // read the first param
+        let malloc_size = uc.reg_read(RDI as i32).unwrap();
+        if malloc_size > HEAP_SIZE_MAX {
+            println!(
+                "Tried to allocate {} bytes, but we may only allocate up to {}",
+                malloc_size, HEAP_SIZE_MAX
+            );
+            abort();
+        }
+        uc.reg_write(RAX as i32, HEAP_ADDRESS).unwrap();
+        uc.reg_write(RIP as i32, addr + size as u64).unwrap();
+        already_allocated_malloc.set(true);
+    };
+
+    let already_allocated_free = already_allocated.clone();
+    // No real free, just set the "used"-flag to false.
+    let hook_free = move |mut uc: UnicornHandle<'_, _>, addr, size| {
+        if already_allocated_free.get() {
+            println!("Double free detected. Real bug?");
+            abort();
+        }
+        // read the first param
+        let free_ptr = uc.reg_read(RDI as i32).unwrap();
+        if free_ptr != HEAP_ADDRESS {
+            println!(
+                "Tried to free wrong mem region {:x} at code loc {:x}",
+                free_ptr, addr
+            );
+            abort();
+        }
+        uc.reg_write(RIP as i32, addr + size as u64).unwrap();
+        already_allocated_free.set(false);
+    };
+
+    /*
+        BEGIN FUNCTION HOOKS
+    */
+
+    // This is a fancy print function that we're just going to skip for fuzzing.
+    let hook_magicfn = move |mut uc: UnicornHandle<'_, _>, addr, size| {
+        uc.reg_write(RIP as i32, addr + size as u64).unwrap();
+    };
+
+    for addr in parse_locs("malloc").unwrap() {
+        //hook!(addr, hook_malloc, "malloc");
+        uc.add_code_hook(addr, addr, Box::new(hook_malloc.clone()))?;
+    }
+
+    for addr in parse_locs("free").unwrap() {
+        uc.add_code_hook(addr, addr, Box::new(hook_free.clone()))?;
+    }
+
+    for addr in parse_locs("magicfn").unwrap() {
+        uc.add_code_hook(addr, addr, Box::new(hook_magicfn.clone()))?;
+    }
+
+    let place_input_callback =
+        |mut uc: UnicornHandle<'_, _>, afl_input: &mut [u8], _persistent_round| {
+            // apply constraints to the mutated input
+            if afl_input.len() > INPUT_MAX as usize {
+                //println!("Skipping testcase with leng {}", afl_input.len());
+                return false;
+            }
+
+            afl_input[afl_input.len() - 1] = b'\0';
+            uc.mem_write(INPUT_ADDRESS, afl_input).unwrap();
+            true
+        };
+
+    // return true if the last run should be counted as crash
+    let crash_validation_callback =
+        |_uc: UnicornHandle<'_, _>, result, _input: &[u8], _persistent_round| {
+            result != uc_error::OK
+        };
+
+    let end_addrs = parse_locs("main_ends").unwrap();
+
+    let ret = uc.afl_fuzz(
+        input_file,
+        Box::new(place_input_callback),
+        &end_addrs,
+        Box::new(crash_validation_callback),
+        false,
+        1000,
+    );
+
+    match ret {
+        Ok(_) => {}
+        Err(e) => panic!(format!("found non-ok unicorn exit: {:?}", e)),
+    }
+
+    Ok(())
+}
diff --git a/unicorn_mode/samples/speedtest/sample_inputs/a b/unicorn_mode/samples/speedtest/sample_inputs/a
new file mode 100644
index 00000000..78981922
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/sample_inputs/a
@@ -0,0 +1 @@
+a
diff --git a/unicorn_mode/samples/speedtest/target.c b/unicorn_mode/samples/speedtest/target.c
new file mode 100644
index 00000000..8359a110
--- /dev/null
+++ b/unicorn_mode/samples/speedtest/target.c
@@ -0,0 +1,77 @@
+/*
+ * Sample target file to test afl-unicorn fuzzing capabilities.
+ * This is a very trivial example that will, however, never crash.
+ * Crashing would change the execution speed.
+ *
+ */
+#include <stdint.h>
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+// Random print function we can hook in our harness to test hook speeds.
+char magicfn(char to_print) {
+  puts("Printing a char, just minding my own business: ");
+  putchar(to_print);
+  putchar('\n');
+  return to_print;
+}
+
+int main(int argc, char** argv) {
+  if (argc < 2) {
+    printf("Gimme input pl0x!\n");
+    return -1;
+  }
+ 
+  // Make sure the hooks work...
+  char *test = malloc(1024);
+  if (!test) {
+    printf("Uh-Oh, malloc doesn't work!");
+    abort();
+  }
+  free(test);
+
+  char *data_buf = argv[1];
+  // We can start the unicorn hooking here.
+  uint64_t data_len = strlen(data_buf);
+  if (data_len < 20) return -2;
+
+  for (; data_len --> 0 ;) {
+    char *buf_cpy = NULL;
+    if (data_len) {
+      buf_cpy = malloc(data_len);
+      if (!buf_cpy) {
+        puts("Oof, malloc failed! :/");
+        abort();
+      }
+      memcpy(buf_cpy, data_buf, data_len);
+    }
+    if (data_len >= 18) {
+      free(buf_cpy);
+      continue;
+    }
+    if (data_len > 2 && data_len < 18) {
+      buf_cpy[data_len - 1] = (char) 0x90;
+    } else if (data_buf[9] == (char) 0x90 && data_buf[10] != 0x00 && buf_cpy[11] == (char) 0x90) {
+        // Cause a crash if data[10] is not zero, but [9] and [11] are zero
+        unsigned char valid_read = buf_cpy[10];
+        if (magicfn(valid_read) != valid_read) {
+          puts("Oof, the hook for data_buf[10] is broken?");
+          abort();
+        }
+    }
+    free(buf_cpy);
+  }
+  if (data_buf[0] > 0x10 && data_buf[0] < 0x20 && data_buf[1] > data_buf[2]) {
+    // Cause an 'invalid read' crash if (0x10 < data[0] < 0x20) and data[1] > data[2]
+    unsigned char valid_read = data_buf[0];
+    if (magicfn(valid_read) != valid_read) {
+      puts("Oof, the hook for data_buf[0] is broken?");
+      abort();
+    }
+  } 
+
+  magicfn('q');
+
+  return 0;
+}
diff --git a/unicorn_mode/unicornafl b/unicorn_mode/unicornafl
-Subproject c6d6647161a32bae88785a618fcd828d1711d9e
+Subproject fb2fc9f25df32f17f6b6b859e4dbd70f9a857e0
diff --git a/unicorn_mode/update_uc_ref.sh b/unicorn_mode/update_uc_ref.sh
index a2613942..7c1c7778 100755
--- a/unicorn_mode/update_uc_ref.sh
+++ b/unicorn_mode/update_uc_ref.sh
@@ -19,7 +19,7 @@ if [ "$NEW_VERSION" = "-h" ]; then
   exit 1
 fi
 
-git submodule init && git submodule update || exit 1
+git submodule init && git submodule update unicornafl || exit 1
 cd ./unicornafl || exit 1
 git fetch origin dev 1>/dev/null || exit 1
 git stash 1>/dev/null 2>/dev/null