diff options
Diffstat (limited to 'unicorn_mode')
29 files changed, 2180 insertions, 439 deletions
diff --git a/unicorn_mode/README.md b/unicorn_mode/README.md index f6bd4d12..b3df44fa 100644 --- a/unicorn_mode/README.md +++ b/unicorn_mode/README.md @@ -8,19 +8,19 @@ The CompareCoverage and NeverZero counters features are by Andrea Fioraldi <andr ## 1) Introduction -The code in ./unicorn_mode allows you to build a standalone feature that -leverages the Unicorn Engine and allows callers to obtain instrumentation +The code in ./unicorn_mode allows you to build the (Unicorn Engine)[https://github.com/unicorn-engine/unicorn] with afl support. +This means, you can run anything that can be emulated in unicorn and obtain instrumentation output for black-box, closed-source binary code snippets. This mechanism can be then used by afl-fuzz to stress-test targets that couldn't be built -with afl-gcc or used in QEMU mode, or with other extensions such as -TriforceAFL. +with afl-cc or used in QEMU mode. There is a significant performance penalty compared to native AFL, but at least we're able to use AFL++ on these binaries, right? ## 2) How to use -Requirements: you need an installed python environment. +First, you will need a working harness for your target in unicorn, using Python, C, or Rust. +For some pointers for more advanced emulation, take a look at [BaseSAFE](https://github.com/fgsect/BaseSAFE) and [Qiling](https://github.com/qilingframework/qiling). ### Building AFL++'s Unicorn Mode @@ -34,23 +34,23 @@ cd unicorn_mode ``` NOTE: This script checks out a Unicorn Engine fork as submodule that has been tested -and is stable-ish, based on the unicorn engine master. +and is stable-ish, based on the unicorn engine `next` branch. Building Unicorn will take a little bit (~5-10 minutes). Once it completes it automatically compiles a sample application and verifies that it works. ### Fuzzing with Unicorn Mode -To really use unicorn-mode effectively you need to prepare the following: +To use unicorn-mode effectively you need to prepare the following: * Relevant binary code to be fuzzed * Knowledge of the memory map and good starting state * Folder containing sample inputs to start fuzzing with + Same ideas as any other AFL inputs - + Quality/speed of results will depend greatly on quality of starting + + Quality/speed of results will depend greatly on the quality of starting samples + See AFL's guidance on how to create a sample corpus - * Unicornafl-based test harness which: + * Unicornafl-based test harness in Rust, C, or Python, which: + Adds memory map regions + Loads binary code into memory + Calls uc.afl_fuzz() / uc.afl_start_forkserver @@ -59,13 +59,13 @@ To really use unicorn-mode effectively you need to prepare the following: the test harness + Presumably the data to be fuzzed is at a fixed buffer address + If input constraints (size, invalid bytes, etc.) are known they - should be checked after the file is loaded. If a constraint - fails, just exit the test harness. AFL will treat the input as + should be checked in the place_input handler. If a constraint + fails, just return false from the handler. AFL will treat the input as 'uninteresting' and move on. + Sets up registers and memory state for beginning of test - + Emulates the interested code from beginning to end + + Emulates the interesting code from beginning to end + If a crash is detected, the test harness must 'crash' by - throwing a signal (SIGSEGV, SIGKILL, SIGABORT, etc.) + throwing a signal (SIGSEGV, SIGKILL, SIGABORT, etc.), or indicate a crash in the crash validation callback. Once you have all those things ready to go you just need to run afl-fuzz in 'unicorn-mode' by passing in the '-U' flag: @@ -79,11 +79,12 @@ AFL's main documentation for more info about how to use afl-fuzz effectively. For a much clearer vision of what all of this looks like, please refer to the sample provided in the 'unicorn_mode/samples' directory. There is also a blog -post that goes over the basics at: +post that uses slightly older concepts, but describes the general ideas, at: [https://medium.com/@njvoss299/afl-unicorn-fuzzing-arbitrary-binary-code-563ca28936bf](https://medium.com/@njvoss299/afl-unicorn-fuzzing-arbitrary-binary-code-563ca28936bf) -The 'helper_scripts' directory also contains several helper scripts that allow you + +The ['helper_scripts'](./helper_scripts) directory also contains several helper scripts that allow you to dump context from a running process, load it, and hook heap allocations. For details on how to use this check out the follow-up blog post to the one linked above. @@ -92,10 +93,10 @@ A example use of AFL-Unicorn mode is discussed in the paper Unicorefuzz: ## 3) Options -As for the QEMU-based instrumentation, the afl-unicorn twist of afl++ -comes with a sub-instruction based instrumentation similar in purpose to laf-intel. +As for the QEMU-based instrumentation, unicornafl comes with a sub-instruction based instrumentation similar in purpose to laf-intel. The options that enable Unicorn CompareCoverage are the same used for QEMU. +This will split up each multi-byte compare to give feedback for each correct byte. AFL_COMPCOV_LEVEL=1 is to instrument comparisons with only immediate values. AFL_COMPCOV_LEVEL=2 instruments all comparison instructions. @@ -119,6 +120,20 @@ unicornafl.monkeypatch() This will replace all unicorn imports with unicornafl inputs. -Refer to the [samples/arm_example/arm_tester.c](samples/arm_example/arm_tester.c) for an example -of how to do this properly! If you don't get this right, AFL will not -load any mutated inputs and your fuzzing will be useless! +5) Examples + +Apart from reading the documentation in `afl.c` and the python bindings of unicornafl, the best documentation are the [samples/](./samples). +The following examples exist at the time of writing: + +- c: A simple example how to use the c bindings +- compcov_x64: A python example that uses compcov to traverse hard-to-reach blocks +- persistent: A c example using persistent mode for maximum speed, and resetting the target state between each iteration +- simple: A simple python example +- speedtest/c: The c harness for an example target, used to compare c, python, and rust bindings and fix speed issues +- speedtest/python: Fuzzing the same target in python +- speedtest/rust: Fuzzing the same target using a rust harness + +Usually, the place to look at is the `harness` in each folder. The source code in each harness is pretty well documented. +Most harnesses also have the `afl-fuzz` commandline, or even offer a `make fuzz` Makefile target. +Targets in these folders, if x86, can usually be made using `make target` in each folder or get shipped pre-built (plus their source). +Especially take a look at the [speedtest documentation](./samples/speedtest/README.md) to see how the languages compare. \ No newline at end of file diff --git a/unicorn_mode/UNICORNAFL_VERSION b/unicorn_mode/UNICORNAFL_VERSION index 02736b77..d9ae5590 100644 --- a/unicorn_mode/UNICORNAFL_VERSION +++ b/unicorn_mode/UNICORNAFL_VERSION @@ -1 +1 @@ -c6d66471 +fb2fc9f2 diff --git a/unicorn_mode/build_unicorn_support.sh b/unicorn_mode/build_unicorn_support.sh index 841728d7..6c376f8d 100755 --- a/unicorn_mode/build_unicorn_support.sh +++ b/unicorn_mode/build_unicorn_support.sh @@ -44,7 +44,7 @@ echo "[*] Performing basic sanity checks..." PLT=`uname -s` -if [ ! "$PLT" = "Linux" ] && [ ! "$PLT" = "Darwin" ] && [ ! "$PLT" = "FreeBSD" ] && [ ! "$PLT" = "NetBSD" ] && [ ! "$PLT" = "OpenBSD" ]; then +if [ ! "$PLT" = "Linux" ] && [ ! "$PLT" = "Darwin" ] && [ ! "$PLT" = "FreeBSD" ] && [ ! "$PLT" = "NetBSD" ] && [ ! "$PLT" = "OpenBSD" ] && [ ! "$PLT" = "DragonFly" ]; then echo "[-] Error: Unicorn instrumentation is unsupported on $PLT." exit 1 @@ -70,6 +70,11 @@ MAKECMD=make TARCMD=tar if [ "$PLT" = "Linux" ]; then + MUSL=`ldd --version 2>&1 | head -n 1 | cut -f 1 -d " "` + if [ "musl" = $MUSL ]; then + echo "[-] Error: Unicorn instrumentation is unsupported with the musl's libc." + exit 1 + fi CORES=`nproc` fi @@ -84,6 +89,12 @@ if [ "$PLT" = "FreeBSD" ]; then TARCMD=gtar fi +if [ "$PLT" = "DragonFly" ]; then + MAKECMD=gmake + CORES=`sysctl -n hw.ncpu` + TARCMD=tar +fi + if [ "$PLT" = "NetBSD" ] || [ "$PLT" = "OpenBSD" ]; then MAKECMD=gmake CORES=`sysctl -n hw.ncpu` @@ -106,19 +117,19 @@ done # some python version should be available now PYTHONS="`command -v python3` `command -v python` `command -v python2`" -EASY_INSTALL_FOUND=0 +SETUPTOOLS_FOUND=0 for PYTHON in $PYTHONS ; do if $PYTHON -c "import setuptools" ; then - EASY_INSTALL_FOUND=1 + SETUPTOOLS_FOUND=1 PYTHONBIN=$PYTHON break fi done -if [ "0" = $EASY_INSTALL_FOUND ]; then +if [ "0" = $SETUPTOOLS_FOUND ]; then echo "[-] Error: Python setup-tools not found. Run 'sudo apt-get install python-setuptools', or install python3-setuptools, or run '$PYTHONBIN -m ensurepip', or create a virtualenv, or ..." PREREQ_NOTFOUND=1 @@ -136,6 +147,8 @@ if [ "$PREREQ_NOTFOUND" = "1" ]; then exit 1 fi +unset CFLAGS + echo "[+] All checks passed!" echo "[*] Making sure unicornafl is checked out" @@ -144,7 +157,8 @@ git status 1>/dev/null 2>/dev/null if [ $? -eq 0 ]; then echo "[*] initializing unicornafl submodule" git submodule init || exit 1 - git submodule update 2>/dev/null # ignore errors + git submodule update ./unicornafl 2>/dev/null # ignore errors + git submodule sync ./unicornafl 2>/dev/null # ignore errors else echo "[*] cloning unicornafl" test -d unicornafl || { @@ -165,8 +179,9 @@ echo "[*] Checking out $UNICORNAFL_VERSION" sh -c 'git stash && git stash drop' 1>/dev/null 2>/dev/null git checkout "$UNICORNAFL_VERSION" || exit 1 -echo "[*] making sure config.h matches" -cp "../../config.h" "." || exit 1 +echo "[*] making sure afl++ header files match" +cp "../../include/config.h" "." || exit 1 +cp "../../include/types.h" "." || exit 1 echo "[*] Configuring Unicorn build..." diff --git a/unicorn_mode/helper_scripts/unicorn_dumper_gdb.py b/unicorn_mode/helper_scripts/unicorn_dumper_gdb.py index 22b9fd47..1ac4c9f3 100644 --- a/unicorn_mode/helper_scripts/unicorn_dumper_gdb.py +++ b/unicorn_mode/helper_scripts/unicorn_dumper_gdb.py @@ -1,13 +1,13 @@ """ unicorn_dumper_gdb.py - + When run with GDB sitting at a debug breakpoint, this dumps the current state (registers/memory/etc) of - the process to a directory consisting of an index - file with register and segment information and + the process to a directory consisting of an index + file with register and segment information and sub-files containing all actual process memory. - - The output of this script is expected to be used + + The output of this script is expected to be used to initialize context for Unicorn emulation. ----------- @@ -44,30 +44,32 @@ MAX_SEG_SIZE = 128 * 1024 * 1024 # Name of the index file INDEX_FILE_NAME = "_index.json" -#---------------------- -#---- Helper Functions + +# ---------------------- +# ---- Helper Functions + def map_arch(): - arch = get_arch() # from GEF - if 'x86_64' in arch or 'x86-64' in arch: + arch = get_arch() # from GEF + if "x86_64" in arch or "x86-64" in arch: return "x64" - elif 'x86' in arch or 'i386' in arch: + elif "x86" in arch or "i386" in arch: return "x86" - elif 'aarch64' in arch or 'arm64' in arch: + elif "aarch64" in arch or "arm64" in arch: return "arm64le" - elif 'aarch64_be' in arch: + elif "aarch64_be" in arch: return "arm64be" - elif 'armeb' in arch: + elif "armeb" in arch: # check for THUMB mode - cpsr = get_register('cpsr') - if (cpsr & (1 << 5)): + cpsr = get_register("$cpsr") + if cpsr & (1 << 5): return "armbethumb" else: return "armbe" - elif 'arm' in arch: + elif "arm" in arch: # check for THUMB mode - cpsr = get_register('cpsr') - if (cpsr & (1 << 5)): + cpsr = get_register("$cpsr") + if cpsr & (1 << 5): return "armlethumb" else: return "armle" @@ -75,8 +77,9 @@ def map_arch(): return "" -#----------------------- -#---- Dumping functions +# ----------------------- +# ---- Dumping functions + def dump_arch_info(): arch_info = {} @@ -88,19 +91,15 @@ def dump_regs(): reg_state = {} for reg in current_arch.all_registers: reg_val = get_register(reg) - # current dumper script looks for register values to be hex strings -# reg_str = "0x{:08x}".format(reg_val) -# if "64" in get_arch(): -# reg_str = "0x{:016x}".format(reg_val) -# reg_state[reg.strip().strip('$')] = reg_str - reg_state[reg.strip().strip('$')] = reg_val + reg_state[reg.strip().strip("$")] = reg_val + return reg_state def dump_process_memory(output_dir): # Segment information dictionary final_segment_list = [] - + # GEF: vmmap = get_process_maps() if not vmmap: @@ -110,45 +109,91 @@ def dump_process_memory(output_dir): for entry in vmmap: if entry.page_start == entry.page_end: continue - - seg_info = {'start': entry.page_start, 'end': entry.page_end, 'name': entry.path, 'permissions': { - "r": entry.is_readable() > 0, - "w": entry.is_writable() > 0, - "x": entry.is_executable() > 0 - }, 'content_file': ''} + + seg_info = { + "start": entry.page_start, + "end": entry.page_end, + "name": entry.path, + "permissions": { + "r": entry.is_readable() > 0, + "w": entry.is_writable() > 0, + "x": entry.is_executable() > 0, + }, + "content_file": "", + } # "(deleted)" may or may not be valid, but don't push it. - if entry.is_readable() and not '(deleted)' in entry.path: + if entry.is_readable() and not "(deleted)" in entry.path: try: # Compress and dump the content to a file seg_content = read_memory(entry.page_start, entry.size) - if(seg_content == None): - print("Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(entry.page_start, entry.path)) + if seg_content == None: + print( + "Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format( + entry.page_start, entry.path + ) + ) else: - print("Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(entry.page_start, len(seg_content), entry.path, repr(seg_info['permissions']))) + print( + "Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format( + entry.page_start, + len(seg_content), + entry.path, + repr(seg_info["permissions"]), + ) + ) compressed_seg_content = zlib.compress(seg_content) md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin" seg_info["content_file"] = md5_sum - + # Write the compressed contents to disk - out_file = open(os.path.join(output_dir, md5_sum), 'wb') + out_file = open(os.path.join(output_dir, md5_sum), "wb") out_file.write(compressed_seg_content) out_file.close() except: - print("Exception reading segment ({}): {}".format(entry.path, sys.exc_info()[0])) + print( + "Exception reading segment ({}): {}".format( + entry.path, sys.exc_info()[0] + ) + ) else: - print("Skipping segment {0}@0x{1:016x}".format(entry.path, entry.page_start)) + print( + "Skipping segment {0}@0x{1:016x}".format(entry.path, entry.page_start) + ) # Add the segment to the list final_segment_list.append(seg_info) - return final_segment_list -#---------- -#---- Main - + +# --------------------------------------------- +# ---- ARM Extention (dump floating point regs) + + +def dump_float(rge=32): + reg_convert = "" + if ( + map_arch() == "armbe" + or map_arch() == "armle" + or map_arch() == "armbethumb" + or map_arch() == "armbethumb" + ): + reg_state = {} + for reg_num in range(32): + value = gdb.selected_frame().read_register("d" + str(reg_num)) + reg_state["d" + str(reg_num)] = int(str(value["u64"]), 16) + value = gdb.selected_frame().read_register("fpscr") + reg_state["fpscr"] = int(str(value), 16) + + return reg_state + + +# ---------- +# ---- Main + + def main(): print("----- Unicorn Context Dumper -----") print("You must be actively debugging before running this!") @@ -159,32 +204,35 @@ def main(): print("!!! GEF not running in GDB. Please run gef.py by executing:") print('\tpython execfile ("<path_to_gef>/gef.py")') return - + try: - + # Create the output directory - timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S') + timestamp = datetime.datetime.fromtimestamp(time.time()).strftime( + "%Y%m%d_%H%M%S" + ) output_path = "UnicornContext_" + timestamp if not os.path.exists(output_path): os.makedirs(output_path) print("Process context will be output to {}".format(output_path)) - + # Get the context context = { "arch": dump_arch_info(), - "regs": dump_regs(), + "regs": dump_regs(), + "regs_extended": dump_float(), "segments": dump_process_memory(output_path), } # Write the index file - index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w') + index_file = open(os.path.join(output_path, INDEX_FILE_NAME), "w") index_file.write(json.dumps(context, indent=4)) - index_file.close() + index_file.close() print("Done.") - + except Exception as e: print("!!! ERROR:\n\t{}".format(repr(e))) - + + if __name__ == "__main__": main() - diff --git a/unicorn_mode/helper_scripts/unicorn_dumper_ida.py b/unicorn_mode/helper_scripts/unicorn_dumper_ida.py index 6cf9f30f..fa29fb90 100644 --- a/unicorn_mode/helper_scripts/unicorn_dumper_ida.py +++ b/unicorn_mode/helper_scripts/unicorn_dumper_ida.py @@ -31,8 +31,9 @@ MAX_SEG_SIZE = 128 * 1024 * 1024 # Name of the index file INDEX_FILE_NAME = "_index.json" -#---------------------- -#---- Helper Functions +# ---------------------- +# ---- Helper Functions + def get_arch(): if ph.id == PLFM_386 and ph.flag & PR_USE64: @@ -52,6 +53,7 @@ def get_arch(): else: return "" + def get_register_list(arch): if arch == "arm64le" or arch == "arm64be": arch = "arm64" @@ -59,84 +61,174 @@ def get_register_list(arch): arch = "arm" registers = { - "x64" : [ - "rax", "rbx", "rcx", "rdx", "rsi", "rdi", "rbp", "rsp", - "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", - "rip", "rsp", "efl", - "cs", "ds", "es", "fs", "gs", "ss", + "x64": [ + "rax", + "rbx", + "rcx", + "rdx", + "rsi", + "rdi", + "rbp", + "rsp", + "r8", + "r9", + "r10", + "r11", + "r12", + "r13", + "r14", + "r15", + "rip", + "rsp", + "efl", + "cs", + "ds", + "es", + "fs", + "gs", + "ss", + ], + "x86": [ + "eax", + "ebx", + "ecx", + "edx", + "esi", + "edi", + "ebp", + "esp", + "eip", + "esp", + "efl", + "cs", + "ds", + "es", + "fs", + "gs", + "ss", ], - "x86" : [ - "eax", "ebx", "ecx", "edx", "esi", "edi", "ebp", "esp", - "eip", "esp", "efl", - "cs", "ds", "es", "fs", "gs", "ss", - ], - "arm" : [ - "R0", "R1", "R2", "R3", "R4", "R5", "R6", "R7", - "R8", "R9", "R10", "R11", "R12", "PC", "SP", "LR", + "arm": [ + "R0", + "R1", + "R2", + "R3", + "R4", + "R5", + "R6", + "R7", + "R8", + "R9", + "R10", + "R11", + "R12", + "PC", + "SP", + "LR", "PSR", ], - "arm64" : [ - "X0", "X1", "X2", "X3", "X4", "X5", "X6", "X7", - "X8", "X9", "X10", "X11", "X12", "X13", "X14", - "X15", "X16", "X17", "X18", "X19", "X20", "X21", - "X22", "X23", "X24", "X25", "X26", "X27", "X28", - "PC", "SP", "FP", "LR", "CPSR" + "arm64": [ + "X0", + "X1", + "X2", + "X3", + "X4", + "X5", + "X6", + "X7", + "X8", + "X9", + "X10", + "X11", + "X12", + "X13", + "X14", + "X15", + "X16", + "X17", + "X18", + "X19", + "X20", + "X21", + "X22", + "X23", + "X24", + "X25", + "X26", + "X27", + "X28", + "PC", + "SP", + "FP", + "LR", + "CPSR" # "NZCV", - ] + ], } - return registers[arch] + return registers[arch] + + +# ----------------------- +# ---- Dumping functions -#----------------------- -#---- Dumping functions def dump_arch_info(): arch_info = {} arch_info["arch"] = get_arch() return arch_info + def dump_regs(): reg_state = {} for reg in get_register_list(get_arch()): reg_state[reg] = GetRegValue(reg) return reg_state + def dump_process_memory(output_dir): # Segment information dictionary segment_list = [] - + # Loop over the segments, fill in the info dictionary for seg_ea in Segments(): seg_start = SegStart(seg_ea) seg_end = SegEnd(seg_ea) seg_size = seg_end - seg_start - + seg_info = {} - seg_info["name"] = SegName(seg_ea) + seg_info["name"] = SegName(seg_ea) seg_info["start"] = seg_start - seg_info["end"] = seg_end - + seg_info["end"] = seg_end + perms = getseg(seg_ea).perm seg_info["permissions"] = { - "r": False if (perms & SEGPERM_READ) == 0 else True, + "r": False if (perms & SEGPERM_READ) == 0 else True, "w": False if (perms & SEGPERM_WRITE) == 0 else True, - "x": False if (perms & SEGPERM_EXEC) == 0 else True, + "x": False if (perms & SEGPERM_EXEC) == 0 else True, } if (perms & SEGPERM_READ) and seg_size <= MAX_SEG_SIZE and isLoaded(seg_start): try: # Compress and dump the content to a file seg_content = get_many_bytes(seg_start, seg_end - seg_start) - if(seg_content == None): - print("Segment empty: {0}@0x{1:016x} (size:UNKNOWN)".format(SegName(seg_ea), seg_ea)) + if seg_content == None: + print( + "Segment empty: {0}@0x{1:016x} (size:UNKNOWN)".format( + SegName(seg_ea), seg_ea + ) + ) seg_info["content_file"] = "" else: - print("Dumping segment {0}@0x{1:016x} (size:{2})".format(SegName(seg_ea), seg_ea, len(seg_content))) + print( + "Dumping segment {0}@0x{1:016x} (size:{2})".format( + SegName(seg_ea), seg_ea, len(seg_content) + ) + ) compressed_seg_content = zlib.compress(seg_content) md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin" seg_info["content_file"] = md5_sum - + # Write the compressed contents to disk - out_file = open(os.path.join(output_dir, md5_sum), 'wb') + out_file = open(os.path.join(output_dir, md5_sum), "wb") out_file.write(compressed_seg_content) out_file.close() except: @@ -145,12 +237,13 @@ def dump_process_memory(output_dir): else: print("Skipping segment {0}@0x{1:016x}".format(SegName(seg_ea), seg_ea)) seg_info["content_file"] = "" - + # Add the segment to the list - segment_list.append(seg_info) - + segment_list.append(seg_info) + return segment_list + """ TODO: FINISH IMPORT DUMPING def import_callback(ea, name, ord): @@ -169,41 +262,47 @@ def dump_imports(): return import_dict """ - -#---------- -#---- Main - + +# ---------- +# ---- Main + + def main(): try: print("----- Unicorn Context Dumper -----") print("You must be actively debugging before running this!") - print("If it fails, double check that you are actively debugging before running.") + print( + "If it fails, double check that you are actively debugging before running." + ) # Create the output directory - timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S') + timestamp = datetime.datetime.fromtimestamp(time.time()).strftime( + "%Y%m%d_%H%M%S" + ) output_path = os.path.dirname(os.path.abspath(GetIdbPath())) output_path = os.path.join(output_path, "UnicornContext_" + timestamp) if not os.path.exists(output_path): os.makedirs(output_path) print("Process context will be output to {}".format(output_path)) - + # Get the context context = { "arch": dump_arch_info(), - "regs": dump_regs(), + "regs": dump_regs(), "segments": dump_process_memory(output_path), - #"imports": dump_imports(), + # "imports": dump_imports(), } # Write the index file - index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w') + index_file = open(os.path.join(output_path, INDEX_FILE_NAME), "w") index_file.write(json.dumps(context, indent=4)) - index_file.close() + index_file.close() print("Done.") - + except Exception, e: print("!!! ERROR:\n\t{}".format(str(e))) - + + if __name__ == "__main__": main() diff --git a/unicorn_mode/helper_scripts/unicorn_dumper_lldb.py b/unicorn_mode/helper_scripts/unicorn_dumper_lldb.py index 3c019d77..179d062a 100644 --- a/unicorn_mode/helper_scripts/unicorn_dumper_lldb.py +++ b/unicorn_mode/helper_scripts/unicorn_dumper_lldb.py @@ -50,10 +50,11 @@ UNICORN_PAGE_SIZE = 0x1000 # Alignment functions to align all memory segments to Unicorn page boundaries (4KB pages only) ALIGN_PAGE_DOWN = lambda x: x & ~(UNICORN_PAGE_SIZE - 1) -ALIGN_PAGE_UP = lambda x: (x + UNICORN_PAGE_SIZE - 1) & ~(UNICORN_PAGE_SIZE-1) +ALIGN_PAGE_UP = lambda x: (x + UNICORN_PAGE_SIZE - 1) & ~(UNICORN_PAGE_SIZE - 1) + +# ---------------------- +# ---- Helper Functions -#---------------------- -#---- Helper Functions def overlap_alignments(segments, memory): final_list = [] @@ -61,33 +62,40 @@ def overlap_alignments(segments, memory): curr_end_addr = 0 curr_node = None current_segment = None - sorted_segments = sorted(segments, key=lambda k: (k['start'], k['end'])) + sorted_segments = sorted(segments, key=lambda k: (k["start"], k["end"])) if curr_seg_idx < len(sorted_segments): current_segment = sorted_segments[curr_seg_idx] - for mem in sorted(memory, key=lambda k: (k['start'], -k['end'])): + for mem in sorted(memory, key=lambda k: (k["start"], -k["end"])): if curr_node is None: - if current_segment is not None and current_segment['start'] == mem['start']: + if current_segment is not None and current_segment["start"] == mem["start"]: curr_node = deepcopy(current_segment) - curr_node['permissions'] = mem['permissions'] + curr_node["permissions"] = mem["permissions"] else: curr_node = deepcopy(mem) - curr_end_addr = curr_node['end'] - - while curr_end_addr <= mem['end']: - if curr_node['end'] == mem['end']: - if current_segment is not None and current_segment['start'] > curr_node['start'] and current_segment['start'] < curr_node['end']: - curr_node['end'] = current_segment['start'] - if(curr_node['end'] > curr_node['start']): + curr_end_addr = curr_node["end"] + + while curr_end_addr <= mem["end"]: + if curr_node["end"] == mem["end"]: + if ( + current_segment is not None + and current_segment["start"] > curr_node["start"] + and current_segment["start"] < curr_node["end"] + ): + curr_node["end"] = current_segment["start"] + if curr_node["end"] > curr_node["start"]: final_list.append(curr_node) curr_node = deepcopy(current_segment) - curr_node['permissions'] = mem['permissions'] - curr_end_addr = curr_node['end'] + curr_node["permissions"] = mem["permissions"] + curr_end_addr = curr_node["end"] else: - if(curr_node['end'] > curr_node['start']): + if curr_node["end"] > curr_node["start"]: final_list.append(curr_node) # if curr_node is a segment - if current_segment is not None and current_segment['end'] == mem['end']: + if ( + current_segment is not None + and current_segment["end"] == mem["end"] + ): curr_seg_idx += 1 if curr_seg_idx < len(sorted_segments): current_segment = sorted_segments[curr_seg_idx] @@ -98,50 +106,56 @@ def overlap_alignments(segments, memory): break # could only be a segment else: - if curr_node['end'] < mem['end']: + if curr_node["end"] < mem["end"]: # check for remaining segments and valid segments - if(curr_node['end'] > curr_node['start']): + if curr_node["end"] > curr_node["start"]: final_list.append(curr_node) - + curr_seg_idx += 1 if curr_seg_idx < len(sorted_segments): current_segment = sorted_segments[curr_seg_idx] else: current_segment = None - - if current_segment is not None and current_segment['start'] <= curr_end_addr and current_segment['start'] < mem['end']: + + if ( + current_segment is not None + and current_segment["start"] <= curr_end_addr + and current_segment["start"] < mem["end"] + ): curr_node = deepcopy(current_segment) - curr_node['permissions'] = mem['permissions'] + curr_node["permissions"] = mem["permissions"] else: # no more segments curr_node = deepcopy(mem) - - curr_node['start'] = curr_end_addr - curr_end_addr = curr_node['end'] - return final_list + curr_node["start"] = curr_end_addr + curr_end_addr = curr_node["end"] + + return final_list + # https://github.com/llvm-mirror/llvm/blob/master/include/llvm/ADT/Triple.h def get_arch(): - arch, arch_vendor, arch_os = lldb.target.GetTriple().split('-') - if arch == 'x86_64': + arch, arch_vendor, arch_os = lldb.target.GetTriple().split("-") + if arch == "x86_64": return "x64" - elif arch == 'x86' or arch == 'i386': + elif arch == "x86" or arch == "i386": return "x86" - elif arch == 'aarch64' or arch == 'arm64': + elif arch == "aarch64" or arch == "arm64": return "arm64le" - elif arch == 'aarch64_be': + elif arch == "aarch64_be": return "arm64be" - elif arch == 'armeb': + elif arch == "armeb": return "armbe" - elif arch == 'arm': + elif arch == "arm": return "armle" else: return "" -#----------------------- -#---- Dumping functions +# ----------------------- +# ---- Dumping functions + def dump_arch_info(): arch_info = {} @@ -152,56 +166,64 @@ def dump_arch_info(): def dump_regs(): reg_state = {} for reg_list in lldb.frame.GetRegisters(): - if 'general purpose registers' in reg_list.GetName().lower(): + if "general purpose registers" in reg_list.GetName().lower(): for reg in reg_list: reg_state[reg.GetName()] = int(reg.GetValue(), 16) return reg_state + def get_section_info(sec): - name = sec.name if sec.name is not None else '' + name = sec.name if sec.name is not None else "" if sec.GetParent().name is not None: - name = sec.GetParent().name + '.' + sec.name + name = sec.GetParent().name + "." + sec.name module_name = sec.addr.module.file.GetFilename() - module_name = module_name if module_name is not None else '' - long_name = module_name + '.' + name - + module_name = module_name if module_name is not None else "" + long_name = module_name + "." + name + return sec.addr.load_addr, (sec.addr.load_addr + sec.size), sec.size, long_name - + def dump_process_memory(output_dir): # Segment information dictionary raw_segment_list = [] raw_memory_list = [] - + # 1st pass: # Loop over the segments, fill in the segment info dictionary for module in lldb.target.module_iter(): for seg_ea in module.section_iter(): - seg_info = {'module': module.file.GetFilename() } - seg_info['start'], seg_info['end'], seg_size, seg_info['name'] = get_section_info(seg_ea) + seg_info = {"module": module.file.GetFilename()} + ( + seg_info["start"], + seg_info["end"], + seg_size, + seg_info["name"], + ) = get_section_info(seg_ea) # TODO: Ugly hack for -1 LONG address on 32-bit - if seg_info['start'] >= sys.maxint or seg_size <= 0: - print "Throwing away page: {}".format(seg_info['name']) + if seg_info["start"] >= sys.maxint or seg_size <= 0: + print "Throwing away page: {}".format(seg_info["name"]) continue # Page-align segment - seg_info['start'] = ALIGN_PAGE_DOWN(seg_info['start']) - seg_info['end'] = ALIGN_PAGE_UP(seg_info['end']) - print("Appending: {}".format(seg_info['name'])) + seg_info["start"] = ALIGN_PAGE_DOWN(seg_info["start"]) + seg_info["end"] = ALIGN_PAGE_UP(seg_info["end"]) + print ("Appending: {}".format(seg_info["name"])) raw_segment_list.append(seg_info) # Add the stack memory region (just hardcode 0x1000 around the current SP) sp = lldb.frame.GetSP() start_sp = ALIGN_PAGE_DOWN(sp) - raw_segment_list.append({'start': start_sp, 'end': start_sp + 0x1000, 'name': 'STACK'}) + raw_segment_list.append( + {"start": start_sp, "end": start_sp + 0x1000, "name": "STACK"} + ) # Write the original memory to file for debugging - index_file = open(os.path.join(output_dir, DEBUG_MEM_FILE_NAME), 'w') + index_file = open(os.path.join(output_dir, DEBUG_MEM_FILE_NAME), "w") index_file.write(json.dumps(raw_segment_list, indent=4)) - index_file.close() + index_file.close() - # Loop over raw memory regions + # Loop over raw memory regions mem_info = lldb.SBMemoryRegionInfo() start_addr = -1 next_region_addr = 0 @@ -218,15 +240,20 @@ def dump_process_memory(output_dir): end_addr = mem_info.GetRegionEnd() # Unknown region name - region_name = 'UNKNOWN' + region_name = "UNKNOWN" # Ignore regions that aren't even mapped if mem_info.IsMapped() and mem_info.IsReadable(): - mem_info_obj = {'start': start_addr, 'end': end_addr, 'name': region_name, 'permissions': { - "r": mem_info.IsReadable(), - "w": mem_info.IsWritable(), - "x": mem_info.IsExecutable() - }} + mem_info_obj = { + "start": start_addr, + "end": end_addr, + "name": region_name, + "permissions": { + "r": mem_info.IsReadable(), + "w": mem_info.IsWritable(), + "x": mem_info.IsExecutable(), + }, + } raw_memory_list.append(mem_info_obj) @@ -234,65 +261,89 @@ def dump_process_memory(output_dir): for seg_info in final_segment_list: try: - seg_info['content_file'] = '' - start_addr = seg_info['start'] - end_addr = seg_info['end'] - region_name = seg_info['name'] + seg_info["content_file"] = "" + start_addr = seg_info["start"] + end_addr = seg_info["end"] + region_name = seg_info["name"] # Compress and dump the content to a file err = lldb.SBError() - seg_content = lldb.process.ReadMemory(start_addr, end_addr - start_addr, err) - if(seg_content == None): - print("Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(start_addr, region_name)) - seg_info['content_file'] = '' + seg_content = lldb.process.ReadMemory( + start_addr, end_addr - start_addr, err + ) + if seg_content == None: + print ( + "Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format( + start_addr, region_name + ) + ) + seg_info["content_file"] = "" else: - print("Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(start_addr, len(seg_content), region_name, repr(seg_info['permissions']))) + print ( + "Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format( + start_addr, + len(seg_content), + region_name, + repr(seg_info["permissions"]), + ) + ) compressed_seg_content = zlib.compress(seg_content) md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin" - seg_info['content_file'] = md5_sum - + seg_info["content_file"] = md5_sum + # Write the compressed contents to disk - out_file = open(os.path.join(output_dir, md5_sum), 'wb') + out_file = open(os.path.join(output_dir, md5_sum), "wb") out_file.write(compressed_seg_content) out_file.close() - + except: - print("Exception reading segment ({}): {}".format(region_name, sys.exc_info()[0])) - + print ( + "Exception reading segment ({}): {}".format( + region_name, sys.exc_info()[0] + ) + ) + return final_segment_list -#---------- -#---- Main - + +# ---------- +# ---- Main + + def main(): try: - print("----- Unicorn Context Dumper -----") - print("You must be actively debugging before running this!") - print("If it fails, double check that you are actively debugging before running.") - + print ("----- Unicorn Context Dumper -----") + print ("You must be actively debugging before running this!") + print ( + "If it fails, double check that you are actively debugging before running." + ) + # Create the output directory - timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S') + timestamp = datetime.datetime.fromtimestamp(time.time()).strftime( + "%Y%m%d_%H%M%S" + ) output_path = "UnicornContext_" + timestamp if not os.path.exists(output_path): os.makedirs(output_path) - print("Process context will be output to {}".format(output_path)) - + print ("Process context will be output to {}".format(output_path)) + # Get the context context = { "arch": dump_arch_info(), - "regs": dump_regs(), + "regs": dump_regs(), "segments": dump_process_memory(output_path), } - + # Write the index file - index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w') + index_file = open(os.path.join(output_path, INDEX_FILE_NAME), "w") index_file.write(json.dumps(context, indent=4)) - index_file.close() - print("Done.") - + index_file.close() + print ("Done.") + except Exception, e: - print("!!! ERROR:\n\t{}".format(repr(e))) - + print ("!!! ERROR:\n\t{}".format(repr(e))) + + if __name__ == "__main__": main() elif lldb.debugger: diff --git a/unicorn_mode/helper_scripts/unicorn_dumper_pwndbg.py b/unicorn_mode/helper_scripts/unicorn_dumper_pwndbg.py index dc56b2aa..eccbc8bf 100644 --- a/unicorn_mode/helper_scripts/unicorn_dumper_pwndbg.py +++ b/unicorn_mode/helper_scripts/unicorn_dumper_pwndbg.py @@ -59,45 +59,47 @@ MAX_SEG_SIZE = 128 * 1024 * 1024 # Name of the index file INDEX_FILE_NAME = "_index.json" -#---------------------- -#---- Helper Functions +# ---------------------- +# ---- Helper Functions + def map_arch(): - arch = pwndbg.arch.current # from PWNDBG - if 'x86_64' in arch or 'x86-64' in arch: + arch = pwndbg.arch.current # from PWNDBG + if "x86_64" in arch or "x86-64" in arch: return "x64" - elif 'x86' in arch or 'i386' in arch: + elif "x86" in arch or "i386" in arch: return "x86" - elif 'aarch64' in arch or 'arm64' in arch: + elif "aarch64" in arch or "arm64" in arch: return "arm64le" - elif 'aarch64_be' in arch: + elif "aarch64_be" in arch: return "arm64be" - elif 'arm' in arch: - cpsr = pwndbg.regs['cpsr'] - # check endianess - if pwndbg.arch.endian == 'big': + elif "arm" in arch: + cpsr = pwndbg.regs["cpsr"] + # check endianess + if pwndbg.arch.endian == "big": # check for THUMB mode - if (cpsr & (1 << 5)): + if cpsr & (1 << 5): return "armbethumb" else: return "armbe" else: # check for THUMB mode - if (cpsr & (1 << 5)): + if cpsr & (1 << 5): return "armlethumb" else: return "armle" - elif 'mips' in arch: - if pwndbg.arch.endian == 'little': - return 'mipsel' + elif "mips" in arch: + if pwndbg.arch.endian == "little": + return "mipsel" else: - return 'mips' + return "mips" else: return "" -#----------------------- -#---- Dumping functions +# ----------------------- +# ---- Dumping functions + def dump_arch_info(): arch_info = {} @@ -110,26 +112,26 @@ def dump_regs(): for reg in pwndbg.regs.all: reg_val = pwndbg.regs[reg] # current dumper script looks for register values to be hex strings -# reg_str = "0x{:08x}".format(reg_val) -# if "64" in get_arch(): -# reg_str = "0x{:016x}".format(reg_val) -# reg_state[reg.strip().strip('$')] = reg_str - reg_state[reg.strip().strip('$')] = reg_val + # reg_str = "0x{:08x}".format(reg_val) + # if "64" in get_arch(): + # reg_str = "0x{:016x}".format(reg_val) + # reg_state[reg.strip().strip('$')] = reg_str + reg_state[reg.strip().strip("$")] = reg_val return reg_state def dump_process_memory(output_dir): # Segment information dictionary final_segment_list = [] - + # PWNDBG: vmmap = pwndbg.vmmap.get() - + # Pointer to end of last dumped memory segment - segment_last_addr = 0x0; + segment_last_addr = 0x0 start = None - end = None + end = None if not vmmap: print("No address mapping information found") @@ -141,86 +143,107 @@ def dump_process_memory(output_dir): continue start = entry.start - end = entry.end + end = entry.end - if (segment_last_addr > entry.start): # indicates overlap - if (segment_last_addr > entry.end): # indicates complete overlap, so we skip the segment entirely + if segment_last_addr > entry.start: # indicates overlap + if ( + segment_last_addr > entry.end + ): # indicates complete overlap, so we skip the segment entirely continue - else: + else: start = segment_last_addr - - - seg_info = {'start': start, 'end': end, 'name': entry.objfile, 'permissions': { - "r": entry.read, - "w": entry.write, - "x": entry.execute - }, 'content_file': ''} + + seg_info = { + "start": start, + "end": end, + "name": entry.objfile, + "permissions": {"r": entry.read, "w": entry.write, "x": entry.execute}, + "content_file": "", + } # "(deleted)" may or may not be valid, but don't push it. - if entry.read and not '(deleted)' in entry.objfile: + if entry.read and not "(deleted)" in entry.objfile: try: # Compress and dump the content to a file seg_content = pwndbg.memory.read(start, end - start) - if(seg_content == None): - print("Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format(entry.start, entry.objfile)) + if seg_content == None: + print( + "Segment empty: @0x{0:016x} (size:UNKNOWN) {1}".format( + entry.start, entry.objfile + ) + ) else: - print("Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format(entry.start, len(seg_content), entry.objfile, repr(seg_info['permissions']))) + print( + "Dumping segment @0x{0:016x} (size:0x{1:x}): {2} [{3}]".format( + entry.start, + len(seg_content), + entry.objfile, + repr(seg_info["permissions"]), + ) + ) compressed_seg_content = zlib.compress(str(seg_content)) md5_sum = hashlib.md5(compressed_seg_content).hexdigest() + ".bin" seg_info["content_file"] = md5_sum - + # Write the compressed contents to disk - out_file = open(os.path.join(output_dir, md5_sum), 'wb') + out_file = open(os.path.join(output_dir, md5_sum), "wb") out_file.write(compressed_seg_content) out_file.close() except Exception as e: traceback.print_exc() - print("Exception reading segment ({}): {}".format(entry.objfile, sys.exc_info()[0])) + print( + "Exception reading segment ({}): {}".format( + entry.objfile, sys.exc_info()[0] + ) + ) else: print("Skipping segment {0}@0x{1:016x}".format(entry.objfile, entry.start)) - + segment_last_addr = end # Add the segment to the list final_segment_list.append(seg_info) - return final_segment_list -#---------- -#---- Main - + +# ---------- +# ---- Main + + def main(): print("----- Unicorn Context Dumper -----") print("You must be actively debugging before running this!") print("If it fails, double check that you are actively debugging before running.") - + try: # Create the output directory - timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%Y%m%d_%H%M%S') + timestamp = datetime.datetime.fromtimestamp(time.time()).strftime( + "%Y%m%d_%H%M%S" + ) output_path = "UnicornContext_" + timestamp if not os.path.exists(output_path): os.makedirs(output_path) print("Process context will be output to {}".format(output_path)) - + # Get the context context = { "arch": dump_arch_info(), - "regs": dump_regs(), + "regs": dump_regs(), "segments": dump_process_memory(output_path), } # Write the index file - index_file = open(os.path.join(output_path, INDEX_FILE_NAME), 'w') + index_file = open(os.path.join(output_path, INDEX_FILE_NAME), "w") index_file.write(json.dumps(context, indent=4)) - index_file.close() + index_file.close() print("Done.") - + except Exception as e: print("!!! ERROR:\n\t{}".format(repr(e))) - + + if __name__ == "__main__" and pwndbg_loaded: main() - diff --git a/unicorn_mode/helper_scripts/unicorn_loader.py b/unicorn_mode/helper_scripts/unicorn_loader.py index adf21b64..1914a83d 100644 --- a/unicorn_mode/helper_scripts/unicorn_loader.py +++ b/unicorn_mode/helper_scripts/unicorn_loader.py @@ -1,8 +1,8 @@ """ unicorn_loader.py - - Loads a process context dumped created using a - Unicorn Context Dumper script into a Unicorn Engine + + Loads a process context dumped created using a + Unicorn Context Dumper script into a Unicorn Engine instance. Once this is performed emulation can be started. """ @@ -26,6 +26,13 @@ from unicorn.arm64_const import * from unicorn.x86_const import * from unicorn.mips_const import * +# If Capstone libraries are availible (only check once) +try: + from capstone import * + CAPSTONE_EXISTS = 1 +except: + CAPSTONE_EXISTS = 0 + # Name of the index file INDEX_FILE_NAME = "_index.json" @@ -86,7 +93,7 @@ class UnicornSimpleHeap(object): total_chunk_size = UNICORN_PAGE_SIZE + ALIGN_PAGE_UP(size) + UNICORN_PAGE_SIZE # Gross but efficient way to find space for the chunk: chunk = None - for addr in xrange(self.HEAP_MIN_ADDR, self.HEAP_MAX_ADDR, UNICORN_PAGE_SIZE): + for addr in range(self.HEAP_MIN_ADDR, self.HEAP_MAX_ADDR, UNICORN_PAGE_SIZE): try: self._uc.mem_map(addr, total_chunk_size, UC_PROT_READ | UC_PROT_WRITE) chunk = self.HeapChunk(addr, total_chunk_size, size) @@ -97,7 +104,7 @@ class UnicornSimpleHeap(object): continue # Something went very wrong if chunk == None: - return 0 + return 0 self._chunks.append(chunk) return chunk.data_addr @@ -112,8 +119,8 @@ class UnicornSimpleHeap(object): old_chunk = None for chunk in self._chunks: if chunk.data_addr == ptr: - old_chunk = chunk - new_chunk_addr = self.malloc(new_size) + old_chunk = chunk + new_chunk_addr = self.malloc(new_size) if old_chunk != None: self._uc.mem_write(new_chunk_addr, str(self._uc.mem_read(old_chunk.data_addr, old_chunk.data_size))) self.free(old_chunk.data_addr) @@ -184,39 +191,27 @@ class AflUnicornEngine(Uc): # Load the registers regs = context['regs'] reg_map = self.__get_register_map(self._arch_str) - for register, value in regs.iteritems(): - if debug_print: - print("Reg {0} = {1}".format(register, value)) - if not reg_map.has_key(register.lower()): - if debug_print: - print("Skipping Reg: {}".format(register)) - else: - reg_write_retry = True - try: - self.reg_write(reg_map[register.lower()], value) - reg_write_retry = False - except Exception as e: - if debug_print: - print("ERROR writing register: {}, value: {} -- {}".format(register, value, repr(e))) + self.__load_registers(regs, reg_map, debug_print) + # If we have extra FLOATING POINT regs, load them in! + if 'regs_extended' in context: + if context['regs_extended']: + regs_extended = context['regs_extended'] + reg_map = self.__get_registers_extended(self._arch_str) + self.__load_registers(regs_extended, reg_map, debug_print) + + # For ARM, sometimes the stack pointer is erased ??? (I think I fixed this (issue with ordering of dumper.py, I'll keep the write anyways) + if self.__get_arch_and_mode(self.get_arch_str())[0] == UC_ARCH_ARM: + self.reg_write(UC_ARM_REG_SP, regs['sp']) - if reg_write_retry: - if debug_print: - print("Trying to parse value ({}) as hex string".format(value)) - try: - self.reg_write(reg_map[register.lower()], int(value, 16)) - except Exception as e: - if debug_print: - print("ERROR writing hex string register: {}, value: {} -- {}".format(register, value, repr(e))) - # Setup the memory map and load memory content self.__map_segments(context['segments'], context_directory, debug_print) - + if enable_trace: self.hook_add(UC_HOOK_BLOCK, self.__trace_block) self.hook_add(UC_HOOK_CODE, self.__trace_instruction) self.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, self.__trace_mem_access) self.hook_add(UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, self.__trace_mem_invalid_access) - + if debug_print: print("Done loading context.") @@ -228,7 +223,7 @@ class AflUnicornEngine(Uc): def get_arch_str(self): return self._arch_str - + def force_crash(self, uc_error): """ This function should be called to indicate to AFL that a crash occurred during emulation. You can pass the exception received from Uc.emu_start @@ -253,21 +248,76 @@ class AflUnicornEngine(Uc): for reg in sorted(self.__get_register_map(self._arch_str).items(), key=lambda reg: reg[0]): print(">>> {0:>4}: 0x{1:016x}".format(reg[0], self.reg_read(reg[1]))) + def dump_regs_extended(self): + """ Dumps the contents of all the registers to STDOUT """ + try: + for reg in sorted(self.__get_registers_extended(self._arch_str).items(), key=lambda reg: reg[0]): + print(">>> {0:>4}: 0x{1:016x}".format(reg[0], self.reg_read(reg[1]))) + except Exception as e: + print("ERROR: Are extended registers loaded?") + # TODO: Make this dynamically get the stack pointer register and pointer width for the current architecture """ def dump_stack(self, window=10): + arch = self.get_arch() + mode = self.get_mode() + # Get stack pointers and bit sizes for given architecture + if arch == UC_ARCH_X86 and mode == UC_MODE_64: + stack_ptr_addr = self.reg_read(UC_X86_REG_RSP) + bit_size = 8 + elif arch == UC_ARCH_X86 and mode == UC_MODE_32: + stack_ptr_addr = self.reg_read(UC_X86_REG_ESP) + bit_size = 4 + elif arch == UC_ARCH_ARM64: + stack_ptr_addr = self.reg_read(UC_ARM64_REG_SP) + bit_size = 8 + elif arch == UC_ARCH_ARM: + stack_ptr_addr = self.reg_read(UC_ARM_REG_SP) + bit_size = 4 + elif arch == UC_ARCH_ARM and mode == UC_MODE_THUMB: + stack_ptr_addr = self.reg_read(UC_ARM_REG_SP) + bit_size = 4 + elif arch == UC_ARCH_MIPS: + stack_ptr_addr = self.reg_read(UC_MIPS_REG_SP) + bit_size = 4 + print("") print(">>> Stack:") stack_ptr_addr = self.reg_read(UC_X86_REG_RSP) for i in xrange(-window, window + 1): addr = stack_ptr_addr + (i*8) print("{0}0x{1:016x}: 0x{2:016x}".format( \ - 'SP->' if i == 0 else ' ', addr, \ + 'SP->' if i == 0 else ' ', addr, \ struct.unpack('<Q', self.mem_read(addr, 8))[0])) """ #----------------------------- #---- Loader Helper Functions + def __load_registers(self, regs, reg_map, debug_print): + for register, value in regs.items(): + if debug_print: + print("Reg {0} = {1}".format(register, value)) + if register.lower() not in reg_map: + if debug_print: + print("Skipping Reg: {}".format(register)) + else: + reg_write_retry = True + try: + self.reg_write(reg_map[register.lower()], value) + reg_write_retry = False + except Exception as e: + if debug_print: + print("ERROR writing register: {}, value: {} -- {}".format(register, value, repr(e))) + + if reg_write_retry: + if debug_print: + print("Trying to parse value ({}) as hex string".format(value)) + try: + self.reg_write(reg_map[register.lower()], int(value, 16)) + except Exception as e: + if debug_print: + print("ERROR writing hex string register: {}, value: {} -- {}".format(register, value, repr(e))) + def __map_segment(self, name, address, size, perms, debug_print=False): # - size is unsigned and must be != 0 # - starting address must be aligned to 4KB @@ -289,7 +339,7 @@ class AflUnicornEngine(Uc): def __map_segments(self, segment_list, context_directory, debug_print=False): for segment in segment_list: - + # Get the segment information from the index name = segment['name'] seg_start = segment['start'] @@ -297,7 +347,7 @@ class AflUnicornEngine(Uc): perms = \ (UC_PROT_READ if segment['permissions']['r'] == True else 0) | \ (UC_PROT_WRITE if segment['permissions']['w'] == True else 0) | \ - (UC_PROT_EXEC if segment['permissions']['x'] == True else 0) + (UC_PROT_EXEC if segment['permissions']['x'] == True else 0) if debug_print: print("Handling segment {}".format(name)) @@ -349,12 +399,12 @@ class AflUnicornEngine(Uc): content_file = open(content_file_path, 'rb') compressed_content = content_file.read() content_file.close() - self.mem_write(seg_start, zlib.decompress(compressed_content)) + self.mem_write(seg_start, zlib.decompress(compressed_content)) else: if debug_print: print("No content found for segment {0} @ {1:016x}".format(name, seg_start)) - self.mem_write(seg_start, '\x00' * (seg_end - seg_start)) + self.mem_write(seg_start, b'\x00' * (seg_end - seg_start)) def __get_arch_and_mode(self, arch_str): arch_map = { @@ -398,7 +448,6 @@ class AflUnicornEngine(Uc): "r14": UC_X86_REG_R14, "r15": UC_X86_REG_R15, "rip": UC_X86_REG_RIP, - "rsp": UC_X86_REG_RSP, "efl": UC_X86_REG_EFLAGS, "cs": UC_X86_REG_CS, "ds": UC_X86_REG_DS, @@ -415,13 +464,12 @@ class AflUnicornEngine(Uc): "esi": UC_X86_REG_ESI, "edi": UC_X86_REG_EDI, "ebp": UC_X86_REG_EBP, - "esp": UC_X86_REG_ESP, "eip": UC_X86_REG_EIP, "esp": UC_X86_REG_ESP, - "efl": UC_X86_REG_EFLAGS, + "efl": UC_X86_REG_EFLAGS, # Segment registers removed... # They caused segfaults (from unicorn?) when they were here - }, + }, "arm" : { "r0": UC_ARM_REG_R0, "r1": UC_ARM_REG_R1, @@ -476,7 +524,7 @@ class AflUnicornEngine(Uc): "fp": UC_ARM64_REG_FP, "lr": UC_ARM64_REG_LR, "nzcv": UC_ARM64_REG_NZCV, - "cpsr": UC_ARM_REG_CPSR, + "cpsr": UC_ARM_REG_CPSR, }, "mips" : { "0" : UC_MIPS_REG_ZERO, @@ -499,13 +547,13 @@ class AflUnicornEngine(Uc): "t9": UC_MIPS_REG_T9, "s0": UC_MIPS_REG_S0, "s1": UC_MIPS_REG_S1, - "s2": UC_MIPS_REG_S2, + "s2": UC_MIPS_REG_S2, "s3": UC_MIPS_REG_S3, "s4": UC_MIPS_REG_S4, "s5": UC_MIPS_REG_S5, - "s6": UC_MIPS_REG_S6, + "s6": UC_MIPS_REG_S6, "s7": UC_MIPS_REG_S7, - "s8": UC_MIPS_REG_S8, + "s8": UC_MIPS_REG_S8, "k0": UC_MIPS_REG_K0, "k1": UC_MIPS_REG_K1, "gp": UC_MIPS_REG_GP, @@ -517,44 +565,127 @@ class AflUnicornEngine(Uc): "lo": UC_MIPS_REG_LO } } - return registers[arch] + return registers[arch] + def __get_registers_extended(self, arch): + # Similar to __get_register_map, but for ARM floating point registers + if arch == "arm64le" or arch == "arm64be": + arch = "arm64" + elif arch == "armle" or arch == "armbe" or "thumb" in arch: + arch = "arm" + elif arch == "mipsel": + arch = "mips" + + registers = { + "arm": { + "d0": UC_ARM_REG_D0, + "d1": UC_ARM_REG_D1, + "d2": UC_ARM_REG_D2, + "d3": UC_ARM_REG_D3, + "d4": UC_ARM_REG_D4, + "d5": UC_ARM_REG_D5, + "d6": UC_ARM_REG_D6, + "d7": UC_ARM_REG_D7, + "d8": UC_ARM_REG_D8, + "d9": UC_ARM_REG_D9, + "d10": UC_ARM_REG_D10, + "d11": UC_ARM_REG_D11, + "d12": UC_ARM_REG_D12, + "d13": UC_ARM_REG_D13, + "d14": UC_ARM_REG_D14, + "d15": UC_ARM_REG_D15, + "d16": UC_ARM_REG_D16, + "d17": UC_ARM_REG_D17, + "d18": UC_ARM_REG_D18, + "d19": UC_ARM_REG_D19, + "d20": UC_ARM_REG_D20, + "d21": UC_ARM_REG_D21, + "d22": UC_ARM_REG_D22, + "d23": UC_ARM_REG_D23, + "d24": UC_ARM_REG_D24, + "d25": UC_ARM_REG_D25, + "d26": UC_ARM_REG_D26, + "d27": UC_ARM_REG_D27, + "d28": UC_ARM_REG_D28, + "d29": UC_ARM_REG_D29, + "d30": UC_ARM_REG_D30, + "d31": UC_ARM_REG_D31, + "fpscr": UC_ARM_REG_FPSCR + } + } + + return registers[arch]; #--------------------------- - # Callbacks for tracing + # Callbacks for tracing - # TODO: Make integer-printing fixed widths dependent on bitness of architecture - # (i.e. only show 4 bytes for 32-bit, 8 bytes for 64-bit) - # TODO: Figure out how best to determine the capstone mode and architecture here - """ - try: - # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary. - from capstone import * - cs = Cs(CS_ARCH_MIPS, CS_MODE_MIPS32 + CS_MODE_BIG_ENDIAN) - def __trace_instruction(self, uc, address, size, user_data): - mem = uc.mem_read(address, size) - for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size): - print(" Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr)) - except ImportError: - def __trace_instruction(self, uc, address, size, user_data): - print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) - """ + # TODO: Extra mode for Capstone (i.e. Cs(cs_arch, cs_mode + cs_extra) not implemented + def __trace_instruction(self, uc, address, size, user_data): - print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) - + if CAPSTONE_EXISTS == 1: + # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary. + arch = self.get_arch() + mode = self.get_mode() + bit_size = self.bit_size_arch() + # Map current arch to capstone labeling + if arch == UC_ARCH_X86 and mode == UC_MODE_64: + cs_arch = CS_ARCH_X86 + cs_mode = CS_MODE_64 + elif arch == UC_ARCH_X86 and mode == UC_MODE_32: + cs_arch = CS_ARCH_X86 + cs_mode = CS_MODE_32 + elif arch == UC_ARCH_ARM64: + cs_arch = CS_ARCH_ARM64 + cs_mode = CS_MODE_ARM + elif arch == UC_ARCH_ARM and mode == UC_MODE_THUMB: + cs_arch = CS_ARCH_ARM + cs_mode = CS_MODE_THUMB + elif arch == UC_ARCH_ARM: + cs_arch = CS_ARCH_ARM + cs_mode = CS_MODE_ARM + elif arch == UC_ARCH_MIPS: + cs_arch = CS_ARCH_MIPS + cs_mode = CS_MODE_MIPS32 # No other MIPS supported in program + + cs = Cs(cs_arch, cs_mode) + mem = uc.mem_read(address, size) + if bit_size == 4: + for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size): + print(" Instr: {:#08x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr)) + else: + for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size): + print(" Instr: {:#16x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr)) + else: + print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) + def __trace_block(self, uc, address, size, user_data): print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) - + def __trace_mem_access(self, uc, access, address, size, value, user_data): if access == UC_MEM_WRITE: print(" >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value)) else: - print(" >>> Read: addr=0x{0:016x} size={1}".format(address, size)) + print(" >>> Read: addr=0x{0:016x} size={1}".format(address, size)) def __trace_mem_invalid_access(self, uc, access, address, size, value, user_data): if access == UC_MEM_WRITE_UNMAPPED: print(" >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value)) else: - print(" >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size)) - + print(" >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size)) + + def bit_size_arch(self): + arch = self.get_arch() + mode = self.get_mode() + # Get bit sizes for given architecture + if arch == UC_ARCH_X86 and mode == UC_MODE_64: + bit_size = 8 + elif arch == UC_ARCH_X86 and mode == UC_MODE_32: + bit_size = 4 + elif arch == UC_ARCH_ARM64: + bit_size = 8 + elif arch == UC_ARCH_ARM: + bit_size = 4 + elif arch == UC_ARCH_MIPS: + bit_size = 4 + return bit_size diff --git a/unicorn_mode/samples/c/COMPILE.md b/unicorn_mode/samples/c/COMPILE.md index 7857e5bf..7da140f7 100644 --- a/unicorn_mode/samples/c/COMPILE.md +++ b/unicorn_mode/samples/c/COMPILE.md @@ -17,6 +17,6 @@ You shouldn't need to compile simple_target.c since a X86_64 binary version is pre-built and shipped in this sample folder. This file documents how the binary was built in case you want to rebuild it or recompile it for any reason. -The pre-built binary (simple_target_x86_64.bin) was built using -g -O0 in gcc. +The pre-built binary (persistent_target_x86_64) was built using -g -O0 in gcc. We then load the binary and execute the main function directly. diff --git a/unicorn_mode/samples/compcov_x64/compcov_test_harness.py b/unicorn_mode/samples/compcov_x64/compcov_test_harness.py index b9ebb61d..f0749d1b 100644 --- a/unicorn_mode/samples/compcov_x64/compcov_test_harness.py +++ b/unicorn_mode/samples/compcov_x64/compcov_test_harness.py @@ -22,48 +22,81 @@ from unicornafl import * from unicornafl.x86_const import * # Path to the file containing the binary to emulate -BINARY_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'compcov_target.bin') +BINARY_FILE = os.path.join( + os.path.dirname(os.path.abspath(__file__)), "compcov_target.bin" +) # Memory map for the code to be tested -CODE_ADDRESS = 0x00100000 # Arbitrary address where code to test will be loaded +CODE_ADDRESS = 0x00100000 # Arbitrary address where code to test will be loaded CODE_SIZE_MAX = 0x00010000 # Max size for the code (64kb) STACK_ADDRESS = 0x00200000 # Address of the stack (arbitrarily chosen) -STACK_SIZE = 0x00010000 # Size of the stack (arbitrarily chosen) -DATA_ADDRESS = 0x00300000 # Address where mutated data will be placed +STACK_SIZE = 0x00010000 # Size of the stack (arbitrarily chosen) +DATA_ADDRESS = 0x00300000 # Address where mutated data will be placed DATA_SIZE_MAX = 0x00010000 # Maximum allowable size of mutated data try: # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary. from capstone import * + cs = Cs(CS_ARCH_X86, CS_MODE_64) + def unicorn_debug_instruction(uc, address, size, user_data): mem = uc.mem_read(address, size) - for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size): + for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite( + bytes(mem), size + ): print(" Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr)) + + except ImportError: + def unicorn_debug_instruction(uc, address, size, user_data): print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) + def unicorn_debug_block(uc, address, size, user_data): print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) + def unicorn_debug_mem_access(uc, access, address, size, value, user_data): if access == UC_MEM_WRITE: - print(" >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value)) + print( + " >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format( + address, size, value + ) + ) else: print(" >>> Read: addr=0x{0:016x} size={1}".format(address, size)) + def unicorn_debug_mem_invalid_access(uc, access, address, size, value, user_data): if access == UC_MEM_WRITE_UNMAPPED: - print(" >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value)) + print( + " >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format( + address, size, value + ) + ) else: - print(" >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size)) + print( + " >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size) + ) + def main(): parser = argparse.ArgumentParser(description="Test harness for compcov_target.bin") - parser.add_argument('input_file', type=str, help="Path to the file containing the mutated input to load") - parser.add_argument('-t', '--trace', default=False, action="store_true", help="Enables debug tracing") + parser.add_argument( + "input_file", + type=str, + help="Path to the file containing the mutated input to load", + ) + parser.add_argument( + "-t", + "--trace", + default=False, + action="store_true", + help="Enables debug tracing", + ) args = parser.parse_args() # Instantiate a MIPS32 big endian Unicorn Engine instance @@ -73,13 +106,16 @@ def main(): uc.hook_add(UC_HOOK_BLOCK, unicorn_debug_block) uc.hook_add(UC_HOOK_CODE, unicorn_debug_instruction) uc.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, unicorn_debug_mem_access) - uc.hook_add(UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, unicorn_debug_mem_invalid_access) + uc.hook_add( + UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, + unicorn_debug_mem_invalid_access, + ) - #--------------------------------------------------- + # --------------------------------------------------- # Load the binary to emulate and map it into memory print("Loading data input from {}".format(args.input_file)) - binary_file = open(BINARY_FILE, 'rb') + binary_file = open(BINARY_FILE, "rb") binary_code = binary_file.read() binary_file.close() @@ -93,11 +129,11 @@ def main(): uc.mem_write(CODE_ADDRESS, binary_code) # Set the program counter to the start of the code - start_address = CODE_ADDRESS # Address of entry point of main() - end_address = CODE_ADDRESS + 0x55 # Address of last instruction in main() + start_address = CODE_ADDRESS # Address of entry point of main() + end_address = CODE_ADDRESS + 0x55 # Address of last instruction in main() uc.reg_write(UC_X86_REG_RIP, start_address) - #----------------- + # ----------------- # Setup the stack uc.mem_map(STACK_ADDRESS, STACK_SIZE) @@ -106,8 +142,7 @@ def main(): # Mapping a location to write our buffer to uc.mem_map(DATA_ADDRESS, DATA_SIZE_MAX) - - #----------------------------------------------- + # ----------------------------------------------- # Load the mutated input and map it into memory def place_input_callback(uc, input, _, data): @@ -121,7 +156,7 @@ def main(): # Write the mutated command into the data buffer uc.mem_write(DATA_ADDRESS, input) - #------------------------------------------------------------ + # ------------------------------------------------------------ # Emulate the code, allowing it to process the mutated input print("Starting the AFL fuzz") @@ -129,8 +164,9 @@ def main(): input_file=args.input_file, place_input_callback=place_input_callback, exits=[end_address], - persistent_iters=1 + persistent_iters=1, ) + if __name__ == "__main__": main() diff --git a/unicorn_mode/samples/persistent/simple_target_noncrashing.c b/unicorn_mode/samples/persistent/simple_target_noncrashing.c index 00764473..9257643b 100644 --- a/unicorn_mode/samples/persistent/simple_target_noncrashing.c +++ b/unicorn_mode/samples/persistent/simple_target_noncrashing.c @@ -10,7 +10,7 @@ * Written by Nathan Voss <njvoss99@gmail.com> * Adapted by Lukas Seidel <seidel.1@campus.tu-berlin.de> */ - +#include <string.h> int main(int argc, char** argv) { if(argc < 2){ @@ -19,15 +19,19 @@ int main(int argc, char** argv) { char *data_buf = argv[1]; - if len(data_buf < 20) { - if (data_buf[20] != 0) { + if (strlen(data_buf) >= 21 && data_buf[20] != 0) { printf("Not crashing"); - } else if (data_buf[0] > 0x10 && data_buf[0] < 0x20 && data_buf[1] > data_buf[2]) { + } else if (strlen(data_buf) > 1 + && data_buf[0] > 0x10 && data_buf[0] < 0x20 && data_buf[1] > data_buf[2]) { printf("Also not crashing with databuf[0] == %c", data_buf[0]) - } else if (data_buf[9] == 0x00 && data_buf[10] != 0x00 && data_buf[11] == 0x00) { + } +#if 0 + // not possible with argv (zero terminated strings) (hexcoder-) + // do not try to access data_buf[10] and beyond + else if (data_buf[9] == 0x00 && data_buf[10] != 0x00 && data_buf[11] == 0x00) { // Cause a crash if data[10] is not zero, but [9] and [11] are zero unsigned char invalid_read = *(unsigned char *) 0x00000000; } - +#endif return 0; } diff --git a/unicorn_mode/samples/simple/simple_test_harness.py b/unicorn_mode/samples/simple/simple_test_harness.py index f4002ca8..cd04ad3a 100644 --- a/unicorn_mode/samples/simple/simple_test_harness.py +++ b/unicorn_mode/samples/simple/simple_test_harness.py @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/env python3 """ Simple test harness for AFL's Unicorn Mode. @@ -22,48 +22,81 @@ from unicornafl import * from unicornafl.mips_const import * # Path to the file containing the binary to emulate -BINARY_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'simple_target.bin') +BINARY_FILE = os.path.join( + os.path.dirname(os.path.abspath(__file__)), "simple_target.bin" +) # Memory map for the code to be tested -CODE_ADDRESS = 0x00100000 # Arbitrary address where code to test will be loaded +CODE_ADDRESS = 0x00100000 # Arbitrary address where code to test will be loaded CODE_SIZE_MAX = 0x00010000 # Max size for the code (64kb) STACK_ADDRESS = 0x00200000 # Address of the stack (arbitrarily chosen) -STACK_SIZE = 0x00010000 # Size of the stack (arbitrarily chosen) -DATA_ADDRESS = 0x00300000 # Address where mutated data will be placed +STACK_SIZE = 0x00010000 # Size of the stack (arbitrarily chosen) +DATA_ADDRESS = 0x00300000 # Address where mutated data will be placed DATA_SIZE_MAX = 0x00010000 # Maximum allowable size of mutated data try: # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary. from capstone import * + cs = Cs(CS_ARCH_MIPS, CS_MODE_MIPS32 + CS_MODE_BIG_ENDIAN) + def unicorn_debug_instruction(uc, address, size, user_data): mem = uc.mem_read(address, size) - for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size): + for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite( + bytes(mem), size + ): print(" Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr)) + + except ImportError: + def unicorn_debug_instruction(uc, address, size, user_data): - print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) + print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) + def unicorn_debug_block(uc, address, size, user_data): print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) - + + def unicorn_debug_mem_access(uc, access, address, size, value, user_data): if access == UC_MEM_WRITE: - print(" >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value)) + print( + " >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format( + address, size, value + ) + ) else: - print(" >>> Read: addr=0x{0:016x} size={1}".format(address, size)) + print(" >>> Read: addr=0x{0:016x} size={1}".format(address, size)) + def unicorn_debug_mem_invalid_access(uc, access, address, size, value, user_data): if access == UC_MEM_WRITE_UNMAPPED: - print(" >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value)) + print( + " >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format( + address, size, value + ) + ) else: - print(" >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size)) + print( + " >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size) + ) + def main(): parser = argparse.ArgumentParser(description="Test harness for simple_target.bin") - parser.add_argument('input_file', type=str, help="Path to the file containing the mutated input to load") - parser.add_argument('-t', '--trace', default=False, action="store_true", help="Enables debug tracing") + parser.add_argument( + "input_file", + type=str, + help="Path to the file containing the mutated input to load", + ) + parser.add_argument( + "-t", + "--trace", + default=False, + action="store_true", + help="Enables debug tracing", + ) args = parser.parse_args() # Instantiate a MIPS32 big endian Unicorn Engine instance @@ -73,13 +106,16 @@ def main(): uc.hook_add(UC_HOOK_BLOCK, unicorn_debug_block) uc.hook_add(UC_HOOK_CODE, unicorn_debug_instruction) uc.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, unicorn_debug_mem_access) - uc.hook_add(UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, unicorn_debug_mem_invalid_access) + uc.hook_add( + UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, + unicorn_debug_mem_invalid_access, + ) - #--------------------------------------------------- + # --------------------------------------------------- # Load the binary to emulate and map it into memory print("Loading data input from {}".format(args.input_file)) - binary_file = open(BINARY_FILE, 'rb') + binary_file = open(BINARY_FILE, "rb") binary_code = binary_file.read() binary_file.close() @@ -93,11 +129,11 @@ def main(): uc.mem_write(CODE_ADDRESS, binary_code) # Set the program counter to the start of the code - start_address = CODE_ADDRESS # Address of entry point of main() - end_address = CODE_ADDRESS + 0xf4 # Address of last instruction in main() + start_address = CODE_ADDRESS # Address of entry point of main() + end_address = CODE_ADDRESS + 0xF4 # Address of last instruction in main() uc.reg_write(UC_MIPS_REG_PC, start_address) - #----------------- + # ----------------- # Setup the stack uc.mem_map(STACK_ADDRESS, STACK_SIZE) @@ -106,14 +142,14 @@ def main(): # reserve some space for data uc.mem_map(DATA_ADDRESS, DATA_SIZE_MAX) - #----------------------------------------------------- + # ----------------------------------------------------- # Set up a callback to place input data (do little work here, it's called for every single iteration) # We did not pass in any data and don't use persistent mode, so we can ignore these params. # Be sure to check out the docstrings for the uc.afl_* functions. def place_input_callback(uc, input, persistent_round, data): # Apply constraints to the mutated input if len(input) > DATA_SIZE_MAX: - #print("Test input is too long (> {} bytes)") + # print("Test input is too long (> {} bytes)") return False # Write the mutated command into the data buffer @@ -122,5 +158,6 @@ def main(): # Start the fuzzer. uc.afl_fuzz(args.input_file, place_input_callback, [end_address]) + if __name__ == "__main__": main() diff --git a/unicorn_mode/samples/simple/simple_test_harness_alt.py b/unicorn_mode/samples/simple/simple_test_harness_alt.py index 9c3dbc93..3249b13d 100644 --- a/unicorn_mode/samples/simple/simple_test_harness_alt.py +++ b/unicorn_mode/samples/simple/simple_test_harness_alt.py @@ -25,50 +25,79 @@ from unicornafl import * from unicornafl.mips_const import * # Path to the file containing the binary to emulate -BINARY_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'simple_target.bin') +BINARY_FILE = os.path.join( + os.path.dirname(os.path.abspath(__file__)), "simple_target.bin" +) # Memory map for the code to be tested -CODE_ADDRESS = 0x00100000 # Arbitrary address where code to test will be loaded +CODE_ADDRESS = 0x00100000 # Arbitrary address where code to test will be loaded CODE_SIZE_MAX = 0x00010000 # Max size for the code (64kb) STACK_ADDRESS = 0x00200000 # Address of the stack (arbitrarily chosen) -STACK_SIZE = 0x00010000 # Size of the stack (arbitrarily chosen) -DATA_ADDRESS = 0x00300000 # Address where mutated data will be placed +STACK_SIZE = 0x00010000 # Size of the stack (arbitrarily chosen) +DATA_ADDRESS = 0x00300000 # Address where mutated data will be placed DATA_SIZE_MAX = 0x00010000 # Maximum allowable size of mutated data try: # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary. from capstone import * + cs = Cs(CS_ARCH_MIPS, CS_MODE_MIPS32 + CS_MODE_BIG_ENDIAN) + def unicorn_debug_instruction(uc, address, size, user_data): mem = uc.mem_read(address, size) - for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size): + for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite( + bytes(mem), size + ): print(" Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr)) + + except ImportError: + def unicorn_debug_instruction(uc, address, size, user_data): - print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) + print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) + def unicorn_debug_block(uc, address, size, user_data): print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) - + + def unicorn_debug_mem_access(uc, access, address, size, value, user_data): if access == UC_MEM_WRITE: - print(" >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value)) + print( + " >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format( + address, size, value + ) + ) else: - print(" >>> Read: addr=0x{0:016x} size={1}".format(address, size)) + print(" >>> Read: addr=0x{0:016x} size={1}".format(address, size)) + def unicorn_debug_mem_invalid_access(uc, access, address, size, value, user_data): if access == UC_MEM_WRITE_UNMAPPED: - print(" >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value)) + print( + " >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format( + address, size, value + ) + ) else: - print(" >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size)) + print( + " >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size) + ) + def force_crash(uc_error): # This function should be called to indicate to AFL that a crash occurred during emulation. # Pass in the exception received from Uc.emu_start() mem_errors = [ - UC_ERR_READ_UNMAPPED, UC_ERR_READ_PROT, UC_ERR_READ_UNALIGNED, - UC_ERR_WRITE_UNMAPPED, UC_ERR_WRITE_PROT, UC_ERR_WRITE_UNALIGNED, - UC_ERR_FETCH_UNMAPPED, UC_ERR_FETCH_PROT, UC_ERR_FETCH_UNALIGNED, + UC_ERR_READ_UNMAPPED, + UC_ERR_READ_PROT, + UC_ERR_READ_UNALIGNED, + UC_ERR_WRITE_UNMAPPED, + UC_ERR_WRITE_PROT, + UC_ERR_WRITE_UNALIGNED, + UC_ERR_FETCH_UNMAPPED, + UC_ERR_FETCH_PROT, + UC_ERR_FETCH_UNALIGNED, ] if uc_error.errno in mem_errors: # Memory error - throw SIGSEGV @@ -80,11 +109,22 @@ def force_crash(uc_error): # Not sure what happened - throw SIGABRT os.kill(os.getpid(), signal.SIGABRT) + def main(): parser = argparse.ArgumentParser(description="Test harness for simple_target.bin") - parser.add_argument('input_file', type=str, help="Path to the file containing the mutated input to load") - parser.add_argument('-d', '--debug', default=False, action="store_true", help="Enables debug tracing") + parser.add_argument( + "input_file", + type=str, + help="Path to the file containing the mutated input to load", + ) + parser.add_argument( + "-d", + "--debug", + default=False, + action="store_true", + help="Enables debug tracing", + ) args = parser.parse_args() # Instantiate a MIPS32 big endian Unicorn Engine instance @@ -94,13 +134,16 @@ def main(): uc.hook_add(UC_HOOK_BLOCK, unicorn_debug_block) uc.hook_add(UC_HOOK_CODE, unicorn_debug_instruction) uc.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, unicorn_debug_mem_access) - uc.hook_add(UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, unicorn_debug_mem_invalid_access) + uc.hook_add( + UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, + unicorn_debug_mem_invalid_access, + ) - #--------------------------------------------------- + # --------------------------------------------------- # Load the binary to emulate and map it into memory print("Loading data input from {}".format(args.input_file)) - binary_file = open(BINARY_FILE, 'rb') + binary_file = open(BINARY_FILE, "rb") binary_code = binary_file.read() binary_file.close() @@ -114,11 +157,11 @@ def main(): uc.mem_write(CODE_ADDRESS, binary_code) # Set the program counter to the start of the code - start_address = CODE_ADDRESS # Address of entry point of main() - end_address = CODE_ADDRESS + 0xf4 # Address of last instruction in main() + start_address = CODE_ADDRESS # Address of entry point of main() + end_address = CODE_ADDRESS + 0xF4 # Address of last instruction in main() uc.reg_write(UC_MIPS_REG_PC, start_address) - #----------------- + # ----------------- # Setup the stack uc.mem_map(STACK_ADDRESS, STACK_SIZE) @@ -127,10 +170,10 @@ def main(): # reserve some space for data uc.mem_map(DATA_ADDRESS, DATA_SIZE_MAX) - #----------------------------------------------------- + # ----------------------------------------------------- # Kick off AFL's fork server - # THIS MUST BE DONE BEFORE LOADING USER DATA! - # If this isn't done every single run, the AFL fork server + # THIS MUST BE DONE BEFORE LOADING USER DATA! + # If this isn't done every single run, the AFL fork server # will not be started appropriately and you'll get erratic results! print("Starting the AFL forkserver") @@ -142,12 +185,12 @@ def main(): else: out = lambda x, y: print(x.format(y)) - #----------------------------------------------- + # ----------------------------------------------- # Load the mutated input and map it into memory # Load the mutated input from disk out("Loading data input from {}", args.input_file) - input_file = open(args.input_file, 'rb') + input_file = open(args.input_file, "rb") input = input_file.read() input_file.close() @@ -159,7 +202,7 @@ def main(): # Write the mutated command into the data buffer uc.mem_write(DATA_ADDRESS, input) - #------------------------------------------------------------ + # ------------------------------------------------------------ # Emulate the code, allowing it to process the mutated input out("Executing until a crash or execution reaches 0x{0:016x}", end_address) @@ -175,5 +218,6 @@ def main(): # UC_AFL_RET_FINISHED = 3 out("Done. AFL Mode is {}", afl_mode) + if __name__ == "__main__": main() diff --git a/unicorn_mode/samples/speedtest/.gitignore b/unicorn_mode/samples/speedtest/.gitignore new file mode 100644 index 00000000..78310c60 --- /dev/null +++ b/unicorn_mode/samples/speedtest/.gitignore @@ -0,0 +1,6 @@ +output +harness +harness-debug +target +target.o +target.offsets.* diff --git a/unicorn_mode/samples/speedtest/Makefile b/unicorn_mode/samples/speedtest/Makefile new file mode 100644 index 00000000..23f5cb07 --- /dev/null +++ b/unicorn_mode/samples/speedtest/Makefile @@ -0,0 +1,17 @@ +CFLAGS += -Wall -Werror -Wextra -Wpedantic -Og -g -fPIE + +.PHONY: all clean + +all: target target.offsets.main + +clean: + rm -rf *.o target target.offsets.* + +target.o: target.c + ${CC} ${CFLAGS} -c target.c -o $@ + +target: target.o + ${CC} ${CFLAGS} target.o -o $@ + +target.offsets.main: target + ./get_offsets.py \ No newline at end of file diff --git a/unicorn_mode/samples/speedtest/README.md b/unicorn_mode/samples/speedtest/README.md new file mode 100644 index 00000000..3c1184a2 --- /dev/null +++ b/unicorn_mode/samples/speedtest/README.md @@ -0,0 +1,65 @@ +# Speedtest + +This is a simple sample harness for a non-crashing file, +to show the raw speed of C, Rust, and Python harnesses. + +## Compiling... + +Make sure, you built unicornafl first (`../../build_unicorn_support.sh`). +Then, follow these individual steps: + +### Rust + +```bash +cd rust +cargo build --release +../../../afl-fuzz -i ../sample_inputs -o out -- ./target/release/harness @@ +``` + +### C + +```bash +cd c +make +../../../afl-fuzz -i ../sample_inputs -o out -- ./harness @@ +``` + +### python + +```bash +cd python +../../../afl-fuzz -i ../sample_inputs -o out -U -- python3 ./harness.py @@ +``` + +## Results + +TODO: add results here. + + +## Compiling speedtest_target.c + +You shouldn't need to compile simple_target.c since a X86_64 binary version is +pre-built and shipped in this sample folder. This file documents how the binary +was built in case you want to rebuild it or recompile it for any reason. + +The pre-built binary (simple_target_x86_64.bin) was built using -g -O0 in gcc. + +We then load the binary and execute the main function directly. + +## Addresses for the harness: +To find the address (in hex) of main, run: +```bash +objdump -M intel -D target | grep '<main>:' | cut -d" " -f1 +``` +To find all call sites to magicfn, run: +```bash +objdump -M intel -D target | grep '<magicfn>$' | cut -d":" -f1 +``` +For malloc callsites: +```bash +objdump -M intel -D target | grep '<malloc@plt>$' | cut -d":" -f1 +``` +And free callsites: +```bash +objdump -M intel -D target | grep '<free@plt>$' | cut -d":" -f1 +``` diff --git a/unicorn_mode/samples/speedtest/c/Makefile b/unicorn_mode/samples/speedtest/c/Makefile new file mode 100644 index 00000000..ce784d4f --- /dev/null +++ b/unicorn_mode/samples/speedtest/c/Makefile @@ -0,0 +1,54 @@ +# UnicornAFL Usage +# Original Unicorn Example Makefile by Nguyen Anh Quynh <aquynh@gmail.com>, 2015 +# Adapted for AFL++ by domenukk <domenukk@gmail.com>, 2020 +.POSIX: +UNAME_S =$(shell uname -s)# GNU make +UNAME_S:sh=uname -s # BSD make +_UNIQ=_QINU_ + +LIBDIR = ../../../unicornafl +BIN_EXT = +AR_EXT = a + +# Verbose output? +V ?= 0 + +CFLAGS += -Wall -Werror -Wextra -Wno-unused-parameter -I../../../unicornafl/include + +LDFLAGS += -L$(LIBDIR) -lpthread -lm + +_LRT = $(_UNIQ)$(UNAME_S:Linux=) +__LRT = $(_LRT:$(_UNIQ)=-lrt) +LRT = $(__LRT:$(_UNIQ)=) + +LDFLAGS += $(LRT) + +_CC = $(_UNIQ)$(CROSS) +__CC = $(_CC:$(_UNIQ)=$(CC)) +MYCC = $(__CC:$(_UNIQ)$(CROSS)=$(CROSS)gcc) + +.PHONY: all clean + +all: fuzz + +clean: + rm -rf *.o harness harness-debug + +harness.o: harness.c ../../../unicornafl/include/unicorn/*.h + ${MYCC} ${CFLAGS} -O3 -c harness.c -o $@ + +harness-debug.o: harness.c ../../../unicornafl/include/unicorn/*.h + ${MYCC} ${CFLAGS} -fsanitize=address -g -Og -c harness.c -o $@ + +harness: harness.o + ${MYCC} -L${LIBDIR} harness.o ../../../unicornafl/libunicornafl.a $(LDFLAGS) -o $@ + +harness-debug: harness-debug.o + ${MYCC} -fsanitize=address -g -Og -L${LIBDIR} harness-debug.o ../../../unicornafl/libunicornafl.a $(LDFLAGS) -o harness-debug + +../target: + $(MAKE) -C .. + +fuzz: ../target harness + rm -rf ./output + SKIP_BINCHECK=1 ../../../../afl-fuzz -s 1 -i ../sample_inputs -o ./output -- ./harness @@ diff --git a/unicorn_mode/samples/speedtest/c/harness.c b/unicorn_mode/samples/speedtest/c/harness.c new file mode 100644 index 00000000..e8de3d80 --- /dev/null +++ b/unicorn_mode/samples/speedtest/c/harness.c @@ -0,0 +1,390 @@ +/* + Simple test harness for AFL++'s unicornafl c mode. + + This loads the simple_target_x86_64 binary into + Unicorn's memory map for emulation, places the specified input into + argv[1], sets up argv, and argc and executes 'main()'. + If run inside AFL, afl_fuzz automatically does the "right thing" + + Run under AFL as follows: + + $ cd <afl_path>/unicorn_mode/samples/simple/ + $ make + $ ../../../afl-fuzz -m none -i sample_inputs -o out -- ./harness @@ +*/ + +// This is not your everyday Unicorn. +#define UNICORN_AFL + +#include <string.h> +#include <inttypes.h> +#include <stdint.h> +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> +#include <unistd.h> +#include <sys/stat.h> +#include <fcntl.h> +#include <sys/mman.h> + +#include <unicorn/unicorn.h> + +// Path to the file containing the binary to emulate +#define BINARY_FILE ("../target") + +// Memory map for the code to be tested +// Arbitrary address where code to test will be loaded +static const int64_t BASE_ADDRESS = 0x0; +// Max size for the code (64kb) +static const int64_t CODE_SIZE_MAX = 0x00010000; +// Location where the input will be placed (make sure the emulated program knows this somehow, too ;) ) +static const int64_t INPUT_ADDRESS = 0x00100000; +// Maximum size for our input +static const int64_t INPUT_MAX = 0x00100000; +// Where our pseudo-heap is at +static const int64_t HEAP_ADDRESS = 0x00200000; +// Maximum allowable size for the heap +static const int64_t HEAP_SIZE_MAX = 0x000F0000; +// Address of the stack (Some random address again) +static const int64_t STACK_ADDRESS = 0x00400000; +// Size of the stack (arbitrarily chosen, just make it big enough) +static const int64_t STACK_SIZE = 0x000F0000; + +// Alignment for unicorn mappings (seems to be needed) +static const int64_t ALIGNMENT = 0x1000; + +static void hook_block(uc_engine *uc, uint64_t address, uint32_t size, void *user_data) { + printf(">>> Tracing basic block at 0x%"PRIx64 ", block size = 0x%x\n", address, size); +} + +static void hook_code(uc_engine *uc, uint64_t address, uint32_t size, void *user_data) { + printf(">>> Tracing instruction at 0x%"PRIx64 ", instruction size = 0x%x\n", address, size); +} + +/* Unicorn page needs to be 0x1000 aligned, apparently */ +static uint64_t pad(uint64_t size) { + if (size % ALIGNMENT == 0) { return size; } + return ((size / ALIGNMENT) + 1) * ALIGNMENT; +} + +/* returns the filesize in bytes, -1 or error. */ +static off_t afl_mmap_file(char *filename, char **buf_ptr) { + + off_t ret = -1; + + int fd = open(filename, O_RDONLY); + + struct stat st = {0}; + if (fstat(fd, &st)) goto exit; + + off_t in_len = st.st_size; + if (in_len == -1) { + /* This can only ever happen on 32 bit if the file is exactly 4gb. */ + fprintf(stderr, "Filesize of %s too large\n", filename); + goto exit; + } + + *buf_ptr = mmap(0, in_len, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); + + if (*buf_ptr != MAP_FAILED) ret = in_len; + +exit: + close(fd); + return ret; + +} + +/* Place the input at the right spot inside unicorn. + This code path is *HOT*, do as little work as possible! */ +static bool place_input_callback( + uc_engine *uc, + char *input, + size_t input_len, + uint32_t persistent_round, + void *data +){ + // printf("Placing input with len %ld to %x\n", input_len, DATA_ADDRESS); + if (input_len >= INPUT_MAX) { + // Test input too short or too long, ignore this testcase + return false; + } + + // We need a valid c string, make sure it never goes out of bounds. + input[input_len-1] = '\0'; + + // Write the testcase to unicorn. + uc_mem_write(uc, INPUT_ADDRESS, input, input_len); + + return true; +} + +// exit in case the unicorn-internal mmap fails. +static void mem_map_checked(uc_engine *uc, uint64_t addr, size_t size, uint32_t mode) { + size = pad(size); + //printf("SIZE %llx, align: %llx\n", size, ALIGNMENT); + uc_err err = uc_mem_map(uc, addr, size, mode); + if (err != UC_ERR_OK) { + printf("Error mapping %ld bytes at 0x%lx: %s (mode: %d)\n", size, addr, uc_strerror(err), mode); + exit(1); + } +} + +// allocates an array, reads all addrs to the given array ptr, returns a size +ssize_t read_all_addrs(char *path, uint64_t *addrs, size_t max_count) { + + FILE *f = fopen(path, "r"); + if (!f) { + perror("fopen"); + fprintf(stderr, "Could not read %s, make sure you ran ./get_offsets.py\n", path); + exit(-1); + } + for (size_t i = 0; i < max_count; i++) { + bool end = false; + if(fscanf(f, "%lx", &addrs[i]) == EOF) { + end = true; + i--; + } else if (fgetc(f) == EOF) { + end = true; + } + if (end) { + printf("Set %ld addrs for %s\n", i + 1, path); + fclose(f); + return i + 1; + } + } + return max_count; +} + +// Read all addresses from the given file, and set a hook for them. +void set_all_hooks(uc_engine *uc, char *hook_file, void *hook_fn) { + + FILE *f = fopen(hook_file, "r"); + if (!f) { + fprintf(stderr, "Could not read %s, make sure you ran ./get_offsets.py\n", hook_file); + exit(-1); + } + uint64_t hook_addr; + for (int hook_count = 0; 1; hook_count++) { + if(fscanf(f, "%lx", &hook_addr) == EOF) { + printf("Set %d hooks for %s\n", hook_count, hook_file); + fclose(f); + return; + } + printf("got new hook addr %lx (count: %d) ohbytw: sizeof %lx\n", hook_addr, hook_count, sizeof(uc_hook)); + hook_addr += BASE_ADDRESS; + // We'll leek these hooks like a good citizen. + uc_hook *hook = calloc(1, sizeof(uc_hook)); + if (!hook) { + perror("calloc"); + exit(-1); + } + uc_hook_add(uc, hook, UC_HOOK_CODE, hook_fn, NULL, hook_addr, hook_addr); + // guzzle up newline + if (fgetc(f) == EOF) { + printf("Set %d hooks for %s\n", hook_count, hook_file); + fclose(f); + return; + } + } + +} + +// This is a fancy print function that we're just going to skip for fuzzing. +static void hook_magicfn(uc_engine *uc, uint64_t address, uint32_t size, void *user_data) { + address += size; + uc_reg_write(uc, UC_X86_REG_RIP, &address); +} + +static bool already_allocated = false; + +// We use a very simple malloc/free stub here, that only works for exactly one allocation at a time. +static void hook_malloc(uc_engine *uc, uint64_t address, uint32_t size, void *user_data) { + if (already_allocated) { + printf("Double malloc, not supported right now!\n"); + abort(); + } + // read the first param. + uint64_t malloc_size; + uc_reg_read(uc, UC_X86_REG_RDI, &malloc_size); + if (malloc_size > HEAP_SIZE_MAX) { + printf("Tried to allocated %ld bytes, but we only support up to %ld\n", malloc_size, HEAP_SIZE_MAX); + abort(); + } + uc_reg_write(uc, UC_X86_REG_RAX, &HEAP_ADDRESS); + address += size; + uc_reg_write(uc, UC_X86_REG_RIP, &address); + already_allocated = true; +} + +// No real free, just set the "used"-flag to false. +static void hook_free(uc_engine *uc, uint64_t address, uint32_t size, void *user_data) { + if (!already_allocated) { + printf("Double free detected. Real bug?\n"); + abort(); + } + // read the first param. + uint64_t free_ptr; + uc_reg_read(uc, UC_X86_REG_RDI, &free_ptr); + if (free_ptr != HEAP_ADDRESS) { + printf("Tried to free wrong mem region: 0x%lx at code loc 0x%lx\n", free_ptr, address); + abort(); + } + address += size; + uc_reg_write(uc, UC_X86_REG_RIP, &address); + already_allocated = false; +} + +int main(int argc, char **argv, char **envp) { + if (argc == 1) { + printf("Test harness to measure speed against Rust and python. Usage: harness [-t] <inputfile>\n"); + exit(1); + } + bool tracing = false; + char *filename = argv[1]; + if (argc > 2 && !strcmp(argv[1], "-t")) { + tracing = true; + filename = argv[2]; + } + + uc_engine *uc; + uc_err err; + uc_hook hooks[2]; + char *file_contents; + + // Initialize emulator in X86_64 mode + err = uc_open(UC_ARCH_X86, UC_MODE_64, &uc); + if (err) { + printf("Failed on uc_open() with error returned: %u (%s)\n", + err, uc_strerror(err)); + return -1; + } + + // If we want tracing output, set the callbacks here + if (tracing) { + // tracing all basic blocks with customized callback + uc_hook_add(uc, &hooks[0], UC_HOOK_BLOCK, hook_block, NULL, 1, 0); + uc_hook_add(uc, &hooks[1], UC_HOOK_CODE, hook_code, NULL, 1, 0); + } + + printf("The input testcase is set to %s\n", filename); + + + printf("Loading target from %s\n", BINARY_FILE); + off_t len = afl_mmap_file(BINARY_FILE, &file_contents); + printf("Binary file size: %lx\n", len); + if (len < 0) { + perror("Could not read binary to emulate"); + return -2; + } + if (len == 0) { + fprintf(stderr, "File at '%s' is empty\n", BINARY_FILE); + return -3; + } + if (len > CODE_SIZE_MAX) { + fprintf(stderr, "Binary too large, increase CODE_SIZE_MAX\n"); + return -4; + } + + // Map memory. + mem_map_checked(uc, BASE_ADDRESS, len, UC_PROT_ALL); + fflush(stdout); + + // write machine code to be emulated to memory + if (uc_mem_write(uc, BASE_ADDRESS, file_contents, len) != UC_ERR_OK) { + puts("Error writing to CODE"); + exit(-1); + } + + // Release copied contents + munmap(file_contents, len); + + // Set the program counter to the start of the code + FILE *f = fopen("../target.offsets.main", "r"); + if (!f) { + perror("fopen"); + puts("Could not read offset to main function, make sure you ran ./get_offsets.py"); + exit(-1); + } + uint64_t start_address; + if(fscanf(f, "%lx", &start_address) == EOF) { + puts("Start address not found in target.offests.main"); + exit(-1); + } + fclose(f); + start_address += BASE_ADDRESS; + printf("Execution will start at 0x%lx", start_address); + // Set the program counter to the start of the code + uc_reg_write(uc, UC_X86_REG_RIP, &start_address); // address of entry point of main() + + // Setup the Stack + mem_map_checked(uc, STACK_ADDRESS, STACK_SIZE, UC_PROT_READ | UC_PROT_WRITE); + // Setup the stack pointer, but allocate two pointers for the pointers to input + uint64_t val = STACK_ADDRESS + STACK_SIZE - 16; + //printf("Stack at %lu\n", stack_val); + uc_reg_write(uc, UC_X86_REG_RSP, &val); + + // reserve some space for our input data + mem_map_checked(uc, INPUT_ADDRESS, INPUT_MAX, UC_PROT_READ); + + // argc = 2 + val = 2; + uc_reg_write(uc, UC_X86_REG_RDI, &val); + //RSI points to our little 2 QWORD space at the beginning of the stack... + val = STACK_ADDRESS + STACK_SIZE - 16; + uc_reg_write(uc, UC_X86_REG_RSI, &val); + + //... which points to the Input. Write the ptr to mem in little endian. + uint32_t addr_little = STACK_ADDRESS; +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ + // The chances you are on a big_endian system aren't too high, but still... + __builtin_bswap32(addr_little); +#endif + + uc_mem_write(uc, STACK_ADDRESS + STACK_SIZE - 16, (char *)&addr_little, 4); + + set_all_hooks(uc, "../target.offsets.malloc", hook_malloc); + set_all_hooks(uc, "../target.offsets.magicfn", hook_magicfn); + set_all_hooks(uc, "../target.offsets.free", hook_free); + + int exit_count_max = 100; + // we don't need more exits for now. + uint64_t exits[exit_count_max]; + + ssize_t exit_count = read_all_addrs("../target.offsets.main_ends", exits, exit_count_max); + if (exit_count < 1) { + printf("Could not find exits! aborting.\n"); + abort(); + } + + printf("Starting to fuzz. Running from addr %ld to one of these %ld exits:\n", start_address, exit_count); + for (ssize_t i = 0; i < exit_count; i++) { + printf(" exit %ld: %ld\n", i, exits[i]); + } + + fflush(stdout); + + // let's gooo + uc_afl_ret afl_ret = uc_afl_fuzz( + uc, // The unicorn instance we prepared + filename, // Filename of the input to process. In AFL this is usually the '@@' placeholder, outside it's any input file. + place_input_callback, // Callback that places the input (automatically loaded from the file at filename) in the unicorninstance + exits, // Where to exit (this is an array) + exit_count, // Count of end addresses + NULL, // Optional calback to run after each exec + false, // true, if the optional callback should be run also for non-crashes + 1000, // For persistent mode: How many rounds to run + NULL // additional data pointer + ); + switch(afl_ret) { + case UC_AFL_RET_ERROR: + printf("Error starting to fuzz"); + return -3; + break; + case UC_AFL_RET_NO_AFL: + printf("No AFL attached - We are done with a single run."); + break; + default: + break; + } + return 0; +} diff --git a/unicorn_mode/samples/speedtest/get_offsets.py b/unicorn_mode/samples/speedtest/get_offsets.py new file mode 100755 index 00000000..c9dc76df --- /dev/null +++ b/unicorn_mode/samples/speedtest/get_offsets.py @@ -0,0 +1,77 @@ +#!/usr/bin/env python3 + +"""This simple script uses objdump to parse important addresses from the target""" +import shlex +import subprocess + +objdump_output = subprocess.check_output( + shlex.split("objdump -M intel -D target") +).decode() +main_loc = None +main_ends = [] +main_ended = False +magicfn_calls = [] +malloc_calls = [] +free_calls = [] +strlen_calls = [] + + +def line2addr(line): + return "0x" + line.split(":", 1)[0].strip() + + +last_line = None +for line in objdump_output.split("\n"): + line = line.strip() + + def read_addr_if_endswith(findme, list_to): + """ + Look, for example, for the addr like: + 12a9: e8 f2 fd ff ff call 10a0 <free@plt> + """ + if line.endswith(findme): + list_to.append(line2addr(line)) + + if main_loc is not None and main_ended is False: + # We want to know where main ends. An empty line in objdump. + if len(line) == 0: + main_ends.append(line2addr(last_line)) + main_ended = True + elif "ret" in line: + main_ends.append(line2addr(line)) + + if "<main>:" in line: + if main_loc is not None: + raise Exception("Found multiple main functions, odd target!") + # main_loc is the label, so it's parsed differntly (i.e. `0000000000001220 <main>:`) + main_loc = "0x" + line.strip().split(" ", 1)[0].strip() + else: + [ + read_addr_if_endswith(*x) + for x in [ + ("<free@plt>", free_calls), + ("<malloc@plt>", malloc_calls), + ("<strlen@plt>", strlen_calls), + ("<magicfn>", magicfn_calls), + ] + ] + + last_line = line + +if main_loc is None: + raise ( + "Could not find main in ./target! Make sure objdump is installed and the target is compiled." + ) + +with open("target.offsets.main", "w") as f: + f.write(main_loc) +with open("target.offsets.main_ends", "w") as f: + f.write("\n".join(main_ends)) +with open("target.offsets.magicfn", "w") as f: + f.write("\n".join(magicfn_calls)) +with open("target.offsets.malloc", "w") as f: + f.write("\n".join(malloc_calls)) +with open("target.offsets.free", "w") as f: + f.write("\n".join(free_calls)) +with open("target.offsets.strlen", "w") as f: + f.write("\n".join(strlen_calls)) diff --git a/unicorn_mode/samples/speedtest/python/Makefile b/unicorn_mode/samples/speedtest/python/Makefile new file mode 100644 index 00000000..4282c6cb --- /dev/null +++ b/unicorn_mode/samples/speedtest/python/Makefile @@ -0,0 +1,8 @@ +all: fuzz + +../target: + $(MAKE) -C .. + +fuzz: ../target + rm -rf ./ouptput + ../../../../afl-fuzz -s 1 -U -i ../sample_inputs -o ./output -- python3 harness.py @@ diff --git a/unicorn_mode/samples/speedtest/python/harness.py b/unicorn_mode/samples/speedtest/python/harness.py new file mode 100644 index 00000000..801ef4d1 --- /dev/null +++ b/unicorn_mode/samples/speedtest/python/harness.py @@ -0,0 +1,277 @@ +#!/usr/bin/env python3 +""" + Simple test harness for AFL's Unicorn Mode. + + This loads the speedtest target binary (precompiled X64 code) into + Unicorn's memory map for emulation, places the specified input into + Argv, and executes main. + There should not be any crashes - it's a speedtest against Rust and c. + + Before running this harness, call make in the parent folder. + + Run under AFL as follows: + + $ cd <afl_path>/unicorn_mode/samples/speedtest/python + $ ../../../../afl-fuzz -U -i ../sample_inputs -o ./output -- python3 harness.py @@ +""" + +import argparse +import os +import struct + +from unicornafl import * +from unicornafl.unicorn_const import UC_ARCH_X86, UC_HOOK_CODE, UC_MODE_64 +from unicornafl.x86_const import ( + UC_X86_REG_RAX, + UC_X86_REG_RDI, + UC_X86_REG_RIP, + UC_X86_REG_RSI, + UC_X86_REG_RSP, +) + +# Memory map for the code to be tested +BASE_ADDRESS = 0x0 # Arbitrary address where the (PIE) target binary will be loaded to +CODE_SIZE_MAX = 0x00010000 # Max size for the code (64kb) +INPUT_ADDRESS = 0x00100000 # where we put our stuff +INPUT_MAX = 0x00100000 # max size for our input +HEAP_ADDRESS = 0x00200000 # Heap addr +HEAP_SIZE_MAX = 0x000F0000 # Maximum allowable size for the heap +STACK_ADDRESS = 0x00400000 # Address of the stack (arbitrarily chosen) +STACK_SIZE = 0x000F0000 # Size of the stack (arbitrarily chosen) + +target_path = os.path.abspath( + os.path.join(os.path.dirname(os.path.abspath(__file__)), "..") +) +target_bin = os.path.join(target_path, "target") + + +def get_offsets_for(name): + full_path = os.path.join(target_path, f"target.offsets.{name}") + with open(full_path) as f: + return [int(x, 16) + BASE_ADDRESS for x in f.readlines()] + + +# Read all offsets from our objdump file +main_offset = get_offsets_for("main")[0] +main_ends = get_offsets_for("main_ends") +malloc_callsites = get_offsets_for("malloc") +free_callsites = get_offsets_for("free") +magicfn_callsites = get_offsets_for("magicfn") +# Joke's on me: strlen got inlined by my compiler +strlen_callsites = get_offsets_for("strlen") + +try: + # If Capstone is installed then we'll dump disassembly, otherwise just dump the binary. + from capstone import * + + cs = Cs(CS_ARCH_MIPS, CS_MODE_MIPS32 + CS_MODE_BIG_ENDIAN) + + def unicorn_debug_instruction(uc, address, size, user_data): + mem = uc.mem_read(address, size) + for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite( + bytes(mem), size + ): + print(" Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr)) + + +except ImportError: + + def unicorn_debug_instruction(uc, address, size, user_data): + print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) + + +def unicorn_debug_block(uc, address, size, user_data): + print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size)) + + +def unicorn_debug_mem_access(uc, access, address, size, value, user_data): + if access == UC_MEM_WRITE: + print( + " >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format( + address, size, value + ) + ) + else: + print(" >>> Read: addr=0x{0:016x} size={1}".format(address, size)) + + +def unicorn_debug_mem_invalid_access(uc, access, address, size, value, user_data): + if access == UC_MEM_WRITE_UNMAPPED: + print( + " >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format( + address, size, value + ) + ) + else: + print( + " >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size) + ) + + +already_allocated = False + + +def malloc_hook(uc, address, size, user_data): + """ + We use a very simple malloc/free stub here, that only works for exactly one allocation at a time. + """ + global already_allocated + if already_allocated: + print("Double malloc, not supported right now!") + os.abort() + # read the first param + malloc_size = uc.reg_read(UC_X86_REG_RDI) + if malloc_size > HEAP_SIZE_MAX: + print( + f"Tried to allocate {malloc_size} bytes, aint't nobody got space for that! (We may only allocate up to {HEAP_SIZE_MAX})" + ) + os.abort() + uc.reg_write(UC_X86_REG_RAX, HEAP_ADDRESS) + uc.reg_write(UC_X86_REG_RIP, address + size) + already_allocated = True + + +def free_hook(uc, address, size, user_data): + """ + No real free, just set the "used"-flag to false. + """ + global already_allocated + if not already_allocated: + print("Double free detected. Real bug?") + os.abort() + # read the first param + free_ptr = uc.reg_read(UC_X86_REG_RDI) + if free_ptr != HEAP_ADDRESS: + print( + f"Tried to free wrong mem region: {hex(free_ptr)} at code loc {hex(address)}" + ) + os.abort() + uc.reg_write(UC_X86_REG_RIP, address + size) + already_allocated = False + + +# def strlen_hook(uc, address, size, user_data): +# """ +# No real strlen, we know the len is == our input. +# This completely ignores '\0', but for this target, do we really care? +# """ +# global input_len +# print(f"Returning len {input_len}") +# uc.reg_write(UC_X86_REG_RAX, input_len) +# uc.reg_write(UC_X86_REG_RIP, address + size) + + +def magicfn_hook(uc, address, size, user_data): + """ + This is a fancy print function that we're just going to skip for fuzzing. + """ + uc.reg_write(UC_X86_REG_RIP, address + size) + + +def main(): + + parser = argparse.ArgumentParser(description="Test harness for simple_target.bin") + parser.add_argument( + "input_file", + type=str, + help="Path to the file containing the mutated input to load", + ) + parser.add_argument( + "-t", + "--trace", + default=False, + action="store_true", + help="Enables debug tracing", + ) + args = parser.parse_args() + + # Instantiate a MIPS32 big endian Unicorn Engine instance + uc = Uc(UC_ARCH_X86, UC_MODE_64) + + if args.trace: + uc.hook_add(UC_HOOK_BLOCK, unicorn_debug_block) + uc.hook_add(UC_HOOK_CODE, unicorn_debug_instruction) + uc.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, unicorn_debug_mem_access) + uc.hook_add( + UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, + unicorn_debug_mem_invalid_access, + ) + + print("The input testcase is set to {}".format(args.input_file)) + + # --------------------------------------------------- + # Load the binary to emulate and map it into memory + with open(target_bin, "rb") as f: + binary_code = f.read() + + # Apply constraints to the mutated input + if len(binary_code) > CODE_SIZE_MAX: + print("Binary code is too large (> {} bytes)".format(CODE_SIZE_MAX)) + return + + # Write the binary to its place in mem + uc.mem_map(BASE_ADDRESS, CODE_SIZE_MAX) + uc.mem_write(BASE_ADDRESS, binary_code) + + # Set the program counter to the start of the code + uc.reg_write(UC_X86_REG_RIP, main_offset) + + # Setup the stack. + uc.mem_map(STACK_ADDRESS, STACK_SIZE) + # Setup the stack pointer, but allocate two pointers for the pointers to input. + uc.reg_write(UC_X86_REG_RSP, STACK_ADDRESS + STACK_SIZE - 16) + + # Setup our input space, and push the pointer to it in the function params + uc.mem_map(INPUT_ADDRESS, INPUT_MAX) + # We have argc = 2 + uc.reg_write(UC_X86_REG_RDI, 2) + # RSI points to our little 2 QWORD space at the beginning of the stack... + uc.reg_write(UC_X86_REG_RSI, STACK_ADDRESS + STACK_SIZE - 16) + # ... which points to the Input. Write the ptr to mem in little endian. + uc.mem_write(STACK_ADDRESS + STACK_SIZE - 16, struct.pack("<Q", INPUT_ADDRESS)) + + for addr in malloc_callsites: + uc.hook_add(UC_HOOK_CODE, malloc_hook, begin=addr, end=addr) + + for addr in free_callsites: + uc.hook_add(UC_HOOK_CODE, free_hook, begin=addr, end=addr) + + if len(strlen_callsites): + # strlen got inlined for my compiler. + print( + "Oops, your compiler emitted strlen as function. You may have to change the harness." + ) + # for addr in strlen_callsites: + # uc.hook_add(UC_HOOK_CODE, strlen_hook, begin=addr, end=addr) + + for addr in magicfn_callsites: + uc.hook_add(UC_HOOK_CODE, magicfn_hook, begin=addr, end=addr + 1) + + # ----------------------------------------------------- + # Set up a callback to place input data (do little work here, it's called for every single iteration! This code is *HOT*) + # We did not pass in any data and don't use persistent mode, so we can ignore these params. + # Be sure to check out the docstrings for the uc.afl_* functions. + def place_input_callback(uc, input, persistent_round, data): + # Apply constraints to the mutated input + input_len = len(input) + # global input_len + if input_len > INPUT_MAX: + # print("Test input is too long (> {} bytes)") + return False + + # print(f"Placing input: {input} in round {persistent_round}") + + # Make sure the string is always 0-terminated (as it would be "in the wild") + input[-1] = b"\0" + + # Write the mutated command into the data buffer + uc.mem_write(INPUT_ADDRESS, input) + # uc.reg_write(UC_X86_REG_RIP, main_offset) + + print(f"Starting to fuzz. Running from addr {main_offset} to one of {main_ends}") + # Start the fuzzer. + uc.afl_fuzz(args.input_file, place_input_callback, main_ends, persistent_iters=1000) + + +if __name__ == "__main__": + main() diff --git a/unicorn_mode/samples/speedtest/rust/.gitignore b/unicorn_mode/samples/speedtest/rust/.gitignore new file mode 100644 index 00000000..a9d37c56 --- /dev/null +++ b/unicorn_mode/samples/speedtest/rust/.gitignore @@ -0,0 +1,2 @@ +target +Cargo.lock diff --git a/unicorn_mode/samples/speedtest/rust/Cargo.toml b/unicorn_mode/samples/speedtest/rust/Cargo.toml new file mode 100644 index 00000000..c19ee0a1 --- /dev/null +++ b/unicorn_mode/samples/speedtest/rust/Cargo.toml @@ -0,0 +1,15 @@ +[package] +name = "unicornafl_harness" +version = "0.1.0" +authors = ["Dominik Maier <domenukk@gmail.com>"] +edition = "2018" + +[profile.release] +lto = true +opt-level = 3 +panic = "abort" + +[dependencies] +unicornafl = { path = "../../../unicornafl/bindings/rust/", version="1.0.0" } +capstone="0.6.0" +libc="0.2.66" \ No newline at end of file diff --git a/unicorn_mode/samples/speedtest/rust/Makefile b/unicorn_mode/samples/speedtest/rust/Makefile new file mode 100644 index 00000000..fe18d6ee --- /dev/null +++ b/unicorn_mode/samples/speedtest/rust/Makefile @@ -0,0 +1,17 @@ +all: fuzz + +clean: + cargo clean + +./target/release/unicornafl_harness: ./src/main.rs + cargo build --release + +./target/debug/unicornafl_harness: ./src/main.rs + cargo build + +../target: + $(MAKE) -c .. + +fuzz: ../target ./target/release/unicornafl_harness + rm -rf ./output + SKIP_BINCHECK=1 ../../../../afl-fuzz -s 1 -i ../sample_inputs -o ./output -- ./target/release/unicornafl_harness @@ diff --git a/unicorn_mode/samples/speedtest/rust/src/main.rs b/unicorn_mode/samples/speedtest/rust/src/main.rs new file mode 100644 index 00000000..1e35ff0b --- /dev/null +++ b/unicorn_mode/samples/speedtest/rust/src/main.rs @@ -0,0 +1,232 @@ +extern crate capstone; +extern crate libc; + +use core::cell::Cell; +use std::{ + env, + fs::File, + io::{self, Read}, + process::abort, + str, +}; + +use unicornafl::{ + unicorn_const::{uc_error, Arch, Mode, Permission}, + RegisterX86::{self, *}, + Unicorn, UnicornHandle, +}; + +const BINARY: &str = &"../target"; + +// Memory map for the code to be tested +// Arbitrary address where code to test will be loaded +const BASE_ADDRESS: u64 = 0x0; +// Max size for the code (64kb) +const CODE_SIZE_MAX: u64 = 0x00010000; +// Location where the input will be placed (make sure the uclated program knows this somehow, too ;) ) +const INPUT_ADDRESS: u64 = 0x00100000; +// Maximum size for our input +const INPUT_MAX: u64 = 0x00100000; +// Where our pseudo-heap is at +const HEAP_ADDRESS: u64 = 0x00200000; +// Maximum allowable size for the heap +const HEAP_SIZE_MAX: u64 = 0x000F0000; +// Address of the stack (Some random address again) +const STACK_ADDRESS: u64 = 0x00400000; +// Size of the stack (arbitrarily chosen, just make it big enough) +const STACK_SIZE: u64 = 0x000F0000; + +fn read_file(filename: &str) -> Result<Vec<u8>, io::Error> { + let mut f = File::open(filename)?; + let mut buffer = Vec::new(); + f.read_to_end(&mut buffer)?; + Ok(buffer) +} + +/// Our location parser +fn parse_locs(loc_name: &str) -> Result<Vec<u64>, io::Error> { + let contents = &read_file(&format!("../target.offsets.{}", loc_name))?; + //println!("Read: {:?}", contents); + Ok(str_from_u8_unchecked(&contents) + .split("\n") + .map(|x| { + //println!("Trying to convert {}", &x[2..]); + let result = u64::from_str_radix(&x[2..], 16); + result.unwrap() + }) + .collect()) +} + +// find null terminated string in vec +pub fn str_from_u8_unchecked(utf8_src: &[u8]) -> &str { + let nul_range_end = utf8_src + .iter() + .position(|&c| c == b'\0') + .unwrap_or(utf8_src.len()); + unsafe { str::from_utf8_unchecked(&utf8_src[0..nul_range_end]) } +} + +fn align(size: u64) -> u64 { + const ALIGNMENT: u64 = 0x1000; + if size % ALIGNMENT == 0 { + size + } else { + ((size / ALIGNMENT) + 1) * ALIGNMENT + } +} + +fn main() { + let args: Vec<String> = env::args().collect(); + if args.len() == 1 { + println!("Missing parameter <uclation_input> (@@ for AFL)"); + return; + } + let input_file = &args[1]; + println!("The input testcase is set to {}", input_file); + fuzz(input_file).unwrap(); +} + +fn fuzz(input_file: &str) -> Result<(), uc_error> { + let mut unicorn = Unicorn::new(Arch::X86, Mode::MODE_64, 0)?; + let mut uc: UnicornHandle<'_, _> = unicorn.borrow(); + + let binary = read_file(BINARY).expect(&format!("Could not read modem image: {}", BINARY)); + let _aligned_binary_size = align(binary.len() as u64); + // Apply constraints to the mutated input + if binary.len() as u64 > CODE_SIZE_MAX { + println!("Binary code is too large (> {} bytes)", CODE_SIZE_MAX); + } + + // Write the binary to its place in mem + uc.mem_map(BASE_ADDRESS, CODE_SIZE_MAX as usize, Permission::ALL)?; + uc.mem_write(BASE_ADDRESS, &binary)?; + + // Set the program counter to the start of the code + let main_locs = parse_locs("main").unwrap(); + //println!("Entry Point: {:x}", main_locs[0]); + uc.reg_write(RegisterX86::RIP as i32, main_locs[0])?; + + // Setup the stack. + uc.mem_map( + STACK_ADDRESS, + STACK_SIZE as usize, + Permission::READ | Permission::WRITE, + )?; + // Setup the stack pointer, but allocate two pointers for the pointers to input. + uc.reg_write(RSP as i32, STACK_ADDRESS + STACK_SIZE - 16)?; + + // Setup our input space, and push the pointer to it in the function params + uc.mem_map(INPUT_ADDRESS, INPUT_MAX as usize, Permission::READ)?; + // We have argc = 2 + uc.reg_write(RDI as i32, 2)?; + // RSI points to our little 2 QWORD space at the beginning of the stack... + uc.reg_write(RSI as i32, STACK_ADDRESS + STACK_SIZE - 16)?; + // ... which points to the Input. Write the ptr to mem in little endian. + uc.mem_write( + STACK_ADDRESS + STACK_SIZE - 16, + &(INPUT_ADDRESS as u32).to_le_bytes(), + )?; + + let already_allocated = Cell::new(false); + + let already_allocated_malloc = already_allocated.clone(); + // We use a very simple malloc/free stub here, + // that only works for exactly one allocation at a time. + let hook_malloc = move |mut uc: UnicornHandle<'_, _>, addr: u64, size: u32| { + if already_allocated_malloc.get() { + println!("Double malloc, not supported right now!"); + abort(); + } + // read the first param + let malloc_size = uc.reg_read(RDI as i32).unwrap(); + if malloc_size > HEAP_SIZE_MAX { + println!( + "Tried to allocate {} bytes, but we may only allocate up to {}", + malloc_size, HEAP_SIZE_MAX + ); + abort(); + } + uc.reg_write(RAX as i32, HEAP_ADDRESS).unwrap(); + uc.reg_write(RIP as i32, addr + size as u64).unwrap(); + already_allocated_malloc.set(true); + }; + + let already_allocated_free = already_allocated.clone(); + // No real free, just set the "used"-flag to false. + let hook_free = move |mut uc: UnicornHandle<'_, _>, addr, size| { + if already_allocated_free.get() { + println!("Double free detected. Real bug?"); + abort(); + } + // read the first param + let free_ptr = uc.reg_read(RDI as i32).unwrap(); + if free_ptr != HEAP_ADDRESS { + println!( + "Tried to free wrong mem region {:x} at code loc {:x}", + free_ptr, addr + ); + abort(); + } + uc.reg_write(RIP as i32, addr + size as u64).unwrap(); + already_allocated_free.set(false); + }; + + /* + BEGIN FUNCTION HOOKS + */ + + // This is a fancy print function that we're just going to skip for fuzzing. + let hook_magicfn = move |mut uc: UnicornHandle<'_, _>, addr, size| { + uc.reg_write(RIP as i32, addr + size as u64).unwrap(); + }; + + for addr in parse_locs("malloc").unwrap() { + //hook!(addr, hook_malloc, "malloc"); + uc.add_code_hook(addr, addr, Box::new(hook_malloc.clone()))?; + } + + for addr in parse_locs("free").unwrap() { + uc.add_code_hook(addr, addr, Box::new(hook_free.clone()))?; + } + + for addr in parse_locs("magicfn").unwrap() { + uc.add_code_hook(addr, addr, Box::new(hook_magicfn.clone()))?; + } + + let place_input_callback = + |mut uc: UnicornHandle<'_, _>, afl_input: &mut [u8], _persistent_round| { + // apply constraints to the mutated input + if afl_input.len() > INPUT_MAX as usize { + //println!("Skipping testcase with leng {}", afl_input.len()); + return false; + } + + afl_input[afl_input.len() - 1] = b'\0'; + uc.mem_write(INPUT_ADDRESS, afl_input).unwrap(); + true + }; + + // return true if the last run should be counted as crash + let crash_validation_callback = + |_uc: UnicornHandle<'_, _>, result, _input: &[u8], _persistent_round| { + result != uc_error::OK + }; + + let end_addrs = parse_locs("main_ends").unwrap(); + + let ret = uc.afl_fuzz( + input_file, + Box::new(place_input_callback), + &end_addrs, + Box::new(crash_validation_callback), + false, + 1000, + ); + + match ret { + Ok(_) => {} + Err(e) => panic!(format!("found non-ok unicorn exit: {:?}", e)), + } + + Ok(()) +} diff --git a/unicorn_mode/samples/speedtest/sample_inputs/a b/unicorn_mode/samples/speedtest/sample_inputs/a new file mode 100644 index 00000000..78981922 --- /dev/null +++ b/unicorn_mode/samples/speedtest/sample_inputs/a @@ -0,0 +1 @@ +a diff --git a/unicorn_mode/samples/speedtest/target.c b/unicorn_mode/samples/speedtest/target.c new file mode 100644 index 00000000..8359a110 --- /dev/null +++ b/unicorn_mode/samples/speedtest/target.c @@ -0,0 +1,77 @@ +/* + * Sample target file to test afl-unicorn fuzzing capabilities. + * This is a very trivial example that will, however, never crash. + * Crashing would change the execution speed. + * + */ +#include <stdint.h> +#include <string.h> +#include <stdio.h> +#include <stdlib.h> + +// Random print function we can hook in our harness to test hook speeds. +char magicfn(char to_print) { + puts("Printing a char, just minding my own business: "); + putchar(to_print); + putchar('\n'); + return to_print; +} + +int main(int argc, char** argv) { + if (argc < 2) { + printf("Gimme input pl0x!\n"); + return -1; + } + + // Make sure the hooks work... + char *test = malloc(1024); + if (!test) { + printf("Uh-Oh, malloc doesn't work!"); + abort(); + } + free(test); + + char *data_buf = argv[1]; + // We can start the unicorn hooking here. + uint64_t data_len = strlen(data_buf); + if (data_len < 20) return -2; + + for (; data_len --> 0 ;) { + char *buf_cpy = NULL; + if (data_len) { + buf_cpy = malloc(data_len); + if (!buf_cpy) { + puts("Oof, malloc failed! :/"); + abort(); + } + memcpy(buf_cpy, data_buf, data_len); + } + if (data_len >= 18) { + free(buf_cpy); + continue; + } + if (data_len > 2 && data_len < 18) { + buf_cpy[data_len - 1] = (char) 0x90; + } else if (data_buf[9] == (char) 0x90 && data_buf[10] != 0x00 && buf_cpy[11] == (char) 0x90) { + // Cause a crash if data[10] is not zero, but [9] and [11] are zero + unsigned char valid_read = buf_cpy[10]; + if (magicfn(valid_read) != valid_read) { + puts("Oof, the hook for data_buf[10] is broken?"); + abort(); + } + } + free(buf_cpy); + } + if (data_buf[0] > 0x10 && data_buf[0] < 0x20 && data_buf[1] > data_buf[2]) { + // Cause an 'invalid read' crash if (0x10 < data[0] < 0x20) and data[1] > data[2] + unsigned char valid_read = data_buf[0]; + if (magicfn(valid_read) != valid_read) { + puts("Oof, the hook for data_buf[0] is broken?"); + abort(); + } + } + + magicfn('q'); + + return 0; +} diff --git a/unicorn_mode/unicornafl b/unicorn_mode/unicornafl -Subproject c6d6647161a32bae88785a618fcd828d1711d9e +Subproject fb2fc9f25df32f17f6b6b859e4dbd70f9a857e0 diff --git a/unicorn_mode/update_uc_ref.sh b/unicorn_mode/update_uc_ref.sh index a2613942..7c1c7778 100755 --- a/unicorn_mode/update_uc_ref.sh +++ b/unicorn_mode/update_uc_ref.sh @@ -19,7 +19,7 @@ if [ "$NEW_VERSION" = "-h" ]; then exit 1 fi -git submodule init && git submodule update || exit 1 +git submodule init && git submodule update unicornafl || exit 1 cd ./unicornafl || exit 1 git fetch origin dev 1>/dev/null || exit 1 git stash 1>/dev/null 2>/dev/null |