diff options
| author | van Hauser <vh@thc.org> | 2024-02-01 15:13:07 +0100 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2024-02-01 14:13:07 +0000 |
| commit | eda770fd32b804e3ebd6a43738c0002f6118a463 (patch) | |
| tree | ef1db3e42bd23e9627ad695f9a65f1e7c5b951b0 /utils | |
| parent | 0c054f520eda67b7bb15f95ca58c028e9b68131f (diff) | |
| download | afl++-eda770fd32b804e3ebd6a43738c0002f6118a463.tar.gz | |
push to stable (#1967)
* Output afl-clang-fast stuffs only if necessary (#1912)
* afl-cc header
* afl-cc common declarations
- Add afl-cc-state.c
- Strip includes, find_object, debug/be_quiet/have_*/callname setting from afl-cc.c
- Use debugf_args in main
- Modify execvp stuffs to fit new aflcc struct
* afl-cc show usage
* afl-cc mode selecting
1. compiler_mode by callname in argv[0]
2. compiler_mode by env "AFL_CC_COMPILER"
3. compiler_mode/instrument_mode by command line options "--afl-..."
4. instrument_mode/compiler_mode by various env vars including "AFL_LLVM_INSTRUMENT"
5. final checking steps
6. print "... - mode: %s-%s\n"
7. determine real argv[0] according to compiler_mode
* afl-cc macro defs
* afl-cc linking behaviors
* afl-cc fsanitize behaviors
* afl-cc misc
* afl-cc body update
* afl-cc all-in-one
formated with custom-format.py
* nits
---------
Co-authored-by: vanhauser-thc <vh@thc.org>
* changelog
* update grammar mutator
* lto llvm 12+
* docs(custom_mutators): fix missing ':' (#1953)
* Fix broken LTO mode and response file support (#1948)
* Strip `-Wl,-no-undefined` during compilation (#1952)
Make the compiler wrapper stripping `-Wl,-no-undefined` in addition to `-Wl,--no-undefined`.
Both versions of the flag are accepted by clang and, therefore, used by building systems in the wild (e.g., samba will not build without this fix).
* Remove dead code in write_to_testcase (#1955)
The custom_mutators_count check in if case is duplicate with if condition.
The else case is custom_mutators_count == 0, neither custom_mutator_list iteration nor sent check needed.
Signed-off-by: Xeonacid <h.dwwwwww@gmail.com>
* update qemuafl
* WIP: Add ability to generate drcov trace using QEMU backend (#1956)
* Document new drcov QEMU plugin
* Add link to lightkeeper for QEMU drcov file loading
---------
Co-authored-by: Jean-Romain Garnier <jean-romain.garnier@airbus.com>
* code format
* changelog
* sleep on uid != 0 afl-system-config
* fix segv about skip_next, warn on unsupported cases of linking options (#1958)
* todos
* ensure afl-cc only allows available compiler modes
* update grammar mutator
* disable aslr on apple
* fix for arm64
* help selective instrumentation
* typos
* macos
* add compiler test script
* apple fixes
* bump nyx submodules (#1963)
* fix docs
* update changelog
* update grammar mutator
* improve compiler test script
* gcc asan workaround (#1966)
* fix github merge fuckup
* fix
* Fix afl-cc (#1968)
- Check if too many cmdline params here, each time before insert a new param.
- Check if it is "-fsanitize=..." before we do sth.
- Remove improper param_st transfer.
* Avoid adding llvmnative instrumentation when linking rust sanitizer runtime (#1969)
* Dynamic instrumentation filtering for LLVM native (#1971)
* Add two dynamic instrumentation filter methods to runtime
* Always use pc-table with native pcguard
* Add make_symbol_list.py and README
* changelog
* todos
* new forkserver check
* fix
* nyx test for CI
* improve nyx docs
* Fixes to afl-cc and documentation (#1974)
* Always compile with -ldl when building for CODE_COVERAGE
When building with CODE_COVERAGE, the afl runtime contains code that
calls `dladdr` which requires -ldl. Under most circumstances, clang
already adds this (e.g. when building with pc-table), but there are some
circumstances where it isn't added automatically.
* Add visibility declaration to __afl_connected
When building with hidden visibility, the use of __AFL_LOOP inside such
code can cause linker errors due to __afl_connected being declared
"hidden".
* Update docs to clarify that CODE_COVERAGE=1 is required for dynamic_covfilter
* nits
* nyx build script updates
* test error output
* debug ci
* debug ci
* Improve afl-cc (#1975)
* update response file support
- full support of rsp file
- fix some segv issues
* Improve afl-cc
- remove dead code about allow/denylist options of sancov
- missing `if (!aflcc->have_msan)`
- add docs for each function
- typo
* enable nyx
* debug ci
* debug ci
* debug ci
* debug ci
* debug ci
* debug ci
* debug ci
* debug ci
* fix ci
* clean test script
* NO_NYX
* NO_NYX
* fix ci
* debug ci
* fix ci
* finalize ci fix
---------
Signed-off-by: Xeonacid <h.dwwwwww@gmail.com>
Co-authored-by: Sonic <50692172+SonicStark@users.noreply.github.com>
Co-authored-by: Xeonacid <h.dwwwwww@gmail.com>
Co-authored-by: Nils Bars <nils.bars@rub.de>
Co-authored-by: Jean-Romain Garnier <7504819+JRomainG@users.noreply.github.com>
Co-authored-by: Jean-Romain Garnier <jean-romain.garnier@airbus.com>
Co-authored-by: Sergej Schumilo <sergej@schumilo.de>
Co-authored-by: Christian Holler (:decoder) <choller@mozilla.com>
Diffstat (limited to 'utils')
| -rw-r--r-- | utils/dynamic_covfilter/README.md | 60 | ||||
| -rw-r--r-- | utils/dynamic_covfilter/make_symbol_list.py | 73 |
2 files changed, 133 insertions, 0 deletions
diff --git a/utils/dynamic_covfilter/README.md b/utils/dynamic_covfilter/README.md new file mode 100644 index 00000000..381e0855 --- /dev/null +++ b/utils/dynamic_covfilter/README.md @@ -0,0 +1,60 @@ +# Dynamic Instrumentation Filter + +Sometimes it can be beneficial to limit the instrumentation feedback to +specific code locations. It is possible to do so at compile-time by simply +not instrumenting any undesired locations. However, there are situations +where doing this dynamically without requiring a new build can be beneficial. +Especially when dealing with larger builds, it is much more convenient to +select the target code locations at runtime instead of doing so at build time. + +There are two ways of doing this in AFL++. Both approaches require a build of +AFL++ with `CODE_COVERAGE=1`, so make sure to build AFL++ first by invoking + +`CODE_COVERAGE=1 make` + +Once you have built AFL++, you can choose out of two approaches: + +## Simple Selection with `AFL_PC_FILTER` + +This approach requires a build with `AFL_INSTRUMENTATION=llvmnative` or +`llvmcodecov` as well as an AddressSanitizer build with debug information. + +By setting the environment variable `AFL_PC_FILTER` to a string, the runtime +symbolizer is enabled in the AFL++ runtime. At startup, the runtime will call +the `__sanitizer_symbolize_pc` API to resolve every PC in every loaded module. +The runtime then matches the result using `strstr` and disables the PC guard +if the symbolized PC does not contain the specified string. + +This approach has the benefit of being very easy to use. The downside is that +it causes significant startup delays with large binaries and that it requires +an AddressSanitizer build. + +This method has no additional runtime overhead after startup. + +## Selection using pre-symbolized data file with `AFL_PC_FILTER_FILE` + +To avoid large startup time delays, a specific module can be pre-symbolized +using the `make_symbol_list.py` script. This script outputs a sorted list of +functions with their respective relative offsets and lengths in the target +binary: + +`python3 make_symbol_list.py libxul.so > libxul.symbols.txt` + +The resulting list can be filtered, e.g. using grep: + +`grep -i "webgl" libxul.symbols.txt > libxul.webgl.symbols.txt` + +Finally, you can run with `AFL_PC_FILTER_FILE=libxul.webgl.symbols.txt` to +restrict instrumentation feedback to the given locations. This approach only +has a minimal startup time delay due to the implementation only using binary +search on the given file per PC rather than reading debug information for every +PC. It also works well with Nyx, where symbolizing is usually disabled for the +target process to avoid delays with frequent crashes. + +Similar to the previous method, This approach requires a build with +`AFL_INSTRUMENTATION=llvmnative` or `llvmcodecov` as well debug information. +However, it does not require the ASan runtime as it doesn't do the symbolizing +in process. Due to the way it maps PCs to symbols, it is less accurate when it +comes to includes and inlines (it assumes all PCs within a function belong to +that function and originate from the same file). For most purposes, this should +be a reasonable simplification to quickly process even the largest binaries. diff --git a/utils/dynamic_covfilter/make_symbol_list.py b/utils/dynamic_covfilter/make_symbol_list.py new file mode 100644 index 00000000..d1dd6ab3 --- /dev/null +++ b/utils/dynamic_covfilter/make_symbol_list.py @@ -0,0 +1,73 @@ +# This Source Code Form is subject to the terms of the Mozilla Public +# License, v. 2.0. If a copy of the MPL was not distributed with this +# file, You can obtain one at http://mozilla.org/MPL/2.0/. +# +# Written by Christian Holler <decoder at mozilla dot com> + +import json +import os +import sys +import subprocess + +if len(sys.argv) != 2: + print("Usage: %s binfile" % os.path.basename(sys.argv[0])) + sys.exit(1) + +binfile = sys.argv[1] + +addr2len = {} +addrs = [] + +output = subprocess.check_output(["objdump", "-t", binfile]).decode("utf-8") +for line in output.splitlines(): + line = line.replace("\t", " ") + components = [x for x in line.split(" ") if x] + if not components: + continue + try: + start_addr = int(components[0], 16) + except ValueError: + continue + + # Length has variable position in objdump output + length = None + for comp in components[1:]: + if len(comp) == 16: + try: + length = int(comp, 16) + break + except: + continue + + if length is None: + print("ERROR: Couldn't determine function section length: %s" % line) + + func = components[-1] + + addrs.append(start_addr) + addr2len[str(hex(start_addr))] = str(length) + +# The search implementation in the AFL runtime expects everything sorted. +addrs.sort() +addrs = [str(hex(addr)) for addr in addrs] + +# We symbolize in one go to speed things up with large binaries. +output = subprocess.check_output([ + "llvm-addr2line", + "--output-style=JSON", + "-f", "-C", "-a", "-e", + binfile], + input="\n".join(addrs).encode("utf-8")).decode("utf-8") + +output = output.strip().splitlines() +for line in output: + output = json.loads(line) + if "Symbol" in output and output["Address"] in addr2len: + final_output = [ + output["Address"], + addr2len[output["Address"]], + os.path.basename(output["ModuleName"]), + output["Symbol"][0]["FileName"], + output["Symbol"][0]["FunctionName"] + ] + print("\t".join(final_output)) |
