From b33bb0943ac4957eaf7b16ef694a4e4b4a538212 Mon Sep 17 00:00:00 2001 From: David Carlier Date: Thu, 31 Oct 2019 15:50:58 +0000 Subject: libtokencap/libdislocator README rename proposals and fixing the install tasks in the process. --- README.md | 2 +- docs/notes_for_asan.txt | 2 +- libdislocator/Makefile | 2 +- libdislocator/README.dislocator.md | 60 +++++++++++++++++++++++++++++++++++ libdislocator/README.md | 60 ----------------------------------- libdislocator/libdislocator.so.c | 2 +- libtokencap/Makefile | 2 +- libtokencap/README.md | 64 -------------------------------------- libtokencap/README.tokencap.md | 64 ++++++++++++++++++++++++++++++++++++++ libtokencap/libtokencap.so.c | 2 +- 10 files changed, 130 insertions(+), 130 deletions(-) create mode 100644 libdislocator/README.dislocator.md delete mode 100644 libdislocator/README.md delete mode 100644 libtokencap/README.md create mode 100644 libtokencap/README.tokencap.md diff --git a/README.md b/README.md index 88a86aaa..e8d4e6a8 100644 --- a/README.md +++ b/README.md @@ -457,7 +457,7 @@ parsers and grammars, but isn't nearly as good as the -x mode. If a dictionary is really hard to come by, another option is to let AFL run for a while, and then use the token capture library that comes as a companion -utility with AFL. For that, see [libtokencap/README.md](libtokencap/README.md). +utility with AFL. For that, see [libtokencap/README.md](libtokencap/README.tokencap.md). ## 11) Crash triage diff --git a/docs/notes_for_asan.txt b/docs/notes_for_asan.txt index 972ca909..09ca172e 100644 --- a/docs/notes_for_asan.txt +++ b/docs/notes_for_asan.txt @@ -34,7 +34,7 @@ Note that ASAN is incompatible with -static, so be mindful of that. There is also the option of generating a corpus using a non-ASAN binary, and then feeding it to an ASAN-instrumented one to check for bugs. This is faster, and can give you somewhat comparable results. You can also try using -libdislocator (see libdislocator/README.dislocator in the parent directory) as a +libdislocator (see libdislocator/README.dislocator.md in the parent directory) as a lightweight and hassle-free (but less thorough) alternative. 2) Long version diff --git a/libdislocator/Makefile b/libdislocator/Makefile index 91efba07..05ba26b3 100644 --- a/libdislocator/Makefile +++ b/libdislocator/Makefile @@ -34,5 +34,5 @@ clean: install: all install -m 755 ../libdislocator.so $${DESTDIR}$(HELPER_PATH) - install -m 644 README.dislocator $${DESTDIR}$(HELPER_PATH) + install -m 644 README.dislocator.md $${DESTDIR}$(HELPER_PATH) diff --git a/libdislocator/README.dislocator.md b/libdislocator/README.dislocator.md new file mode 100644 index 00000000..5d5a1464 --- /dev/null +++ b/libdislocator/README.dislocator.md @@ -0,0 +1,60 @@ +# libdislocator, an abusive allocator + + (See ../docs/README for the general instruction manual.) + +This is a companion library that can be used as a drop-in replacement for the +libc allocator in the fuzzed binaries. It improves the odds of bumping into +heap-related security bugs in several ways: + + - It allocates all buffers so that they are immediately adjacent to a + subsequent PROT_NONE page, causing most off-by-one reads and writes to + immediately segfault, + + - It adds a canary immediately below the allocated buffer, to catch writes + to negative offsets (won't catch reads, though), + + - It sets the memory returned by malloc() to garbage values, improving the + odds of crashing when the target accesses uninitialized data, + + - It sets freed memory to PROT_NONE and does not actually reuse it, causing + most use-after-free bugs to segfault right away, + + - It forces all realloc() calls to return a new address - and sets + PROT_NONE on the original block. This catches use-after-realloc bugs, + + - It checks for calloc() overflows and can cause soft or hard failures + of alloc requests past a configurable memory limit (AFL_LD_LIMIT_MB, + AFL_LD_HARD_FAIL). + +Basically, it is inspired by some of the non-default options available for the +OpenBSD allocator - see malloc.conf(5) on that platform for reference. It is +also somewhat similar to several other debugging libraries, such as gmalloc +and DUMA - but is simple, plug-and-play, and designed specifically for fuzzing +jobs. + +Note that it does nothing for stack-based memory handling errors. The +-fstack-protector-all setting for GCC / clang, enabled when using AFL_HARDEN, +can catch some subset of that. + +The allocator is slow and memory-intensive (even the tiniest allocation uses up +4 kB of physical memory and 8 kB of virtual mem), making it completely unsuitable +for "production" uses; but it can be faster and more hassle-free than ASAN / MSAN +when fuzzing small, self-contained binaries. + +To use this library, run AFL like so: + +``` +AFL_PRELOAD=/path/to/libdislocator.so ./afl-fuzz [...other params...] +``` + +You *have* to specify path, even if it's just ./libdislocator.so or +$PWD/libdislocator.so. + +Similarly to afl-tmin, the library is not "proprietary" and can be used with +other fuzzers or testing tools without the need for any code tweaks. It does not +require AFL-instrumented binaries to work. + +Note that the AFL_PRELOAD approach (which AFL internally maps to LD_PRELOAD or +DYLD_INSERT_LIBRARIES, depending on the OS) works only if the target binary is +dynamically linked. Otherwise, attempting to use the library will have no +effect. diff --git a/libdislocator/README.md b/libdislocator/README.md deleted file mode 100644 index 5d5a1464..00000000 --- a/libdislocator/README.md +++ /dev/null @@ -1,60 +0,0 @@ -# libdislocator, an abusive allocator - - (See ../docs/README for the general instruction manual.) - -This is a companion library that can be used as a drop-in replacement for the -libc allocator in the fuzzed binaries. It improves the odds of bumping into -heap-related security bugs in several ways: - - - It allocates all buffers so that they are immediately adjacent to a - subsequent PROT_NONE page, causing most off-by-one reads and writes to - immediately segfault, - - - It adds a canary immediately below the allocated buffer, to catch writes - to negative offsets (won't catch reads, though), - - - It sets the memory returned by malloc() to garbage values, improving the - odds of crashing when the target accesses uninitialized data, - - - It sets freed memory to PROT_NONE and does not actually reuse it, causing - most use-after-free bugs to segfault right away, - - - It forces all realloc() calls to return a new address - and sets - PROT_NONE on the original block. This catches use-after-realloc bugs, - - - It checks for calloc() overflows and can cause soft or hard failures - of alloc requests past a configurable memory limit (AFL_LD_LIMIT_MB, - AFL_LD_HARD_FAIL). - -Basically, it is inspired by some of the non-default options available for the -OpenBSD allocator - see malloc.conf(5) on that platform for reference. It is -also somewhat similar to several other debugging libraries, such as gmalloc -and DUMA - but is simple, plug-and-play, and designed specifically for fuzzing -jobs. - -Note that it does nothing for stack-based memory handling errors. The --fstack-protector-all setting for GCC / clang, enabled when using AFL_HARDEN, -can catch some subset of that. - -The allocator is slow and memory-intensive (even the tiniest allocation uses up -4 kB of physical memory and 8 kB of virtual mem), making it completely unsuitable -for "production" uses; but it can be faster and more hassle-free than ASAN / MSAN -when fuzzing small, self-contained binaries. - -To use this library, run AFL like so: - -``` -AFL_PRELOAD=/path/to/libdislocator.so ./afl-fuzz [...other params...] -``` - -You *have* to specify path, even if it's just ./libdislocator.so or -$PWD/libdislocator.so. - -Similarly to afl-tmin, the library is not "proprietary" and can be used with -other fuzzers or testing tools without the need for any code tweaks. It does not -require AFL-instrumented binaries to work. - -Note that the AFL_PRELOAD approach (which AFL internally maps to LD_PRELOAD or -DYLD_INSERT_LIBRARIES, depending on the OS) works only if the target binary is -dynamically linked. Otherwise, attempting to use the library will have no -effect. diff --git a/libdislocator/libdislocator.so.c b/libdislocator/libdislocator.so.c index 7fe40afa..106b44f4 100644 --- a/libdislocator/libdislocator.so.c +++ b/libdislocator/libdislocator.so.c @@ -14,7 +14,7 @@ http://www.apache.org/licenses/LICENSE-2.0 This is a companion library that can be used as a drop-in replacement - for the libc allocator in the fuzzed binaries. See README.dislocator for + for the libc allocator in the fuzzed binaries. See README.dislocator.md for more info. */ diff --git a/libtokencap/Makefile b/libtokencap/Makefile index df2426ed..6e1319d8 100644 --- a/libtokencap/Makefile +++ b/libtokencap/Makefile @@ -49,5 +49,5 @@ clean: install: all install -m 755 ../libtokencap.so $${DESTDIR}$(HELPER_PATH) - install -m 644 README.tokencap $${DESTDIR}$(HELPER_PATH) + install -m 644 README.tokencap.md $${DESTDIR}$(HELPER_PATH) diff --git a/libtokencap/README.md b/libtokencap/README.md deleted file mode 100644 index 8aae38bf..00000000 --- a/libtokencap/README.md +++ /dev/null @@ -1,64 +0,0 @@ -# strcmp() / memcmp() token capture library - - (See ../docs/README for the general instruction manual.) - -This companion library allows you to instrument `strcmp()`, `memcmp()`, -and related functions to automatically extract syntax tokens passed to any of -these libcalls. The resulting list of tokens may be then given as a starting -dictionary to afl-fuzz (the -x option) to improve coverage on subsequent -fuzzing runs. - -This may help improving coverage in some targets, and do precisely nothing in -others. In some cases, it may even make things worse: if libtokencap picks up -syntax tokens that are not used to process the input data, but that are a part -of - say - parsing a config file... well, you're going to end up wasting a lot -of CPU time on trying them out in the input stream. In other words, use this -feature with care. Manually screening the resulting dictionary is almost -always a necessity. - -As for the actual operation: the library stores tokens, without any deduping, -by appending them to a file specified via AFL_TOKEN_FILE. If the variable is not -set, the tool uses stderr (which is probably not what you want). - -Similarly to afl-tmin, the library is not "proprietary" and can be used with -other fuzzers or testing tools without the need for any code tweaks. It does not -require AFL-instrumented binaries to work. - -To use the library, you *need* to make sure that your fuzzing target is compiled -with -fno-builtin and is linked dynamically. If you wish to automate the first -part without mucking with CFLAGS in Makefiles, you can set AFL_NO_BUILTIN=1 -when using afl-gcc. This setting specifically adds the following flags: - -``` - -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strcasecmp - -fno-builtin-strcasencmp -fno-builtin-memcmp -fno-builtin-strstr - -fno-builtin-strcasestr -``` - -The next step is simply loading this library via LD_PRELOAD. The optimal usage -pattern is to allow afl-fuzz to fuzz normally for a while and build up a corpus, -and then fire off the target binary, with libtokencap.so loaded, on every file -found by AFL in that earlier run. This demonstrates the basic principle: - -``` - export AFL_TOKEN_FILE=$PWD/temp_output.txt - - for i in /queue/id*; do - LD_PRELOAD=/path/to/libtokencap.so \ - /path/to/target/program [...params, including $i...] - done - - sort -u temp_output.txt >afl_dictionary.txt -``` - -If you don't get any results, the target library is probably not using strcmp() -and memcmp() to parse input; or you haven't compiled it with -fno-builtin; or -the whole thing isn't dynamically linked, and LD_PRELOAD is having no effect. - -Portability hints: There is probably no particularly portable and non-invasive -way to distinguish between read-only and read-write memory mappings. -The `__tokencap_load_mappings()` function is the only thing that would -need to be changed for other OSes. - -Current supported OSes are: Linux, Darwin, FreeBSD (thanks to @devnexen) - diff --git a/libtokencap/README.tokencap.md b/libtokencap/README.tokencap.md new file mode 100644 index 00000000..8aae38bf --- /dev/null +++ b/libtokencap/README.tokencap.md @@ -0,0 +1,64 @@ +# strcmp() / memcmp() token capture library + + (See ../docs/README for the general instruction manual.) + +This companion library allows you to instrument `strcmp()`, `memcmp()`, +and related functions to automatically extract syntax tokens passed to any of +these libcalls. The resulting list of tokens may be then given as a starting +dictionary to afl-fuzz (the -x option) to improve coverage on subsequent +fuzzing runs. + +This may help improving coverage in some targets, and do precisely nothing in +others. In some cases, it may even make things worse: if libtokencap picks up +syntax tokens that are not used to process the input data, but that are a part +of - say - parsing a config file... well, you're going to end up wasting a lot +of CPU time on trying them out in the input stream. In other words, use this +feature with care. Manually screening the resulting dictionary is almost +always a necessity. + +As for the actual operation: the library stores tokens, without any deduping, +by appending them to a file specified via AFL_TOKEN_FILE. If the variable is not +set, the tool uses stderr (which is probably not what you want). + +Similarly to afl-tmin, the library is not "proprietary" and can be used with +other fuzzers or testing tools without the need for any code tweaks. It does not +require AFL-instrumented binaries to work. + +To use the library, you *need* to make sure that your fuzzing target is compiled +with -fno-builtin and is linked dynamically. If you wish to automate the first +part without mucking with CFLAGS in Makefiles, you can set AFL_NO_BUILTIN=1 +when using afl-gcc. This setting specifically adds the following flags: + +``` + -fno-builtin-strcmp -fno-builtin-strncmp -fno-builtin-strcasecmp + -fno-builtin-strcasencmp -fno-builtin-memcmp -fno-builtin-strstr + -fno-builtin-strcasestr +``` + +The next step is simply loading this library via LD_PRELOAD. The optimal usage +pattern is to allow afl-fuzz to fuzz normally for a while and build up a corpus, +and then fire off the target binary, with libtokencap.so loaded, on every file +found by AFL in that earlier run. This demonstrates the basic principle: + +``` + export AFL_TOKEN_FILE=$PWD/temp_output.txt + + for i in /queue/id*; do + LD_PRELOAD=/path/to/libtokencap.so \ + /path/to/target/program [...params, including $i...] + done + + sort -u temp_output.txt >afl_dictionary.txt +``` + +If you don't get any results, the target library is probably not using strcmp() +and memcmp() to parse input; or you haven't compiled it with -fno-builtin; or +the whole thing isn't dynamically linked, and LD_PRELOAD is having no effect. + +Portability hints: There is probably no particularly portable and non-invasive +way to distinguish between read-only and read-write memory mappings. +The `__tokencap_load_mappings()` function is the only thing that would +need to be changed for other OSes. + +Current supported OSes are: Linux, Darwin, FreeBSD (thanks to @devnexen) + diff --git a/libtokencap/libtokencap.so.c b/libtokencap/libtokencap.so.c index 2fe9ae63..7495180d 100644 --- a/libtokencap/libtokencap.so.c +++ b/libtokencap/libtokencap.so.c @@ -15,7 +15,7 @@ This Linux-only companion library allows you to instrument strcmp(), memcmp(), and related functions to automatically extract tokens. - See README.tokencap for more info. + See README.tokencap.md for more info. */ -- cgit 1.4.1