diff options
-rw-r--r-- | README.md | 18 | ||||
-rw-r--r-- | docs/Changelog.md | 7 | ||||
-rw-r--r-- | docs/FAQ.md | 124 | ||||
-rw-r--r-- | docs/binaryonly_fuzzing.md | 19 | ||||
-rw-r--r-- | include/afl-fuzz.h | 13 | ||||
-rw-r--r-- | include/config.h | 23 | ||||
-rw-r--r-- | include/envs.h | 2 | ||||
-rw-r--r-- | llvm_mode/README.laf-intel.md | 4 | ||||
-rw-r--r-- | llvm_mode/README.lto.md | 2 | ||||
-rw-r--r-- | llvm_mode/afl-clang-fast.c | 3 | ||||
-rw-r--r-- | src/afl-fuzz-one.c | 250 | ||||
-rw-r--r-- | src/afl-fuzz-queue.c | 112 | ||||
-rw-r--r-- | src/afl-fuzz-redqueen.c | 102 | ||||
-rw-r--r-- | src/afl-fuzz-state.c | 15 | ||||
-rw-r--r-- | src/afl-fuzz-stats.c | 3 | ||||
-rw-r--r-- | src/afl-fuzz.c | 81 |
16 files changed, 700 insertions, 78 deletions
diff --git a/README.md b/README.md index 7268f5d1..c6893fa0 100644 --- a/README.md +++ b/README.md @@ -193,6 +193,7 @@ Here are some good writeups to show how to effectively use AFL++: * [https://aflplus.plus/docs/tutorials/libxml2_tutorial/](https://aflplus.plus/docs/tutorials/libxml2_tutorial/) * [https://bananamafia.dev/post/gb-fuzz/](https://bananamafia.dev/post/gb-fuzz/) * [https://securitylab.github.com/research/fuzzing-challenges-solutions-1](https://securitylab.github.com/research/fuzzing-challenges-solutions-1) + * [https://securitylab.github.com/research/fuzzing-software-2](https://securitylab.github.com/research/fuzzing-software-2) * [https://securitylab.github.com/research/fuzzing-sockets-FTP](https://securitylab.github.com/research/fuzzing-sockets-FTP) If you are interested in fuzzing structured data (where you define what the @@ -232,7 +233,7 @@ anything below 9 is not recommended. | clang/clang++ 11+ is available | --> use afl-clang-lto and afl-clang-lto++ +--------------------------------+ see [llvm/README.lto.md](llvm/README.lto.md) | - | if not, or if the target fails with with afl-clang-lto/++ + | if not, or if the target fails with afl-clang-lto/++ | v +---------------------------------+ @@ -435,6 +436,9 @@ more useful. If you just use one CPU for fuzzing, then you are fuzzing just for fun and not seriously :-) +Pro tip: load the [afl++ snapshot module](https://github.com/AFLplusplus/AFL-Snapshot-LKM) before start afl-fuzz as this improves +performance by a x2 speed increase! + #### a) running afl-fuzz Before to do even a test run of afl-fuzz execute `sudo afl-system-config` (on @@ -561,8 +565,20 @@ then you can expect that your fuzzing won't be fruitful anymore. However often this just means that you should switch out secondaries for others, e.g. custom mutator modules, sync to very different fuzzers, etc. +#### f) improve the speed! + + * Use [persistent mode](llvm_mode/README.persistent_mode.md) (x2-x20 speed increase) + * Use the [afl++ snapshot module](https://github.com/AFLplusplus/AFL-Snapshot-LKM) (x2 speed increase) + * If you do not use shmem persistent mode, use `AFL_TMPDIR` to point the input file on a tempfs location, see [docs/env_variables.md](docs/env_variables.md) + * Improve kernel performance: modify `/etc/default/grub`, set `GRUB_CMDLINE_LINUX_DEFAULT="ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off spectre_v2=off stf_barrier=off"`; then `update-grub` and `reboot` (warning: makes the system more insecure) + * Running on an `ext2` filesystem with `noatime` mount option will be a bit faster than on any other journaling filesystem + * Use your cores! [3.b) Using multiple cores/threads](#b-using-multiple-coresthreads) + ### The End +Check out the [docs/FAQ](docs/FAQ.md) if it maybe answers your question (that +you might not even have known you had ;-) ). + This is basically all you need to know to professionally run fuzzing campaigns. If you want to know more, the rest of this README and the tons of texts in [docs/](docs/) will have you covered. diff --git a/docs/Changelog.md b/docs/Changelog.md index 38787def..7efab1e6 100644 --- a/docs/Changelog.md +++ b/docs/Changelog.md @@ -14,6 +14,8 @@ sending a mail to <afl-users+subscribe@googlegroups.com>. - added -F option to allow -M main fuzzers to sync to foreign fuzzers, e.g. honggfuzz or libfuzzer - eliminated CPU affinity race condition for -S/-M runs + - expanded havoc mode added, on no cycle finds add extra splicing and + MOpt into the mix - llvm_mode: - now supports llvm 12! - fixes for laf-intel float splitting (thanks to mark-griffin for @@ -21,10 +23,13 @@ sending a mail to <afl-users+subscribe@googlegroups.com>. - LTO: autodictionary mode is a default - LTO: instrim instrumentation disabled, only classic support used as it is always better - - added honggfuzz mangle as a custom mutator in custom_mutators/honggfuzz :) + - setting AFL_LLVM_LAF_SPLIT_FLOATS now activates + AFL_LLVM_LAF_SPLIT_COMPARES + - added honggfuzz mangle as a custom mutator in custom_mutators/honggfuzz - added afl-frida gum solution to examples/afl_frida (mostly imported from https://github.com/meme/hotwax/) - small fixes to afl-plot, afl-whatsup and man page creation + - new README, added FAQ ### Version ++2.66c (release) diff --git a/docs/FAQ.md b/docs/FAQ.md new file mode 100644 index 00000000..e09385a8 --- /dev/null +++ b/docs/FAQ.md @@ -0,0 +1,124 @@ +# Frequently asked questions about afl++ + +## Contents + + 1. [How to improve the fuzzing speed?](#how-to-improve-the-fuzzing-speed) + 2. [What is an edge?](#what-is-an-edge) + 3. [Why is my stability below 100%?](#why-is-my-stability-below-100) + 4. [How can I improve the stability value](#how-can-i-improve-the-stability-value) + +If you find an interesting or important question missing, submit it via +[https://github.com/AFLplusplus/AFLplusplus/issues](https://github.com/AFLplusplus/AFLplusplus/issues) + +## How to improve the fuzzing speed + + 1. use [llvm_mode](docs/llvm_mode/README.md): afl-clang-lto (llvm >= 11) or afl-clang-fast (llvm >= 9 recommended) + 2. Use [persistent mode](llvm_mode/README.persistent_mode.md) (x2-x20 speed increase) + 3. Use the [afl++ snapshot module](https://github.com/AFLplusplus/AFL-Snapshot-LKM) (x2 speed increase) + 4. If you do not use shmem persistent mode, use `AFL_TMPDIR` to point the input file on a tempfs location, see [docs/env_variables.md](docs/env_variables.md) + 5. Improve kernel performance: modify `/etc/default/grub`, set `GRUB_CMDLINE_LINUX_DEFAULT="ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off spectre_v2=off stf_barrier=off"`; then `update-grub` and `reboot` (warning: makes the system more insecure) + 6. Running on an `ext2` filesystem with `noatime` mount option will be a bit faster than on any other journaling filesystem + 7. Use your cores! [README.md:3.b) Using multiple cores/threads](../README.md#b-using-multiple-coresthreads) + +## What is an "edge" + +A program contains `functions`, `functions` contain the compiled machine code. +The compiled machine code in a `function` can be in a single or many `basic blocks`. +A `basic block` is the largest possible number of subsequent machine code +instructions that runs independent, meaning it does not split up to different +locations nor is it jumped into it from a different location: +``` +function() { + A: + some + code + B: + if (x) goto C; else goto D; + C: + some code + goto D + D: + some code + goto B + E: + return +} +``` +Every code block between two jump locations is a `basic block`. + +An `edge` is then the unique relationship between two `basic blocks` (from the +code example above): +``` + Block A + | + v + Block B <------+ + / \ | + v v | + Block C Block D --+ + \ + v + Block E +``` +Every line between two blocks is an `edge`. + +## Why is my stability below 100 + +Stability is measured by how many percent of the edges in the target are +"stable". Sending the same input again and again should take the exact same +path through the target every time. If that is the case, the stability is 100%. + +If however randomness happens, e.g. a thread reading from shared memory, +reaction to timing, etc. then in some of the re-executions with the same data +will result in the edge information being different accross runs. +Those edges that change are then flagged "unstable". + +The more "unstable" edges, the more difficult for afl++ to identify valid new +paths. + +A value above 90% is usually fine and a value above 80% is also still ok, and +even above 20% can still result in successful finds of bugs. +However, it is recommended that below 90% or 80% you should take measures to +improve the stability. + +## How can I improve the stability value + +Four steps are required to do this and requires quite some knowledge of +coding and/or disassembly and it is only effectively possible with +afl-clang-fast PCGUARD and afl-clang-lto LTO instrumentation! + + 1. First step: Identify which edge ID numbers are unstable + + run the target with `export AFL_DEBUG=1` for a few minutes then terminate. + The out/fuzzer_stats file will then show the edge IDs that were identified + as unstable. + + 2. Second step: Find the responsible function. + + a) For LTO instrumented binaries just disassemble or decompile the target + and look which edge is writing to that edge ID. Ghidra is a good tool + for this: [https://ghidra-sre.org/](https://ghidra-sre.org/) + + b) For PCGUARD instrumented binaries it is more difficult. Here you can + either modify the __sanitizer_cov_trace_pc_guard function in + llvm_mode/afl-llvm-rt.o.c to write a backtrace to a file if the ID in + __afl_area_ptr[*guard] is one of the unstable edge IDs. Then recompile + and reinstall llvm_mode and rebuild your target. Run the recompiled + target with afl-fuzz for a while and then check the file that you + wrote with the backtrace information. + Alternatively you can use `gdb` to hook __sanitizer_cov_trace_pc_guard_init + on start, check to which memory address the edge ID value is written + and set a write breakpoint to that address (`watch 0x.....`). + + 3. Third step: create a text file with the filenames + + Identify which source code files contain the functions that you need to + remove from instrumentation. + + Simply follow this document on how to do this: [llvm_mode/README.instrument_file.md](llvm_mode/README.instrument_file.md) + If PCGUARD is used, then you need to follow this guide: [http://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation](http://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation) + + 4. Fourth step: recompile the target + + Recompile, fuzz it, be happy :) + diff --git a/docs/binaryonly_fuzzing.md b/docs/binaryonly_fuzzing.md index 7c9be418..111147e2 100644 --- a/docs/binaryonly_fuzzing.md +++ b/docs/binaryonly_fuzzing.md @@ -8,12 +8,17 @@ The following is a description of how these binaries can be fuzzed with afl++ + ## TL;DR: qemu_mode in persistent mode is the fastest - if the stability is high enough. Otherwise try retrowrite, afl-dyninst and if these fail too then standard qemu_mode with AFL_ENTRYPOINT to where you need it. + If your a target is library use examples/afl_frida/. + + If your target is non-linux then use unicorn_mode/ + ## QEMU @@ -57,6 +62,20 @@ As it is included in afl++ this needs no URL. +## AFL FRIDA + + If you want to fuzz a binary-only shared library then you can fuzz it with + frida-gum via examples/afl_frida/, you will have to write a harness to + call the target function in the library, use afl-frida.c as a template. + + +## AFL UNTRACER + + If you want to fuzz a binary-only shared library then you can fuzz it with + examples/afl_untracer/, use afl-untracer.c as a template. + It is slower than AFL FRIDA (see above). + + ## DYNINST Dyninst is a binary instrumentation framework similar to Pintool and diff --git a/include/afl-fuzz.h b/include/afl-fuzz.h index c0c4cfd5..1c1be711 100644 --- a/include/afl-fuzz.h +++ b/include/afl-fuzz.h @@ -139,7 +139,8 @@ struct queue_entry { var_behavior, /* Variable behavior? */ favored, /* Currently favored? */ fs_redundant, /* Marked as redundant in the fs? */ - fully_colorized; /* Do not run redqueen stage again */ + fully_colorized, /* Do not run redqueen stage again */ + is_ascii; /* Is the input just ascii text? */ u32 bitmap_size, /* Number of bits set in bitmap */ fuzz_level; /* Number of fuzzing iterations */ @@ -333,7 +334,7 @@ typedef struct afl_env_vars { afl_dumb_forksrv, afl_import_first, afl_custom_mutator_only, afl_no_ui, afl_force_ui, afl_i_dont_care_about_missing_crashes, afl_bench_just_one, afl_bench_until_crash, afl_debug_child_output, afl_autoresume, - afl_cal_fast; + afl_cal_fast, afl_cycle_schedules, afl_expand_havoc; u8 *afl_tmpdir, *afl_custom_mutator_library, *afl_python_module, *afl_path, *afl_hang_tmout, *afl_skip_crashes, *afl_preload; @@ -461,7 +462,9 @@ typedef struct afl_state { fixed_seed, /* do not reseed */ fast_cal, /* Try to calibrate faster? */ disable_trim, /* Never trim in fuzz_one */ - shmem_testcase_mode; /* If sharedmem testcases are used */ + shmem_testcase_mode, /* If sharedmem testcases are used */ + expand_havoc, /* perform expensive havoc after no find */ + cycle_schedules; /* cycle power schedules ? */ u8 *virgin_bits, /* Regions yet untouched by fuzzing */ *virgin_tmout, /* Bits we haven't seen in tmouts */ @@ -553,6 +556,10 @@ typedef struct afl_state { *queue_top, /* Top of the list */ *q_prev100; /* Previous 100 marker */ + // growing buf + struct queue_entry **queue_buf; + size_t queue_size; + struct queue_entry **top_rated; /* Top entries for bitmap bytes */ struct extra_data *extras; /* Extra tokens to fuzz with */ diff --git a/include/config.h b/include/config.h index 4503c3e9..344a368f 100644 --- a/include/config.h +++ b/include/config.h @@ -401,5 +401,28 @@ // #define IGNORE_FINDS +/* Text mutations */ + +/* Minimum length of a queue input to be evaluated for "is_ascii"? */ + +#define AFL_TXT_MIN_LEN 12 + +/* What is the minimum percentage of ascii characters present to be classifed + as "is_ascii"? */ + +#define AFL_TXT_MIN_PERCENT 94 + +/* How often to perform ASCII mutations 0 = disable, 1-8 are good values */ + +#define AFL_TXT_BIAS 6 + +/* Maximum length of a string to tamper with */ + +#define AFL_TXT_STRING_MAX_LEN 1024 + +/* Maximum mutations on a string */ + +#define AFL_TXT_STRING_MAX_MUTATIONS 6 + #endif /* ! _HAVE_CONFIG_H */ diff --git a/include/envs.h b/include/envs.h index 86222418..c1c7d387 100644 --- a/include/envs.h +++ b/include/envs.h @@ -34,6 +34,7 @@ static char *afl_environment_variables[] = { "AFL_CUSTOM_MUTATOR_LIBRARY", "AFL_CUSTOM_MUTATOR_ONLY", "AFL_CXX", + "AFL_CYCLE_SCHEDULES", "AFL_DEBUG", "AFL_DEBUG_CHILD_OUTPUT", "AFL_DEBUG_GDB", @@ -129,6 +130,7 @@ static char *afl_environment_variables[] = { "AFL_USE_CFISAN", "AFL_WINE_PATH", "AFL_NO_SNAPSHOT", + "AFL_EXPAND_HAVOC_NOW", NULL }; diff --git a/llvm_mode/README.laf-intel.md b/llvm_mode/README.laf-intel.md index 2fa4bc26..f63ab2bb 100644 --- a/llvm_mode/README.laf-intel.md +++ b/llvm_mode/README.laf-intel.md @@ -35,8 +35,8 @@ bit_width may be 64, 32 or 16. A new experimental feature is splitting floating point comparisons into a series of sign, exponent and mantissa comparisons followed by splitting each of them into 8 bit comparisons when necessary. -It is activated with the `AFL_LLVM_LAF_SPLIT_FLOATS` setting, available only -when `AFL_LLVM_LAF_SPLIT_COMPARES` is set. +It is activated with the `AFL_LLVM_LAF_SPLIT_FLOATS` setting. +Note that setting this automatically activates `AFL_LLVM_LAF_SPLIT_COMPARES` You can also set `AFL_LLVM_LAF_ALL` and have all of the above enabled :-) diff --git a/llvm_mode/README.lto.md b/llvm_mode/README.lto.md index d54d4ee0..a4c969b9 100644 --- a/llvm_mode/README.lto.md +++ b/llvm_mode/README.lto.md @@ -157,7 +157,7 @@ instrument it: when compiling, so we have to trick configure: ``` -./configure --enable-lto --disable-shared +./configure --enable-lto --disable-shared --disable-inline-asm ``` 3. Now the configuration is done - and we edit the settings in `./ffbuild/config.mak` diff --git a/llvm_mode/afl-clang-fast.c b/llvm_mode/afl-clang-fast.c index 4d01e740..dca11bf3 100644 --- a/llvm_mode/afl-clang-fast.c +++ b/llvm_mode/afl-clang-fast.c @@ -268,7 +268,8 @@ static void edit_params(u32 argc, char **argv, char **envp) { } - if (getenv("LAF_SPLIT_COMPARES") || getenv("AFL_LLVM_LAF_SPLIT_COMPARES")) { + if (getenv("LAF_SPLIT_COMPARES") || getenv("AFL_LLVM_LAF_SPLIT_COMPARES") || + getenv("AFL_LLVM_LAF_SPLIT_FLOATS")) { cc_params[cc_par_cnt++] = "-Xclang"; cc_params[cc_par_cnt++] = "-load"; diff --git a/src/afl-fuzz-one.c b/src/afl-fuzz-one.c index 72383727..1f0bf30e 100644 --- a/src/afl-fuzz-one.c +++ b/src/afl-fuzz-one.c @@ -24,6 +24,8 @@ */ #include "afl-fuzz.h" +#include <string.h> +#include <limits.h> /* MOpt */ @@ -362,6 +364,8 @@ static void locate_diffs(u8 *ptr1, u8 *ptr2, u32 len, s32 *first, s32 *last) { #endif /* !IGNORE_FINDS */ +#define BUF_PARAMS(name) (void **)&afl->name##_buf, &afl->name##_size + /* Take the current entry from the queue, fuzz it for a while. This function is a tad too long... returns 0 if fuzzed successfully, 1 if skipped or bailed out. */ @@ -1854,6 +1858,21 @@ havoc_stage: /* We essentially just do several thousand runs (depending on perf_score) where we take the input file and make random stacked tweaks. */ + u32 r_max, r; + + if (unlikely(afl->expand_havoc)) { + + /* add expensive havoc cases here, they are activated after a full + cycle without finds happened */ + + r_max = 16 + ((afl->extras_cnt + afl->a_extras_cnt) ? 2 : 0); + + } else { + + r_max = 15 + ((afl->extras_cnt + afl->a_extras_cnt) ? 2 : 0); + + } + for (afl->stage_cur = 0; afl->stage_cur < afl->stage_max; ++afl->stage_cur) { u32 use_stacking = 1 << (1 + rand_below(afl, HAVOC_STACK_POW2)); @@ -1896,8 +1915,7 @@ havoc_stage: } - switch (rand_below( - afl, 15 + ((afl->extras_cnt + afl->a_extras_cnt) ? 2 : 0))) { + switch ((r = rand_below(afl, r_max))) { case 0: @@ -2192,85 +2210,198 @@ havoc_stage: } - /* Values 15 and 16 can be selected only if there are any extras - present in the dictionaries. */ + default: - case 15: { + if (likely(r <= 16 && (afl->extras_cnt || afl->a_extras_cnt))) { - /* Overwrite bytes with an extra. */ + /* Values 15 and 16 can be selected only if there are any extras + present in the dictionaries. */ - if (!afl->extras_cnt || (afl->a_extras_cnt && rand_below(afl, 2))) { + if (r == 15) { - /* No user-specified extras or odds in our favor. Let's use an - auto-detected one. */ + /* Overwrite bytes with an extra. */ - u32 use_extra = rand_below(afl, afl->a_extras_cnt); - u32 extra_len = afl->a_extras[use_extra].len; - u32 insert_at; + if (!afl->extras_cnt || + (afl->a_extras_cnt && rand_below(afl, 2))) { - if (extra_len > temp_len) { break; } + /* No user-specified extras or odds in our favor. Let's use an + auto-detected one. */ - insert_at = rand_below(afl, temp_len - extra_len + 1); - memcpy(out_buf + insert_at, afl->a_extras[use_extra].data, - extra_len); + u32 use_extra = rand_below(afl, afl->a_extras_cnt); + u32 extra_len = afl->a_extras[use_extra].len; + u32 insert_at; - } else { + if (extra_len > temp_len) { break; } - /* No auto extras or odds in our favor. Use the dictionary. */ + insert_at = rand_below(afl, temp_len - extra_len + 1); + memcpy(out_buf + insert_at, afl->a_extras[use_extra].data, + extra_len); - u32 use_extra = rand_below(afl, afl->extras_cnt); - u32 extra_len = afl->extras[use_extra].len; - u32 insert_at; + } else { - if (extra_len > temp_len) { break; } + /* No auto extras or odds in our favor. Use the dictionary. */ - insert_at = rand_below(afl, temp_len - extra_len + 1); - memcpy(out_buf + insert_at, afl->extras[use_extra].data, extra_len); + u32 use_extra = rand_below(afl, afl->extras_cnt); + u32 extra_len = afl->extras[use_extra].len; + u32 insert_at; - } + if (extra_len > temp_len) { break; } - break; + insert_at = rand_below(afl, temp_len - extra_len + 1); + memcpy(out_buf + insert_at, afl->extras[use_extra].data, + extra_len); - } + } - case 16: { + break; - u32 use_extra, extra_len, insert_at = rand_below(afl, temp_len + 1); - u8 *ptr; + } else { // case 16 + + u32 use_extra, extra_len, + insert_at = rand_below(afl, temp_len + 1); + u8 *ptr; + + /* Insert an extra. Do the same dice-rolling stuff as for the + previous case. */ + + if (!afl->extras_cnt || + (afl->a_extras_cnt && rand_below(afl, 2))) { + + use_extra = rand_below(afl, afl->a_extras_cnt); + extra_len = afl->a_extras[use_extra].len; + ptr = afl->a_extras[use_extra].data; + + } else { + + use_extra = rand_below(afl, afl->extras_cnt); + extra_len = afl->extras[use_extra].len; + ptr = afl->extras[use_extra].data; + + } + + if (temp_len + extra_len >= MAX_FILE) { break; } + + out_buf = ck_maybe_grow(BUF_PARAMS(out), temp_len + extra_len); + + /* Tail */ + memmove(out_buf + insert_at + extra_len, out_buf + insert_at, + temp_len - insert_at); + + /* Inserted part */ + memcpy(out_buf + insert_at, ptr, extra_len); - /* Insert an extra. Do the same dice-rolling stuff as for the - previous case. */ + temp_len += extra_len; - if (!afl->extras_cnt || (afl->a_extras_cnt && rand_below(afl, 2))) { + break; - use_extra = rand_below(afl, afl->a_extras_cnt); - extra_len = afl->a_extras[use_extra].len; - ptr = afl->a_extras[use_extra].data; + } } else { - use_extra = rand_below(afl, afl->extras_cnt); - extra_len = afl->extras[use_extra].len; - ptr = afl->extras[use_extra].data; + /* + switch (r) { - } + case 15: // fall through + case 16: + case 17: {*/ - if (temp_len + extra_len >= MAX_FILE) { break; } + /* Overwrite bytes with a randomly selected chunk from another + testcase or insert that chunk. */ - out_buf = ck_maybe_grow(BUF_PARAMS(out), temp_len + extra_len); + if (afl->queued_paths < 4) break; - /* Tail */ - memmove(out_buf + insert_at + extra_len, out_buf + insert_at, - temp_len - insert_at); + /* Pick a random queue entry and seek to it. */ - /* Inserted part */ - memcpy(out_buf + insert_at, ptr, extra_len); + u32 tid; + do + tid = rand_below(afl, afl->queued_paths); + while (tid == afl->current_entry); - temp_len += extra_len; + struct queue_entry *target = afl->queue_buf[tid]; - break; + /* Make sure that the target has a reasonable length. */ - } + while (target && (target->len < 2 || target == afl->queue_cur)) + target = target->next; + + if (!target) break; + + /* Read the testcase into a new buffer. */ + + fd = open(target->fname, O_RDONLY); + + if (unlikely(fd < 0)) { + + PFATAL("Unable to open '%s'", target->fname); + + } + + u32 new_len = target->len; + u8 *new_buf = ck_maybe_grow(BUF_PARAMS(in_scratch), new_len); + + ck_read(fd, new_buf, new_len, target->fname); + + close(fd); + + u8 overwrite = 0; + if (temp_len >= 2 && rand_below(afl, 2)) + overwrite = 1; + else if (temp_len + HAVOC_BLK_XL >= MAX_FILE) { + + if (temp_len >= 2) + overwrite = 1; + else + break; + + } + + if (overwrite) { + + u32 copy_from, copy_to, copy_len; + + copy_len = choose_block_len(afl, new_len - 1); + if (copy_len > temp_len) copy_len = temp_len; + + copy_from = rand_below(afl, new_len - copy_len + 1); + copy_to = rand_below(afl, temp_len - copy_len + 1); + + memmove(out_buf + copy_to, new_buf + copy_from, copy_len); + + } else { + + u32 clone_from, clone_to, clone_len; + + clone_len = choose_block_len(afl, new_len); + clone_from = rand_below(afl, new_len - clone_len + 1); + + clone_to = rand_below(afl, temp_len); + + u8 *temp_buf = + ck_maybe_grow(BUF_PARAMS(out_scratch), temp_len + clone_len); + + /* Head */ + + memcpy(temp_buf, out_buf, clone_to); + + /* Inserted part */ + + memcpy(temp_buf + clone_to, new_buf + clone_from, clone_len); + + /* Tail */ + memcpy(temp_buf + clone_to + clone_len, out_buf + clone_to, + temp_len - clone_to); + + swap_bufs(BUF_PARAMS(out), BUF_PARAMS(out_scratch)); + out_buf = temp_buf; + temp_len += clone_len; + + } + + break; + + } + + // end of default: } @@ -2357,20 +2488,7 @@ retry_splicing: } while (tid == afl->current_entry); afl->splicing_with = tid; - target = afl->queue; - - while (tid >= 100) { - - target = target->next_100; - tid -= 100; - - } - - while (tid--) { - - target = target->next; - - } + target = afl->queue_buf[tid]; /* Make sure that the target has a reasonable length. */ @@ -4750,7 +4868,7 @@ u8 fuzz_one(afl_state_t *afl) { return (key_val_lv_1 | key_val_lv_2); -#undef BUF_PARAMS - } +#undef BUF_PARAMS + diff --git a/src/afl-fuzz-queue.c b/src/afl-fuzz-queue.c index 7afdd9f1..38e95ac8 100644 --- a/src/afl-fuzz-queue.c +++ b/src/afl-fuzz-queue.c @@ -24,6 +24,9 @@ #include "afl-fuzz.h" #include <limits.h> +#include <ctype.h> + +#define BUF_PARAMS(name) (void **)&afl->name##_buf, &afl->name##_size /* Mark deterministic checks as done for a particular queue entry. We use the .state file to avoid repeating deterministic fuzzing when resuming aborted @@ -100,6 +103,108 @@ void mark_as_redundant(afl_state_t *afl, struct queue_entry *q, u8 state) { } +/* check if ascii or UTF-8 */ + +static u8 check_if_text(struct queue_entry *q) { + + if (q->len < AFL_TXT_MIN_LEN) return 0; + + u8 buf[MAX_FILE]; + s32 fd, len = q->len, offset = 0, ascii = 0, utf8 = 0, comp; + + if ((fd = open(q->fname, O_RDONLY)) < 0) return 0; + if ((comp = read(fd, buf, len)) != len) return 0; + close(fd); + + while (offset < len) { + + // ASCII: <= 0x7F to allow ASCII control characters + if ((buf[offset + 0] == 0x09 || buf[offset + 0] == 0x0A || + buf[offset + 0] == 0x0D || + (0x20 <= buf[offset + 0] && buf[offset + 0] <= 0x7E))) { + + offset++; + utf8++; + ascii++; + continue; + + } + + if (isascii((int)buf[offset]) || isprint((int)buf[offset])) { + + ascii++; + // we continue though as it can also be a valid utf8 + + } + + // non-overlong 2-byte + if (((0xC2 <= buf[offset + 0] && buf[offset + 0] <= 0xDF) && + (0x80 <= buf[offset + 1] && buf[offset + 1] <= 0xBF))) { + + offset += 2; + utf8++; + comp--; + continue; + + } + + // excluding overlongs + if ((buf[offset + 0] == 0xE0 && + (0xA0 <= buf[offset + 1] && buf[offset + 1] <= 0xBF) && + (0x80 <= buf[offset + 2] && + buf[offset + 2] <= 0xBF)) || // straight 3-byte + (((0xE1 <= buf[offset + 0] && buf[offset + 0] <= 0xEC) || + buf[offset + 0] == 0xEE || buf[offset + 0] == 0xEF) && + (0x80 <= buf[offset + 1] && buf[offset + 1] <= 0xBF) && + (0x80 <= buf[offset + 2] && + buf[offset + 2] <= 0xBF)) || // excluding surrogates + (buf[offset + 0] == 0xED && + (0x80 <= buf[offset + 1] && buf[offset + 1] <= 0x9F) && + (0x80 <= buf[offset + 2] && buf[offset + 2] <= 0xBF))) { + + offset += 3; + utf8++; + comp -= 2; + continue; + + } + + // planes 1-3 + if ((buf[offset + 0] == 0xF0 && + (0x90 <= buf[offset + 1] && buf[offset + 1] <= 0xBF) && + (0x80 <= buf[offset + 2] && buf[offset + 2] <= 0xBF) && + (0x80 <= buf[offset + 3] && + buf[offset + 3] <= 0xBF)) || // planes 4-15 + ((0xF1 <= buf[offset + 0] && buf[offset + 0] <= 0xF3) && + (0x80 <= buf[offset + 1] && buf[offset + 1] <= 0xBF) && + (0x80 <= buf[offset + 2] && buf[offset + 2] <= 0xBF) && + (0x80 <= buf[offset + 3] && buf[offset + 3] <= 0xBF)) || // plane 16 + (buf[offset + 0] == 0xF4 && + (0x80 <= buf[offset + 1] && buf[offset + 1] <= 0x8F) && + (0x80 <= buf[offset + 2] && buf[offset + 2] <= 0xBF) && + (0x80 <= buf[offset + 3] && buf[offset + 3] <= 0xBF))) { + + offset += 4; + utf8++; + comp -= 3; + continue; + + } + + offset++; + + } + + u32 percent_utf8 = (utf8 * 100) / comp; + u32 percent_ascii = (ascii * 100) / len; + + if (percent_utf8 >= percent_ascii && percent_utf8 >= AFL_TXT_MIN_PERCENT) + return 2; + if (percent_ascii >= AFL_TXT_MIN_PERCENT) return 1; + return 0; + +} + /* Append new test case to the queue. */ void add_to_queue(afl_state_t *afl, u8 *fname, u32 len, u8 passed_det) { @@ -138,6 +243,10 @@ void add_to_queue(afl_state_t *afl, u8 *fname, u32 len, u8 passed_det) { } + struct queue_entry **queue_buf = ck_maybe_grow( + BUF_PARAMS(queue), afl->queued_paths * sizeof(struct queue_entry *)); + queue_buf[afl->queued_paths - 1] = q; + afl->last_path_time = get_cur_time(); if (afl->custom_mutators_count) { @@ -159,6 +268,9 @@ void add_to_queue(afl_state_t *afl, u8 *fname, u32 len, u8 passed_det) { } + /* only redqueen currently uses is_ascii */ + if (afl->shm.cmplog_mode) q->is_ascii = check_if_text(q); + } /* Destroy the entire queue. */ diff --git a/src/afl-fuzz-redqueen.c b/src/afl-fuzz-redqueen.c index c53e0e06..57e60c3d 100644 --- a/src/afl-fuzz-redqueen.c +++ b/src/afl-fuzz-redqueen.c @@ -24,6 +24,7 @@ */ +#include <limits.h> #include "afl-fuzz.h" #include "cmplog.h" @@ -262,6 +263,58 @@ static u8 its_fuzz(afl_state_t *afl, u8 *buf, u32 len, u8 *status) { } +static long long strntoll(const char *str, size_t sz, char **end, int base) { + + char buf[64]; + long long ret; + const char *beg = str; + + for (; beg && sz && *beg == ' '; beg++, sz--) + ; + + if (!sz || sz >= sizeof(buf)) { + + if (end) *end = (char *)str; + return 0; + + } + + memcpy(buf, beg, sz); + buf[sz] = '\0'; + ret = strtoll(buf, end, base); + if (ret == LLONG_MIN || ret == LLONG_MAX) return ret; + if (end) *end = (char *)beg + (*end - buf); + return ret; + +} + +static unsigned long long strntoull(const char *str, size_t sz, char **end, + int base) { + + char buf[64]; + unsigned long long ret; + const char * beg = str; + + for (; beg && sz && *beg == ' '; beg++, sz--) + ; + + if (!sz || sz >= sizeof(buf)) { + + if (end) *end = (char *)str; + return 0; + + } + + memcpy(buf, beg, sz); + buf[sz] = '\0'; + ret = strtoull(buf, end, base); + if (end) *end = (char *)beg + (*end - buf); + return ret; + +} + +#define BUF_PARAMS(name) (void **)&afl->name##_buf, &afl->name##_size + static u8 cmp_extend_encoding(afl_state_t *afl, struct cmp_header *h, u64 pattern, u64 repl, u64 o_pattern, u32 idx, u8 *orig_buf, u8 *buf, u32 len, u8 do_reverse, @@ -279,7 +332,54 @@ static u8 cmp_extend_encoding(afl_state_t *afl, struct cmp_header *h, u32 its_len = len - idx; // *status = 0; - if (SHAPE_BYTES(h->shape) >= 8) { + u8 * endptr; + u8 use_num = 0, use_unum = 0; + unsigned long long unum; + long long num; + if (afl->queue_cur->is_ascii) { + + endptr = buf_8; + num = strntoll(buf_8, len - idx, (char **)&endptr, 0); + if (endptr == buf_8) { + + unum = strntoull(buf_8, len - idx, (char **)&endptr, 0); + if (endptr == buf_8) use_unum = 1; + + } else + + use_num = 1; + + } + + if (use_num && num == pattern) { + + size_t old_len = endptr - buf_8; + size_t num_len = snprintf(NULL, 0, "%lld", num); + + u8 *new_buf = ck_maybe_grow(BUF_PARAMS(out_scratch), len + num_len); + memcpy(new_buf, buf, idx); + + snprintf(new_buf + idx, num_len, "%lld", num); + memcpy(new_buf + idx + num_len, buf_8 + old_len, len - idx - old_len); + + if (unlikely(its_fuzz(afl, new_buf, len, status))) { return 1; } + + } else if (use_unum && unum == pattern) { + + size_t old_len = endptr - buf_8; + size_t num_len = snprintf(NULL, 0, "%llu", unum); + + u8 *new_buf = ck_maybe_grow(BUF_PARAMS(out_scratch), len + num_len); + memcpy(new_buf, buf, idx); + + snprintf(new_buf + idx, num_len, "%llu", unum); + memcpy(new_buf + idx + num_len, buf_8 + old_len, len - idx - old_len); + + if (unlikely(its_fuzz(afl, new_buf, len, status))) { return 1; } + + } + + if (SHAPE_BYTES(h->shape) >= 8 && *status != 1) { if (its_len >= 8 && *buf_64 == pattern && *o_buf_64 == o_pattern) { diff --git a/src/afl-fuzz-state.c b/src/afl-fuzz-state.c index e0e43f54..66280ed1 100644 --- a/src/afl-fuzz-state.c +++ b/src/afl-fuzz-state.c @@ -293,6 +293,20 @@ void read_afl_environment(afl_state_t *afl, char **envp) { afl->afl_env.afl_autoresume = get_afl_env(afl_environment_variables[i]) ? 1 : 0; + } else if (!strncmp(env, "AFL_CYCLE_SCHEDULES", + + afl_environment_variable_len)) { + + afl->cycle_schedules = afl->afl_env.afl_cycle_schedules = + get_afl_env(afl_environment_variables[i]) ? 1 : 0; + + } else if (!strncmp(env, "AFL_EXPAND_HAVOC_NOW", + + afl_environment_variable_len)) { + + afl->expand_havoc = afl->afl_env.afl_expand_havoc = + get_afl_env(afl_environment_variables[i]) ? 1 : 0; + } else if (!strncmp(env, "AFL_CAL_FAST", afl_environment_variable_len)) { @@ -405,6 +419,7 @@ void afl_state_deinit(afl_state_t *afl) { if (afl->pass_stats) { ck_free(afl->pass_stats); } if (afl->orig_cmp_map) { ck_free(afl->orig_cmp_map); } + if (afl->queue_buf) { free(afl->queue_buf); } if (afl->out_buf) { free(afl->out_buf); } if (afl->out_scratch_buf) { free(afl->out_scratch_buf); } if (afl->eff_buf) { free(afl->eff_buf); } diff --git a/src/afl-fuzz-stats.c b/src/afl-fuzz-stats.c index 2546a57a..7b30b5ea 100644 --- a/src/afl-fuzz-stats.c +++ b/src/afl-fuzz-stats.c @@ -115,6 +115,7 @@ void write_stats_file(afl_state_t *afl, double bitmap_cvg, double stability, "cpu_affinity : %d\n" "edges_found : %u\n" "var_byte_count : %u\n" + "havoc_expansion : %u\n" "afl_banner : %s\n" "afl_version : " VERSION "\n" @@ -148,7 +149,7 @@ void write_stats_file(afl_state_t *afl, double bitmap_cvg, double stability, #else -1, #endif - t_bytes, afl->var_byte_count, afl->use_banner, + t_bytes, afl->var_byte_count, afl->expand_havoc, afl->use_banner, afl->unicorn_mode ? "unicorn" : "", afl->fsrv.qemu_mode ? "qemu " : "", afl->non_instrumented_mode ? " non_instrumented " : "", diff --git a/src/afl-fuzz.c b/src/afl-fuzz.c index eb4b6a87..5bedf6e1 100644 --- a/src/afl-fuzz.c +++ b/src/afl-fuzz.c @@ -916,6 +916,7 @@ int main(int argc, char **argv_orig, char **envp) { if (get_afl_env("AFL_NO_ARITH")) { afl->no_arith = 1; } if (get_afl_env("AFL_SHUFFLE_QUEUE")) { afl->shuffle_queue = 1; } if (get_afl_env("AFL_FAST_CAL")) { afl->fast_cal = 1; } + if (get_afl_env("AFL_EXPAND_HAVOC_NOW")) { afl->expand_havoc = 1; } if (afl->afl_env.afl_autoresume) { @@ -1271,11 +1272,42 @@ int main(int argc, char **argv_orig, char **envp) { /* If we had a full queue cycle with no new finds, try recombination strategies next. */ - if (afl->queued_paths == prev_queued) { + if (afl->queued_paths == prev_queued && + (get_cur_time() - afl->start_time) >= 3600) { if (afl->use_splicing) { ++afl->cycles_wo_finds; + switch (afl->expand_havoc) { + + case 0: + afl->expand_havoc = 1; + break; + case 1: + if (afl->limit_time_sig == 0) { + + afl->limit_time_sig = -1; + afl->limit_time_puppet = 0; + + } + + afl->expand_havoc = 2; + break; + case 2: + // afl->cycle_schedules = 1; + afl->expand_havoc = 3; + break; + case 3: + // nothing else currently + break; + + } + + if (afl->expand_havoc) { + + } else + + afl->expand_havoc = 1; } else { @@ -1289,6 +1321,53 @@ int main(int argc, char **argv_orig, char **envp) { } + if (afl->cycle_schedules) { + + /* we cannot mix non-AFLfast schedules with others */ + + switch (afl->schedule) { + + case EXPLORE: + afl->schedule = EXPLOIT; + break; + case EXPLOIT: + afl->schedule = MMOPT; + break; + case MMOPT: + afl->schedule = SEEK; + break; + case SEEK: + afl->schedule = EXPLORE; + break; + case FAST: + afl->schedule = COE; + break; + case COE: + afl->schedule = LIN; + break; + case LIN: + afl->schedule = QUAD; + break; + case QUAD: + afl->schedule = RARE; + break; + case RARE: + afl->schedule = FAST; + break; + + } + + struct queue_entry *q = afl->queue; + // we must recalculate the scores of all queue entries + while (q) { + + update_bitmap_score(afl, q); + q = q->next; + + } + + } + prev_queued = afl->queued_paths; if (afl->sync_id && afl->queue_cycle == 1 && |