From 5ec859cece70ab1b5cd9e0356c4cc3e260d2cbe0 Mon Sep 17 00:00:00 2001 From: llzmb <46303940+llzmb@users.noreply.github.com> Date: Sat, 20 Nov 2021 15:48:49 +0100 Subject: Clean up docs folder --- docs/best_practices.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'docs/best_practices.md') diff --git a/docs/best_practices.md b/docs/best_practices.md index 5d07dd14..7016f08d 100644 --- a/docs/best_practices.md +++ b/docs/best_practices.md @@ -48,7 +48,7 @@ to emulate the network. This is also much faster than the real network would be. See [utils/socket_fuzzing/](../utils/socket_fuzzing/). There is an outdated AFL++ branch that implements networking if you are -desperate though: [https://github.com/AFLplusplus/AFLplusplus/tree/networking](https://github.com/AFLplusplus/AFLplusplus/tree/networking) - +desperate though: [https://github.com/AFLplusplus/AFLplusplus/tree/networking](https://github.com/AFLplusplus/AFLplusplus/tree/networking) - however a better option is AFLnet ([https://github.com/aflnet/aflnet](https://github.com/aflnet/aflnet)) which allows you to define network state with different type of data packets. @@ -62,7 +62,7 @@ which allows you to define network state with different type of data packets. 4. If you do not use shmem persistent mode, use `AFL_TMPDIR` to put the input file directory on a tempfs location, see [env_variables.md](env_variables.md). 5. Improve Linux kernel performance: modify `/etc/default/grub`, set `GRUB_CMDLINE_LINUX_DEFAULT="ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off spectre_v2=off stf_barrier=off"`; then `update-grub` and `reboot` (warning: makes the system less secure). 6. Running on an `ext2` filesystem with `noatime` mount option will be a bit faster than on any other journaling filesystem. -7. Use your cores! [fuzzing_expert.md:b) Using multiple cores](fuzzing_expert.md#b-using-multiple-cores). +7. Use your cores ([fuzzing_in_depth.md:b) Using multiple cores](fuzzing_in_depth.md#b-using-multiple-cores))! ### Improving stability -- cgit 1.4.1 From 492dbe9fb294dec27e5c2bc7297b36526bb8e61f Mon Sep 17 00:00:00 2001 From: llzmb <46303940+llzmb@users.noreply.github.com> Date: Sun, 21 Nov 2021 18:00:01 +0100 Subject: Clean up docs folder --- README.md | 5 +- docs/FAQ.md | 5 +- docs/best_practices.md | 18 +++-- docs/parallel_fuzzing.md | 182 +++++++++++++++++++++++------------------------ qemu_mode/README.md | 5 +- 5 files changed, 109 insertions(+), 106 deletions(-) (limited to 'docs/best_practices.md') diff --git a/README.md b/README.md index b2714787..fcb6b3c9 100644 --- a/README.md +++ b/README.md @@ -132,9 +132,6 @@ The following branches exist: * [dev](https://github.com/AFLplusplus/AFLplusplus/tree/dev): development state of AFL++ - bleeding edge and you might catch a checkout which does not compile or has a bug. *We only accept PRs in dev!!* * (any other): experimental branches to work on specific features or testing new functionality or changes. -For releases, please see the [Releases tab](https://github.com/AFLplusplus/AFLplusplus/releases). -Also take a look at the list of [important changes in AFL++](docs/important_changes.md). - ## Help wanted We have several [ideas](docs/ideas.md) we would like to see in AFL++ to make it @@ -233,4 +230,4 @@ presented at WOOT'20: } ``` - + \ No newline at end of file diff --git a/docs/FAQ.md b/docs/FAQ.md index 68ca3bad..34ed4cf5 100644 --- a/docs/FAQ.md +++ b/docs/FAQ.md @@ -83,7 +83,8 @@ If you find an interesting or important question missing, submit it via However, if there is only the binary program and no source code available, then the standard non-instrumented mode is not effective. - To learn how these binaries can be fuzzed, read [binaryonly_fuzzing.md](binaryonly_fuzzing.md). + To learn how these binaries can be fuzzed, read + [fuzzing_binary-only_targets.md](fuzzing_binary-only_targets.md).

@@ -143,7 +144,7 @@ If you find an interesting or important question missing, submit it via Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /prg/tmp/llvm-project/build/bin - clang-13: note: diagnostic msg: + clang-13: note: diagnostic msg: ******************** ``` diff --git a/docs/best_practices.md b/docs/best_practices.md index 7016f08d..5f2d45ed 100644 --- a/docs/best_practices.md +++ b/docs/best_practices.md @@ -4,20 +4,26 @@ ### Targets - * [Fuzzing a binary-only target](#fuzzing-a-binary-only-target) - * [Fuzzing a GUI program](#fuzzing-a-gui-program) - * [Fuzzing a network service](#fuzzing-a-network-service) +* [Fuzzing a target with source code available](#fuzzing-a-target-with-source-code-available) +* [Fuzzing a binary-only target](#fuzzing-a-binary-only-target) +* [Fuzzing a GUI program](#fuzzing-a-gui-program) +* [Fuzzing a network service](#fuzzing-a-network-service) ### Improvements - * [Improving speed](#improving-speed) - * [Improving stability](#improving-stability) +* [Improving speed](#improving-speed) +* [Improving stability](#improving-stability) ## Targets +### Fuzzing a target with source code available + +To learn how to fuzz a target if source code is available, see [fuzzing_in_depth.md](fuzzing_in_depth.md). + ### Fuzzing a binary-only target -For a comprehensive guide, see [binaryonly_fuzzing.md](binaryonly_fuzzing.md). +For a comprehensive guide, see +[fuzzing_binary-only_targets.md](fuzzing_binary-only_targets.md). ### Fuzzing a GUI program diff --git a/docs/parallel_fuzzing.md b/docs/parallel_fuzzing.md index d24f2837..130cb3ce 100644 --- a/docs/parallel_fuzzing.md +++ b/docs/parallel_fuzzing.md @@ -1,28 +1,28 @@ # Tips for parallel fuzzing -This document talks about synchronizing afl-fuzz jobs on a single machine -or across a fleet of systems. See README.md for the general instruction manual. +This document talks about synchronizing afl-fuzz jobs on a single machine or +across a fleet of systems. See README.md for the general instruction manual. Note that this document is rather outdated. please refer to the main document -section on multiple core usage [fuzzing_expert.md#Using multiple cores](fuzzing_expert.md#b-using-multiple-cores) +section on multiple core usage +[fuzzing_in_depth.md:b) Using multiple cores](fuzzing_in_depth.md#b-using-multiple-cores) for up to date strategies! ## 1) Introduction -Every copy of afl-fuzz will take up one CPU core. This means that on an -n-core system, you can almost always run around n concurrent fuzzing jobs with +Every copy of afl-fuzz will take up one CPU core. This means that on an n-core +system, you can almost always run around n concurrent fuzzing jobs with virtually no performance hit (you can use the afl-gotcpu tool to make sure). -In fact, if you rely on just a single job on a multi-core system, you will -be underutilizing the hardware. So, parallelization is always the right way to -go. +In fact, if you rely on just a single job on a multi-core system, you will be +underutilizing the hardware. So, parallelization is always the right way to go. When targeting multiple unrelated binaries or using the tool in "non-instrumented" (-n) mode, it is perfectly fine to just start up several -fully separate instances of afl-fuzz. The picture gets more complicated when -you want to have multiple fuzzers hammering a common target: if a hard-to-hit -but interesting test case is synthesized by one fuzzer, the remaining instances -will not be able to use that input to guide their work. +fully separate instances of afl-fuzz. The picture gets more complicated when you +want to have multiple fuzzers hammering a common target: if a hard-to-hit but +interesting test case is synthesized by one fuzzer, the remaining instances will +not be able to use that input to guide their work. To help with this problem, afl-fuzz offers a simple way to synchronize test cases on the fly. @@ -30,15 +30,15 @@ cases on the fly. It is a good idea to use different power schedules if you run several instances in parallel (`-p` option). -Alternatively running other AFL spinoffs in parallel can be of value, -e.g. Angora (https://github.com/AngoraFuzzer/Angora/) +Alternatively running other AFL spinoffs in parallel can be of value, e.g. +Angora (https://github.com/AngoraFuzzer/Angora/) ## 2) Single-system parallelization -If you wish to parallelize a single job across multiple cores on a local -system, simply create a new, empty output directory ("sync dir") that will be -shared by all the instances of afl-fuzz; and then come up with a naming scheme -for every instance - say, "fuzzer01", "fuzzer02", etc. +If you wish to parallelize a single job across multiple cores on a local system, +simply create a new, empty output directory ("sync dir") that will be shared by +all the instances of afl-fuzz; and then come up with a naming scheme for every +instance - say, "fuzzer01", "fuzzer02", etc. Run the first one ("main node", -M) like this: @@ -57,18 +57,18 @@ Each fuzzer will keep its state in a separate subdirectory, like so: /path/to/sync_dir/fuzzer01/ -Each instance will also periodically rescan the top-level sync directory -for any test cases found by other fuzzers - and will incorporate them into -its own fuzzing when they are deemed interesting enough. -For performance reasons only -M main node syncs the queue with everyone, the --S secondary nodes will only sync from the main node. +Each instance will also periodically rescan the top-level sync directory for any +test cases found by other fuzzers - and will incorporate them into its own +fuzzing when they are deemed interesting enough. For performance reasons only -M +main node syncs the queue with everyone, the -S secondary nodes will only sync +from the main node. -The difference between the -M and -S modes is that the main instance will -still perform deterministic checks; while the secondary instances will -proceed straight to random tweaks. +The difference between the -M and -S modes is that the main instance will still +perform deterministic checks; while the secondary instances will proceed +straight to random tweaks. -Note that you must always have one -M main instance! -Running multiple -M instances is wasteful! +Note that you must always have one -M main instance! Running multiple -M +instances is wasteful! You can also monitor the progress of your jobs from the command line with the provided afl-whatsup tool. When the instances are no longer finding new paths, @@ -90,18 +90,18 @@ file name. ## 3) Multiple -M mains -There is support for parallelizing the deterministic checks. -This is only needed where +There is support for parallelizing the deterministic checks. This is only needed +where 1. many new paths are found fast over a long time and it looks unlikely that main node will ever catch up, and 2. deterministic fuzzing is actively helping path discovery (you can see this in the main node for the first for lines in the "fuzzing strategy yields" - section. If the ration `found/attemps` is high, then it is effective. It + section. If the ration `found/attempts` is high, then it is effective. It most commonly isn't.) -Only if both are true it is beneficial to have more than one main. -You can leverage this by creating -M instances like so: +Only if both are true it is beneficial to have more than one main. You can +leverage this by creating -M instances like so: ``` ./afl-fuzz -i testcase_dir -o sync_dir -M mainA:1/3 [...] @@ -115,27 +115,26 @@ distribute the deterministic fuzzing across. Note that if you boot up fewer fuzzers than indicated by the second number passed to -M, you may end up with poor coverage. -## 4) Syncing with non-AFL fuzzers or independant instances +## 4) Syncing with non-AFL fuzzers or independent instances -A -M main node can be told with the `-F other_fuzzer_queue_directory` option -to sync results from other fuzzers, e.g. libfuzzer or honggfuzz. +A -M main node can be told with the `-F other_fuzzer_queue_directory` option to +sync results from other fuzzers, e.g. libfuzzer or honggfuzz. -Only the specified directory will by synced into afl, not subdirectories. -The specified directory does not need to exist yet at the start of afl. +Only the specified directory will by synced into afl, not subdirectories. The +specified directory does not need to exist yet at the start of afl. The `-F` option can be passed to the main node several times. ## 5) Multi-system parallelization -The basic operating principle for multi-system parallelization is similar to -the mechanism explained in section 2. The key difference is that you need to -write a simple script that performs two actions: +The basic operating principle for multi-system parallelization is similar to the +mechanism explained in section 2. The key difference is that you need to write a +simple script that performs two actions: - - Uses SSH with authorized_keys to connect to every machine and retrieve - a tar archive of the /path/to/sync_dir/ directory local to - the machine. - It is best to use a naming scheme that includes host name and it's being - a main node (e.g. main1, main2) in the fuzzer ID, so that you can do + - Uses SSH with authorized_keys to connect to every machine and retrieve a tar + archive of the /path/to/sync_dir/ directory local to the + machine. It is best to use a naming scheme that includes host name and it's + being a main node (e.g. main1, main2) in the fuzzer ID, so that you can do something like: ```sh @@ -163,70 +162,70 @@ There are other (older) more featured, experimental tools: However these do not support syncing just main nodes (yet). -When developing custom test case sync code, there are several optimizations -to keep in mind: +When developing custom test case sync code, there are several optimizations to +keep in mind: - - The synchronization does not have to happen very often; running the - task every 60 minutes or even less often at later fuzzing stages is - fine + - The synchronization does not have to happen very often; running the task + every 60 minutes or even less often at later fuzzing stages is fine - - There is no need to synchronize crashes/ or hangs/; you only need to - copy over queue/* (and ideally, also fuzzer_stats). + - There is no need to synchronize crashes/ or hangs/; you only need to copy + over queue/* (and ideally, also fuzzer_stats). - - It is not necessary (and not advisable!) to overwrite existing files; - the -k option in tar is a good way to avoid that. + - It is not necessary (and not advisable!) to overwrite existing files; the -k + option in tar is a good way to avoid that. - There is no need to fetch directories for fuzzers that are not running locally on a particular machine, and were simply copied over onto that system during earlier runs. - - For large fleets, you will want to consolidate tarballs for each host, - as this will let you use n SSH connections for sync, rather than n*(n-1). + - For large fleets, you will want to consolidate tarballs for each host, as + this will let you use n SSH connections for sync, rather than n*(n-1). You may also want to implement staged synchronization. For example, you - could have 10 groups of systems, with group 1 pushing test cases only - to group 2; group 2 pushing them only to group 3; and so on, with group + could have 10 groups of systems, with group 1 pushing test cases only to + group 2; group 2 pushing them only to group 3; and so on, with group eventually 10 feeding back to group 1. - This arrangement would allow test interesting cases to propagate across - the fleet without having to copy every fuzzer queue to every single host. + This arrangement would allow test interesting cases to propagate across the + fleet without having to copy every fuzzer queue to every single host. - You do not want a "main" instance of afl-fuzz on every system; you should run them all with -S, and just designate a single process somewhere within the fleet to run with -M. - - Syncing is only necessary for the main nodes on a system. It is possible - to run main-less with only secondaries. However then you need to find out - which secondary took over the temporary role to be the main node. Look for - the `is_main_node` file in the fuzzer directories, eg. `sync-dir/hostname-*/is_main_node` + - Syncing is only necessary for the main nodes on a system. It is possible to + run main-less with only secondaries. However then you need to find out which + secondary took over the temporary role to be the main node. Look for the + `is_main_node` file in the fuzzer directories, eg. + `sync-dir/hostname-*/is_main_node` It is *not* advisable to skip the synchronization script and run the fuzzers -directly on a network filesystem; unexpected latency and unkillable processes -in I/O wait state can mess things up. +directly on a network filesystem; unexpected latency and unkillable processes in +I/O wait state can mess things up. ## 6) Remote monitoring and data collection -You can use screen, nohup, tmux, or something equivalent to run remote -instances of afl-fuzz. If you redirect the program's output to a file, it will +You can use screen, nohup, tmux, or something equivalent to run remote instances +of afl-fuzz. If you redirect the program's output to a file, it will automatically switch from a fancy UI to more limited status reports. There is also basic machine-readable information which is always written to the fuzzer_stats file in the output directory. Locally, that information can be interpreted with afl-whatsup. -In principle, you can use the status screen of the main (-M) instance to -monitor the overall fuzzing progress and decide when to stop. In this -mode, the most important signal is just that no new paths are being found -for a longer while. If you do not have a main instance, just pick any -single secondary instance to watch and go by that. +In principle, you can use the status screen of the main (-M) instance to monitor +the overall fuzzing progress and decide when to stop. In this mode, the most +important signal is just that no new paths are being found for a longer while. +If you do not have a main instance, just pick any single secondary instance to +watch and go by that. -You can also rely on that instance's output directory to collect the -synthesized corpus that covers all the noteworthy paths discovered anywhere -within the fleet. Secondary (-S) instances do not require any special -monitoring, other than just making sure that they are up. +You can also rely on that instance's output directory to collect the synthesized +corpus that covers all the noteworthy paths discovered anywhere within the +fleet. Secondary (-S) instances do not require any special monitoring, other +than just making sure that they are up. -Keep in mind that crashing inputs are *not* automatically propagated to the -main instance, so you may still want to monitor for crashes fleet-wide -from within your synchronization or health checking scripts (see afl-whatsup). +Keep in mind that crashing inputs are *not* automatically propagated to the main +instance, so you may still want to monitor for crashes fleet-wide from within +your synchronization or health checking scripts (see afl-whatsup). ## 7) Asymmetric setups @@ -238,21 +237,20 @@ It is perhaps worth noting that all of the following is permitted: out_dir//queue/* and writing their own finds to sequentially numbered id:nnnnnn files in out_dir//queue/*. - - Running some of the synchronized fuzzers with different (but related) - target binaries. For example, simultaneously stress-testing several - different JPEG parsers (say, IJG jpeg and libjpeg-turbo) while sharing - the discovered test cases can have synergistic effects and improve the - overall coverage. + - Running some of the synchronized fuzzers with different (but related) target + binaries. For example, simultaneously stress-testing several different JPEG + parsers (say, IJG jpeg and libjpeg-turbo) while sharing the discovered test + cases can have synergistic effects and improve the overall coverage. (In this case, running one -M instance per target is necessary.) - - Having some of the fuzzers invoke the binary in different ways. - For example, 'djpeg' supports several DCT modes, configurable with - a command-line flag, while 'dwebp' supports incremental and one-shot - decoding. In some scenarios, going after multiple distinct modes and then - pooling test cases will improve coverage. + - Having some of the fuzzers invoke the binary in different ways. For example, + 'djpeg' supports several DCT modes, configurable with a command-line flag, + while 'dwebp' supports incremental and one-shot decoding. In some scenarios, + going after multiple distinct modes and then pooling test cases will improve + coverage. - Much less convincingly, running the synchronized fuzzers with different starting test cases (e.g., progressive and standard JPEG) or dictionaries. The synchronization mechanism ensures that the test sets will get fairly - homogeneous over time, but it introduces some initial variability. + homogeneous over time, but it introduces some initial variability. \ No newline at end of file diff --git a/qemu_mode/README.md b/qemu_mode/README.md index d28479d9..c62309a2 100644 --- a/qemu_mode/README.md +++ b/qemu_mode/README.md @@ -217,5 +217,6 @@ them at run time, can be a faster alternative. That said, static rewriting is fraught with peril, because it depends on being able to properly and fully model program control flow without actually executing each and every code path. -Checkout the "Fuzzing binary-only targets" section in our main README.md and -the docs/binaryonly_fuzzing.md document for more information and hints. +Check out +[docs/fuzzing_binary-only_targets.md](../docs/fuzzing_binary-only_targets.md) +for more information and hints. -- cgit 1.4.1 From fce93647cc788683be3d8cca79c4689de4b71c3f Mon Sep 17 00:00:00 2001 From: llzmb <46303940+llzmb@users.noreply.github.com> Date: Wed, 24 Nov 2021 13:24:12 +0100 Subject: Merge "perf_tips.md" into "best_practices.md" and "fuzzing_in_depth.md" --- docs/best_practices.md | 4 +- docs/fuzzing_in_depth.md | 46 +++++++---- docs/perf_tips.md | 209 ----------------------------------------------- 3 files changed, 32 insertions(+), 227 deletions(-) delete mode 100644 docs/perf_tips.md (limited to 'docs/best_practices.md') diff --git a/docs/best_practices.md b/docs/best_practices.md index 5f2d45ed..979849f4 100644 --- a/docs/best_practices.md +++ b/docs/best_practices.md @@ -64,11 +64,11 @@ which allows you to define network state with different type of data packets. 1. Use [llvm_mode](../instrumentation/README.llvm.md): afl-clang-lto (llvm >= 11) or afl-clang-fast (llvm >= 9 recommended). 2. Use [persistent mode](../instrumentation/README.persistent_mode.md) (x2-x20 speed increase). -3. Use the [AFL++ snapshot module](https://github.com/AFLplusplus/AFL-Snapshot-LKM) (x2 speed increase). +3. Instrument just what you are interested in, see [instrumentation/README.instrument_list.md](../instrumentation/README.instrument_list.md). 4. If you do not use shmem persistent mode, use `AFL_TMPDIR` to put the input file directory on a tempfs location, see [env_variables.md](env_variables.md). 5. Improve Linux kernel performance: modify `/etc/default/grub`, set `GRUB_CMDLINE_LINUX_DEFAULT="ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off spectre_v2=off stf_barrier=off"`; then `update-grub` and `reboot` (warning: makes the system less secure). 6. Running on an `ext2` filesystem with `noatime` mount option will be a bit faster than on any other journaling filesystem. -7. Use your cores ([fuzzing_in_depth.md:b) Using multiple cores](fuzzing_in_depth.md#b-using-multiple-cores))! +7. Use your cores ([fuzzing_in_depth.md:3c) Using multiple cores](fuzzing_in_depth.md#c-using-multiple-cores))! ### Improving stability diff --git a/docs/fuzzing_in_depth.md b/docs/fuzzing_in_depth.md index 2365c6fd..869ed212 100644 --- a/docs/fuzzing_in_depth.md +++ b/docs/fuzzing_in_depth.md @@ -419,7 +419,7 @@ as test data in there. If you do not want anything special, the defaults are already usually best, hence all you need is to specify the seed input directory with the result of -step [2a. Collect inputs](#a-collect-inputs): +step [2a) Collect inputs](#a-collect-inputs): `afl-fuzz -i input -o output -- bin/target -d @@` Note that the directory specified with -o will be created if it does not exist. @@ -438,11 +438,6 @@ If you need to stop and re-start the fuzzing, use the same command line options mode!) and switch the input directory with a dash (`-`): `afl-fuzz -i - -o output -- bin/target -d @@` -Memory limits are not enforced by afl-fuzz by default and the system may run out -of memory. You can decrease the memory with the `-m` option, the value is in MB. -If this is too small for the target, you can usually see this by afl-fuzz -bailing with the message that it could not connect to the forkserver. - Adding a dictionary is helpful. See the directory [dictionaries/](../dictionaries/) if something is already included for your data format, and tell afl-fuzz to load that dictionary by adding `-x @@ -472,7 +467,26 @@ is: All labels are explained in [status_screen.md](status_screen.md). -#### b) Using multiple cores +#### b) Keeping memory use and timeouts in check + +Memory limits are not enforced by afl-fuzz by default and the system may run out +of memory. You can decrease the memory with the `-m` option, the value is in MB. +If this is too small for the target, you can usually see this by afl-fuzz +bailing with the message that it could not connect to the forkserver. + +Consider setting low values for `-m` and `-t`. + +For programs that are nominally very fast, but get sluggish for some inputs, you +can also try setting `-t` values that are more punishing than what `afl-fuzz` +dares to use on its own. On fast and idle machines, going down to `-t 5` may be +a viable plan. + +The `-m` parameter is worth looking at, too. Some programs can end up spending a +fair amount of time allocating and initializing megabytes of memory when +presented with pathological inputs. Low `-m` values can make them give up sooner +and not waste CPU time. + +#### c) Using multiple cores If you want to seriously fuzz then use as many cores/threads as possible to fuzz your target. @@ -537,7 +551,7 @@ directory of a different fuzzer is, e.g. `-F /src/target/honggfuzz`. Using honggfuzz (with `-n 1` or `-n 2`) and libfuzzer in parallel is highly recommended! -#### c) Using multiple machines for fuzzing +#### d) Using multiple machines for fuzzing Maybe you have more than one machine you want to fuzz the same target on. Simply start the `afl-fuzz` (and perhaps libfuzzer, honggfuzz, ...) @@ -575,7 +589,7 @@ done You can run this manually, per cron job - as you need it. There is a more complex and configurable script in `utils/distributed_fuzzing`. -#### d) The status of the fuzz campaign +#### e) The status of the fuzz campaign AFL++ comes with the `afl-whatsup` script to show the status of the fuzzing campaign. @@ -593,7 +607,7 @@ afl-plot, which generates an index.html file and a graphs that show how the fuzzing instance is performing. The syntax is `afl-plot instance_dir web_dir`, e.g., `afl-plot out/default /srv/www/htdocs/plot`. -#### e) Stopping fuzzing, restarting fuzzing, adding new seeds +#### f) Stopping fuzzing, restarting fuzzing, adding new seeds To stop an afl-fuzz run, simply press Control-C. @@ -608,7 +622,7 @@ are in `newseeds/` directory: AFL_BENCH_JUST_ONE=1 AFL_FAST_CAL=1 afl-fuzz -i newseeds -o out -S newseeds -- ./target ``` -#### f) Checking the coverage of the fuzzing +#### g) Checking the coverage of the fuzzing The `paths found` value is a bad indicator for checking how good the coverage is. @@ -648,7 +662,7 @@ individual fuzzing campaigns each with one of these options set. E.g., if you fuzz a library to convert image formats and your target is the png to tiff API then you will not touch any of the other library APIs and features. -#### g) How long to fuzz a target? +#### h) How long to fuzz a target? This is a difficult question. Basically if no new path is found for a long time (e.g. for a day or a week) then you can expect that your fuzzing won't be @@ -660,7 +674,7 @@ Keep the queue/ directory (for future fuzzings of the same or similar targets) and use them to seed other good fuzzers like libfuzzer with the -entropic switch or honggfuzz. -#### h) Improve the speed! +#### i) Improve the speed! * Use [persistent mode](../instrumentation/README.persistent_mode.md) (x2-x20 speed increase) @@ -675,11 +689,11 @@ or honggfuzz. also just run `sudo afl-persistent-config` * Linux: Running on an `ext2` filesystem with `noatime` mount option will be a bit faster than on any other journaling filesystem -* Use your cores! [b) Using multiple cores](#b-using-multiple-cores) +* Use your cores! [3c) Using multiple cores](#c-using-multiple-cores) * Run `sudo afl-system-config` before starting the first afl-fuzz instance after a reboot -#### i) Going beyond crashes +#### j) Going beyond crashes Fuzzing is a wonderful and underutilized technique for discovering non-crashing design and implementation errors, too. Quite a few interesting bugs have been @@ -703,7 +717,7 @@ conditional with `#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` (a flag also shared with libfuzzer and honggfuzz) or `#ifdef __AFL_COMPILER` (this one is just for AFL++). -#### j) Known limitations & areas for improvement +#### k) Known limitations & areas for improvement Here are some of the most important caveats for AFL++: diff --git a/docs/perf_tips.md b/docs/perf_tips.md deleted file mode 100644 index 1e8fd4d0..00000000 --- a/docs/perf_tips.md +++ /dev/null @@ -1,209 +0,0 @@ -## Tips for performance optimization - - This file provides tips for troubleshooting slow or wasteful fuzzing jobs. - See README.md for the general instruction manual. - -## 1. Keep your test cases small - -This is probably the single most important step to take! Large test cases do -not merely take more time and memory to be parsed by the tested binary, but -also make the fuzzing process dramatically less efficient in several other -ways. - -To illustrate, let's say that you're randomly flipping bits in a file, one bit -at a time. Let's assume that if you flip bit #47, you will hit a security bug; -flipping any other bit just results in an invalid document. - -Now, if your starting test case is 100 bytes long, you will have a 71% chance of -triggering the bug within the first 1,000 execs - not bad! But if the test case -is 1 kB long, the probability that we will randomly hit the right pattern in -the same timeframe goes down to 11%. And if it has 10 kB of non-essential -cruft, the odds plunge to 1%. - -On top of that, with larger inputs, the binary may be now running 5-10x times -slower than before - so the overall drop in fuzzing efficiency may be easily -as high as 500x or so. - -In practice, this means that you shouldn't fuzz image parsers with your -vacation photos. Generate a tiny 16x16 picture instead, and run it through -`jpegtran` or `pngcrunch` for good measure. The same goes for most other types -of documents. - -There's plenty of small starting test cases in ../testcases/ - try them out -or submit new ones! - -If you want to start with a larger, third-party corpus, run `afl-cmin` with an -aggressive timeout on that data set first. - -## 2. Use a simpler target - -Consider using a simpler target binary in your fuzzing work. For example, for -image formats, bundled utilities such as `djpeg`, `readpng`, or `gifhisto` are -considerably (10-20x) faster than the convert tool from ImageMagick - all while exercising roughly the same library-level image parsing code. - -Even if you don't have a lightweight harness for a particular target, remember -that you can always use another, related library to generate a corpus that will -be then manually fed to a more resource-hungry program later on. - -Also note that reading the fuzzing input via stdin is faster than reading from -a file. - -## 3. Use LLVM persistent instrumentation - -The LLVM mode offers a "persistent", in-process fuzzing mode that can -work well for certain types of self-contained libraries, and for fast targets, -can offer performance gains up to 5-10x; and a "deferred fork server" mode -that can offer huge benefits for programs with high startup overhead. Both -modes require you to edit the source code of the fuzzed program, but the -changes often amount to just strategically placing a single line or two. - -If there are important data comparisons performed (e.g. `strcmp(ptr, MAGIC_HDR)`) -then using laf-intel (see instrumentation/README.laf-intel.md) will help `afl-fuzz` a lot -to get to the important parts in the code. - -If you are only interested in specific parts of the code being fuzzed, you can -instrument_files the files that are actually relevant. This improves the speed and -accuracy of afl. See instrumentation/README.instrument_list.md - -## 4. Profile and optimize the binary - -Check for any parameters or settings that obviously improve performance. For -example, the djpeg utility that comes with IJG jpeg and libjpeg-turbo can be -called with: - -```bash - -dct fast -nosmooth -onepass -dither none -scale 1/4 -``` - -...and that will speed things up. There is a corresponding drop in the quality -of decoded images, but it's probably not something you care about. - -In some programs, it is possible to disable output altogether, or at least use -an output format that is computationally inexpensive. For example, with image -transcoding tools, converting to a BMP file will be a lot faster than to PNG. - -With some laid-back parsers, enabling "strict" mode (i.e., bailing out after -first error) may result in smaller files and improved run time without -sacrificing coverage; for example, for sqlite, you may want to specify -bail. - -If the program is still too slow, you can use `strace -tt` or an equivalent -profiling tool to see if the targeted binary is doing anything silly. -Sometimes, you can speed things up simply by specifying `/dev/null` as the -config file, or disabling some compile-time features that aren't really needed -for the job (try `./configure --help`). One of the notoriously resource-consuming -things would be calling other utilities via `exec*()`, `popen()`, `system()`, or -equivalent calls; for example, tar can invoke external decompression tools -when it decides that the input file is a compressed archive. - -Some programs may also intentionally call `sleep()`, `usleep()`, or `nanosleep()`; -vim is a good example of that. Other programs may attempt `fsync()` and so on. -There are third-party libraries that make it easy to get rid of such code, -e.g.: - - https://launchpad.net/libeatmydata - -In programs that are slow due to unavoidable initialization overhead, you may -want to try the LLVM deferred forkserver mode (see README.llvm.md), -which can give you speed gains up to 10x, as mentioned above. - -Last but not least, if you are using ASAN and the performance is unacceptable, -consider turning it off for now, and manually examining the generated corpus -with an ASAN-enabled binary later on. - -## 5. Instrument just what you need - -Instrument just the libraries you actually want to stress-test right now, one -at a time. Let the program use system-wide, non-instrumented libraries for -any functionality you don't actually want to fuzz. For example, in most -cases, it doesn't make to instrument `libgmp` just because you're testing a -crypto app that relies on it for bignum math. - -Beware of programs that come with oddball third-party libraries bundled with -their source code (Spidermonkey is a good example of this). Check `./configure` -options to use non-instrumented system-wide copies instead. - -## 6. Parallelize your fuzzers - -The fuzzer is designed to need ~1 core per job. This means that on a, say, -4-core system, you can easily run four parallel fuzzing jobs with relatively -little performance hit. For tips on how to do that, see parallel_fuzzing.md. - -The `afl-gotcpu` utility can help you understand if you still have idle CPU -capacity on your system. (It won't tell you about memory bandwidth, cache -misses, or similar factors, but they are less likely to be a concern.) - -## 7. Keep memory use and timeouts in check - -Consider setting low values for `-m` and `-t`. - -For programs that are nominally very fast, but get sluggish for some inputs, -you can also try setting `-t` values that are more punishing than what `afl-fuzz` -dares to use on its own. On fast and idle machines, going down to `-t 5` may be -a viable plan. - -The `-m` parameter is worth looking at, too. Some programs can end up spending -a fair amount of time allocating and initializing megabytes of memory when -presented with pathological inputs. Low `-m` values can make them give up sooner -and not waste CPU time. - -## 8. Check OS configuration - -There are several OS-level factors that may affect fuzzing speed: - - - If you have no risk of power loss then run your fuzzing on a tmpfs - partition. This increases the performance noticably. - Alternatively you can use `AFL_TMPDIR` to point to a tmpfs location to - just write the input file to a tmpfs. - - High system load. Use idle machines where possible. Kill any non-essential - CPU hogs (idle browser windows, media players, complex screensavers, etc). - - Network filesystems, either used for fuzzer input / output, or accessed by - the fuzzed binary to read configuration files (pay special attention to the - home directory - many programs search it for dot-files). - - Disable all the spectre, meltdown etc. security countermeasures in the - kernel if your machine is properly separated: - -``` -ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off -no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable -nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off -spectre_v2=off stf_barrier=off -``` - In most Linux distributions you can put this into a `/etc/default/grub` - variable. - You can use `sudo afl-persistent-config` to set these options for you. - -The following list of changes are made when executing `afl-system-config`: - - - On-demand CPU scaling. The Linux `ondemand` governor performs its analysis - on a particular schedule and is known to underestimate the needs of - short-lived processes spawned by `afl-fuzz` (or any other fuzzer). On Linux, - this can be fixed with: - -``` bash - cd /sys/devices/system/cpu - echo performance | tee cpu*/cpufreq/scaling_governor -``` - - On other systems, the impact of CPU scaling will be different; when fuzzing, - use OS-specific tools to find out if all cores are running at full speed. - - Transparent huge pages. Some allocators, such as `jemalloc`, can incur a - heavy fuzzing penalty when transparent huge pages (THP) are enabled in the - kernel. You can disable this via: - -```bash - echo never > /sys/kernel/mm/transparent_hugepage/enabled -``` - - - Suboptimal scheduling strategies. The significance of this will vary from - one target to another, but on Linux, you may want to make sure that the - following options are set: - -```bash - echo 1 >/proc/sys/kernel/sched_child_runs_first - echo 1 >/proc/sys/kernel/sched_autogroup_enabled -``` - - Setting a different scheduling policy for the fuzzer process - say - `SCHED_RR` - can usually speed things up, too, but needs to be done with - care. - -- cgit 1.4.1