Merge pull request #1191 from llzmb/docs_quality_assurance

Docs content - quality assurance
author: van Hauser <vh@thc.org> 2021-12-07 15:18:32 +0100
committer: GitHub <noreply@github.com> 2021-12-07 15:18:32 +0100
commit: 1f6c72ea1baea69b2dc5b3a68bfacbc00652bc66 (patch)
tree: a5a7ed81710c1dec50f0aa661b53c0cd884a4da2 /docs/best_practices.md
parent: 5469112db90741cb06c0979313938d83e63f793f (diff)
parent: bb506de0b809f97a4221ee1b6e040dcb5f9ca56a (diff)
download: afl++-1f6c72ea1baea69b2dc5b3a68bfacbc00652bc66.tar.gz
1 files changed, 96 insertions, 51 deletions
diff --git a/docs/best_practices.md b/docs/best_practices.md
index 18096851..96c6e3c2 100644
--- a/docs/best_practices.md
+++ b/docs/best_practices.md
@@ -19,7 +19,8 @@
 
 ### Fuzzing a target with source code available
 
-To learn how to fuzz a target if source code is available, see [fuzzing_in_depth.md](fuzzing_in_depth.md).
+To learn how to fuzz a target if source code is available, see
+[fuzzing_in_depth.md](fuzzing_in_depth.md).
 
 ### Fuzzing a target with dlopen instrumented libraries
 
@@ -48,11 +49,16 @@ For a comprehensive guide, see
 
 ### Fuzzing a GUI program
 
-If the GUI program can read the fuzz data from a file (via the command line, a fixed location or via an environment variable) without needing any user interaction, then it would be suitable for fuzzing.
+If the GUI program can read the fuzz data from a file (via the command line, a
+fixed location or via an environment variable) without needing any user
+interaction, then it would be suitable for fuzzing.
 
-Otherwise, it is not possible without modifying the source code - which is a very good idea anyway as the GUI functionality is a huge CPU/time overhead for the fuzzing.
+Otherwise, it is not possible without modifying the source code - which is a
+very good idea anyway as the GUI functionality is a huge CPU/time overhead for
+the fuzzing.
 
-So create a new `main()` that just reads the test case and calls the functionality for processing the input that the GUI program is using.
+So create a new `main()` that just reads the test case and calls the
+functionality for processing the input that the GUI program is using.
 
 ### Fuzzing a network service
 
@@ -61,87 +67,126 @@ Fuzzing a network service does not work "out of the box".
 Using a network channel is inadequate for several reasons:
 - it has a slow-down of x10-20 on the fuzzing speed
 - it does not scale to fuzzing multiple instances easily,
-- instead of one initial data packet often a back-and-forth interplay of packets is needed for stateful protocols (which is totally unsupported by most coverage aware fuzzers).
-
-The established method to fuzz network services is to modify the source code
-to read from a file or stdin (fd 0) (or even faster via shared memory, combine
-this with persistent mode [instrumentation/README.persistent_mode.md](../instrumentation/README.persistent_mode.md)
-and you have a performance gain of x10 instead of a performance loss of over
-x10 - that is a x100 difference!).
-
-If modifying the source is not an option (e.g. because you only have a binary
+- instead of one initial data packet often a back-and-forth interplay of packets
+  is needed for stateful protocols (which is totally unsupported by most
+  coverage aware fuzzers).
+
+The established method to fuzz network services is to modify the source code to
+read from a file or stdin (fd 0) (or even faster via shared memory, combine this
+with persistent mode
+[instrumentation/README.persistent_mode.md](../instrumentation/README.persistent_mode.md)
+and you have a performance gain of x10 instead of a performance loss of over x10
+- that is a x100 difference!).
+
+If modifying the source is not an option (e.g., because you only have a binary
 and perform binary fuzzing) you can also use a shared library with AFL_PRELOAD
 to emulate the network. This is also much faster than the real network would be.
 See [utils/socket_fuzzing/](../utils/socket_fuzzing/).
 
 There is an outdated AFL++ branch that implements networking if you are
-desperate though: [https://github.com/AFLplusplus/AFLplusplus/tree/networking](https://github.com/AFLplusplus/AFLplusplus/tree/networking) -
-however a better option is AFLnet ([https://github.com/aflnet/aflnet](https://github.com/aflnet/aflnet))
-which allows you to define network state with different type of data packets.
+desperate though:
+[https://github.com/AFLplusplus/AFLplusplus/tree/networking](https://github.com/AFLplusplus/AFLplusplus/tree/networking)
+- however, a better option is AFLnet
+([https://github.com/aflnet/aflnet](https://github.com/aflnet/aflnet)) which
+allows you to define network state with different type of data packets.
 
 ## Improvements
 
 ### Improving speed
 
-1. Use [llvm_mode](../instrumentation/README.llvm.md): afl-clang-lto (llvm >= 11) or afl-clang-fast (llvm >= 9 recommended).
-2. Use [persistent mode](../instrumentation/README.persistent_mode.md) (x2-x20 speed increase).
-3. Instrument just what you are interested in, see [instrumentation/README.instrument_list.md](../instrumentation/README.instrument_list.md).
-4. If you do not use shmem persistent mode, use `AFL_TMPDIR` to put the input file directory on a tempfs location, see [env_variables.md](env_variables.md).
-5. Improve Linux kernel performance: modify `/etc/default/grub`, set `GRUB_CMDLINE_LINUX_DEFAULT="ibpb=off ibrs=off kpti=off l1tf=off mds=off mitigations=off no_stf_barrier noibpb noibrs nopcid nopti nospec_store_bypass_disable nospectre_v1 nospectre_v2 pcid=off pti=off spec_store_bypass_disable=off spectre_v2=off stf_barrier=off"`; then `update-grub` and `reboot` (warning: makes the system less secure).
-6. Running on an `ext2` filesystem with `noatime` mount option will be a bit faster than on any other journaling filesystem.
-7. Use your cores ([fuzzing_in_depth.md:3c) Using multiple cores](fuzzing_in_depth.md#c-using-multiple-cores))!
+1. Use [llvm_mode](../instrumentation/README.llvm.md): afl-clang-lto (llvm >=
+   11) or afl-clang-fast (llvm >= 9 recommended).
+2. Use [persistent mode](../instrumentation/README.persistent_mode.md) (x2-x20
+   speed increase).
+3. Instrument just what you are interested in, see
+   [instrumentation/README.instrument_list.md](../instrumentation/README.instrument_list.md).
+4. If you do not use shmem persistent mode, use `AFL_TMPDIR` to put the input
+   file directory on a tempfs location, see
+   [env_variables.md](env_variables.md).
+5. Improve Linux kernel performance: modify `/etc/default/grub`, set
+   `GRUB_CMDLINE_LINUX_DEFAULT="ibpb=off ibrs=off kpti=off l1tf=off mds=off
+   mitigations=off no_stf_barrier noibpb noibrs nopcid nopti
+   nospec_store_bypass_disable nospectre_v1 nospectre_v2 pcid=off pti=off
+   spec_store_bypass_disable=off spectre_v2=off stf_barrier=off"`; then
+   `update-grub` and `reboot` (warning: makes the system less secure).
+6. Running on an `ext2` filesystem with `noatime` mount option will be a bit
+   faster than on any other journaling filesystem.
+7. Use your cores
+   ([fuzzing_in_depth.md:3c) Using multiple cores](fuzzing_in_depth.md#c-using-multiple-cores))!
 
 ### Improving stability
 
-For fuzzing a 100% stable target that covers all edges is the best case.
-A 90% stable target that covers all edges is however better than a 100% stable target that ignores 10% of the edges.
+For fuzzing a 100% stable target that covers all edges is the best case. A 90%
+stable target that covers all edges is, however, better than a 100% stable
+target that ignores 10% of the edges.
 
-With instability, you basically have a partial coverage loss on an edge, with ignored functions you have a full loss on that edges.
+With instability, you basically have a partial coverage loss on an edge, with
+ignored functions you have a full loss on that edges.
 
-There are functions that are unstable, but also provide value to coverage, e.g., init functions that use fuzz data as input.
-If however a function that has nothing to do with the input data is the source of instability, e.g., checking jitter, or is a hash map function etc., then it should not be instrumented.
+There are functions that are unstable, but also provide value to coverage, e.g.,
+init functions that use fuzz data as input. If, however, a function that has
+nothing to do with the input data is the source of instability, e.g., checking
+jitter, or is a hash map function etc., then it should not be instrumented.
 
-To be able to exclude these functions (based on AFL++'s measured stability), the following process will allow to identify functions with variable edges.
+To be able to exclude these functions (based on AFL++'s measured stability), the
+following process will allow to identify functions with variable edges.
 
-Four steps are required to do this and it also requires quite some knowledge of coding and/or disassembly and is effectively possible only with `afl-clang-fast` `PCGUARD` and `afl-clang-lto` `LTO` instrumentation.
+Four steps are required to do this and it also requires quite some knowledge of
+coding and/or disassembly and is effectively possible only with `afl-clang-fast`
+`PCGUARD` and `afl-clang-lto` `LTO` instrumentation.
 
   1. Instrument to be able to find the responsible function(s):
 
-     a) For LTO instrumented binaries, this can be documented during compile time, just set `export AFL_LLVM_DOCUMENT_IDS=/path/to/a/file`.
-        This file will have one assigned edge ID and the corresponding function per line.
-
-     b) For PCGUARD instrumented binaries, it is much more difficult. Here you can either modify the `__sanitizer_cov_trace_pc_guard` function in `instrumentation/afl-llvm-rt.o.c` to write a backtrace to a file if the ID in `__afl_area_ptr[*guard]` is one of the unstable edge IDs.
-        (Example code is already there).
-        Then recompile and reinstall `llvm_mode` and rebuild your target.
-        Run the recompiled target with `afl-fuzz` for a while and then check the file that you wrote with the backtrace information.
-        Alternatively, you can use `gdb` to hook `__sanitizer_cov_trace_pc_guard_init` on start, check to which memory address the edge ID value is written, and set a write breakpoint to that address (`watch 0x.....`).
-
-     c) In other instrumentation types, this is not possible.
-        So just recompile with the two mentioned above.
-        This is just for identifying the functions that have unstable edges.
+     a) For LTO instrumented binaries, this can be documented during compile
+        time, just set `export AFL_LLVM_DOCUMENT_IDS=/path/to/a/file`. This file
+        will have one assigned edge ID and the corresponding function per line.
+
+     b) For PCGUARD instrumented binaries, it is much more difficult. Here you
+        can either modify the `__sanitizer_cov_trace_pc_guard` function in
+        `instrumentation/afl-llvm-rt.o.c` to write a backtrace to a file if the
+        ID in `__afl_area_ptr[*guard]` is one of the unstable edge IDs. (Example
+        code is already there). Then recompile and reinstall `llvm_mode` and
+        rebuild your target. Run the recompiled target with `afl-fuzz` for a
+        while and then check the file that you wrote with the backtrace
+        information. Alternatively, you can use `gdb` to hook
+        `__sanitizer_cov_trace_pc_guard_init` on start, check to which memory
+        address the edge ID value is written, and set a write breakpoint to that
+        address (`watch 0x.....`).
+
+     c) In other instrumentation types, this is not possible. So just recompile
+        with the two mentioned above. This is just for identifying the functions
+        that have unstable edges.
 
   2. Identify which edge ID numbers are unstable.
 
      Run the target with `export AFL_DEBUG=1` for a few minutes then terminate.
      The out/fuzzer_stats file will then show the edge IDs that were identified
-     as unstable in the `var_bytes` entry. You can match these numbers
-     directly to the data you created in the first step.
-     Now you know which functions are responsible for the instability
+     as unstable in the `var_bytes` entry. You can match these numbers directly
+     to the data you created in the first step. Now you know which functions are
+     responsible for the instability
 
   3. Create a text file with the filenames/functions
 
-     Identify which source code files contain the functions that you need to remove from instrumentation, or just specify the functions you want to skip for instrumentation.
-     Note that optimization might inline functions!
+     Identify which source code files contain the functions that you need to
+     remove from instrumentation, or just specify the functions you want to skip
+     for instrumentation. Note that optimization might inline functions!
+
+     Follow this document on how to do this:
+     [instrumentation/README.instrument_list.md](../instrumentation/README.instrument_list.md).
 
-     Follow this document on how to do this: [instrumentation/README.instrument_list.md](../instrumentation/README.instrument_list.md).
      If `PCGUARD` is used, then you need to follow this guide (needs llvm 12+!):
      [https://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation](https://clang.llvm.org/docs/SanitizerCoverage.html#partially-disabling-instrumentation)
 
-     Only exclude those functions from instrumentation that provide no value for coverage - that is if it does not process any fuzz data directly or indirectly (e.g. hash maps, thread management etc.).
-     If however a function directly or indirectly handles fuzz data, then you should not put the function in a deny instrumentation list and rather live with the instability it comes with.
+     Only exclude those functions from instrumentation that provide no value for
+     coverage - that is if it does not process any fuzz data directly or
+     indirectly (e.g., hash maps, thread management etc.). If, however, a
+     function directly or indirectly handles fuzz data, then you should not put
+     the function in a deny instrumentation list and rather live with the
+     instability it comes with.
 
   4. Recompile the target
 
      Recompile, fuzz it, be happy :)
 
-     This link explains this process for [Fuzzbench](https://github.com/google/fuzzbench/issues/677).
+     This link explains this process for
+     [Fuzzbench](https://github.com/google/fuzzbench/issues/677).
\ No newline at end of file
author	van Hauser <vh@thc.org>	2021-12-07 15:18:32 +0100
committer	GitHub <noreply@github.com>	2021-12-07 15:18:32 +0100
commit	1f6c72ea1baea69b2dc5b3a68bfacbc00652bc66 (patch)
tree	a5a7ed81710c1dec50f0aa661b53c0cd884a4da2 /docs/best_practices.md
parent	5469112db90741cb06c0979313938d83e63f793f (diff)
parent	bb506de0b809f97a4221ee1b6e040dcb5f9ca56a (diff)
download	afl++-1f6c72ea1baea69b2dc5b3a68bfacbc00652bc66.tar.gz