From e7db4d4fe0c334404c531821ae52a5f20f9185a1 Mon Sep 17 00:00:00 2001
From: van Hauser <vh@thc.org>
Date: Mon, 31 Aug 2020 12:36:30 +0200
Subject: fix sync script, update remote sync documentation

---
 docs/parallel_fuzzing.md | 105 +++++++++++++++++++++++++++--------------------
 1 file changed, 61 insertions(+), 44 deletions(-)

(limited to 'docs/parallel_fuzzing.md')

diff --git a/docs/parallel_fuzzing.md b/docs/parallel_fuzzing.md
index 2ab1466c..14c237c1 100644
--- a/docs/parallel_fuzzing.md
+++ b/docs/parallel_fuzzing.md
@@ -10,8 +10,8 @@ n-core system, you can almost always run around n concurrent fuzzing jobs with
 virtually no performance hit (you can use the afl-gotcpu tool to make sure).
 
 In fact, if you rely on just a single job on a multi-core system, you will
-be underutilizing the hardware. So, parallelization is usually the right
-way to go.
+be underutilizing the hardware. So, parallelization is always the right way to
+go.
 
 When targeting multiple unrelated binaries or using the tool in
 "non-instrumented" (-n) mode, it is perfectly fine to just start up several
@@ -65,22 +65,7 @@ still perform deterministic checks; while the secondary instances will
 proceed straight to random tweaks.
 
 Note that you must always have one -M main instance!
-
-Note that running multiple -M instances is wasteful, although there is an
-experimental support for parallelizing the deterministic checks. To leverage
-that, you need to create -M instances like so:
-
-```
-./afl-fuzz -i testcase_dir -o sync_dir -M mainA:1/3 [...]
-./afl-fuzz -i testcase_dir -o sync_dir -M mainB:2/3 [...]
-./afl-fuzz -i testcase_dir -o sync_dir -M mainC:3/3 [...]
-```
-
-...where the first value after ':' is the sequential ID of a particular main
-instance (starting at 1), and the second value is the total number of fuzzers to
-distribute the deterministic fuzzing across. Note that if you boot up fewer
-fuzzers than indicated by the second number passed to -M, you may end up with
-poor coverage.
+Running multiple -M instances is wasteful!
 
 You can also monitor the progress of your jobs from the command line with the
 provided afl-whatsup tool. When the instances are no longer finding new paths,
@@ -99,61 +84,88 @@ example may be:
 This is not a concern if you use @@ without -f and let afl-fuzz come up with the
 file name.
 
-## 3) Syncing with non-afl fuzzers or independant instances
+## 3) Multiple -M mains
+
+
+There is support for parallelizing the deterministic checks.
+This is only needed where
+ 
+ 1. many new paths are found fast over a long time and it looks unlikely that
+    main node will ever catch up, and
+ 2. deterministic fuzzing is actively helping path discovery (you can see this
+    in the main node for the first for lines in the "fuzzing strategy yields"
+    section. If the ration `found/attemps` is high, then it is effective. It
+    most commonly isn't.)
+
+Only if both are true it is beneficial to have more than one main.
+You can leverage this by creating -M instances like so:
+
+```
+./afl-fuzz -i testcase_dir -o sync_dir -M mainA:1/3 [...]
+./afl-fuzz -i testcase_dir -o sync_dir -M mainB:2/3 [...]
+./afl-fuzz -i testcase_dir -o sync_dir -M mainC:3/3 [...]
+```
+
+... where the first value after ':' is the sequential ID of a particular main
+instance (starting at 1), and the second value is the total number of fuzzers to
+distribute the deterministic fuzzing across. Note that if you boot up fewer
+fuzzers than indicated by the second number passed to -M, you may end up with
+poor coverage.
+
+## 4) Syncing with non-afl fuzzers or independant instances
 
 A -M main node can be told with the `-F other_fuzzer_queue_directory` option
 to sync results from other fuzzers, e.g. libfuzzer or honggfuzz.
 
 Only the specified directory will by synced into afl, not subdirectories.
-The specified directories do not need to exist yet at the start of afl.
+The specified directory does not need to exist yet at the start of afl.
 
-## 4) Multi-system parallelization
+The `-F` option can be passed to the main node several times.
+
+## 5) Multi-system parallelization
 
 The basic operating principle for multi-system parallelization is similar to
 the mechanism explained in section 2. The key difference is that you need to
 write a simple script that performs two actions:
 
   - Uses SSH with authorized_keys to connect to every machine and retrieve
-    a tar archive of the /path/to/sync_dir/<fuzzer_id>/queue/ directories for
-    every <fuzzer_id> local to the machine. It's best to use a naming scheme
-    that includes host name in the fuzzer ID, so that you can do something
-    like:
+    a tar archive of the /path/to/sync_dir/<main_node(s)> directory local to
+    the machine.
+    It is best to use a naming scheme that includes host name and it's being
+    a main node (e.g. main1, main2) in the fuzzer ID, so that you can do
+    something like:
 
     ```sh
-    for s in {1..10}; do
-      ssh user@host${s} "tar -czf - sync/host${s}_fuzzid*/[qf]*" >host${s}.tgz
+    for host in `cat HOSTLIST`; do
+      ssh user@$host "tar -czf - sync/$host_main*/" > $host.tgz
     done
     ```
 
   - Distributes and unpacks these files on all the remaining machines, e.g.:
 
     ```sh
-    for s in {1..10}; do
-      for d in {1..10}; do
+    for srchost in `cat HOSTLIST`; do
+      for dsthost in `cat HOSTLIST`; do
         test "$s" = "$d" && continue
-        ssh user@host${d} 'tar -kxzf -' <host${s}.tgz
+        ssh user@$srchost 'tar -kxzf -' < $dsthost.tgz
       done
     done
     ```
 
-There is an example of such a script in examples/distributed_fuzzing/;
-you can also find a more featured, experimental tool developed by
-Martijn Bogaard at:
-
-  https://github.com/MartijnB/disfuzz-afl
-
-Another client-server implementation from Richo Healey is:
+There is an example of such a script in examples/distributed_fuzzing/.
 
-  https://github.com/richo/roving
+There are other (older) more featured, experimental tools:
+  * https://github.com/richo/roving
+  * https://github.com/MartijnB/disfuzz-afl
 
-Note that these third-party tools are unsafe to run on systems exposed to the
-Internet or to untrusted users.
+However these do not support syncing just main nodes (yet).
 
 When developing custom test case sync code, there are several optimizations
 to keep in mind:
 
   - The synchronization does not have to happen very often; running the
-    task every 30 minutes or so may be perfectly fine.
+    task every 60 minutes or even less often at later fuzzing stages is
+    fine
 
   - There is no need to synchronize crashes/ or hangs/; you only need to
     copy over queue/* (and ideally, also fuzzer_stats).
@@ -179,12 +191,17 @@ to keep in mind:
   - You do not want a "main" instance of afl-fuzz on every system; you should
     run them all with -S, and just designate a single process somewhere within
     the fleet to run with -M.
+    
+  - Syncing is only necessary for the main nodes on a system. It is possible
+    to run main-less with only secondaries. However then you need to find out
+    which secondary took over the temporary role to be the main node. Look for
+    the `is_main` file in the fuzzer directories, eg. `sync-dir/hostname-*/is_main`
 
 It is *not* advisable to skip the synchronization script and run the fuzzers
 directly on a network filesystem; unexpected latency and unkillable processes
 in I/O wait state can mess things up.
 
-## 5) Remote monitoring and data collection
+## 6) Remote monitoring and data collection
 
 You can use screen, nohup, tmux, or something equivalent to run remote
 instances of afl-fuzz. If you redirect the program's output to a file, it will
@@ -208,7 +225,7 @@ Keep in mind that crashing inputs are *not* automatically propagated to the
 main instance, so you may still want to monitor for crashes fleet-wide
 from within your synchronization or health checking scripts (see afl-whatsup).
 
-## 6) Asymmetric setups
+## 7) Asymmetric setups
 
 It is perhaps worth noting that all of the following is permitted:
 
@@ -224,7 +241,7 @@ It is perhaps worth noting that all of the following is permitted:
     the discovered test cases can have synergistic effects and improve the
     overall coverage.
 
-    (In this case, running one -M instance per each binary is a good plan.)
+    (In this case, running one -M instance per target is necessary.)
 
   - Having some of the fuzzers invoke the binary in different ways.
     For example, 'djpeg' supports several DCT modes, configurable with
-- 
cgit 1.4.1


From 7fb72f10387979ac5e46fdfb8901e928901a94e7 Mon Sep 17 00:00:00 2001
From: hexcoder- <heiko@hexco.de>
Date: Mon, 31 Aug 2020 14:47:22 +0200
Subject: typos

---
 docs/parallel_fuzzing.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

(limited to 'docs/parallel_fuzzing.md')

diff --git a/docs/parallel_fuzzing.md b/docs/parallel_fuzzing.md
index 14c237c1..0c4cd237 100644
--- a/docs/parallel_fuzzing.md
+++ b/docs/parallel_fuzzing.md
@@ -206,9 +206,9 @@ in I/O wait state can mess things up.
 You can use screen, nohup, tmux, or something equivalent to run remote
 instances of afl-fuzz. If you redirect the program's output to a file, it will
 automatically switch from a fancy UI to more limited status reports. There is
-also basic machine-readable information always written to the fuzzer_stats file
-in the output directory. Locally, that information can be interpreted with
-afl-whatsup.
+also basic machine-readable information which is always written to the
+fuzzer_stats file in the output directory. Locally, that information can be
+interpreted with afl-whatsup.
 
 In principle, you can use the status screen of the main (-M) instance to
 monitor the overall fuzzing progress and decide when to stop. In this
-- 
cgit 1.4.1


From 58cf030546b1fb2dbe9d5325c4e69c0611c4c35b Mon Sep 17 00:00:00 2001
From: van Hauser <vh@thc.org>
Date: Mon, 31 Aug 2020 16:34:57 +0200
Subject: fix for MacOS sudo

---
 GNUmakefile              | 2 +-
 docs/parallel_fuzzing.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

(limited to 'docs/parallel_fuzzing.md')

diff --git a/GNUmakefile b/GNUmakefile
index c0614d4d..fb60f301 100644
--- a/GNUmakefile
+++ b/GNUmakefile
@@ -492,7 +492,7 @@ ifndef AFL_NO_X86
 
 test_build: afl-gcc afl-as afl-showmap
 	@echo "[*] Testing the CC wrapper and instrumentation output..."
-	@unset AFL_USE_ASAN AFL_USE_MSAN AFL_CC; AFL_DEBUG=1 AFL_INST_RATIO=100 AFL_PATH=. ./$(TEST_CC) $(CFLAGS) test-instr.c -o test-instr $(LDFLAGS) 2>&1 | grep 'afl-as' >/dev/null || (echo "Oops, afl-as did not get called from "$(TEST_CC)". This is normally achieved by "$(CC)" honoring the -B option."; exit 1 )
+	@unset AFL_USE_ASAN AFL_USE_MSAN AFL_CC; AFL_DEBUG=1 AFL_INST_RATIO=100 AFL_AS_FORCE_INSTRUMENT=1 AFL_PATH=. ./$(TEST_CC) $(CFLAGS) test-instr.c -o test-instr $(LDFLAGS) 2>&1 | grep 'afl-as' >/dev/null || (echo "Oops, afl-as did not get called from "$(TEST_CC)". This is normally achieved by "$(CC)" honoring the -B option."; exit 1 )
 	ASAN_OPTIONS=detect_leaks=0 ./afl-showmap -m none -q -o .test-instr0 ./test-instr < /dev/null
 	echo 1 | ASAN_OPTIONS=detect_leaks=0 ./afl-showmap -m none -q -o .test-instr1 ./test-instr
 	@rm -f test-instr
diff --git a/docs/parallel_fuzzing.md b/docs/parallel_fuzzing.md
index 0c4cd237..12895ac3 100644
--- a/docs/parallel_fuzzing.md
+++ b/docs/parallel_fuzzing.md
@@ -195,7 +195,7 @@ to keep in mind:
   - Syncing is only necessary for the main nodes on a system. It is possible
     to run main-less with only secondaries. However then you need to find out
     which secondary took over the temporary role to be the main node. Look for
-    the `is_main` file in the fuzzer directories, eg. `sync-dir/hostname-*/is_main`
+    the `is_main_node` file in the fuzzer directories, eg. `sync-dir/hostname-*/is_main_node`
 
 It is *not* advisable to skip the synchronization script and run the fuzzers
 directly on a network filesystem; unexpected latency and unkillable processes
-- 
cgit 1.4.1


From 338638b124f46ac9fda25efc0060910a781d199c Mon Sep 17 00:00:00 2001
From: ploppelop <70337824+ploppelop@users.noreply.github.com>
Date: Mon, 31 Aug 2020 18:34:27 +0200
Subject: Update parallel_fuzzing.md

fix multisystem example
---
 docs/parallel_fuzzing.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'docs/parallel_fuzzing.md')

diff --git a/docs/parallel_fuzzing.md b/docs/parallel_fuzzing.md
index 12895ac3..bf57ace8 100644
--- a/docs/parallel_fuzzing.md
+++ b/docs/parallel_fuzzing.md
@@ -146,7 +146,7 @@ write a simple script that performs two actions:
     ```sh
     for srchost in `cat HOSTLIST`; do
       for dsthost in `cat HOSTLIST`; do
-        test "$s" = "$d" && continue
+        test "$srchost" = "$dsthost" && continue
         ssh user@$srchost 'tar -kxzf -' < $dsthost.tgz
       done
     done
-- 
cgit 1.4.1