1 files changed, 147 insertions, 0 deletions
diff --git a/frida_mode/MapDensity.md b/frida_mode/MapDensity.md
new file mode 100644
index 00000000..f4ae3ace
--- /dev/null
+++ b/frida_mode/MapDensity.md
@@ -0,0 +1,147 @@
+# Map Density
+
+# How Coverage Works
+The coverage in AFL++ works by assigning each basic block of code a unique ID
+and during execution when transitioning between blocks (e.g. by calls or jumps)
+assigning each of these edges an ID based upon the source and destination block
+ID.
+
+For each individual execution of the target, a single dimensional byte array
+indexed by the edge ID is used to count how many times each edge is traversed.
+
+A single dimensional cumulative byte array is also constructed where each byte
+again represents an individual edge ID, but this time, the value of the byte
+represents a range of how many times that edge has been traversed.
+
+```1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+```
+
+The theory is that a new path isn't particularly interesting if an edge has been
+traversed `23` instead of `24` times for example, but is interesting if an edge
+has been traversed for the very first time, or the number of times fits within a different bucket.
+
+After each run, the count of times each edge is hit is compared to the values in
+the cumulative map and if it is different, then the input is kept as a new seed
+and the cumulative map is updated.
+
+This mechanism is described in greater detail in the seminal
+[paper](https://lcamtuf.coredump.cx/afl/technical_details.txt) on AFL by
+[lcamtuf](https://github.com/lcamtuf).
+
+# Collisions
+In black-box fuzzing, we must assume that control may flow from any block to any
+other block, since we don't know any better. Thus for a target with `n` basic
+blocks of code, there are `n * n` potential edges. As we can see, even with a
+small number of edges, a very large map will be required so that we have space
+to fit them all. Even if our target only had `1024` blocks, this would require a
+map containing `1048576` entries (or 1Mb in size).
+
+Whilst this may not seem like a lot of memory, it causes problems for two reasons. Firstly, the processing step after each execution must now process much more
+data, and secondly a map this size is unlikely to fit within the L2 cache of the processor. Since this is a very hot code path, we are likely to pay a very heavy
+performance cost.
+
+Therefore, we must accept that not all edges can have a unique and that
+therefore there will be collisions. This means that if the fuzzer finds a new
+path by uncovering an edge which was not previously found, but that the same
+edge ID is used by another edge, then it may go completely unnoticed. This is
+obviously undesirable, but equally if our map is too large, then we will not be
+able to process as many potential inputs in the same time and hence not uncover
+edges for that reason. Thus a careful trade-off of map size must be made.
+
+# Block & Edge Numbering
+Since the original AFL, blocks and edges have always been numbered in the same
+way as we can see from the following C snippet from the whitepaper.
+
+```c
+    cur_location = (block_address >> 4) ^ (block_address << 8);
+    shared_mem[cur_location ^ prev_location]++;
+    prev_location = cur_location >> 1;
+
+```
+
+Each block ID is generated by performing a shift and XOR on its address. Then
+the edge ID is calculated as `E = B ^ (B' >> 1)`. Here, we can make two
+observations. In fact, the edge ID is also masked to ensure it is less than the
+size of the map being used.
+
+## Block IDs
+Firstly, the block ID doesn't have very good entropy. If we consider the address
+of the block, then whilst each block has a unique ID, it isn't necessarily very
+evenly distributed.
+
+We start with a large address, and need to discard a large number of the bits to
+generate a block ID which is within range. But how do we choose the unique bits
+of the address verus those which are the same for every block? The high bits of
+the address may simply be all `0s` or all `1s` to make the address cannonical,
+the middle portion of the address may be the same for all blocks (since if they
+are all within the same binary, then they will all be adjacent in memory), and
+on some systems, even the low bits may have poor entropy as some use fixed
+length aligned instructions. Then we need to consider that a portion of each
+binary may contain the `.data` or `.bss` sections and so may not contain any
+blocks of code at all.
+
+## Edge IDs
+Secondly, we can observe that when we generate an edge ID from the source and
+destination block IDs, we perform a right shift on the source block ID. Whilst
+there are good reasons as set out in the whitepaper why such a transform is
+applied, in so doing, we dispose of `1` bit of precious entropy in our source
+block ID.
+
+All together, this means that some edge IDs may be more popular than others.
+This means that some portions of the map may be very densly populated with large
+numbers of edges, whilst others may be very sparsely populated, or not populated
+at all.
+
+# Improvements
+One of the main reaons why this algorithm selected, is performance. All of the
+operations are very quick to perform and given we may be carrying this out for
+every block of code we execute, performance is critical.
+
+However, the design of the binary instrumentation modes of AFL++ has moved on.
+Both QEMU and FRIDA modes use a two stage process when executing a target
+application. Each block is first compiled or instrumented, and then it is
+executed. The compiled blocks can be re-used each time the target executes them.
+
+Since a blocks ID is based on its address, and this is known at compile time, we
+only need to generate this ID once per block and so this ID generation no longer
+needs to be as performant. We can therefore use a hash algorithm to generate
+this ID and therefore ensure that the block IDs are more evenly distributed.
+
+Edge IDs however, can only be determined at run-time. Since we don't know which
+blocks a given input will traverse until we run it. However, given our block IDs
+are now evenly distributed, generating an evenly distributed edge ID becomes
+simple. Here, the only change we make is to use a rotate operation rather than
+a shift operation so we don't lose a bit of entropy from the source ID.
+
+So our new algorithm becomes:
+```c
+    cur_location = hash(block_address)
+    shared_mem[cur_location ^ prev_location]++;
+    prev_location = rotate(cur_location, 1);
+```
+
+Lastly, in the original design, the `cur_location` was always set to `0`, at the
+beginning of a run, we instead set the value of `cur_location` to `hash(0)`.
+
+# Parallel Fuzzing
+Another sub-optimal aspect of the original design is that no matter how many
+instances of the fuzzer you ran in parallel, each instance numbered each block
+and so each edge with the same ID. Each instance would therefore find the same
+subset of edges collide with each other. In the event of a collision, all
+instances will hit the same road block.
+
+However, if we instead use a different seed for our hashing function for each
+instance, then each will ascribe each block a different ID and hence each edge
+will be given a different edge ID. This means that whilst one instance of the
+fuzzer may find a given pair of edges collide, it is very unlikely that another
+instance will find the same pair also collide.
+
+Due to the collaborative nature of parallel fuzzing, this means that whilst one
+instance may struggle to find a particular new path because the new edge
+collides, another instance will likely not encounter the same collision and thus
+be able to differentiate this new path and share it with the other instances.
+
+If only a single new edge is found, and the new path is shared with an instance
+for which that edge collides, that instance may disregard it as irrelevant. In
+practice, however, the discovery of a single new edge, likely leads to several
+more edges beneath it also being found and therefore the likelihood of all of
+these being collisions is very slim.