diff options
Diffstat (limited to 'frida_mode/MapDensity.md')
-rw-r--r-- | frida_mode/MapDensity.md | 147 |
1 files changed, 147 insertions, 0 deletions
diff --git a/frida_mode/MapDensity.md b/frida_mode/MapDensity.md new file mode 100644 index 00000000..f4ae3ace --- /dev/null +++ b/frida_mode/MapDensity.md @@ -0,0 +1,147 @@ +# Map Density + +# How Coverage Works +The coverage in AFL++ works by assigning each basic block of code a unique ID +and during execution when transitioning between blocks (e.g. by calls or jumps) +assigning each of these edges an ID based upon the source and destination block +ID. + +For each individual execution of the target, a single dimensional byte array +indexed by the edge ID is used to count how many times each edge is traversed. + +A single dimensional cumulative byte array is also constructed where each byte +again represents an individual edge ID, but this time, the value of the byte +represents a range of how many times that edge has been traversed. + +```1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+``` + +The theory is that a new path isn't particularly interesting if an edge has been +traversed `23` instead of `24` times for example, but is interesting if an edge +has been traversed for the very first time, or the number of times fits within a different bucket. + +After each run, the count of times each edge is hit is compared to the values in +the cumulative map and if it is different, then the input is kept as a new seed +and the cumulative map is updated. + +This mechanism is described in greater detail in the seminal +[paper](https://lcamtuf.coredump.cx/afl/technical_details.txt) on AFL by +[lcamtuf](https://github.com/lcamtuf). + +# Collisions +In black-box fuzzing, we must assume that control may flow from any block to any +other block, since we don't know any better. Thus for a target with `n` basic +blocks of code, there are `n * n` potential edges. As we can see, even with a +small number of edges, a very large map will be required so that we have space +to fit them all. Even if our target only had `1024` blocks, this would require a +map containing `1048576` entries (or 1Mb in size). + +Whilst this may not seem like a lot of memory, it causes problems for two reasons. Firstly, the processing step after each execution must now process much more +data, and secondly a map this size is unlikely to fit within the L2 cache of the processor. Since this is a very hot code path, we are likely to pay a very heavy +performance cost. + +Therefore, we must accept that not all edges can have a unique and that +therefore there will be collisions. This means that if the fuzzer finds a new +path by uncovering an edge which was not previously found, but that the same +edge ID is used by another edge, then it may go completely unnoticed. This is +obviously undesirable, but equally if our map is too large, then we will not be +able to process as many potential inputs in the same time and hence not uncover +edges for that reason. Thus a careful trade-off of map size must be made. + +# Block & Edge Numbering +Since the original AFL, blocks and edges have always been numbered in the same +way as we can see from the following C snippet from the whitepaper. + +```c + cur_location = (block_address >> 4) ^ (block_address << 8); + shared_mem[cur_location ^ prev_location]++; + prev_location = cur_location >> 1; + +``` + +Each block ID is generated by performing a shift and XOR on its address. Then +the edge ID is calculated as `E = B ^ (B' >> 1)`. Here, we can make two +observations. In fact, the edge ID is also masked to ensure it is less than the +size of the map being used. + +## Block IDs +Firstly, the block ID doesn't have very good entropy. If we consider the address +of the block, then whilst each block has a unique ID, it isn't necessarily very +evenly distributed. + +We start with a large address, and need to discard a large number of the bits to +generate a block ID which is within range. But how do we choose the unique bits +of the address verus those which are the same for every block? The high bits of +the address may simply be all `0s` or all `1s` to make the address cannonical, +the middle portion of the address may be the same for all blocks (since if they +are all within the same binary, then they will all be adjacent in memory), and +on some systems, even the low bits may have poor entropy as some use fixed +length aligned instructions. Then we need to consider that a portion of each +binary may contain the `.data` or `.bss` sections and so may not contain any +blocks of code at all. + +## Edge IDs +Secondly, we can observe that when we generate an edge ID from the source and +destination block IDs, we perform a right shift on the source block ID. Whilst +there are good reasons as set out in the whitepaper why such a transform is +applied, in so doing, we dispose of `1` bit of precious entropy in our source +block ID. + +All together, this means that some edge IDs may be more popular than others. +This means that some portions of the map may be very densly populated with large +numbers of edges, whilst others may be very sparsely populated, or not populated +at all. + +# Improvements +One of the main reaons why this algorithm selected, is performance. All of the +operations are very quick to perform and given we may be carrying this out for +every block of code we execute, performance is critical. + +However, the design of the binary instrumentation modes of AFL++ has moved on. +Both QEMU and FRIDA modes use a two stage process when executing a target +application. Each block is first compiled or instrumented, and then it is +executed. The compiled blocks can be re-used each time the target executes them. + +Since a blocks ID is based on its address, and this is known at compile time, we +only need to generate this ID once per block and so this ID generation no longer +needs to be as performant. We can therefore use a hash algorithm to generate +this ID and therefore ensure that the block IDs are more evenly distributed. + +Edge IDs however, can only be determined at run-time. Since we don't know which +blocks a given input will traverse until we run it. However, given our block IDs +are now evenly distributed, generating an evenly distributed edge ID becomes +simple. Here, the only change we make is to use a rotate operation rather than +a shift operation so we don't lose a bit of entropy from the source ID. + +So our new algorithm becomes: +```c + cur_location = hash(block_address) + shared_mem[cur_location ^ prev_location]++; + prev_location = rotate(cur_location, 1); +``` + +Lastly, in the original design, the `cur_location` was always set to `0`, at the +beginning of a run, we instead set the value of `cur_location` to `hash(0)`. + +# Parallel Fuzzing +Another sub-optimal aspect of the original design is that no matter how many +instances of the fuzzer you ran in parallel, each instance numbered each block +and so each edge with the same ID. Each instance would therefore find the same +subset of edges collide with each other. In the event of a collision, all +instances will hit the same road block. + +However, if we instead use a different seed for our hashing function for each +instance, then each will ascribe each block a different ID and hence each edge +will be given a different edge ID. This means that whilst one instance of the +fuzzer may find a given pair of edges collide, it is very unlikely that another +instance will find the same pair also collide. + +Due to the collaborative nature of parallel fuzzing, this means that whilst one +instance may struggle to find a particular new path because the new edge +collides, another instance will likely not encounter the same collision and thus +be able to differentiate this new path and share it with the other instances. + +If only a single new edge is found, and the new path is shared with an instance +for which that edge collides, that instance may disregard it as irrelevant. In +practice, however, the discovery of a single new edge, likely leads to several +more edges beneath it also being found and therefore the likelihood of all of +these being collisions is very slim. |