TODO.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70

## Roadmap 2.61

Makefile:
 - -march=native -Ofast -flto=full

afl-fuzz:
 - sync_fuzzers(): only masters sync from all, slaves only sync from master
   (@andrea: be careful, often people run all slaves)
 - ascii_only mode

gcc_plugin:
 - laf-intel
 - better instrumentation

qemu_mode:
 - update to 4.x (probably this will be skipped :( )
 - instrim for QEMU mode via static analysis (with r2pipe? or angr?)
   Idea: The static analyzer outputs a map in which each edge that must be
   skipped is marked with 1. QEMU loads it at startup in the parent process.
 - rename qemu specific envs to AFL_QEMU (AFL_ENTRYPOINT, AFL_CODE_START/END, AFL_COMPCOV_LEVEL?)
 - add AFL_QEMU_EXITPOINT (maybe multiple?)
 - add/implement AFL_QEMU_INST_LIBLIST and AFL_QEMU_NOINST_PROGRAM

custom_mutators:
 - rip what Superion is doing into custom mutators for js, php, etc.
 - uniform python and custom mutators API


## The far away future:

Problem: Average targets (tiff, jpeg, unrar) go through 1500 edges.
         At afl's default map that means ~16 collisions and ~3 wrappings.

 - Solution #1: increase map size.
    every +1 decreases fuzzing speed by ~10% and halfs the collisions
    birthday paradox predicts collisions at this # of edges:
    
    | mapsize | collisions |
    | :-----: | :--------: |
    | 2^16    | 302        |
	  | 2^17    | 427        |
	  | 2^18    | 603        |
	  | 2^19    | 853        |
	  | 2^20    | 1207       |
	  | 2^21    | 1706       |
	  | 2^22    | 2412       |
	  | 2^23    | 3411       |
	  | 2^24    | 4823       |

    Increasing the map is an easy solution but also not a good one.

 - Solution #2: use dynamic map size and collision free basic block IDs
    This only works in llvm_mode and llvm >= 9 though
    A potential good future solution. Heiko/hexcoder follows this up

 - Solution #3: write instruction pointers to a big shared map
    512kb/1MB shared map and the instrumented code writes the instruction
    pointer into the map. Map must be big enough but could be command line
    controlled.
    
    Good: complete coverage information, nothing is lost. choice of analysis
          impacts speed, but this can be decided by user options
    
    Neutral: a little bit slower but no loss of coverage
    
    Bad: completely changes how afl uses the map and the scheduling.
    Overall another very good solution, Marc Heuse/vanHauser follows this up