about summary refs log tree commit diff homepage
path: root/docs/overview
blob: 666a23d9426f02e3a034876999e55e735a74839e (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
/// @page overview High level overview of KLEE.
/// This document contains a high level overview of the inner workings of KLEE.
///
/// KLEE implements symbolic execution by interpreting LLVM bitcode. Symbolic
/// memory is defined by inserting special calls to KLEE (namely
/// klee_make_symbolic)
/// During execution, KLEE tracks all uses of symbolic memory. Constraints
/// on symbolic memory usage are collected. Memory
/// that is defined using previously declared symbolic memory become
/// symbolic as well.
/// Whenever a branch refering to symbolic memory is encountered, KLEE forks
/// the entire states and explores each side of the branch for which a possible
/// solution to the symbolic constraints can be found.
/// KLEE makes queries to STP to solve symbolic constraints.
///
/// The rest of this document describes some of the important components of KLEE
/// 
/// @section executor Interpreter
/// klee::Interpreter is the main abstract class defining the interface of the
/// bitcode interpreter. klee::Executor is the main concrete instance of this
/// class.
/// Application states (i.e. memory, registers and PC) are stored in instances of
/// class klee::ExecutionState. There is one such instance for each path beeing
/// executed (except when some states are merged together).
/// On a branch, if condition is symbolic, klee::Executor::fork returns a
/// klee::ExecutionState::StatePair which is a pair of ExecutionState to be
/// executed.
/// 
/// @section memory Memory model
/// MemoryObject's represent allocation sites in the program (calls to malloc, stack
/// objects, global variables)
/// and, at least conceptually, can be thought of as the unique name for the object
/// allocated at that site.
/// ObjectState's are used to store the actual contents of a MemoryObject in a
/// particular ExecutionState (but
/// can be shared). I need better names for these two things. 
/// 
/// Each ExecutionState stores a mapping of MemoryObjects -> ObjectState using the
/// AddressSpace data
/// structure (implemented as an immutable tree so that copying is cheap and the
/// shared structure is exploited).
/// Each AddressSpace may "own" some subset of the ObjectStates in the mapping. When
/// an AddressSpace
/// is duplicated it loses ownership of the ObjectState in the map. Any subsequent
/// write to an ObjectState will
/// create a copy of the object (AddressSpace::getWriteable). This is the COW
/// mechanism (which gets used
/// for all objects, not just globals).
/// 
/// From the point of view of the state and this mapping there is no distinction
/// between stack, heap, and global
/// objects. The only special handling for stack objects is that the MemoryObject is
/// marked as isLocal and the
/// MemoryObject is stored in the StackFrame alloca list. When the StackFrame is
/// popped these objects are
/// then unbound so that the state can no longer access the memory directly
/// (references to the memory object
/// may still remain in ReadExprs, but conceptually the actual memory is no longer
/// addressable).
/// 
/// It is also important that the AddressSpace mapping is ordered. We use this when
/// we need to resolve a symbolic
/// address to an ObjectState by first getting a particular value for the symbolic
/// address, and using that value to start
/// looking for objects that the pointer can fall within.
/// Difference betweens MemoryObjects and ObjectStates ?
///
/// @section expression Expressions
/// The various Expr classes mostly model the llvm instruction set. ref<Expr> is
/// used to maintain the reference count
/// but also embeds any constant expressions. In fact in the current code base
/// ConstantExprs should almost never be
/// created. Most of the Expr's are straightforward. Some of the most important ones
/// are Concat?Expr, which join
/// some number of bytes into a larger type, ExtractExpr which extracts smaller
/// types from larger ones, and ReadExpr
/// which is a symbolic array access.
///
/// The way memory is implemented all accesses are broken down into byte level
/// operations. This means that the
/// memory system (by which I mean the ObjectState data structure) tends to use a
/// lot of ExtractExpr and Concat?Expr,
/// so it is very important that these expressions fold their operands when
/// possible.
///
/// The ReadExpr is probably the most important one. Conceptually it is simply an
/// index and a list of (index, value)
/// updates (writes). The ReadExpr evaluates to all the values for which the two
/// indices can be equal. The ObjectState
/// structure uses a cache for concrete writes and for symbolic writes at concrete
/// indices, but for writes at symbolic
/// indices it must construct a list of such updates. These are stored in the
/// UpdateList and UpdateNode structures
/// which are again immutable data structures so that copy is cheap and the sharing
/// is exploited.
/// 
/// @section searcher Searcher
/// Base classe: klee::Searcher. The Executor uses a Searcher to select the next
/// state (i.e. program instance following a single path) for which an
/// instruction
/// will be executed. There are multiple implementations of Searcher in klee,
/// implementing different search policies. klee::RandomSearcher selects the next state randomly.
/// klee::DFSSearcher uses a depth first approach. klee::MergingSearcher tries
/// to merge states ?