/// @page overview High level overview of KLEE. /// This document contains a high level overview of the inner workings of KLEE. /// /// KLEE implements symbolic execution by interpreting LLVM bitcode. Symbolic /// memory is defined by inserting special calls to KLEE (namely /// klee_make_symbolic) /// During execution, KLEE tracks all uses of symbolic memory. Constraints /// on symbolic memory usage are collected. Memory /// that is defined using previously declared symbolic memory become /// symbolic as well. /// Whenever a branch refering to symbolic memory is encountered, KLEE forks /// the entire states and explores each side of the branch for which a possible /// solution to the symbolic constraints can be found. /// KLEE makes queries to STP to solve symbolic constraints. /// /// The rest of this document describes some of the important components of KLEE /// /// @section executor Interpreter /// klee::Interpreter is the main abstract class defining the interface of the /// bitcode interpreter. klee::Executor is the main concrete instance of this /// class. /// Application states (i.e. memory, registers and PC) are stored in instances of /// class klee::ExecutionState. There is one such instance for each path beeing /// executed (except when some states are merged together). /// On a branch, if condition is symbolic, klee::Executor::fork returns a /// klee::ExecutionState::StatePair which is a pair of ExecutionState to be /// executed. /// /// @section memory Memory model /// MemoryObject's represent allocation sites in the program (calls to malloc, stack /// objects, global variables) /// and, at least conceptually, can be thought of as the unique name for the object /// allocated at that site. /// ObjectState's are used to store the actual contents of a MemoryObject in a /// particular ExecutionState (but /// can be shared). I need better names for these two things. /// /// Each ExecutionState stores a mapping of MemoryObjects -> ObjectState using the /// AddressSpace data /// structure (implemented as an immutable tree so that copying is cheap and the /// shared structure is exploited). /// Each AddressSpace may "own" some subset of the ObjectStates in the mapping. When /// an AddressSpace /// is duplicated it loses ownership of the ObjectState in the map. Any subsequent /// write to an ObjectState will /// create a copy of the object (AddressSpace::getWriteable). This is the COW /// mechanism (which gets used /// for all objects, not just globals). /// /// From the point of view of the state and this mapping there is no distinction /// between stack, heap, and global /// objects. The only special handling for stack objects is that the MemoryObject is /// marked as isLocal and the /// MemoryObject is stored in the StackFrame alloca list. When the StackFrame is /// popped these objects are /// then unbound so that the state can no longer access the memory directly /// (references to the memory object /// may still remain in ReadExprs, but conceptually the actual memory is no longer /// addressable). /// /// It is also important that the AddressSpace mapping is ordered. We use this when /// we need to resolve a symbolic /// address to an ObjectState by first getting a particular value for the symbolic /// address, and using that value to start /// looking for objects that the pointer can fall within. /// Difference betweens MemoryObjects and ObjectStates ? /// /// @section expression Expressions /// The various Expr classes mostly model the llvm instruction set. ref is /// used to maintain the reference count /// but also embeds any constant expressions. In fact in the current code base /// ConstantExprs should almost never be /// created. Most of the Expr's are straightforward. Some of the most important ones /// are Concat?Expr, which join /// some number of bytes into a larger type, ExtractExpr which extracts smaller /// types from larger ones, and ReadExpr /// which is a symbolic array access. /// /// The way memory is implemented all accesses are broken down into byte level /// operations. This means that the /// memory system (by which I mean the ObjectState data structure) tends to use a /// lot of ExtractExpr and Concat?Expr, /// so it is very important that these expressions fold their operands when /// possible. /// /// The ReadExpr is probably the most important one. Conceptually it is simply an /// index and a list of (index, value) /// updates (writes). The ReadExpr evaluates to all the values for which the two /// indices can be equal. The ObjectState /// structure uses a cache for concrete writes and for symbolic writes at concrete /// indices, but for writes at symbolic /// indices it must construct a list of such updates. These are stored in the /// UpdateList and UpdateNode structures /// which are again immutable data structures so that copy is cheap and the sharing /// is exploited. /// /// @section searcher Searcher /// Base classe: klee::Searcher. The Executor uses a Searcher to select the next /// state (i.e. program instance following a single path) for which an /// instruction /// will be executed. There are multiple implementations of Searcher in klee, /// implementing different search policies. klee::RandomSearcher selects the next state randomly. /// klee::DFSSearcher uses a depth first approach. klee::MergingSearcher tries /// to merge states ?