1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
|
\section{Background}
As mentioned earlier, our technique is based on the symbolic execution,
an engine named \klee{} in particular. Their general principles
and some specifics are laid out here to make it less ambiguous to describe
our approach later on.
\subsection{Symbolic execution}
At a high level, symbolic execution is an interpreter loop.
Every iteration, it selects a symbolic process,
also known as a \emph{\gls{state}}, in whose context an instruction
is then executed~\cite{klee}. As shown in algorithm~\ref{symb-exec},
each path with possible \gls{sat} of the given program is explore.
The number of paths grows at an exponential rate, so the interpreter loop
also breaks at the exhaustion of the time budget~\cite{klee},
which is omitted from the algorithm for brevity.
\begin{algorithm}
\caption{Symbolic execution}\label{symb-exec}
\begin{algorithmic}[1]
\Require Program ($P$)
\Ensure Set of test cases ($T$)
\State $T \gets \varnothing$
\State $S \gets \Call{InitialState}{P}$ \Comment{set of states}
\While{$S \neq \varnothing$}
\State $s \gets \Call{Select}{S}$
\State $S \gets S \setminus \{s\}$
\State $(i, e, \varPhi) \gets s$
\Comment{instruction, environment, and path constraints}
\State $(i', e') \gets \Call{Execute}{i, e}$
\If{$\Call{IsHalt}{i}$}
\State $t \gets \Call{GenerateTest}{\varPhi}$
\If{$t \ne \bot$}
$T \gets T \cup \{t\}$
\EndIf
\ElsIf{$\Call{IsBranch}{i}$}
\State $(\varphi, i_c', i_a') \gets i'$
\Comment{condition, consequent, and alternative}
\If{$\Call{sat}{\varPhi \land \varphi}$}
\State $s_c' \gets (i_c', e', \varPhi \land \varphi)$
\State $S \gets S \cup \{s_c'\}$
\EndIf
\If{$\Call{sat}{\varPhi \land \lnot\varphi}$}
\State $s_a' \gets i_a', e', \varPhi \land \lnot\varphi$
\State $S \gets S \cup \{s_a'\}$
\EndIf
\Else
\State $s' \gets (i', e', \varPhi)$
\State $S \gets S \cup \{s'\}$
\EndIf
\EndWhile
\end{algorithmic}
\end{algorithm}
\pagebreak
A \gls{sat} problem involving background theories like arithmetic
in computer programs is known as a~\gls{smt} problem, a generalization
of the Boolean \gls{sat} problem~\cite{smt}, which is already NP-complete.
Under the exponential time hypothesis, the cost of solving
an arbitrary \gls{sat} problem grows exponentially with the number
of variables~\cite{k-sat}.
\subsection{Usage of \klee{}}\label{klee-use}
The symbolic execution engine \klee{} analyses the LLVM \gls{ir} of a program,
also known as its bitcode~\cite{bc}, starting from values marked as symbolic.
Marking can be communicated either through command-line options
for files (including standard streams) and command-line arguments
or via in-source annotations~\cite{klee-docs}.
As shown in algorithm~\ref{symb-exec}, the path constraints is only expanded
at branching instructions, so only values affecting the control flow
are tracked, since \klee{} targets the exploration of execution paths.
In~order to track other values of interest, i.e.~non-inputs, a condition
involving such value must be artificially injected. One common idiom for doing
this condition injection~\cite{angelix,cpr,trident} is shown
in figure~\ref{klee-output}. There, a~temporary symbolic avatar \texttt{var}
is created for the sole purpose of being fixed to the value \texttt{val}
in the path constraints. From here, this mechanism shall be referred to
in abstract descriptions as \textsc{SymbolicOutput(value, context)}.
\begin{figure}[h]\centering\hfill
\begin{subfigure}{0.6\textwidth}
\begin{verbatim}
#define KLEE_OUTPUT_GEN(T) \
T __klee_output_##T(T val, ...) { T var; \
klee_make_symbolic(&var, sizeof(T), ...); \
klee_assume(var == val); return val; }
KLEE_OUTPUT_GEN(int) ...
#define KLEE_OUTPUT(T, val, ...) \
__klee_output_##T((val), ...)
\end{verbatim}
\subcaption{}\label{klee-output-def}
\end{subfigure}\hfill
\begin{subfigure}{0.3\textwidth}
\begin{alltt}
int foo;
...
\textcolor{purple}{-bar(foo);}
\textcolor{teal}{+bar(KLEE_OUTPUT(foo));}\end{alltt}
\subcaption{}\label{klee-output-use}
\end{subfigure}\hfill
\captionsetup{subrefformat=parens}
\caption{Condition injection \subref{klee-output-def} implementation in C
and \subref{klee-output-use} example annotation.}\label{klee-output}
\end{figure}
|