diff options
Diffstat (limited to 'discuss.tex')
-rw-r--r-- | discuss.tex | 48 |
1 files changed, 47 insertions, 1 deletions
diff --git a/discuss.tex b/discuss.tex index 682bd5e..12af349 100644 --- a/discuss.tex +++ b/discuss.tex @@ -1,3 +1,49 @@ \section{Discussion} +The experiment on \textsc{IntroClass} reveals some limitations +of \psychic{}. First, the tool has trouble scaling with growing path spaces. +While this is inherent to the path explosion problem, not all paths +need to be explored. Therefore, more aggressive path pruning~\cite{homi,hydiff} +and scheduling heuristics~\cite{directed,rsc,paradyse} could prove +to be beneficial. The use of \textsc{DiffTest} on any (undifferentiated) +state pair in algorithm~\ref{diffgen} further hurts performance. + +Next, while in theory, \klee{} can execute any LLVM bitcode, +the programming interface sets the limit on logical interpretation. +More specifically, to leave rooms for optimization, C and POSIX standards +leave a lot of behaviors unspecified. For example, function \texttt{atoi(3)}, +which converts string to integer, may return any number upon +a parsing error~\cite{atoi}, lumping this implementation-defined number +with error cases. + +Some others like \texttt{scanf(3)} and \texttt{printf(3)} +are prohibitively expensive due to heavy pattern matching, and since +they operate on input/output, their calls are likely dominators +or post-dominators in the control-flow graph, requiring manual interventions. +Failing to replace these calls could either significantly degrade efficiency +or force the theorem prover to give up on \gls{smt} solving. In case +of \textsc{IntroClass}, ignoring \texttt{printf(3)} calls disabled +the use of output files capturing. + +On more positive notes, capturing local function return values +and mutable arguments has proved its usability. Together +with the use of eager concrete execution to skip symbolic test generation, +promising preliminaries have been obtained. + \subsection{Threats to Validity} -\subsection{Related Work} +The small scope of experiments on toy programs challenges +the practical generalizibility of the tool. Since the \textsc{Prophet}-based +\gls{apr} tool MSV mines for tokens within the codebase~\cite{prophet}, +patches are not diverse in semantics and the overall result +is not statistically strong. + +\subsection{Related Works} +There exists multiple differential test generators for a pair +of program revisions, such as \textsc{Nezha} which uses black- or gray-box +fuzzing~\cite{nezha}, \textsc{Shadow} based on symbolic execution~\cite{shadow}, +and \textsc{HyDiff} combining both~\cite{hydiff}. Some recent work targets +the opposite problem of detecting semantic code clone~\cite{semclone}. + +For interpreting multiple program variants symbolically at the same time, +alternative to \psychic{} keeping independent states, other approaches +may choose shadow execution~\cite{shadow}, or to join~\cite{join-split} +or merge~\cite{merge} states at higher-order branches. |