about summary refs log tree commit diff homepage
diff options
context:
space:
mode:
authorDaniel Dunbar <daniel@zuster.org>2009-05-27 16:03:03 +0000
committerDaniel Dunbar <daniel@zuster.org>2009-05-27 16:03:03 +0000
commit0192b63f304c75f225d7ad13c1ee348597195c77 (patch)
tree31669ba1b8b46e9d5cffb904778e191faca9de9a
parent77bcf7c643e72891d9ec468a41d1ab9c89b86dfe (diff)
downloadklee-0192b63f304c75f225d7ad13c1ee348597195c77.tar.gz
Move the regular expression example to "Tutorial Two".
git-svn-id: https://llvm.org/svn/llvm-project/klee/trunk@72472 91177308-0d34-0410-b5e6-96231b3b80d8
-rw-r--r--www/Examples.html134
-rw-r--r--www/Tutorial-2.html287
-rw-r--r--www/menu.html.incl4
-rw-r--r--www/tutorial-1.html4
-rw-r--r--www/tutorials.html8
5 files changed, 297 insertions, 140 deletions
diff --git a/www/Examples.html b/www/Examples.html
deleted file mode 100644
index 25ee6fb0..00000000
--- a/www/Examples.html
+++ /dev/null
@@ -1,134 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
-          "http://www.w3.org/TR/html4/strict.dtd">
-<html>
-<head>
-  <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
-  <title>KLEE - Examples</title>
-  <link type="text/css" rel="stylesheet" href="menu.css" />
-  <link type="text/css" rel="stylesheet" href="content.css" />
-</head>
-<body>
-
-<!--#include virtual="menu.html.incl"-->
-
-<div id="content">
-
-<h1>KLEE Examples</h1>
-
-<p>Here is a basic example of using KLEE to test a simple regular expression matching function. You can find the basic example in the source tree under <tt>examples/regexp</tt>.</p>
-
-<h2>Example: <tt>Regexp.c</tt></h2>
-
-<p><tt>Regexp.c</tt> contains a simple regular expression matching function, and
-the bare bones testing harness (in <tt>main</tt>) need to explore this code with
-klee. You can see a version of the source
-code <a href="resources/Regexp.c.html">here</a>.</p>
-
-<p>This example will show to build and run the example using KLEE, as well as
-how to interpret the output, and some additional KLEE features that can be used
-when writing a test driver by hand.</p>
-
-<p>We'll start by showing how to build and run the example, and then explain how
-the test harness works in more detail.</p>
-
-<h3>Step 1: Building the example</h3>
-
-<p>The first step is to compile the source code using a compiler which can
-generate object files in LLVM bitcode format. Here we use <tt>llvm-gcc</tt>,
-but <a href="http://clang.llvm.org">Clang</a> works just as well!</p>
-
-<p>From within the <tt>examples/regexp</tt> directory:
-<div class="instr">
-  <b>$ llvm-gcc -I ../../include -emit-llvm -c -g Regexp.c</b>
-</div>
-which should create a <tt>Regexp.o</tt> file in LLVM bitcode
-format. The <tt>-I</tt> argument is used so that the compiler can
-find <a href="http://t1.minormatter.com/~ddunbar/klee-doxygen/klee_8h-source.html">"klee/klee.h"</a>,which
-contains definitions for the intrinsic functions used to interact with the KLEE
-virtual machine. <tt>-c</tt> is used because we only want to compile the code to
-an object file (not a native executable), and finally <tt>-g</tt> causes
-additional debug information to be stored in the object file, which KLEE will
-use to determine source line number information.</p>
-
-<p>If you have the LLVM tools installed in your path, you can verify that this step
-worked by running <tt>llvm-nm</tt> on the generated file:</p>
-<div class="instr">
-<pre>
-  <b>$ llvm-nm Regexp.o</b>
-         t matchstar
-         t matchhere
-         T match
-         T main
-         U klee_make_symbolic_name
-         d LC
-         d LC1
-</pre>
-</div>
-
-<p>Normally before running this program we would need to link it to create a
-native executable. However, KLEE runs directly on LLVM bitcode files -- since
-this program only has a single file there is no need to link. For "real"
-programs with multiple inputs,
-the <a href="http://llvm.org/cmds/llvm-link.html"><tt>llvm-link</tt></a>
-and <a href="http://llvm.org/cmds/llvm-ld.html"><tt>llvm-ld</tt></a> tools can
-be used in place of the regular link step to merge multiple LLVM bitcode files
-into a single module which can be executed by KLEE.</p>
-
-<h3>Step 2: Executing the code with KLEE</h3>
-
-<!-- FIXME: Make only-output-states-covering-new default -->
-<p>The next step is to execute the code with KLEE:</p>
-<div class="instr">
-<pre>
-<b>$ klee --only-output-states-covering-new Regexp.o</b>
-KLEE: output directory = "klee-out-1"
-KLEE: ERROR: .../klee/examples/regexp/Regexp.c:23: memory error: out of bound pointer
-KLEE: NOTE: now ignoring this error at this location
-KLEE: ERROR: .../klee/examples/regexp/Regexp.c:25: memory error: out of bound pointer
-KLEE: NOTE: now ignoring this error at this location
-KLEE: done: total instructions = 6334861
-KLEE: done: completed paths = 7692
-KLEE: done: generated tests = 22
-</pre>
-</div>
-
-<p>On startup, KLEE prints the directory used to store output (in this
-case <tt>klee-out-1</tt>). By default klee will use the first
-free <tt>klee-out-<em>N</em></tt> directory and also create a <tt>klee-last</tt>
-symlink which will point to the most recent created directory. You can specify a
-directory to use for outputs using the <tt>-output-dir=<em>path</em></tt>
-command line argument.</p>
-
-<p>While KLEE is running, it will print status messages for "important" events,
-for example when it finds an error in the program. In this case, KLEE detected
-to invalid memory accesses on lines 23 and 25 of our test program. We'll look
-more at this in a moment.</p>
-
-<p>Finally, when KLEE finishes execution it prints out a few statistics about
-the run. Here we see that KLEE executed a total of ~6 million instructions,
-explored 7,692 paths, and generated 22 test cases.</p>
-
-<p>Note that many realistic programs have an infinite (or extremely large)
-number of paths through them, and it is common that KLEE will not terminate. By
-default KLEE will run until the user presses Control-C (i.e. <tt>klee</tt> gets
-a SIGINT), but there are additional options to limit KLEE's runtime and memory
-usage:<p>
-<ul>
-  <li><tt>-max-time=<em>seconds</em></tt>: Halt execution after the given number
-  of seconds.</li>
-  <li><tt>-max-forks=<em>N</em></tt>: Stop forking after <em>N</em> symbolic
-  branches, and run the remaining paths to termination.</li>
-  <li><tt>-max-memory=<em>N</em></tt>: Try to limit memory consumption
-  to <em>N</em> megabytes.</li>
-</ul>
-</p>
-
-<h3>Step ?: Changing the test harness</h3>
-
-<p>FIXME: Step through & explain the test harness, and how we can modify it.</p>
-
-<p>FIXME: Write: show the important klee.h functions.</p>
-
-</div>
-</body>
-</html>
diff --git a/www/Tutorial-2.html b/www/Tutorial-2.html
new file mode 100644
index 00000000..47187a45
--- /dev/null
+++ b/www/Tutorial-2.html
@@ -0,0 +1,287 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
+          "http://www.w3.org/TR/html4/strict.dtd">
+<html>
+<head>
+  <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+  <title>KLEE - Tutorial Two</title>
+  <link type="text/css" rel="stylesheet" href="menu.css" />
+  <link type="text/css" rel="stylesheet" href="content.css" />
+</head>
+<body>
+
+<!--#include virtual="menu.html.incl"-->
+
+<div id="content">
+
+<h1>Tutorial Two: Testing a Simple Regular Expression Library</h1>
+
+<p>This is an example of using KLEE to test a simple regular expression matching
+function. You can find the basic example in the source tree
+under <tt>examples/regexp</tt>.</p>
+
+<p><tt>Regexp.c</tt> contains a simple regular expression matching function, and
+the bare bones testing harness (in <tt>main</tt>) needed to explore this code
+with klee. You can see a version of the source
+code <a href="resources/Regexp.c.html">here</a>.</p>
+
+<p>This example will show to build and run the example using KLEE, as well as
+how to interpret the output, and some additional KLEE features that can be used
+when writing a test driver by hand.</p>
+
+<p>We'll start by showing how to build and run the example, and then explain how
+the test harness works in more detail.</p>
+
+<h2>Building the example</h2>
+
+<p>The first step is to compile the source code using a compiler which can
+generate object files in LLVM bitcode format. Here we use <tt>llvm-gcc</tt>,
+but <a href="http://clang.llvm.org">Clang</a> works just as well!</p>
+
+<p>From within the <tt>examples/regexp</tt> directory:
+<div class="instr">
+  <b>$ llvm-gcc -I ../../include -emit-llvm -c -g Regexp.c</b>
+</div>
+which should create a <tt>Regexp.o</tt> file in LLVM bitcode
+format. The <tt>-I</tt> argument is used so that the compiler can
+find <a href="http://t1.minormatter.com/~ddunbar/klee-doxygen/klee_8h-source.html">"klee/klee.h"</a>,
+which contains definitions for the intrinsic functions used to interact with the
+KLEE virtual machine. <tt>-c</tt> is used because we only want to compile the
+code to an object file (not a native executable), and finally <tt>-g</tt> causes
+additional debug information to be stored in the object file, which KLEE will
+use to determine source line number information.</p>
+
+<p>If you have the LLVM tools installed in your path, you can verify that this step
+worked by running <tt>llvm-nm</tt> on the generated file:</p>
+<div class="instr">
+<pre>
+  <b>$ llvm-nm Regexp.o</b>
+         t matchstar
+         t matchhere
+         T match
+         T main
+         U klee_make_symbolic_name
+         d LC
+         d LC1
+</pre>
+</div>
+
+<p>Normally before running this program we would need to link it to create a
+native executable. However, KLEE runs directly on LLVM bitcode files -- since
+this program only has a single file there is no need to link. For "real"
+programs with multiple inputs,
+the <a href="http://llvm.org/cmds/llvm-link.html"><tt>llvm-link</tt></a>
+and <a href="http://llvm.org/cmds/llvm-ld.html"><tt>llvm-ld</tt></a> tools can
+be used in place of the regular link step to merge multiple LLVM bitcode files
+into a single module which can be executed by KLEE.</p>
+
+<h2>Executing the code with KLEE</h2>
+
+<!-- FIXME: Make only-output-states-covering-new default -->
+<p>The next step is to execute the code with KLEE:</p>
+<div class="instr">
+<pre>
+<b>$ klee --only-output-states-covering-new Regexp.o</b>
+KLEE: output directory = "klee-out-1"
+KLEE: ERROR: .../klee/examples/regexp/Regexp.c:23: memory error: out of bound pointer
+KLEE: NOTE: now ignoring this error at this location
+KLEE: ERROR: .../klee/examples/regexp/Regexp.c:25: memory error: out of bound pointer
+KLEE: NOTE: now ignoring this error at this location
+KLEE: done: total instructions = 6334861
+KLEE: done: completed paths = 7692
+KLEE: done: generated tests = 22
+</pre>
+</div>
+
+<p>On startup, KLEE prints the directory used to store output (in this
+case <tt>klee-out-1</tt>). By default klee will use the first
+free <tt>klee-out-<em>N</em></tt> directory and also create a <tt>klee-last</tt>
+symlink which will point to the most recent created directory. You can specify a
+directory to use for outputs using the <tt>-output-dir=<em>path</em></tt>
+command line argument.</p>
+
+<p>While KLEE is running, it will print status messages for "important" events,
+for example when it finds an error in the program. In this case, KLEE detected
+to invalid memory accesses on lines 23 and 25 of our test program. We'll look
+more at this in a moment.</p>
+
+<p>Finally, when KLEE finishes execution it prints out a few statistics about
+the run. Here we see that KLEE executed a total of ~6 million instructions,
+explored 7,692 paths, and generated 22 test cases.</p>
+
+<p>Note that many realistic programs have an infinite (or extremely large)
+number of paths through them, and it is common that KLEE will not terminate. By
+default KLEE will run until the user presses Control-C (i.e. <tt>klee</tt> gets
+a SIGINT), but there are additional options to limit KLEE's runtime and memory
+usage:<p>
+<ul>
+  <li><tt>-max-time=<em>seconds</em></tt>: Halt execution after the given number
+  of seconds.</li>
+  <li><tt>-max-forks=<em>N</em></tt>: Stop forking after <em>N</em> symbolic
+  branches, and run the remaining paths to termination.</li>
+  <li><tt>-max-memory=<em>N</em></tt>: Try to limit memory consumption
+  to <em>N</em> megabytes.</li>
+</ul>
+</p>
+
+<h2>KLEE error reports</h2>
+
+<p>When KLEE detects an error in the program being executed it will generate a
+test case which exhibits the error, and write some additional information about
+the error into a file <tt>test<i>N</i>.<i>TYPE</i>.err</tt>, where <i>N</i> is
+the test case number, and <i>TYPE</i> identifies the kind of error. Some
+types of errors KLEE detects include:</p>
+
+<ul>
+  <li><b>ptr</b>: Stores or loads of invalid memory locations.</li>
+
+  <li><b>free</b>: Double or invalid <tt>free()</tt>.</li>
+
+  <li><b>abort</b>: The program called <tt>abort()</tt>.</li>
+
+  <li><b>assert</b>: An assertion failed.</li>
+
+  <li><b>div</b>: A division or modulus by zero was detected.</li>
+
+  <li><b>user</b>: There is a problem with the input (invalid <tt>klee</tt>
+  intrinsic calls) or the way KLEE is being used.</li>
+
+  <li><b>exec</b>: There was a problem which prevented KLEE from executing the
+  program; for example an unknown instruction, a call to an invalid function
+  pointer, or inline assembly.</li>
+
+  <li><b>model</b>: KLEE was unable to keep full precision and is only exploring
+  parts of the program state. For example, symbolic sizes to <tt>malloc</tt> are
+  not currently supported, in such cases KLEE will concretize the argument.</li>
+</ul>
+
+<p>KLEE will print a message to the console when it detects an error, in the
+test run above we can see that KLEE detected two memory errors. For all program
+errors, KLEE will write a simple backtrace into the <tt>.err</tt> file. This is
+what one of the errors above looks like:</p>
+<div class="instr">
+<pre>
+Error: memory error: out of bound pointer
+File: .../klee/examples/regexp/Regexp.c
+Line: 23
+Stack: 
+	#0 00000146 in matchhere (re=14816471, text=14815301) at .../klee/examples/regexp/Regexp.c:23
+	#1 00000074 in matchstar (c, re=14816471, text=14815301) at .../klee/examples/regexp/Regexp.c:16
+	#2 00000172 in matchhere (re=14816469, text=14815301) at .../klee/examples/regexp/Regexp.c:26
+	#3 00000074 in matchstar (c, re=14816469, text=14815301) at .../klee/examples/regexp/Regexp.c:16
+	#4 00000172 in matchhere (re=14816467, text=14815301) at .../klee/examples/regexp/Regexp.c:26
+	#5 00000074 in matchstar (c, re=14816467, text=14815301) at .../klee/examples/regexp/Regexp.c:16
+	#6 00000172 in matchhere (re=14816465, text=14815301) at .../klee/examples/regexp/Regexp.c:26
+	#7 00000231 in matchhere (re=14816464, text=14815300) at .../klee/examples/regexp/Regexp.c:30
+	#8 00000280 in match (re=14816464, text=14815296) at .../klee/examples/regexp/Regexp.c:38
+	#9 00000327 in main () at .../klee/examples/regexp/Regexp.c:59
+Info: 
+	address: 14816471
+	next: object at 14816624 of size 4
+	prev: object at 14816464 of size 7
+</pre>
+</div>
+
+<p>Each line of the backtrace lists the frame number, the instruction line (this
+is the line number in the <tt>assembly.ll</tt> file found along with the run
+output), the function and arguments (including values for the concrete
+parameters), and the source information.</p>
+
+<p>Particular error reports may also include additional information. For memory
+errors, KLEE will show the invalid address, and what objects are on the heap
+both before and after that address. In this case, we can see that the address
+happens to be exactly one byte past the end of the previous object.</p>
+
+<h2>Changing the test harness</h2>
+
+<p>The reason KLEE is finding memory errors in this program isn't because the
+regular expression functions have a bug, rather it indicates a problem in our
+test driver. The problem is that we are making the input regular expression
+buffer completely symbolic, but the <tt>match</tt> function expects it to be a
+null terminated string. Let's look at two ways we can fix this.</p>
+
+<p>The simplest way to fix this problem is to store <tt>'\0'</tt> at the end of
+the buffer, after making it symbolic. This makes our driver look like this:</p>
+<div class="instr">
+<pre>
+int main() {
+  // The input regular expression.
+  char re[SIZE];
+  
+  // Make the input symbolic. 
+  klee_make_symbolic(re, sizeof re, "re");
+  re[SIZE - 1] = '\0';
+
+  // Try to match against a constant string "hello".
+  match(re, "hello");
+
+  return 0;
+}
+</pre>
+</div>
+<p>Making a buffer symbolic just initializes the contents to refer to symbolic
+variables, we are still free to modify the memory as we wish. If you recompile
+and run <tt>klee</tt> on this test program, the memory errors should now be
+gone.</p>
+
+<p>Another way to accomplish the same effect is to use the <tt>klee_assume</tt>
+intrinsic function. <tt>klee_assume</tt> takes a single argument (an unsigned
+integer) which generally should some kind of conditional expression, and
+"assumes" that expression to be true on the current path (if that can never
+happen, i.e. the expression is provably false, KLEE will report an error).</pp>
+
+<p>We can use <tt>klee_assume</tt> to cause KLEE to only explore states where
+the string is null terminated by writing the driver like this:</p>
+<div class="instr">
+<pre>
+int main() {
+  // The input regular expression.
+  char re[SIZE];
+  
+  // Make the input symbolic. 
+  klee_make_symbolic(re, sizeof re, "re");
+  klee_assume(re[SIZE - 1] == '\0');
+
+  // Try to match against a constant string "hello".
+  match(re, "hello");
+
+  return 0;
+}
+</pre>
+</div>
+<p>In this particular example, both solutions work fine, but in
+general <tt>klee_assume</tt> is more flexible:</p>
+<ul>
+  <li>By explicitly declaring the constraint, this will force test cases to have
+  the <tt>'\0'</tt> in them. In the first example where we write the terminating
+  null explicitly, it doesn't matter what the last byte of the symbolic input is
+  and KLEE is free to generate any value. In some cases where you want to
+  inspect the test cases by hand, it is more convenient for the test case to
+  show all the values that matter.</li>
+
+  <li><tt>klee_assume</tt> can be used to encode more complicated
+  constraints. For example, we could use <tt>klee_assume(re[0] != '^')</tt> to
+  cause KLEE to only explore states where the first byte is
+  not <tt>'^'</tt>.</li>
+</ul>
+
+<p><b>NOTE</b>: One important caveat when using <tt>klee_assume</tt> with
+multiple conditions; remember that boolean conditionals like '&&' and '||' may
+be compiled into code which branches before computing the result of the
+expression. In such situations KLEE will branch the process *before* it reaches
+the call to <tt>klee_assume</tt>, which may result in exploring unnecessary
+additional states. For this reason it is good to use as simple expressions as
+possible to <tt>klee_assume</tt> (for example splitting a single call into
+multiple ones), and to use the '&' and '|' operators instead of the
+short-circuiting ones.</p>
+
+<!--
+<h2>Visualizing what KLEE explored<h2>
+
+<p>For large or long running programs, it is frequently useful to know exactly
+what code KLEE explored, where it spent its time, where branches were taken,
+and so on.</p>
+-->
+
+</div>
+</body>
+</html>
diff --git a/www/menu.html.incl b/www/menu.html.incl
index 58245138..1c15ab44 100644
--- a/www/menu.html.incl
+++ b/www/menu.html.incl
@@ -6,15 +6,15 @@
   <div class="submenu">
     <label>KLEE Info</label>
     <a href="index.html">About</a>
-    <a href="GetStarted.html">Get Started</a>
+    <a href="GetStarted.html">Getting Started</a>
     <a href="GetInvolved.html">Get Involved</a>
     <a href="tutorials.html">Tutorials</a>
-    <a href="Examples.html">Examples</a>
   </div>
 
   <div class="submenu">
     <label>Quick Links</label>
     <a href="http://keeda.stanford.edu/mailman/listinfo/klee-dev">klee-dev</a>
+    <a href="http://llvm.org/bugs/">Bug Reports</a>
     <a href="http://llvm.org/svn/llvm-project/klee/trunk/">Browse SVN</a>
     <a href="http://llvm.org/viewvc/llvm-project/klee/trunk/">Browse ViewVC</a>
     <a href="http://t1.minormatter.com/~ddunbar/klee-doxygen/index.html">doxygen</a>
diff --git a/www/tutorial-1.html b/www/tutorial-1.html
index 9f458237..3f9bfc4e 100644
--- a/www/tutorial-1.html
+++ b/www/tutorial-1.html
@@ -4,7 +4,7 @@
 <html>
 <head>
   <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
-  <title>The KLEE Symbolic Virtual Machine</title>
+  <title>KLEE - Tutorial One</title>
   <link type="text/css" rel="stylesheet" href="menu.css">
   <link type="text/css" rel="stylesheet" href="content.css">
 </head>
@@ -116,7 +116,7 @@
 
   <pre class="output">
   $ ktest-tool klee-last/test000001.ktest 
-  bout file  : 'klee-last/test000001.ktest'
+  ktest file : 'klee-last/test000001.ktest'
   args       : ['demo.o']
   num objects: 1
   object    0: name: 'input'
diff --git a/www/tutorials.html b/www/tutorials.html
index d618a91e..795c629e 100644
--- a/www/tutorials.html
+++ b/www/tutorials.html
@@ -4,7 +4,7 @@
 <html>
 <head>
   <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
-  <title>The KLEE Symbolic Virtual Machine</title>
+  <title>KLEE - Tutorials</title>
   <link type="text/css" rel="stylesheet" href="menu.css">
   <link type="text/css" rel="stylesheet" href="content.css">
 </head>
@@ -16,7 +16,11 @@
   <!--*********************************************************************-->
 
   <ol>
-  <li><a href="tutorial-1.html">Tutorial One</a>:  Using KLEE on a toy example</li>
+    <li><a href="tutorial-1.html">Tutorial One</a>: Using KLEE on a toy
+      example.</li>
+
+    <li><a href="Tutorial-2.html">Tutorial Two</a>: Testing a simple regular
+      expression library.</li>
   </ol>