diff options
author | Daniel Dunbar <daniel@zuster.org> | 2009-05-27 16:03:03 +0000 |
---|---|---|
committer | Daniel Dunbar <daniel@zuster.org> | 2009-05-27 16:03:03 +0000 |
commit | 0192b63f304c75f225d7ad13c1ee348597195c77 (patch) | |
tree | 31669ba1b8b46e9d5cffb904778e191faca9de9a | |
parent | 77bcf7c643e72891d9ec468a41d1ab9c89b86dfe (diff) | |
download | klee-0192b63f304c75f225d7ad13c1ee348597195c77.tar.gz |
Move the regular expression example to "Tutorial Two".
git-svn-id: https://llvm.org/svn/llvm-project/klee/trunk@72472 91177308-0d34-0410-b5e6-96231b3b80d8
-rw-r--r-- | www/Examples.html | 134 | ||||
-rw-r--r-- | www/Tutorial-2.html | 287 | ||||
-rw-r--r-- | www/menu.html.incl | 4 | ||||
-rw-r--r-- | www/tutorial-1.html | 4 | ||||
-rw-r--r-- | www/tutorials.html | 8 |
5 files changed, 297 insertions, 140 deletions
diff --git a/www/Examples.html b/www/Examples.html deleted file mode 100644 index 25ee6fb0..00000000 --- a/www/Examples.html +++ /dev/null @@ -1,134 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<html> -<head> - <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /> - <title>KLEE - Examples</title> - <link type="text/css" rel="stylesheet" href="menu.css" /> - <link type="text/css" rel="stylesheet" href="content.css" /> -</head> -<body> - -<!--#include virtual="menu.html.incl"--> - -<div id="content"> - -<h1>KLEE Examples</h1> - -<p>Here is a basic example of using KLEE to test a simple regular expression matching function. You can find the basic example in the source tree under <tt>examples/regexp</tt>.</p> - -<h2>Example: <tt>Regexp.c</tt></h2> - -<p><tt>Regexp.c</tt> contains a simple regular expression matching function, and -the bare bones testing harness (in <tt>main</tt>) need to explore this code with -klee. You can see a version of the source -code <a href="resources/Regexp.c.html">here</a>.</p> - -<p>This example will show to build and run the example using KLEE, as well as -how to interpret the output, and some additional KLEE features that can be used -when writing a test driver by hand.</p> - -<p>We'll start by showing how to build and run the example, and then explain how -the test harness works in more detail.</p> - -<h3>Step 1: Building the example</h3> - -<p>The first step is to compile the source code using a compiler which can -generate object files in LLVM bitcode format. Here we use <tt>llvm-gcc</tt>, -but <a href="http://clang.llvm.org">Clang</a> works just as well!</p> - -<p>From within the <tt>examples/regexp</tt> directory: -<div class="instr"> - <b>$ llvm-gcc -I ../../include -emit-llvm -c -g Regexp.c</b> -</div> -which should create a <tt>Regexp.o</tt> file in LLVM bitcode -format. The <tt>-I</tt> argument is used so that the compiler can -find <a href="http://t1.minormatter.com/~ddunbar/klee-doxygen/klee_8h-source.html">"klee/klee.h"</a>,which -contains definitions for the intrinsic functions used to interact with the KLEE -virtual machine. <tt>-c</tt> is used because we only want to compile the code to -an object file (not a native executable), and finally <tt>-g</tt> causes -additional debug information to be stored in the object file, which KLEE will -use to determine source line number information.</p> - -<p>If you have the LLVM tools installed in your path, you can verify that this step -worked by running <tt>llvm-nm</tt> on the generated file:</p> -<div class="instr"> -<pre> - <b>$ llvm-nm Regexp.o</b> - t matchstar - t matchhere - T match - T main - U klee_make_symbolic_name - d LC - d LC1 -</pre> -</div> - -<p>Normally before running this program we would need to link it to create a -native executable. However, KLEE runs directly on LLVM bitcode files -- since -this program only has a single file there is no need to link. For "real" -programs with multiple inputs, -the <a href="http://llvm.org/cmds/llvm-link.html"><tt>llvm-link</tt></a> -and <a href="http://llvm.org/cmds/llvm-ld.html"><tt>llvm-ld</tt></a> tools can -be used in place of the regular link step to merge multiple LLVM bitcode files -into a single module which can be executed by KLEE.</p> - -<h3>Step 2: Executing the code with KLEE</h3> - -<!-- FIXME: Make only-output-states-covering-new default --> -<p>The next step is to execute the code with KLEE:</p> -<div class="instr"> -<pre> -<b>$ klee --only-output-states-covering-new Regexp.o</b> -KLEE: output directory = "klee-out-1" -KLEE: ERROR: .../klee/examples/regexp/Regexp.c:23: memory error: out of bound pointer -KLEE: NOTE: now ignoring this error at this location -KLEE: ERROR: .../klee/examples/regexp/Regexp.c:25: memory error: out of bound pointer -KLEE: NOTE: now ignoring this error at this location -KLEE: done: total instructions = 6334861 -KLEE: done: completed paths = 7692 -KLEE: done: generated tests = 22 -</pre> -</div> - -<p>On startup, KLEE prints the directory used to store output (in this -case <tt>klee-out-1</tt>). By default klee will use the first -free <tt>klee-out-<em>N</em></tt> directory and also create a <tt>klee-last</tt> -symlink which will point to the most recent created directory. You can specify a -directory to use for outputs using the <tt>-output-dir=<em>path</em></tt> -command line argument.</p> - -<p>While KLEE is running, it will print status messages for "important" events, -for example when it finds an error in the program. In this case, KLEE detected -to invalid memory accesses on lines 23 and 25 of our test program. We'll look -more at this in a moment.</p> - -<p>Finally, when KLEE finishes execution it prints out a few statistics about -the run. Here we see that KLEE executed a total of ~6 million instructions, -explored 7,692 paths, and generated 22 test cases.</p> - -<p>Note that many realistic programs have an infinite (or extremely large) -number of paths through them, and it is common that KLEE will not terminate. By -default KLEE will run until the user presses Control-C (i.e. <tt>klee</tt> gets -a SIGINT), but there are additional options to limit KLEE's runtime and memory -usage:<p> -<ul> - <li><tt>-max-time=<em>seconds</em></tt>: Halt execution after the given number - of seconds.</li> - <li><tt>-max-forks=<em>N</em></tt>: Stop forking after <em>N</em> symbolic - branches, and run the remaining paths to termination.</li> - <li><tt>-max-memory=<em>N</em></tt>: Try to limit memory consumption - to <em>N</em> megabytes.</li> -</ul> -</p> - -<h3>Step ?: Changing the test harness</h3> - -<p>FIXME: Step through & explain the test harness, and how we can modify it.</p> - -<p>FIXME: Write: show the important klee.h functions.</p> - -</div> -</body> -</html> diff --git a/www/Tutorial-2.html b/www/Tutorial-2.html new file mode 100644 index 00000000..47187a45 --- /dev/null +++ b/www/Tutorial-2.html @@ -0,0 +1,287 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" + "http://www.w3.org/TR/html4/strict.dtd"> +<html> +<head> + <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /> + <title>KLEE - Tutorial Two</title> + <link type="text/css" rel="stylesheet" href="menu.css" /> + <link type="text/css" rel="stylesheet" href="content.css" /> +</head> +<body> + +<!--#include virtual="menu.html.incl"--> + +<div id="content"> + +<h1>Tutorial Two: Testing a Simple Regular Expression Library</h1> + +<p>This is an example of using KLEE to test a simple regular expression matching +function. You can find the basic example in the source tree +under <tt>examples/regexp</tt>.</p> + +<p><tt>Regexp.c</tt> contains a simple regular expression matching function, and +the bare bones testing harness (in <tt>main</tt>) needed to explore this code +with klee. You can see a version of the source +code <a href="resources/Regexp.c.html">here</a>.</p> + +<p>This example will show to build and run the example using KLEE, as well as +how to interpret the output, and some additional KLEE features that can be used +when writing a test driver by hand.</p> + +<p>We'll start by showing how to build and run the example, and then explain how +the test harness works in more detail.</p> + +<h2>Building the example</h2> + +<p>The first step is to compile the source code using a compiler which can +generate object files in LLVM bitcode format. Here we use <tt>llvm-gcc</tt>, +but <a href="http://clang.llvm.org">Clang</a> works just as well!</p> + +<p>From within the <tt>examples/regexp</tt> directory: +<div class="instr"> + <b>$ llvm-gcc -I ../../include -emit-llvm -c -g Regexp.c</b> +</div> +which should create a <tt>Regexp.o</tt> file in LLVM bitcode +format. The <tt>-I</tt> argument is used so that the compiler can +find <a href="http://t1.minormatter.com/~ddunbar/klee-doxygen/klee_8h-source.html">"klee/klee.h"</a>, +which contains definitions for the intrinsic functions used to interact with the +KLEE virtual machine. <tt>-c</tt> is used because we only want to compile the +code to an object file (not a native executable), and finally <tt>-g</tt> causes +additional debug information to be stored in the object file, which KLEE will +use to determine source line number information.</p> + +<p>If you have the LLVM tools installed in your path, you can verify that this step +worked by running <tt>llvm-nm</tt> on the generated file:</p> +<div class="instr"> +<pre> + <b>$ llvm-nm Regexp.o</b> + t matchstar + t matchhere + T match + T main + U klee_make_symbolic_name + d LC + d LC1 +</pre> +</div> + +<p>Normally before running this program we would need to link it to create a +native executable. However, KLEE runs directly on LLVM bitcode files -- since +this program only has a single file there is no need to link. For "real" +programs with multiple inputs, +the <a href="http://llvm.org/cmds/llvm-link.html"><tt>llvm-link</tt></a> +and <a href="http://llvm.org/cmds/llvm-ld.html"><tt>llvm-ld</tt></a> tools can +be used in place of the regular link step to merge multiple LLVM bitcode files +into a single module which can be executed by KLEE.</p> + +<h2>Executing the code with KLEE</h2> + +<!-- FIXME: Make only-output-states-covering-new default --> +<p>The next step is to execute the code with KLEE:</p> +<div class="instr"> +<pre> +<b>$ klee --only-output-states-covering-new Regexp.o</b> +KLEE: output directory = "klee-out-1" +KLEE: ERROR: .../klee/examples/regexp/Regexp.c:23: memory error: out of bound pointer +KLEE: NOTE: now ignoring this error at this location +KLEE: ERROR: .../klee/examples/regexp/Regexp.c:25: memory error: out of bound pointer +KLEE: NOTE: now ignoring this error at this location +KLEE: done: total instructions = 6334861 +KLEE: done: completed paths = 7692 +KLEE: done: generated tests = 22 +</pre> +</div> + +<p>On startup, KLEE prints the directory used to store output (in this +case <tt>klee-out-1</tt>). By default klee will use the first +free <tt>klee-out-<em>N</em></tt> directory and also create a <tt>klee-last</tt> +symlink which will point to the most recent created directory. You can specify a +directory to use for outputs using the <tt>-output-dir=<em>path</em></tt> +command line argument.</p> + +<p>While KLEE is running, it will print status messages for "important" events, +for example when it finds an error in the program. In this case, KLEE detected +to invalid memory accesses on lines 23 and 25 of our test program. We'll look +more at this in a moment.</p> + +<p>Finally, when KLEE finishes execution it prints out a few statistics about +the run. Here we see that KLEE executed a total of ~6 million instructions, +explored 7,692 paths, and generated 22 test cases.</p> + +<p>Note that many realistic programs have an infinite (or extremely large) +number of paths through them, and it is common that KLEE will not terminate. By +default KLEE will run until the user presses Control-C (i.e. <tt>klee</tt> gets +a SIGINT), but there are additional options to limit KLEE's runtime and memory +usage:<p> +<ul> + <li><tt>-max-time=<em>seconds</em></tt>: Halt execution after the given number + of seconds.</li> + <li><tt>-max-forks=<em>N</em></tt>: Stop forking after <em>N</em> symbolic + branches, and run the remaining paths to termination.</li> + <li><tt>-max-memory=<em>N</em></tt>: Try to limit memory consumption + to <em>N</em> megabytes.</li> +</ul> +</p> + +<h2>KLEE error reports</h2> + +<p>When KLEE detects an error in the program being executed it will generate a +test case which exhibits the error, and write some additional information about +the error into a file <tt>test<i>N</i>.<i>TYPE</i>.err</tt>, where <i>N</i> is +the test case number, and <i>TYPE</i> identifies the kind of error. Some +types of errors KLEE detects include:</p> + +<ul> + <li><b>ptr</b>: Stores or loads of invalid memory locations.</li> + + <li><b>free</b>: Double or invalid <tt>free()</tt>.</li> + + <li><b>abort</b>: The program called <tt>abort()</tt>.</li> + + <li><b>assert</b>: An assertion failed.</li> + + <li><b>div</b>: A division or modulus by zero was detected.</li> + + <li><b>user</b>: There is a problem with the input (invalid <tt>klee</tt> + intrinsic calls) or the way KLEE is being used.</li> + + <li><b>exec</b>: There was a problem which prevented KLEE from executing the + program; for example an unknown instruction, a call to an invalid function + pointer, or inline assembly.</li> + + <li><b>model</b>: KLEE was unable to keep full precision and is only exploring + parts of the program state. For example, symbolic sizes to <tt>malloc</tt> are + not currently supported, in such cases KLEE will concretize the argument.</li> +</ul> + +<p>KLEE will print a message to the console when it detects an error, in the +test run above we can see that KLEE detected two memory errors. For all program +errors, KLEE will write a simple backtrace into the <tt>.err</tt> file. This is +what one of the errors above looks like:</p> +<div class="instr"> +<pre> +Error: memory error: out of bound pointer +File: .../klee/examples/regexp/Regexp.c +Line: 23 +Stack: + #0 00000146 in matchhere (re=14816471, text=14815301) at .../klee/examples/regexp/Regexp.c:23 + #1 00000074 in matchstar (c, re=14816471, text=14815301) at .../klee/examples/regexp/Regexp.c:16 + #2 00000172 in matchhere (re=14816469, text=14815301) at .../klee/examples/regexp/Regexp.c:26 + #3 00000074 in matchstar (c, re=14816469, text=14815301) at .../klee/examples/regexp/Regexp.c:16 + #4 00000172 in matchhere (re=14816467, text=14815301) at .../klee/examples/regexp/Regexp.c:26 + #5 00000074 in matchstar (c, re=14816467, text=14815301) at .../klee/examples/regexp/Regexp.c:16 + #6 00000172 in matchhere (re=14816465, text=14815301) at .../klee/examples/regexp/Regexp.c:26 + #7 00000231 in matchhere (re=14816464, text=14815300) at .../klee/examples/regexp/Regexp.c:30 + #8 00000280 in match (re=14816464, text=14815296) at .../klee/examples/regexp/Regexp.c:38 + #9 00000327 in main () at .../klee/examples/regexp/Regexp.c:59 +Info: + address: 14816471 + next: object at 14816624 of size 4 + prev: object at 14816464 of size 7 +</pre> +</div> + +<p>Each line of the backtrace lists the frame number, the instruction line (this +is the line number in the <tt>assembly.ll</tt> file found along with the run +output), the function and arguments (including values for the concrete +parameters), and the source information.</p> + +<p>Particular error reports may also include additional information. For memory +errors, KLEE will show the invalid address, and what objects are on the heap +both before and after that address. In this case, we can see that the address +happens to be exactly one byte past the end of the previous object.</p> + +<h2>Changing the test harness</h2> + +<p>The reason KLEE is finding memory errors in this program isn't because the +regular expression functions have a bug, rather it indicates a problem in our +test driver. The problem is that we are making the input regular expression +buffer completely symbolic, but the <tt>match</tt> function expects it to be a +null terminated string. Let's look at two ways we can fix this.</p> + +<p>The simplest way to fix this problem is to store <tt>'\0'</tt> at the end of +the buffer, after making it symbolic. This makes our driver look like this:</p> +<div class="instr"> +<pre> +int main() { + // The input regular expression. + char re[SIZE]; + + // Make the input symbolic. + klee_make_symbolic(re, sizeof re, "re"); + re[SIZE - 1] = '\0'; + + // Try to match against a constant string "hello". + match(re, "hello"); + + return 0; +} +</pre> +</div> +<p>Making a buffer symbolic just initializes the contents to refer to symbolic +variables, we are still free to modify the memory as we wish. If you recompile +and run <tt>klee</tt> on this test program, the memory errors should now be +gone.</p> + +<p>Another way to accomplish the same effect is to use the <tt>klee_assume</tt> +intrinsic function. <tt>klee_assume</tt> takes a single argument (an unsigned +integer) which generally should some kind of conditional expression, and +"assumes" that expression to be true on the current path (if that can never +happen, i.e. the expression is provably false, KLEE will report an error).</pp> + +<p>We can use <tt>klee_assume</tt> to cause KLEE to only explore states where +the string is null terminated by writing the driver like this:</p> +<div class="instr"> +<pre> +int main() { + // The input regular expression. + char re[SIZE]; + + // Make the input symbolic. + klee_make_symbolic(re, sizeof re, "re"); + klee_assume(re[SIZE - 1] == '\0'); + + // Try to match against a constant string "hello". + match(re, "hello"); + + return 0; +} +</pre> +</div> +<p>In this particular example, both solutions work fine, but in +general <tt>klee_assume</tt> is more flexible:</p> +<ul> + <li>By explicitly declaring the constraint, this will force test cases to have + the <tt>'\0'</tt> in them. In the first example where we write the terminating + null explicitly, it doesn't matter what the last byte of the symbolic input is + and KLEE is free to generate any value. In some cases where you want to + inspect the test cases by hand, it is more convenient for the test case to + show all the values that matter.</li> + + <li><tt>klee_assume</tt> can be used to encode more complicated + constraints. For example, we could use <tt>klee_assume(re[0] != '^')</tt> to + cause KLEE to only explore states where the first byte is + not <tt>'^'</tt>.</li> +</ul> + +<p><b>NOTE</b>: One important caveat when using <tt>klee_assume</tt> with +multiple conditions; remember that boolean conditionals like '&&' and '||' may +be compiled into code which branches before computing the result of the +expression. In such situations KLEE will branch the process *before* it reaches +the call to <tt>klee_assume</tt>, which may result in exploring unnecessary +additional states. For this reason it is good to use as simple expressions as +possible to <tt>klee_assume</tt> (for example splitting a single call into +multiple ones), and to use the '&' and '|' operators instead of the +short-circuiting ones.</p> + +<!-- +<h2>Visualizing what KLEE explored<h2> + +<p>For large or long running programs, it is frequently useful to know exactly +what code KLEE explored, where it spent its time, where branches were taken, +and so on.</p> +--> + +</div> +</body> +</html> diff --git a/www/menu.html.incl b/www/menu.html.incl index 58245138..1c15ab44 100644 --- a/www/menu.html.incl +++ b/www/menu.html.incl @@ -6,15 +6,15 @@ <div class="submenu"> <label>KLEE Info</label> <a href="index.html">About</a> - <a href="GetStarted.html">Get Started</a> + <a href="GetStarted.html">Getting Started</a> <a href="GetInvolved.html">Get Involved</a> <a href="tutorials.html">Tutorials</a> - <a href="Examples.html">Examples</a> </div> <div class="submenu"> <label>Quick Links</label> <a href="http://keeda.stanford.edu/mailman/listinfo/klee-dev">klee-dev</a> + <a href="http://llvm.org/bugs/">Bug Reports</a> <a href="http://llvm.org/svn/llvm-project/klee/trunk/">Browse SVN</a> <a href="http://llvm.org/viewvc/llvm-project/klee/trunk/">Browse ViewVC</a> <a href="http://t1.minormatter.com/~ddunbar/klee-doxygen/index.html">doxygen</a> diff --git a/www/tutorial-1.html b/www/tutorial-1.html index 9f458237..3f9bfc4e 100644 --- a/www/tutorial-1.html +++ b/www/tutorial-1.html @@ -4,7 +4,7 @@ <html> <head> <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> - <title>The KLEE Symbolic Virtual Machine</title> + <title>KLEE - Tutorial One</title> <link type="text/css" rel="stylesheet" href="menu.css"> <link type="text/css" rel="stylesheet" href="content.css"> </head> @@ -116,7 +116,7 @@ <pre class="output"> $ ktest-tool klee-last/test000001.ktest - bout file : 'klee-last/test000001.ktest' + ktest file : 'klee-last/test000001.ktest' args : ['demo.o'] num objects: 1 object 0: name: 'input' diff --git a/www/tutorials.html b/www/tutorials.html index d618a91e..795c629e 100644 --- a/www/tutorials.html +++ b/www/tutorials.html @@ -4,7 +4,7 @@ <html> <head> <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> - <title>The KLEE Symbolic Virtual Machine</title> + <title>KLEE - Tutorials</title> <link type="text/css" rel="stylesheet" href="menu.css"> <link type="text/css" rel="stylesheet" href="content.css"> </head> @@ -16,7 +16,11 @@ <!--*********************************************************************--> <ol> - <li><a href="tutorial-1.html">Tutorial One</a>: Using KLEE on a toy example</li> + <li><a href="tutorial-1.html">Tutorial One</a>: Using KLEE on a toy + example.</li> + + <li><a href="Tutorial-2.html">Tutorial Two</a>: Testing a simple regular + expression library.</li> </ol> |