From 6ae711b1d900bffbca407fe97d5e5ce97745dff1 Mon Sep 17 00:00:00 2001 From: Dominic Chen Date: Thu, 25 Jul 2013 10:45:32 +0100 Subject: move website to separate repo --- www/Tutorial-2.html | 287 ---------------------------------------------------- 1 file changed, 287 deletions(-) delete mode 100644 www/Tutorial-2.html (limited to 'www/Tutorial-2.html') diff --git a/www/Tutorial-2.html b/www/Tutorial-2.html deleted file mode 100644 index 47187a45..00000000 --- a/www/Tutorial-2.html +++ /dev/null @@ -1,287 +0,0 @@ - - - - - KLEE - Tutorial Two - - - - - - - -
- -

Tutorial Two: Testing a Simple Regular Expression Library

- -

This is an example of using KLEE to test a simple regular expression matching -function. You can find the basic example in the source tree -under examples/regexp.

- -

Regexp.c contains a simple regular expression matching function, and -the bare bones testing harness (in main) needed to explore this code -with klee. You can see a version of the source -code here.

- -

This example will show to build and run the example using KLEE, as well as -how to interpret the output, and some additional KLEE features that can be used -when writing a test driver by hand.

- -

We'll start by showing how to build and run the example, and then explain how -the test harness works in more detail.

- -

Building the example

- -

The first step is to compile the source code using a compiler which can -generate object files in LLVM bitcode format. Here we use llvm-gcc, -but Clang works just as well!

- -

From within the examples/regexp directory: -

- $ llvm-gcc -I ../../include -emit-llvm -c -g Regexp.c -
-which should create a Regexp.o file in LLVM bitcode -format. The -I argument is used so that the compiler can -find "klee/klee.h", -which contains definitions for the intrinsic functions used to interact with the -KLEE virtual machine. -c is used because we only want to compile the -code to an object file (not a native executable), and finally -g causes -additional debug information to be stored in the object file, which KLEE will -use to determine source line number information.

- -

If you have the LLVM tools installed in your path, you can verify that this step -worked by running llvm-nm on the generated file:

-
-
-  $ llvm-nm Regexp.o
-         t matchstar
-         t matchhere
-         T match
-         T main
-         U klee_make_symbolic_name
-         d LC
-         d LC1
-
-
- -

Normally before running this program we would need to link it to create a -native executable. However, KLEE runs directly on LLVM bitcode files -- since -this program only has a single file there is no need to link. For "real" -programs with multiple inputs, -the llvm-link -and llvm-ld tools can -be used in place of the regular link step to merge multiple LLVM bitcode files -into a single module which can be executed by KLEE.

- -

Executing the code with KLEE

- - -

The next step is to execute the code with KLEE:

-
-
-$ klee --only-output-states-covering-new Regexp.o
-KLEE: output directory = "klee-out-1"
-KLEE: ERROR: .../klee/examples/regexp/Regexp.c:23: memory error: out of bound pointer
-KLEE: NOTE: now ignoring this error at this location
-KLEE: ERROR: .../klee/examples/regexp/Regexp.c:25: memory error: out of bound pointer
-KLEE: NOTE: now ignoring this error at this location
-KLEE: done: total instructions = 6334861
-KLEE: done: completed paths = 7692
-KLEE: done: generated tests = 22
-
-
- -

On startup, KLEE prints the directory used to store output (in this -case klee-out-1). By default klee will use the first -free klee-out-N directory and also create a klee-last -symlink which will point to the most recent created directory. You can specify a -directory to use for outputs using the -output-dir=path -command line argument.

- -

While KLEE is running, it will print status messages for "important" events, -for example when it finds an error in the program. In this case, KLEE detected -to invalid memory accesses on lines 23 and 25 of our test program. We'll look -more at this in a moment.

- -

Finally, when KLEE finishes execution it prints out a few statistics about -the run. Here we see that KLEE executed a total of ~6 million instructions, -explored 7,692 paths, and generated 22 test cases.

- -

Note that many realistic programs have an infinite (or extremely large) -number of paths through them, and it is common that KLEE will not terminate. By -default KLEE will run until the user presses Control-C (i.e. klee gets -a SIGINT), but there are additional options to limit KLEE's runtime and memory -usage:

-

-

- -

KLEE error reports

- -

When KLEE detects an error in the program being executed it will generate a -test case which exhibits the error, and write some additional information about -the error into a file testN.TYPE.err, where N is -the test case number, and TYPE identifies the kind of error. Some -types of errors KLEE detects include:

- - - -

KLEE will print a message to the console when it detects an error, in the -test run above we can see that KLEE detected two memory errors. For all program -errors, KLEE will write a simple backtrace into the .err file. This is -what one of the errors above looks like:

-
-
-Error: memory error: out of bound pointer
-File: .../klee/examples/regexp/Regexp.c
-Line: 23
-Stack: 
-	#0 00000146 in matchhere (re=14816471, text=14815301) at .../klee/examples/regexp/Regexp.c:23
-	#1 00000074 in matchstar (c, re=14816471, text=14815301) at .../klee/examples/regexp/Regexp.c:16
-	#2 00000172 in matchhere (re=14816469, text=14815301) at .../klee/examples/regexp/Regexp.c:26
-	#3 00000074 in matchstar (c, re=14816469, text=14815301) at .../klee/examples/regexp/Regexp.c:16
-	#4 00000172 in matchhere (re=14816467, text=14815301) at .../klee/examples/regexp/Regexp.c:26
-	#5 00000074 in matchstar (c, re=14816467, text=14815301) at .../klee/examples/regexp/Regexp.c:16
-	#6 00000172 in matchhere (re=14816465, text=14815301) at .../klee/examples/regexp/Regexp.c:26
-	#7 00000231 in matchhere (re=14816464, text=14815300) at .../klee/examples/regexp/Regexp.c:30
-	#8 00000280 in match (re=14816464, text=14815296) at .../klee/examples/regexp/Regexp.c:38
-	#9 00000327 in main () at .../klee/examples/regexp/Regexp.c:59
-Info: 
-	address: 14816471
-	next: object at 14816624 of size 4
-	prev: object at 14816464 of size 7
-
-
- -

Each line of the backtrace lists the frame number, the instruction line (this -is the line number in the assembly.ll file found along with the run -output), the function and arguments (including values for the concrete -parameters), and the source information.

- -

Particular error reports may also include additional information. For memory -errors, KLEE will show the invalid address, and what objects are on the heap -both before and after that address. In this case, we can see that the address -happens to be exactly one byte past the end of the previous object.

- -

Changing the test harness

- -

The reason KLEE is finding memory errors in this program isn't because the -regular expression functions have a bug, rather it indicates a problem in our -test driver. The problem is that we are making the input regular expression -buffer completely symbolic, but the match function expects it to be a -null terminated string. Let's look at two ways we can fix this.

- -

The simplest way to fix this problem is to store '\0' at the end of -the buffer, after making it symbolic. This makes our driver look like this:

-
-
-int main() {
-  // The input regular expression.
-  char re[SIZE];
-  
-  // Make the input symbolic. 
-  klee_make_symbolic(re, sizeof re, "re");
-  re[SIZE - 1] = '\0';
-
-  // Try to match against a constant string "hello".
-  match(re, "hello");
-
-  return 0;
-}
-
-
-

Making a buffer symbolic just initializes the contents to refer to symbolic -variables, we are still free to modify the memory as we wish. If you recompile -and run klee on this test program, the memory errors should now be -gone.

- -

Another way to accomplish the same effect is to use the klee_assume -intrinsic function. klee_assume takes a single argument (an unsigned -integer) which generally should some kind of conditional expression, and -"assumes" that expression to be true on the current path (if that can never -happen, i.e. the expression is provably false, KLEE will report an error). - -

We can use klee_assume to cause KLEE to only explore states where -the string is null terminated by writing the driver like this:

-
-
-int main() {
-  // The input regular expression.
-  char re[SIZE];
-  
-  // Make the input symbolic. 
-  klee_make_symbolic(re, sizeof re, "re");
-  klee_assume(re[SIZE - 1] == '\0');
-
-  // Try to match against a constant string "hello".
-  match(re, "hello");
-
-  return 0;
-}
-
-
-

In this particular example, both solutions work fine, but in -general klee_assume is more flexible:

- - -

NOTE: One important caveat when using klee_assume with -multiple conditions; remember that boolean conditionals like '&&' and '||' may -be compiled into code which branches before computing the result of the -expression. In such situations KLEE will branch the process *before* it reaches -the call to klee_assume, which may result in exploring unnecessary -additional states. For this reason it is good to use as simple expressions as -possible to klee_assume (for example splitting a single call into -multiple ones), and to use the '&' and '|' operators instead of the -short-circuiting ones.

- - - -
- - -- cgit 1.4.1