summary refs log tree commit diff
diff options
context:
space:
mode:
-rw-r--r--doc/il.txt109
1 files changed, 54 insertions, 55 deletions
diff --git a/doc/il.txt b/doc/il.txt
index 7ebaf64..d816732 100644
--- a/doc/il.txt
+++ b/doc/il.txt
@@ -46,18 +46,18 @@ to focus on language design issues.
 ~ Input Files
 ~~~~~~~~~~~~~
 
-The intermediate language is provided to QBE as text files.
-Usually, one file is generated per each compilation unit of
+The intermediate language is provided to QBE as text.
+Usually, one file is generated per each compilation unit from
 the frontend input language.  An IL file is a sequence of
 <@ Definitions > for data, functions, and types.  Once
 processed by QBE, the resulting file can be assembled and
 linked using a standard toolchain (e.g., GNU binutils).
 
-Here is a complete "Hello World" IL file, it defines a
+Here is a complete "Hello World" IL file which defines a
 function that prints to the screen.  Since the string is
 not a first class object (only the pointer is) it is
 defined outside the function's body.  Comments start with
-a # character and run until the end of the line.
+a # character and finish with the end of the line.
 
     # Define the string constant.
     data $str = { b "hello world", b 0 }
@@ -70,7 +70,7 @@ a # character and run until the end of the line.
     }
 
 If you have read the LLVM language reference, you might
-recognize the above example.  In comparison, QBE makes a
+recognize the example above.  In comparison, QBE makes a
 much lighter use of types and the syntax is terser.
 
 ~ BNF Notation
@@ -86,7 +86,7 @@ are listed below.
   * `( ... ),` designates a comma-separated list of the
     enclosed syntax;
   * `...*` and `...+` are used for arbitrary and
-    at-least-once repetition.
+    at-least-once repetition respectively.
 
 ~ Sigils
 ~~~~~~~~
@@ -94,14 +94,14 @@ are listed below.
 The intermediate language makes heavy use of sigils, all
 user-defined names are prefixed with a sigil.  This is
 to avoid keyword conflicts, and also to quickly spot the
-scope and kind of an identifier.
+scope and nature of identifiers.
 
  * `:` is for user-defined <@ Aggregate Types>
  * `$` is for globals (represented by a pointer)
  * `%` is for function-scope temporaries
  * `@` is for block labels
 
-In BNF syntax, we use `?IDENT` to designate an identifier
+In this BNF syntax, we use `?IDENT` to designate an identifier
 starting with the sigil `?`.
 
 - 2. Types
@@ -114,7 +114,7 @@ starting with the sigil `?`.
     BASETY := 'w' | 'l' | 's' | 'd'  # Base types
     EXTTY  := BASETY    | 'b' | 'h'  # Extended types
 
-The IL makes very minimal use of types.  By design, the types
+The IL makes minimal use of types.  By design, the types
 used are restricted to what is necessary for unambiguous
 compilation to machine code and C interfacing.  Unlike LLVM,
 QBE is not using types as a means to safety; they are only
@@ -140,16 +140,16 @@ section.
 ~ Subtyping
 ~~~~~~~~~~~
 
-The IL has a minimal subtyping feature for integer types.
+The IL has a minimal subtyping feature, for integer types only.
 Any value of type `l` can be used in a `w` context.  In that
 case, only the 32 least significant bits of the word value
 are used.
 
-Make note that it is the inverse of the usual subtyping on
+Make note that it is the opposite of the usual subtyping on
 integers (in C, we can safely use an `int` where a `long`
 is expected).  A long value cannot be used in word context.
 The rationale is that a word can be signed or unsigned, so
-extending it to a long can be done in two ways, either
+extending it to a long could be done in two ways, either
 by zero-extension, or by sign-extension.
 
 - 3. Constants
@@ -184,9 +184,9 @@ operand of the subtraction is a word (32-bit) context.
 
 Because specifying floating-point constants by their bits
 makes the code less readable, syntactic sugar is provided
-to express them.  Standard scientific notation is used with
-a prefix of `s_` for single and `d_` for double-precision
-numbers.  Once again, the following example defines twice
+to express them.  Standard scientific notation is prefixed
+with `s_` and `d_` for single and double precision numbers
+respectively. Once again, the following example defines twice
 the same double-precision constant.
 
     %x =d add d_0, d_-1
@@ -200,7 +200,7 @@ constants by the linker.
 ----------------
 
 Definitions are the essential components of an IL file.
-They can define three types of objects: Aggregate types,
+They can define three types of objects: aggregate types,
 data, and functions.  Aggregate types are never exported
 and do not compile to any code.  Data and function
 definitions have file scope and are mutually recursive
@@ -221,14 +221,14 @@ using the `export` keyword.
         'type' :IDENT '=' 'align' NUMBER '{' NUMBER '}'
 
 Aggregate type definitions start with the `type` keyword.
-They have file scope, but types must be defined before their
-first use.  The inner structure of a type is expressed by a
+They have file scope, but types must be defined before being
+referenced.  The inner structure of a type is expressed by a
 comma-separated list of <@ Simple Types> enclosed in curly
 braces.
 
     type :fourfloats = { s, s, d, d }
 
-For ease of generation, a trailing comma is tolerated by
+For ease of IL generation, a trailing comma is tolerated by
 the parser.  In case many items of the same type are
 sequenced (like in a C array), the shorter array syntax
 can be used.
@@ -243,7 +243,7 @@ explicitly specified by the programmer.
 
 Opaque types are used when the inner structure of an
 aggregate cannot be specified; the alignment for opaque
-types is mandatory.  They are defined by simply enclosing
+types is mandatory.  They are defined simply by enclosing
 their size between curly braces.
 
     type :opaque = align 16 { 32 }
@@ -264,7 +264,7 @@ their size between curly braces.
       |  '"' ... '"'         # String
       |  CONST               # Constant
 
-Data definitions define objects that will be emitted in the
+Data definitions express objects that will be emitted in the
 compiled file.  They can be local to the file or exported
 with global visibility to the whole program.
 
@@ -282,11 +282,11 @@ initialize multiple fields of the same size.
 The members of a struct will be packed.  This means that
 padding has to be emitted by the frontend when necessary.
 Alignment of the whole data objects can be manually specified,
-and when no alignment is provided, the maximum alignment of
+and when no alignment is provided, the maximum alignment from
 the platform is used.
 
 When the `z` letter is used the number following indicates
-the size of the field, the contents of the field are zero
+the size of the field; the contents of the field are zero
 initialized.  It can be used to add padding between fields
 or zero-initialize big arrays.
 
@@ -325,19 +325,18 @@ Here are various examples of data definitions.
 Function definitions contain the actual code to emit in
 the compiled file.  They define a global symbol that
 contains a pointer to the function code.  This pointer
-can be used in call instructions or stored in memory.
+can be used in `call` instructions or stored in memory.
 
 The type given right before the function name is the
 return type of the function.  All return values of this
-function must have the return type.  If the return
+function must have this return type.  If the return
 type is missing, the function cannot return any value.
 
 The parameter list is a comma separated list of
 temporary names prefixed by types.  The types are used
 to correctly implement C compatibility.  When an argument
-has an aggregate type, is is set on entry of the
-function to a pointer to the aggregate passed by the
-caller.  In the example below, we have to use a load
+has an aggregate type, a pointer to the aggregate is passed
+by the caller.  In the example below, we have to use a load
 instruction to get the value of the first (and only)
 member of the struct.
 
@@ -350,7 +349,7 @@ member of the struct.
     }
 
 If the parameter list ends with `...`, the function is
-a variadic function: It can accept a variable number of
+a variadic function: it can accept a variable number of
 arguments.  To access the extra arguments provided by
 the caller, use the `vastart` and `vaarg` instructions
 described in the <@ Variadic > section.
@@ -375,10 +374,10 @@ very good compatibility with C.  The <@ Call > section
 explains how to pass an environment parameter.
 
 Since global symbols are defined mutually recursive,
-there is no need for function declarations: A function
+there is no need for function declarations: a function
 can be referenced before its definition.
 Similarly, functions from other modules can be used
-without previous declarations.  All the type information
+without previous declaration.  All the type information
 is provided in the call instructions.
 
 The syntax and semantics for the body of functions
@@ -389,8 +388,8 @@ are described in the <@ Control > section.
 
 The IL represents programs as textual transcriptions of
 control flow graphs.  The control flow is serialized as
-a sequence of blocks of straight-line code and connected
-using jump instructions.
+a sequence of blocks of straight-line code which are
+connected using jump instructions.
 
 ~ Blocks
 ~~~~~~~~
@@ -406,12 +405,12 @@ All blocks have a name that is specified by a label at
 their beginning.  Then follows a sequence of instructions
 that have "fall-through" flow.  Finally one jump terminates
 the block.  The jump can either transfer control to another
-block of the same function or return, they are described
+block of the same function or return; they are described
 further below.
 
 The first block in a function must not be the target of
-any jump in the program.  If this need is encountered,
-the frontend can always insert an empty prelude block
+any jump in the program.  If this is really needed,
+the frontend could insert an empty prelude block
 at the beginning of the function.
 
 When one block jumps to the next block in the IL file,
@@ -453,7 +452,7 @@ the following list.
 
     When its word argument is non-zero, it jumps to its
     first label argument; otherwise it jumps to the other
-    label.  The argument must be of word type, because of
+    label.  The argument must be of word type; because of
     subtyping a long argument can be passed, but only its
     least significant 32 bits will be compared to 0.
 
@@ -461,7 +460,7 @@ the following list.
 
     Terminates the execution of the current function,
     optionally returning a value to the caller.  The value
-    returned must have the type given in the function
+    returned must be of the type given in the function
     prototype.  If the function prototype does not specify
     a return type, no return value can be used.
 
@@ -498,12 +497,12 @@ This is made explicit by the instruction suffix.
 The types of instructions are described below using a short
 type string.  A type string specifies all the valid return
 types an instruction can have, its arity, and the type of
-its arguments in function of its return type.
+its arguments depending on its return type.
 
 Type strings begin with acceptable return types, then
 follows, in parentheses, the possible types for the arguments.
-If the n-th return type of the type string is used for an
-instruction, the arguments must use the n-th type listed for
+If the N-th return type of the type string is used for an
+instruction, the arguments must use the N-th type listed for
 them in the type string.  When an instruction does not have a
 return type, the type string only contains the types of the
 arguments.
@@ -513,7 +512,7 @@ The following abbreviations are used.
   * `T` stands for `wlsd`
   * `I` stands for `wl`
   * `F` stands for `sd`
-  * `m` stands for the type of pointers on the target, on
+  * `m` stands for the type of pointers on the target; on
     64-bit architectures it is the same as `l`
 
 For example, consider the type string `wl(F)`, it mentions
@@ -540,7 +539,7 @@ towards zero.
 The signed and unsigned remainder operations are available
 as `rem` and `urem`.  The sign of the remainder is the same
 as the one of the dividend.  Its magnitude is smaller than
-the divisor's.  These two instructions and `udiv` are only
+the divisor one.  These two instructions and `udiv` are only
 available with integer arguments and result.
 
 Bitwise OR, AND, and XOR operations are available for both
@@ -548,8 +547,8 @@ integer types.  Logical operations of typical programming
 languages can be implemented using <@ Comparisons > and
 <@ Jumps >.
 
-Shift instructions `sar`, `shr`, and `shl` shift right or
-left their first operand by the amount in the second
+Shift instructions `sar`, `shr`, and `shl`, shift right or
+left their first operand by the amount from the second
 operand.  The shifting amount is taken modulo the size of
 the result type.  Shifting right can either preserve the
 sign of the value (using `sar`), or fill the newly freed
@@ -591,8 +590,8 @@ towards zero.
       * `loadsb`, `loadub` -- `I(mm)`
 
     For types smaller than long, two variants of the load
-    instruction is available: one will sign extend the value
-    loaded, while the other will zero extend it.  Remark that
+    instruction are available: one will sign extend the loaded
+    value, while the other will zero extend it.  Note that
     all loads smaller than long can load to either a long or
     a word.
 
@@ -635,9 +634,9 @@ instructions.  Pointers are stored in long temporaries.
 ~~~~~~~~~~~~~
 
 Comparison instructions return an integer value (either a word
-or a long), and compare values of arbitrary types.  The value
-returned is 1 if the two operands satisfy the comparison
-relation, and 0 otherwise.  The names of comparisons respect
+or a long), and compare values of arbitrary types.  The returned
+value is 1 if the two operands satisfy the comparison
+relation, or 0 otherwise.  The names of comparisons respect
 a standard naming scheme in three parts.
 
  1. All comparisons start with the letter `c`.
@@ -676,7 +675,7 @@ a standard naming scheme in three parts.
 
 For example, `cod` (`I(dd,dd)`) compares two double-precision
 floating point numbers and returns 1 if the two floating points
-are not NaNs, and 0 otherwise.  The `csltw` (`I(ww,ww)`)
+are not NaNs, or 0 otherwise.  The `csltw` (`I(ww,ww)`)
 instruction compares two words representing signed numbers and
 returns 1 when the first argument is smaller than the second one.
 
@@ -727,7 +726,7 @@ instruction to lower the precision of an integer temporary.
 ~~~~~~~~~~~~~~~
 
 The `cast` and `copy` instructions return the bits of their
-argument verbatim.  A `cast` will however change an integer
+argument verbatim.  However a `cast` will change an integer
 into a floating point of the same width and vice versa.
 
   * `cast` -- `wlsd(sdwl)`
@@ -755,7 +754,7 @@ single-precision floating point number `%f` into `%rs`.
 
     ABITY := BASETY | :IDENT
 
-The call instruction is special in many ways.  It is not
+The call instruction is special in several ways.  It is not
 a three-address instruction and requires the type of all
 its arguments to be given.  Also, the return type can be
 either a base type or an aggregate type.  These specifics
@@ -801,7 +800,7 @@ is essentially effectful: calling it twice in a row will
 return two consecutive arguments from the argument list.
 
 Both instructions take a pointer to a variable argument
-list as only argument.  The size and alignment of variable
+list as sole argument.  The size and alignment of variable
 argument lists depend on the target used.  However, it
 is possible to conservatively use the maximum size and
 alignment required by all the targets.
@@ -890,7 +889,7 @@ translate it in SSA form is to insert a phi instruction.
 
 Phi instructions return one of their arguments depending
 on where the control came from.  In the example, `%y` is
-set to 1 if the `@ift` branch is taken, and it is set to
+set to 1 if the `@ift` branch is taken, or it is set to
 2 otherwise.
 
 An important remark about phi instructions is that QBE