Age | Commit message (Collapse) | Author |
|
This is consistent with newtmp()
and newcon().
|
|
|
|
|
|
Crashing loads of uninitialized memory
proved to be a problem when implementing
unions using qbe. This patch introduces
a new UNDEF Ref to represent data that is
known to be uninitialized. Optimization
passes can make use of it to eliminate
some code. In the last compilation stages,
UNDEF is treated as the constant 0xdeaddead.
|
|
|
|
The .val field is signed in RSlot.
Add a new dedicated function to
fetch it as a signed int.
|
|
It is handy to express when
the end of a block cannot be
reached. If a hlt terminator
is executed, it traps the
program.
We don't go the llvm way and
specify execution semantics as
undefined behavior.
|
|
Symbols are a useful abstraction
that occurs in both Con and Alias.
In this patch they get their own
struct. This new struct packages
a symbol name and a type; the type
tells us where the symbol name
must be interpreted (currently, in
gobal memory or in thread-local
storage).
The refactor fixed a bug in
addcon(), proving the value of
packaging symbol names with their
type.
|
|
|
|
This pass limits stack usage when
many small aggregates are allocated
on the stack. A fast liveness
analysis figures out which slots
interfere and the pass then fuses
slots that do not interfere. The
pass also kills stack slots that
are only ever assigned.
On the hare stdlib test suite, this
fusion pass managed to reduce the
total eligible slot bytes count
by 84%.
The slots considered for fusion
must not escape and not exceed
64 bytes in size.
|
|
We will be using it in the new
coalesce() pass.
|
|
Stack slots may have padding
bytes, and if we want to have
precise liveness information
it's important that we are able
to tell them apart.
This patch extends fillalias()
to remember for every slot
what bytes were ever assigned.
In case the slot address does
not escape we know that only
these bytes matter.
To save space, we only store
this information if the slot
size is less than or equal to
NBit.
The Alias struct was reworked
a bit to save some space. I am
still not very satisfied with
its layout though.
|
|
We had the invariant that it'd
always be a temporary.
|
|
|
|
It is more natural to branch on a
flag than have different function
pointers for high-level passes.
|
|
|
|
|
|
The apple targets are not done yet.
|
|
|
|
The general idea is to give abis a
chance to talk before we've done all
the optimizations. Currently, all
targets eliminate {par,arg,ret}{sb,ub,...}
during this pass. The forthcoming
arm64_apple will, however, insert
proper extensions during abi0.
Moving forward abis can, for example,
lower small-aggregates passing there
so that memory optimizations can
interact better with function calls.
|
|
|
|
apple support is more than assembly syntax
in case of arm64 machines, and apple syntax
is currently useless in all cases but amd64;
rather than having a -G option that only
makes sense with amd64, we add a new target
amd64_apple
|
|
|
|
|
|
I also moved some isel logic
that would have been repeated
a third time in util.c.
|
|
That is not available on osx
so I tweaked the gas.c api
a little to conditionally
output the two directives.
|
|
The risc-v abi needs to know if a
type is defined as a union or not.
We cannot use nunion to obtain this
information because the risc-v abi
made the unfortunate decision of
treating
union { int i; }
differently from
int i;
So, instead, I introduce a single
bit flag 'isunion'.
|
|
|
|
It is mostly complete, but still has a few ABI bugs when passing
floats in structs, or when structs are passed partly in register,
and partly on stack.
|
|
This allows frontends to use BSS generically, without knowledge of
platform-dependent details.
|
|
|
|
parseref() has code to reuse address constants, but this is not
done in other passes such as fold or isel. Introduce a new function
newcon() which takes a Con and returns a Ref for that constant, and
use this whenever creating address constants.
This is necessary to fix folding of address constants when one
operand is already folded. For example, in
%a =l add $x, 1
%b =l add %a, 2
%c =w loadw %b
%a and %b were folded to $x+1 and $x+3 respectively, but then the
second add is visited again since it uses %a. This gets folded to
$x+3 as well, but as a new distinct constant. This results in %b
getting labeled as bottom instead of either constant, disabling the
replacement of %b by a constant in subsequent instructions (such
as the loadw).
|
|
|
|
Some abis, like the riscv one, treat
arguments differently depending on
whether they are variadic or not.
To prepare for the upcomming riscv
target, we change the variadic call
syntax and give meaning to the
location of the '...' marker.
# new syntax
%ret =w call $f(w %regular, ..., w %variadic)
By nature of their abis, the change
is backwards compatible for existing
targets.
|
|
Reported by Alessandro Mantovani.
These addresses are likely bogus, but
they triggered an unwarranted assertion
failure. We now raise a civilized error.
|
|
This now only limits the number of arguments when parsing the input SSA,
which is usually a small fixed size (depending on the frontend).
|
|
|
|
|
|
Slots are stored as `int` in Fn, so use the same type in Tmp.
Rearrange the fields in Tmp slightly so that sizeof(Tmp) stays the same
(at least on 64-bit systems).
|
|
Michael Forney needs this to run his
compiler on interesting programs.
|
|
Previously, we would skip ssa construction when
a temporary has a single definition. This is
only part of the ssa invariant: we must also
check that all uses are dominated by the single
definition. The new code does this.
In fact, qbe does not store all the dominators
for a block, so instead of walking the idom
linked list we use a rough heuristic and declare
conservatively that B0 dominates B1 when one of
the two conditions is true:
a. B0 is the start block
b. B0 is B1
Some measurements on a big file from Michael
Forney show that the code is still as fast as
before this patch.
|
|
|
|
They are now linear and can be
safely used with arguments that
have side-effects. This patch
also introduces an iscall()
macro and uses it to fix a
missing check for Ovacall in
liveness analysis.
|
|
The arm64 might have the same problem but it
is currently unable to handle them even in
instruction selection.
Thanks to Jean Dao for reporting the bug.
|
|
The stashing of constants in gas.c was also
changed to support 16-bytes constants.
|
|
|
|
Symbols in the source file are still limited in
length because the rest of the code assumes that
strings always fit in NString bytes.
Regardless, there is already a benefit because
comparing/copying symbol names does not require
using strcmp()/strcpy() anymore.
|
|
The previous heuristics were ad hoc and it was
hard to understand why they worked at all.
This patch can be summarized in three points:
1. When a register is freed (an instruction
assigns it), we try to find if a temporary
would like to be in it, and if we find one,
we move it in the newly freed register.
I call this an "eager move".
2. Temporaries now remember in what register
they were last allocated; this information
is stored in the field Tmp.visit, and
prevails on the field Tmp.hint when it is
set. (This makes having the same hint for
interfering temporaries not so disastrous.)
3. Blocks are now allocated in "onion" order,
from the innermost loop to the outermost.
This is the change I am the least sure
about; it should be evaluated thorougly.
|
|
Ori needs this. It should not cost much more memory at runtime,
only a minimal amount of address space.
|
|
The previous code was buggy. It would put a stack
pointer on the heap when handling "add $foo, 42".
The new code is more straightforward and hopefully
more correct. Only temporaries with a "stack"
alias class will have a slot pointer.
|