Age | Commit message (Collapse) | Author |
|
Crashing loads of uninitialized memory
proved to be a problem when implementing
unions using qbe. This patch introduces
a new UNDEF Ref to represent data that is
known to be uninitialized. Optimization
passes can make use of it to eliminate
some code. In the last compilation stages,
UNDEF is treated as the constant 0xdeaddead.
|
|
The copy elimination pass is not
complete. This patch improves
things a bit, but I think we still
have quite a bit of incompleteness.
We now consistently mark phis with
all arguments identical as copies.
Previously, they were inconsistently
eliminated by phisimpl(). An example
where they were not eliminated is
the following:
@blk2
%a = phi @blk0 %x, @blk1 %x
jnz ?, @blk3, @blk4
@blk3
%b = copy %x
@blk4
%c = phi @blk2 %a, @blk3 %b
In this example, neither %c nor %a
were marked as copies of %x because,
when phisimpl() is called, the copy
information for %b is not available.
The incompleteness is still present
and can be observed by modifying
the example above so that %a takes
a copy of %x through a back-edge.
Then, phisimpl()'s lack of copy
information about %b will prevent
optimization.
|
|
|
|
udiv %x, 1 == %x, and for each of sub, or, xor, sar, shr, and shl,
<op> %x, 0 == %x.
|
|
C99 6.5.2.5p6:
> If the compound literal occurs outside the body of a function,
> the object has static storage duration; otherwise, it has automatic
> storage duration associated with the enclosing block.
So, we can't use the address of a compound literal here. Instead,
just set p to NULL, and make the loop conditional on p being non-NULL.
Remarks from Quentin:
I made a cosmetic change to Michael's
original patch and merely pushed the
literal at toplevel.
|
|
The code used to see add 0, 10 as
a copy of 0.
|
|
When lowering pointer arithmetic, it is
natural for a C frontend to generate
those instructions.
|
|
The sparse data-flow analysis used for
copy elimination before this patch
could sometimes diverge. The core
reason for this behavior is that the
visitphi() function was not monotonic
in the following copy-of lattice:
top (represented as the temp
/ | \ itself)
x y z ...
\ | /
bot (represented as R)
This monotonicity defect could be
fixed by reverting 2f41ff03, but
then the pass would end up missing
some redundant phis.
This patch re-implements the pass
from scratch using a different
approach. The new algorithm should
get rid of all redundant copies. On
the other hand, it can run slower
than the monotonic sparse data-flow
analysis because, in the worst case,
an instruction in a phi cluster can
be visited as many times as there
are phis in the input program.
Thanks to Michael Forney for reviewing
and testing the new pass.
|
|
Compiler warned about comparison between signed and unsigned values.
|
|
|
|
|
|
|
|
The pass was not doing anything incorrect, but
it missed some opportunities to optimize. On
a copy heavy example I observed that, in the
output of the pass a phi of the following shape
remained:
%a =w phi @A %c, @B %a
Originally the phi for %a was:
%a =w phi @A %b, @B %a
Since %b was discovered a copy of %c, %a should
have been eliminated and replaced with %c.
I think the problem is that knowledge on the
first argument of the phi %a changes as the
algorithm progresses, a more detailed walk-
through follows.
In the first round of the algoritm, %a is
discovered to be a copy of its first argument
%b.
phi(%b, %a) -> %b
In the second round, %a is computed as the phi
of %c (since the first argument changed) and %b
(the result of the first iteration), in our
lattice, the glb of those two is bottom.
phi(%c, %b) -> %a (Bottom)
Finally, there is a third round in which we
compute %a as the phi of %a and %c, which again,
gives bottom.
phi(%c, %a) -> %a (Bottom)
The bug is not tied to a phi of a copy, for
example, if the first argument is speculated
to be a copy of 0 and then, this knowledge is
retracted; we will compute in sequence:
phi(0, %a) -> 0
phi(%b, 0) -> %a (Bottom)
phi(%b, %a) -> %a (Bottom)
The way I fixed this is by ignoring arguments
of the phi that were discovered to be copies of
the phi node itself. This will make the last
rounds above do the correct thing.
|
|
|
|
|
|
|
|
|
|
|