Age | Commit message (Collapse) | Author |
|
|
|
valloc is actually a POSIX function that
prevents compilation on some systems.
|
|
The detection of empty permutations was incorrect
since the changes made to the vector routines.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
When rdx is used to return a value and is used as argument,
it is in the call defs and hence made dead by the loop
modified here. This is obviously erroneous behavior.
We instead rephrase the loop to make it clear that among
the caller-save registers, only the ones used by the call
must be live before the call.
|
|
The IR generated by calls was very bulky because two
instructions were used for marking the live range of
a clobber.
This patch attempts to store the information of what
registers are use/def/clobber in the call instruction
itself, this leads to more compact code (even more
when we'll have SSE registers). However, I find that
the amount of extra code needed is not really
easonable. Fortunately it is not too invasive, thus
if the complexity creeps in, it should be easy to
revert.
|
|
|
|
|
|
revert commit d0e9e3ada106dfe8dcda7a0099b341000f00afb2.
|
|
|
|
I could never figure out a correct version
without the tests. I'm now fairly confident
dopm() will work as we need it to.
|
|
|
|
|
|
The old code was broken for sure, this one might
be. I have to create a test bench for the dopm
function. It would also test the parallel move
lowering (pmgen and folks).
|
|
The first one was not so bad, when we a parallel move
clobbers one machine register in use, we used to
free the temporary t* using it, mark the register as
unavailable and allocate a new location for the
t*. But this fails when all the registers are in use.
In that case, the destination of the move must be in
a register r1, so I require a swap of the register
to copy (used by t*) with r1 and update the map
accordingly.
I would like to move all the above logic in a function
dealing with clobbers in general.
The second bug is in the parallel move compiler, this
one was a little more nasty and could have caused much
debugging pain. It would be reasonable to test it in
a similar way that I did for the slota() allocator.
|
|
An invariant is that all registers allocated at some point
have a hint. This makes the code removed by this commit
dead because of the if condition testing for empty hints.
|
|
|
|
We only allocate a register that has a hint if the
hint register is not used already. In the max example
it gives a better result and it does not seem to affect
the collatz test.
|
|
It could very well be that the temporary we
assign already got assigned to the right
register! Good things happen.
|
|
There was a typo that made always the same successor
to be selected for register allocation hinting.
Also, I now attempt to prioritize hints over succeccor's
choices as it appears to give slightly better results...
Now that I think about it, the code re-using the most
frequent successor block's assignment might be dead
because all registers have hints if they got assigned
once. To investigate.
|
|
|
|
|
|
|
|
|
|
This is possible because we know that they are
represented by different integers.
|
|
I've been septic since I introduced it, this commit
proves that it costs more than it helps. I've also fixed
a bad bug in rega() where I alloc'ed the wrong size for
internal arrays. Enums now have names so I can use them
to cast in gdb to get the name corresponding to a constant.
|
|
|
|
This gives a more uniform use of the registers.
|
|
|
|
The substraction contrained the register allocator
to allocate a different register for the result and
the second operand, now, we use a neg trick to compile
it down. The machinery that was setup is, regardless,
interesting and will have to be used for floating
point computations (division).
The first bug in rega made broke the explicited loop
invariant: we were using register allocation unavailable
information from other blocks. It's still unclear
how we got wrong results from that considering mappings
are all 0-initialized.
The second bug is a stupid one, one sizeof operator was
missing from a memcpy...
|
|
|
|
This might not be a good idea, the problem was that
many spurious registers would be added to the Bits
data-structures during compilation (and would
always remain 0). However, doing the above
modification de-uniformizes the handling of temps
and regs, this makes the code longer and not
really nicer. Also, additional Bits structures
are required to track the registers independently.
Overall this might be a bad idea to revert.
|
|
|
|
The commutativity information only makes sense for
arithmetic expressions. To account for that, I introduced
a new tri-valued boolean type B3. Memory operations, for
example, will receive an undefined commutativity trit.
The code emitter was buggy when rega emitted instructions
like 'rax = add 1, rax', this is now fixed using the
commutativity information (we rewrite it in 'rax = add
rax, 1').
|
|
|
|
|
|
|