summary refs log tree commit diff
AgeCommit message (Collapse)Author
2019-07-11minic: fix undefined symbol linkage issueSergei V. Rogachev
The mandel example uses SDL2 for graphics output. When GCC is used to assemble the resulting *.s file it shows linker's errors about undefined symbols from the library. This behavior can be fixed by moving the flags passed to the compiler after the source file name.
2019-05-16Fix a few uses of gassym missed in 9e7e5bffMichael Forney
2019-05-15arm64: Handle stack allocations larger than 4095 bytesMichael Forney
In this case, the immediate is too large to use directly in the add/sub instructions, so move it into a temporary register first. Also, for clarity, rearrange the if-conditions so that they match the constraints of the instructions that immediately follow.
2019-05-15arm64: Handle truncd instructionMichael Forney
2019-05-15arm64: Use 32-bit register name when loading 'b' or 'h' into 'l'Michael Forney
The ldrb and ldrh instructions require a 32-bit register name for the destination and will clear the upper 32-bits of that register.
2019-05-15Allow specifying literal global namesMichael Forney
2019-05-14drop dead declarationQuentin Carbonneaux
2019-05-14fix a bad bug in copy detectionQuentin Carbonneaux
The code used to see add 0, 10 as a copy of 0.
2019-05-05add asm diffing in test scriptQuentin Carbonneaux
2019-05-05fuse epilog deduplication with jump threadingQuentin Carbonneaux
2019-05-05revert last commitQuentin Carbonneaux
The same functionality can be implemented naturally in the cfg simplification pass.
2019-05-04emit only one epilog per functionQuentin Carbonneaux
Previously, each ret would lead to an epilog. This caused bloat for large functions with multiple return points.
2019-05-03gas: use .balign instead of .alignQuentin Carbonneaux
.align N can either mean align to the next multiple of N or align to the next multiple of 1<<N. Credit goes to Jorge Acereda MaciĆ” for reporting this issue.
2019-05-02move fillloop() after fold()Quentin Carbonneaux
SCCP is currently the one and only pass which seriously affects control flow; so we must compute loop costs afterwards.
2019-05-02detect ubiquitous simple copiesQuentin Carbonneaux
When lowering pointer arithmetic, it is natural for a C frontend to generate those instructions.
2019-05-02revert heuristic to reuse stack slotsQuentin Carbonneaux
The heuristic was bogus for at least two reasons (see below), and, looking at some generated code, it looks like some other issues are more pressing. 1. A stack slot of 4 bytes could be used for a temporary of 8 bytes. 2. Should 2 arguments of an operation end up spilled, the same slot could be allocated to both!
2019-04-30isel fix for amd64 memory storesQuentin Carbonneaux
The value argument of store instructions was handled incorrectly.
2019-04-29fix folding of unsigned operationsQuentin Carbonneaux
This fixes similar bugs than the ones fixed in the previous commit. In the folding code the invariant is that when a result is 32 bits wide, the low 32 bits of 'x' are correct. The high bits can be anything.
2019-04-29fold: Make sure 32-bit constants get sign extended when necessaryMichael Forney
2019-04-29amd64: Use unordered compare for floating pointsMichael Forney
This prevents an FE_INVALID exception when comparing with NaN.
2019-04-29add missing gas prefixQuentin Carbonneaux
Thanks to Jorge Acereda MaciĆ” for catching this.
2019-04-26new large test to evaluate performanceQuentin Carbonneaux
This was generated by csmith and then compiled to qbe il by Michael Forney's C compiler.
2019-04-26update conaddr test to catch early segfaultsQuentin Carbonneaux
2019-04-26Fix config.h dependency when OBJDIR != objMichael Forney
2019-04-26amd64/isel: Error if alloc size doesn't fit in Tmp slot typeMichael Forney
2019-04-26Allow stack allocations larger than SHRT_MAX * 4 bytesMichael Forney
Slots are stored as `int` in Fn, so use the same type in Tmp. Rearrange the fields in Tmp slightly so that sizeof(Tmp) stays the same (at least on 64-bit systems).
2019-04-26restore some code from b4a98cQuentin Carbonneaux
I had forgotten that %rip can only be used as base when there is no index. I also added a test which stresses addressing selection with and without constants.
2019-04-25cleanup amd64 constant addressingQuentin Carbonneaux
We now emit correct code when the user refers to a specific constant address. I also made some comments clearer in the instruction selection pass and got rid of some apparently useless code.
2019-04-24Fix default config.h for arm64Michael Forney
2019-04-17avoid some gcc warningsQuentin Carbonneaux
In this case, the potential truncations flagged by gcc are only affecting debug information.
2019-04-16bump NString and NPredQuentin Carbonneaux
Michael Forney needs this to run his compiler on interesting programs.
2019-04-15handle big constants moves to slotsQuentin Carbonneaux
There is no flavor of mov which can set 8 bytes of memory to a constant not representable as an int32. The solution is simply to emit two movs of 4 bytes each.
2019-04-11properly detect ssa formQuentin Carbonneaux
Previously, we would skip ssa construction when a temporary has a single definition. This is only part of the ssa invariant: we must also check that all uses are dominated by the single definition. The new code does this. In fact, qbe does not store all the dominators for a block, so instead of walking the idom linked list we use a rough heuristic and declare conservatively that B0 dominates B1 when one of the two conditions is true: a. B0 is the start block b. B0 is B1 Some measurements on a big file from Michael Forney show that the code is still as fast as before this patch.
2019-04-08make sure a spill slot is initializedQuentin Carbonneaux
If an instruction does not have a result, the variable `s` is not set. This could lead to a bogus slot assignment.
2019-03-14Rearrange the fields in Ins so the bit-fields get packed togetherMichael Forney
2019-03-13simple heuristic to reuse stack slotsQuentin Carbonneaux
On test/spill1.ssa, the stack frame of the function f() goes from 56 bytes to 40 bytes. That's a reduction of close to 30%. This patch also opens the door to folding operations on spill slots. For example movl $15, %r15d addl -X(%rbp), %r15d movl %r15d, -X(%rbp) should become add $15, -X(%rbp) when %r15d is not used afterwards.
2019-03-12improve range-checking macrosQuentin Carbonneaux
They are now linear and can be safely used with arguments that have side-effects. This patch also introduces an iscall() macro and uses it to fix a missing check for Ovacall in liveness analysis.
2019-03-12emit valid code for mem->mem copiesQuentin Carbonneaux
2019-03-09add a stress test for phi spillingQuentin Carbonneaux
2019-03-09make sure phis are temporaries in regaQuentin Carbonneaux
In fact, after spilling, a phi can be a temporary or a slot. I am now pondering whether this is a good idea or not because it causes annoying mem->mem movs after register allocation.
2019-03-08use a hash table to parse temporariesQuentin Carbonneaux
2019-03-07fix in load elimination (vacall is a call)Michael Forney
2019-03-01skip expensive ssa-building loop when possibleQuentin Carbonneaux
If a temporary is assigned exactly once (most are), there is no need to do any work to put it in ssa form. On an input file of ~35k loc, this makes the processing time go from 2.9 secs to 1.2 secs.
2019-02-28update copyright yearsQuentin Carbonneaux
2019-02-27Let runtime crash on zero div, don't fold it.Andrew Chambers
Remarks from Quentin: It is an important decision to use Bot and not Top as the result of 'x / 0'. By using Bot, we refuse to give a warrant to the compiler that would allow meaningless subsequent decisions. An example follows. Clang, on my computer, will build a program which prints "Ho" when fed the following C: int main() { puts(1/0 ? "Hi" : "Ho"); } On the other hand, a C compiler based on QBE will build a program which crashes, as one would expect. See also https://c9x.me/notes/2014-09-10.html
2019-02-26new copy elimination passQuentin Carbonneaux
The sparse data-flow analysis used for copy elimination before this patch could sometimes diverge. The core reason for this behavior is that the visitphi() function was not monotonic in the following copy-of lattice: top (represented as the temp / | \ itself) x y z ... \ | / bot (represented as R) This monotonicity defect could be fixed by reverting 2f41ff03, but then the pass would end up missing some redundant phis. This patch re-implements the pass from scratch using a different approach. The new algorithm should get rid of all redundant copies. On the other hand, it can run slower than the monotonic sparse data-flow analysis because, in the worst case, an instruction in a phi cluster can be visited as many times as there are phis in the input program. Thanks to Michael Forney for reviewing and testing the new pass.
2019-02-25prefer bigger amd64 addressingQuentin Carbonneaux
Before, amatch() would prefer matching "o + b" to "o + s*i" and "b + s*i".
2019-02-21fix amd64 addressing mode matcherQuentin Carbonneaux
The numberer made some arranging choices when numbering arguments of an instruction, but these decisions were ignored when matching. The fix is to reconcile numbering and matching.
2019-02-21doc: Aggregate types can be nestedMichael Forney
2019-02-21Fix assertion failure if temporary was spilled in all predecessorsMichael Forney
Since ce0ab53ed7, we skip over predecessors that spilled the temporary. However, if all predecessors spilled, then we might not have an entry in `rl`, triggering an assertion failure in the following loop.