~cnx/roux - Alternative QBE compiler

Age	Commit message (Collapse)	Author
2019-07-11	minic: fix undefined symbol linkage issue	Sergei V. Rogachev
	The mandel example uses SDL2 for graphics output. When GCC is used to assemble the resulting *.s file it shows linker's errors about undefined symbols from the library. This behavior can be fixed by moving the flags passed to the compiler after the source file name.
2019-05-16	Fix a few uses of gassym missed in 9e7e5bff	Michael Forney

2019-05-15	arm64: Handle stack allocations larger than 4095 bytes	Michael Forney
	In this case, the immediate is too large to use directly in the add/sub instructions, so move it into a temporary register first. Also, for clarity, rearrange the if-conditions so that they match the constraints of the instructions that immediately follow.
2019-05-15	arm64: Handle truncd instruction	Michael Forney

2019-05-15	arm64: Use 32-bit register name when loading 'b' or 'h' into 'l'	Michael Forney
	The ldrb and ldrh instructions require a 32-bit register name for the destination and will clear the upper 32-bits of that register.
2019-05-15	Allow specifying literal global names	Michael Forney

2019-05-14	drop dead declaration	Quentin Carbonneaux

2019-05-14	fix a bad bug in copy detection	Quentin Carbonneaux
	The code used to see add 0, 10 as a copy of 0.
2019-05-05	add asm diffing in test script	Quentin Carbonneaux

2019-05-05	fuse epilog deduplication with jump threading	Quentin Carbonneaux

2019-05-05	revert last commit	Quentin Carbonneaux
	The same functionality can be implemented naturally in the cfg simplification pass.
2019-05-04	emit only one epilog per function	Quentin Carbonneaux
	Previously, each ret would lead to an epilog. This caused bloat for large functions with multiple return points.
2019-05-03	gas: use .balign instead of .align	Quentin Carbonneaux
	.align N can either mean align to the next multiple of N or align to the next multiple of 1<<N. Credit goes to Jorge Acereda Maciá for reporting this issue.
2019-05-02	move fillloop() after fold()	Quentin Carbonneaux
	SCCP is currently the one and only pass which seriously affects control flow; so we must compute loop costs afterwards.
2019-05-02	detect ubiquitous simple copies	Quentin Carbonneaux
	When lowering pointer arithmetic, it is natural for a C frontend to generate those instructions.
2019-05-02	revert heuristic to reuse stack slots	Quentin Carbonneaux
	The heuristic was bogus for at least two reasons (see below), and, looking at some generated code, it looks like some other issues are more pressing. 1. A stack slot of 4 bytes could be used for a temporary of 8 bytes. 2. Should 2 arguments of an operation end up spilled, the same slot could be allocated to both!
2019-04-30	isel fix for amd64 memory stores	Quentin Carbonneaux
	The value argument of store instructions was handled incorrectly.
2019-04-29	fix folding of unsigned operations	Quentin Carbonneaux
	This fixes similar bugs than the ones fixed in the previous commit. In the folding code the invariant is that when a result is 32 bits wide, the low 32 bits of 'x' are correct. The high bits can be anything.
2019-04-29	fold: Make sure 32-bit constants get sign extended when necessary	Michael Forney

2019-04-29	amd64: Use unordered compare for floating points	Michael Forney
	This prevents an FE_INVALID exception when comparing with NaN.
2019-04-29	add missing gas prefix	Quentin Carbonneaux
	Thanks to Jorge Acereda Maciá for catching this.
2019-04-26	new large test to evaluate performance	Quentin Carbonneaux
	This was generated by csmith and then compiled to qbe il by Michael Forney's C compiler.
2019-04-26	update conaddr test to catch early segfaults	Quentin Carbonneaux

2019-04-26	Fix config.h dependency when OBJDIR != obj	Michael Forney

2019-04-26	amd64/isel: Error if alloc size doesn't fit in Tmp slot type	Michael Forney

2019-04-26	Allow stack allocations larger than SHRT_MAX * 4 bytes	Michael Forney
	Slots are stored as `int` in Fn, so use the same type in Tmp. Rearrange the fields in Tmp slightly so that sizeof(Tmp) stays the same (at least on 64-bit systems).
2019-04-26	restore some code from b4a98c	Quentin Carbonneaux
	I had forgotten that %rip can only be used as base when there is no index. I also added a test which stresses addressing selection with and without constants.
2019-04-25	cleanup amd64 constant addressing	Quentin Carbonneaux
	We now emit correct code when the user refers to a specific constant address. I also made some comments clearer in the instruction selection pass and got rid of some apparently useless code.
2019-04-24	Fix default config.h for arm64	Michael Forney

2019-04-17	avoid some gcc warnings	Quentin Carbonneaux
	In this case, the potential truncations flagged by gcc are only affecting debug information.
2019-04-16	bump NString and NPred	Quentin Carbonneaux
	Michael Forney needs this to run his compiler on interesting programs.
2019-04-15	handle big constants moves to slots	Quentin Carbonneaux
	There is no flavor of mov which can set 8 bytes of memory to a constant not representable as an int32. The solution is simply to emit two movs of 4 bytes each.
2019-04-11	properly detect ssa form	Quentin Carbonneaux
	Previously, we would skip ssa construction when a temporary has a single definition. This is only part of the ssa invariant: we must also check that all uses are dominated by the single definition. The new code does this. In fact, qbe does not store all the dominators for a block, so instead of walking the idom linked list we use a rough heuristic and declare conservatively that B0 dominates B1 when one of the two conditions is true: a. B0 is the start block b. B0 is B1 Some measurements on a big file from Michael Forney show that the code is still as fast as before this patch.
2019-04-08	make sure a spill slot is initialized	Quentin Carbonneaux
	If an instruction does not have a result, the variable `s` is not set. This could lead to a bogus slot assignment.
2019-03-14	Rearrange the fields in Ins so the bit-fields get packed together	Michael Forney

2019-03-13	simple heuristic to reuse stack slots	Quentin Carbonneaux
	On test/spill1.ssa, the stack frame of the function f() goes from 56 bytes to 40 bytes. That's a reduction of close to 30%. This patch also opens the door to folding operations on spill slots. For example movl $15, %r15d addl -X(%rbp), %r15d movl %r15d, -X(%rbp) should become add $15, -X(%rbp) when %r15d is not used afterwards.
2019-03-12	improve range-checking macros	Quentin Carbonneaux
	They are now linear and can be safely used with arguments that have side-effects. This patch also introduces an iscall() macro and uses it to fix a missing check for Ovacall in liveness analysis.
2019-03-12	emit valid code for mem->mem copies	Quentin Carbonneaux

2019-03-09	add a stress test for phi spilling	Quentin Carbonneaux

2019-03-09	make sure phis are temporaries in rega	Quentin Carbonneaux
	In fact, after spilling, a phi can be a temporary or a slot. I am now pondering whether this is a good idea or not because it causes annoying mem->mem movs after register allocation.
2019-03-08	use a hash table to parse temporaries	Quentin Carbonneaux

2019-03-07	fix in load elimination (vacall is a call)	Michael Forney

2019-03-01	skip expensive ssa-building loop when possible	Quentin Carbonneaux
	If a temporary is assigned exactly once (most are), there is no need to do any work to put it in ssa form. On an input file of ~35k loc, this makes the processing time go from 2.9 secs to 1.2 secs.
2019-02-28	update copyright years	Quentin Carbonneaux

2019-02-27	Let runtime crash on zero div, don't fold it.	Andrew Chambers
	Remarks from Quentin: It is an important decision to use Bot and not Top as the result of 'x / 0'. By using Bot, we refuse to give a warrant to the compiler that would allow meaningless subsequent decisions. An example follows. Clang, on my computer, will build a program which prints "Ho" when fed the following C: int main() { puts(1/0 ? "Hi" : "Ho"); } On the other hand, a C compiler based on QBE will build a program which crashes, as one would expect. See also https://c9x.me/notes/2014-09-10.html
2019-02-26	new copy elimination pass	Quentin Carbonneaux
	The sparse data-flow analysis used for copy elimination before this patch could sometimes diverge. The core reason for this behavior is that the visitphi() function was not monotonic in the following copy-of lattice: top (represented as the temp / \| \ itself) x y z ... \ \| / bot (represented as R) This monotonicity defect could be fixed by reverting 2f41ff03, but then the pass would end up missing some redundant phis. This patch re-implements the pass from scratch using a different approach. The new algorithm should get rid of all redundant copies. On the other hand, it can run slower than the monotonic sparse data-flow analysis because, in the worst case, an instruction in a phi cluster can be visited as many times as there are phis in the input program. Thanks to Michael Forney for reviewing and testing the new pass.
2019-02-25	prefer bigger amd64 addressing	Quentin Carbonneaux
	Before, amatch() would prefer matching "o + b" to "o + si" and "b + si".
2019-02-21	fix amd64 addressing mode matcher	Quentin Carbonneaux
	The numberer made some arranging choices when numbering arguments of an instruction, but these decisions were ignored when matching. The fix is to reconcile numbering and matching.
2019-02-21	doc: Aggregate types can be nested	Michael Forney

2019-02-21	Fix assertion failure if temporary was spilled in all predecessors	Michael Forney
	Since ce0ab53ed7, we skip over predecessors that spilled the temporary. However, if all predecessors spilled, then we might not have an entry in `rl`, triggering an assertion failure in the following loop.