~cnx/roux - Alternative QBE compiler

Age	Commit message (Collapse)	Author
2023-06-06	implement line number info tracking	Thomas Bracht Laumann Jespersen
	Support "file" and "loc" directives. "file" takes a string (a file name) assigns it a number, sets the current file to that number and records the string for later. "loc" takes a single number and outputs location information with a reference to the current file.
2023-04-02	amd64_apple: one more thread-local symbols fix	Quentin Carbonneaux
	We now treat thread-local symbols in Mems properly.
2023-04-02	amd64_apple: support thread-local addresses	Quentin Carbonneaux
	Non-store/load instructions were not lowered correctly for thread- local symbols. This is an attempt at a fix (cannot test for now).
2023-04-02	amd64_sysv: fix offsets in thread-local Oaddr	Quentin Carbonneaux

2023-04-02	amd64_sysv: thread-local support in Oaddr	Quentin Carbonneaux
	Thanks to Lassi Pulkkinen for flagging the issue and pointing me to Ulrich Drepper's extensive doc [1]. [1] https://people.redhat.com/drepper/tls.pdf
2023-03-22	rename blknew() to newblk()	Quentin Carbonneaux
	This is consistent with newtmp() and newcon().
2023-03-19	naming nit	Quentin Carbonneaux

2023-03-16	silence format warning more reliably	Quentin Carbonneaux

2023-03-15	silence some warnings	Quentin Carbonneaux

2022-12-25	new UNDEF Ref	Quentin Carbonneaux
	Crashing loads of uninitialized memory proved to be a problem when implementing unions using qbe. This patch introduces a new UNDEF Ref to represent data that is known to be uninitialized. Optimization passes can make use of it to eliminate some code. In the last compilation stages, UNDEF is treated as the constant 0xdeaddead.
2022-12-14	new blit instruction	Quentin Carbonneaux

2022-12-12	new rsval() helper for signed Refs	Quentin Carbonneaux
	The .val field is signed in RSlot. Add a new dedicated function to fetch it as a signed int.
2022-11-27	new hlt block terminator	Quentin Carbonneaux
	It is handy to express when the end of a block cannot be reached. If a hlt terminator is executed, it traps the program. We don't go the llvm way and specify execution semantics as undefined behavior.
2022-11-22	use a new struct for symbols	Quentin Carbonneaux
	Symbols are a useful abstraction that occurs in both Con and Alias. In this patch they get their own struct. This new struct packages a symbol name and a type; the type tells us where the symbol name must be interpreted (currently, in gobal memory or in thread-local storage). The refactor fixed a bug in addcon(), proving the value of packaging symbol names with their type.
2022-10-12	thread-local storage for amd64_apple	Quentin Carbonneaux
	It is quite similar to arm64_apple. Probably, the call that needs to be generated also provides extra invariants on top of the regular abi, but I have not checked that. Clang generates code that is a bit neater than qbe's because, on x86, a load can be fused in a call instruction! We do not bother with supporting these since we expect only sporadic use of the feature. For reference, here is what clang might output for a store to the second entry of a thread-local array of ints: movq _x@TLVP(%rip), %rdi callq *(%rdi) movl %ecx, 4(%rax)
2022-10-08	mark apple targets with a boolean	Quentin Carbonneaux
	It is more natural to branch on a flag than have different function pointers for high-level passes.
2022-10-08	"rel" fields become "reloc"	Quentin Carbonneaux

2022-10-08	add support for thread-local storage	Quentin Carbonneaux
	The apple targets are not done yet.
2022-10-03	fix case of Pool constants	Quentin Carbonneaux

2022-10-03	add new target-specific abi0 pass	Quentin Carbonneaux
	The general idea is to give abis a chance to talk before we've done all the optimizations. Currently, all targets eliminate {par,arg,ret}{sb,ub,...} during this pass. The forthcoming arm64_apple will, however, insert proper extensions during abi0. Moving forward abis can, for example, lower small-aggregates passing there so that memory optimizations can interact better with function calls.
2022-08-31	drop -G flag and add target amd64_apple	Quentin Carbonneaux
	apple support is more than assembly syntax in case of arm64 machines, and apple syntax is currently useless in all cases but amd64; rather than having a -G option that only makes sense with amd64, we add a new target amd64_apple
2022-06-14	do not fold cnst+cnst in amd64's isel	Quentin Carbonneaux
	This may cause invalid assembly to be generated and is not all that useful anyway after constant folding has run.
2022-03-17	amd64: restore previous name of amd64_sysv target	Michael Forney

2022-03-15	new -t? flag to print default target	Quentin Carbonneaux

2022-03-14	dynamic stack allocs for arm64	Quentin Carbonneaux
	I also moved some isel logic that would have been repeated a third time in util.c.
2022-03-08	flag types defined as unions	Quentin Carbonneaux
	The risc-v abi needs to know if a type is defined as a union or not. We cannot use nunion to obtain this information because the risc-v abi made the unfortunate decision of treating union { int i; } differently from int i; So, instead, I introduce a single bit flag 'isunion'.
2022-03-08	cosmetics	Quentin Carbonneaux

2022-02-02	shared linkage logic for func/data	Quentin Carbonneaux

2022-01-28	amd64/isel: nits	Quentin Carbonneaux

2022-01-28	implement float -> unsigned casts	Bor Grošelj Simić
	amd64 lacks instruction for this so it has to be implemented with float -> signed casts. The approach is borrowed from llvm.
2022-01-28	implement unsigned -> float casts	Bor Grošelj Simić
	amd64 lacks an instruction for this so it has to be implemented with signed -> float casts: - Word casting is done by zero-extending the word to a long and then doing a regular signed cast. - Long casting is done by dividing by two with correct rounding if the highest bit is set and casting that to float, then adding 1 to mantissa with integer addition
2022-01-23	Add a negation instruction	Eyal Sawady
	Necessary for floating-point negation, because `%result = sub 0, %operand` doesn't give the correct sign for 0/-0.
2021-11-22	reuse previous address constants in fold()	Michael Forney
	parseref() has code to reuse address constants, but this is not done in other passes such as fold or isel. Introduce a new function newcon() which takes a Con and returns a Ref for that constant, and use this whenever creating address constants. This is necessary to fix folding of address constants when one operand is already folded. For example, in %a =l add $x, 1 %b =l add %a, 2 %c =w loadw %b %a and %b were folded to $x+1 and $x+3 respectively, but then the second add is visited again since it uses %a. This gets folded to $x+3 as well, but as a new distinct constant. This results in %b getting labeled as bottom instead of either constant, disabling the replacement of %b by a constant in subsequent instructions (such as the loadw).
2021-11-08	amd64: avoid reading past end of passed struct	Michael Forney
	If the size of the struct is not a multiple of 8, the actual struct size may be different from the size reserved on the stack. This fixes the case where the struct is passed in memory, but we still may over-read a struct passed in registers. A TODO is added for now.
2021-10-22	make variadic args explicit	Quentin Carbonneaux
	Some abis, like the riscv one, treat arguments differently depending on whether they are variadic or not. To prepare for the upcomming riscv target, we change the variadic call syntax and give meaning to the location of the '...' marker. # new syntax %ret =w call $f(w %regular, ..., w %variadic) By nature of their abis, the change is backwards compatible for existing targets.
2021-10-17	amd64/sysv: unbreak env calls	Quentin Carbonneaux
	Env calls were disfunctional from the start. This fixes them on amd64, but they remain to do on arm64. A new test shows how to use them.
2021-10-13	add size suffix to frame setup.	Andrew Chambers

2021-08-29	amd64/isel: fix floating point == and != result with NaN	Michael Forney
	On x86_64, ucomis[sd] sets ZF=1, PF=0, CF=0 for equal arguments. However, if the arguments are unordered it sets ZF=1, PF=1, CF=1, and there is no jump/flag instruction for ZF=1 & PF=0 or ZF=1 & CF=0. So, in order to correctly implement ceq[sd] on x86_64, we need to be a bit more creative. There are several options available, depending on whether the result of ceq[sd] is used with jnz, or with other instructions, or both. If the result is used for a conditional jump, both gcc and clang use a combination of jp and jnz: ucomisd %xmm1, %xmm0 jp .Lfalse jnz .Lfalse ... .Lfalse: If the result is used in other instructions or return, gcc does the following for x == y: ucomisd %xmm1, %xmm0 setnp %al movzbl %al, %eax movl $0, %edx cmovne %edx, %eax This sets EAX to PF=0, then uses cmovne to clear it if ZF=0. It also takes care to avoid clobbering the flags register in case the result is also used for a conditional jump. Implementing this approach in QBE would require adding an architecture-specific instruction for cmovne. In contrast, clang does an additional compare, this time using cmpeqsd instead of ucomisd: cmpeqsd %xmm1, %xmm0 movq %xmm0, %rax andl $1, %rax The cmpeqsd instruction doas a floating point equality test, setting XMM0 to all 1s if they are equal and all 0s if they are not. However, we need the result in a non-XMM register, so it moves the result back then masks off all but the first bit. Both of these approaches are a bit awkward to implement in QBE, so instead, this commit does the following: ucomisd %xmm1, %xmm0 setz %al movzbl %al, %eax setnp %cl movzbl %cl, %ecx andl %ecx, %eax This sets the result by anding the two flags, but has a side effect of clobbering the flags register. This was a problem in one of my earlier patches to fix this issue[0], in addition to being more complex than I'd hoped. Instead, this commit always leaves the ceq[sd] instruction in the block, even if the result is only used to control a jump, so that the above instruction sequence is always used. Then, since we now have ZF=!(ZF=1 & PF=0) for x == y, or ZF=!(ZF=0 \| PF=1) for x != y, we can use jnz for the jump instruction. [0] https://git.sr.ht/~sircmpwn/qbe/commit/64833841b18c074a23b4a1254625315e05b86658
2021-08-27	amd64/isel: fix floating < and <= result with NaN	Michael Forney
	When the two operands are Unordered (for instance if one of them is NaN), ucomisd sets ZF=1, PF=1, and CF=1. When the result is LessThan, it sets ZF=0, PF=0, and CF=1. However, jb[e]/setb[e] only checks that CF=1 [or ZF=1] which causes the result to be true for unordered operands. To fix this, change the operand swap condition for these two floating point comparison types: always rewrite x < y as y > x, and never rewrite x > y as y < x. Add a test to check the result of cltd, cled, cgtd, cged, ceqd, and cned with arguments that are LessThan, Equal, GreaterThan, and Unordered. Additionally, check three different implementations for equality testing: one that uses the result of ceqd directly, one that uses the result to control a conditional jump, and one that uses the result both as a value and for a conditional jump. For now, unordered equality tests are still broken so they are disabled.
2021-08-23	amd64/emit.c: fix %x =k sub %x, %x	Eyal Sawady
	The negate trick is unnecessary and broken when the first arg is the result.
2021-07-30	err when an address contains a sum $a+$b (afl)	Quentin Carbonneaux
	Reported by Alessandro Mantovani. These addresses are likely bogus, but they triggered an unwarranted assertion failure. We now raise a civilized error.
2021-07-28	handle fast locals in amd64 shifts (afl)	Quentin Carbonneaux
	Reported by Alessandro Mantovani. Although unlikely in real programs it was found that using the address of a fast local in amd64 shifts triggers assertion failures. We now err when the shift count is given by an address; but we allow shifting an address.
2021-07-28	fix amd64 addressing selection bug (afl)	Quentin Carbonneaux
	Reported by Alessandro Mantovani. Unlikely to be hit in practice because we don't add addresses to addresses. type :biggie = { l, l, l } function $repro(:biggie %p) { @start %x =l add %p, $a storew 42, %x ret }
2021-06-17	amd64: fix conditional jump when compare is swapped and used elsewhere	Michael Forney
	selcmp may potentially swap the arguments and return 1 indicating that the opposite operation should be used. However, if the compare result is used for a conditional jump as well as elsewhere, the original compare op is used instead of the opposite. To fix this, add a check to see whether the opposite compare should be used, regardless of whether selcmp() is done now, or later on during sel(). Bug report and test case from Charlie Stanton.
2021-03-18	spill: use stronger assertion for registers in use at start of function	Michael Forney

2020-08-06	amd64: Use member class for aggregate parameter temporary	Michael Forney
	Otherwise, we may end up using an integer and floating class for the same register, triggering an assertion failure: qbe: rega.c:215: pmrec: Assertion `KBASE(pm[i].cls) == KBASE(*k)' failed. Test case: type :T = { s } export function $d(:T %.1, s %.2) { @start call $c(s %.2) ret }
2020-08-06	Use a dynamic array for phi arguments	Michael Forney

2019-05-15	Allow specifying literal global names	Michael Forney

2019-05-05	revert last commit	Quentin Carbonneaux
	The same functionality can be implemented naturally in the cfg simplification pass.
2019-05-04	emit only one epilog per function	Quentin Carbonneaux
	Previously, each ret would lead to an epilog. This caused bloat for large functions with multiple return points.