~cnx/roux - Alternative QBE compiler

Age	Commit message (Collapse)	Author
2021-09-09	skip nx stack annotation on osx	Quentin Carbonneaux

2021-09-07	test: use architecture-neutral wrapper for calling vprintf	Michael Forney
	Different architectures use different types for va_list: x86_64 uses an 1-length array of struct type[0]: typedef struct { unsigned int gp_offset; unsigned int fp_offset; void overflow_arg_area; void reg_save_area; } va_list[1]; aarch64 uses a struct type[1] typedef struct { void __stack; void __gr_top; void __vr_top; int __gr_offs; int __vr_offs; } va_list; Consequently, C functions which takes a va_list as an argument, such as vprintf, may pass va_list in different ways depending on the architecture. On x86_64, va_list is an array type, so parameter decays to a pointer and passing the address of the va_list is correct. On aarch64, the va_list struct is passed by value, but since it is larger than 16 bytes, the parameter is replaced with a pointer to caller-allocated memory. Thus, passing the address as an l argument happens to work. However, this pattern of passing the address of the va_list to vprintf doesn't extend to other architectures. On riscv64, va_list is defined as typedef void va_list; which is not passed by reference. This means that tests that call vprintf using the address of a va_list (vararg1 and vararg2) will not work on riscv. To fix this while keeping the tests architecture-neutral, add a small wrapper function to the driver which takes a va_list *, and let the C compiler deal with the details of passing va_list by value. [0] https://c9x.me/compile/bib/abi-x64.pdf#figure.3.34 [1] https://c9x.me/compile/bib/abi-arm64.pdf#%5B%7B%22num%22%3A63%2C%22gen%22%3A0%7D%2C%7B%22name%22%3A%22XYZ%22%7D%2C52%2C757%2C0%5D [2] https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#va_list-va_start-and-va_arg$
2021-09-07	test: assign result of print functions to temporary	Michael Forney
	Though I am not aware of any architecture where this matters, it is technically incorrect to call these stdio functions as if they had no result. The QBE documentation says > Unless the called function does not return a value, a return > temporary must be specified, even if it is never used afterwards. so we should follow it in the tests as well.
2021-08-30	skip jump arguments in rega	Quentin Carbonneaux
	On both amd64 & arm64, the jumps making it to rega won't have any argument.
2021-08-29	amd64/isel: fix floating point == and != result with NaN	Michael Forney
	On x86_64, ucomis[sd] sets ZF=1, PF=0, CF=0 for equal arguments. However, if the arguments are unordered it sets ZF=1, PF=1, CF=1, and there is no jump/flag instruction for ZF=1 & PF=0 or ZF=1 & CF=0. So, in order to correctly implement ceq[sd] on x86_64, we need to be a bit more creative. There are several options available, depending on whether the result of ceq[sd] is used with jnz, or with other instructions, or both. If the result is used for a conditional jump, both gcc and clang use a combination of jp and jnz: ucomisd %xmm1, %xmm0 jp .Lfalse jnz .Lfalse ... .Lfalse: If the result is used in other instructions or return, gcc does the following for x == y: ucomisd %xmm1, %xmm0 setnp %al movzbl %al, %eax movl $0, %edx cmovne %edx, %eax This sets EAX to PF=0, then uses cmovne to clear it if ZF=0. It also takes care to avoid clobbering the flags register in case the result is also used for a conditional jump. Implementing this approach in QBE would require adding an architecture-specific instruction for cmovne. In contrast, clang does an additional compare, this time using cmpeqsd instead of ucomisd: cmpeqsd %xmm1, %xmm0 movq %xmm0, %rax andl $1, %rax The cmpeqsd instruction doas a floating point equality test, setting XMM0 to all 1s if they are equal and all 0s if they are not. However, we need the result in a non-XMM register, so it moves the result back then masks off all but the first bit. Both of these approaches are a bit awkward to implement in QBE, so instead, this commit does the following: ucomisd %xmm1, %xmm0 setz %al movzbl %al, %eax setnp %cl movzbl %cl, %ecx andl %ecx, %eax This sets the result by anding the two flags, but has a side effect of clobbering the flags register. This was a problem in one of my earlier patches to fix this issue[0], in addition to being more complex than I'd hoped. Instead, this commit always leaves the ceq[sd] instruction in the block, even if the result is only used to control a jump, so that the above instruction sequence is always used. Then, since we now have ZF=!(ZF=1 & PF=0) for x == y, or ZF=!(ZF=0 \| PF=1) for x != y, we can use jnz for the jump instruction. [0] https://git.sr.ht/~sircmpwn/qbe/commit/64833841b18c074a23b4a1254625315e05b86658
2021-08-27	amd64/isel: fix floating < and <= result with NaN	Michael Forney
	When the two operands are Unordered (for instance if one of them is NaN), ucomisd sets ZF=1, PF=1, and CF=1. When the result is LessThan, it sets ZF=0, PF=0, and CF=1. However, jb[e]/setb[e] only checks that CF=1 [or ZF=1] which causes the result to be true for unordered operands. To fix this, change the operand swap condition for these two floating point comparison types: always rewrite x < y as y > x, and never rewrite x > y as y < x. Add a test to check the result of cltd, cled, cgtd, cged, ceqd, and cned with arguments that are LessThan, Equal, GreaterThan, and Unordered. Additionally, check three different implementations for equality testing: one that uses the result of ceqd directly, one that uses the result to control a conditional jump, and one that uses the result both as a value and for a conditional jump. For now, unordered equality tests are still broken so they are disabled.
2021-08-23	amd64/emit.c: fix %x =k sub %x, %x	Eyal Sawady
	The negate trick is unnecessary and broken when the first arg is the result.
2021-08-23	test: include exit status in test failure reason	Michael Forney
	This was intended, but was missing due to a typo in the test status variable.
2021-08-23	parsefields: fix padding calculation	Drew DeVault
	This was causing issues with aggregate types. A simple reproduction is: type :type.1 = align 8 { 24 } type :type.2 = align 8 { w 1, :type.1 1 } The size of type.2 should be 32, adding only 4 bytes of padding between the first and second field. Prior to this patch, 20 bytes of padding was added instead, causing the type to have a size of 48. Signed-off-by: Drew DeVault <sir@cmpwn.com>
2021-08-02	copy: consider identity element for more instructions	Michael Forney
	udiv %x, 1 == %x, and for each of sub, or, xor, sar, shr, and shl, <op> %x, 0 == %x.
2021-08-02	gas: always emit GNU-stack note	Érico Nogueira
	In cases where stash was 0, gasemitfin exits immediately and the GNU-stack note isn't added to the asm output. This would result in an executable where GNU_STACK uses flags RWE instead of the desired RW.
2021-07-30	err when an address contains a sum $a+$b (afl)	Quentin Carbonneaux
	Reported by Alessandro Mantovani. These addresses are likely bogus, but they triggered an unwarranted assertion failure. We now raise a civilized error.
2021-07-29	load: handle all cases in cast()	Michael Forney
	Previously, all casts but d->w, d->s, l->s, s->d, w->d were supported. At least the first three can occur by storing to then loading from a slot, currently triggering an assertion failure. Though the other two might not be possible, they are easy enough to support as well. Fixes hare#360.
2021-07-28	handle fast locals in amd64 shifts (afl)	Quentin Carbonneaux
	Reported by Alessandro Mantovani. Although unlikely in real programs it was found that using the address of a fast local in amd64 shifts triggers assertion failures. We now err when the shift count is given by an address; but we allow shifting an address.
2021-07-28	fix buffer overflow in parser (afl)	Quentin Carbonneaux
	Reported by Alessandro Mantovani. Overly long function names would trigger out-of-bounds accesses.
2021-07-28	fix amd64 addressing selection bug (afl)	Quentin Carbonneaux
	Reported by Alessandro Mantovani. Unlikely to be hit in practice because we don't add addresses to addresses. type :biggie = { l, l, l } function $repro(:biggie %p) { @start %x =l add %p, $a storew 42, %x ret }
2021-06-17	amd64: fix conditional jump when compare is swapped and used elsewhere	Michael Forney
	selcmp may potentially swap the arguments and return 1 indicating that the opposite operation should be used. However, if the compare result is used for a conditional jump as well as elsewhere, the original compare op is used instead of the opposite. To fix this, add a check to see whether the opposite compare should be used, regardless of whether selcmp() is done now, or later on during sel(). Bug report and test case from Charlie Stanton.
2021-03-18	spill: use stronger assertion for registers in use at start of function	Michael Forney

2021-03-18	use toolchain to determine aarch64 sysroot path	Michael Forney

2021-03-18	Revert "arm64: try qemu-system-aarch64"	Michael Forney
	This reverts commit be3a67a7f5079f30b0ccc696d549fd03a2dbbad1. qemu-system-aarch64 is a full system emulator and is not suitable for running the qbe test suite (at least without a kernel and root filesystem).
2021-03-12	arm64: fix selcall call data for return of aggregate in memory	Michael Forney
	The no-op `copy R0` is necessary in order to trigger dopm in spill.c and rega.c, which assume that a call is always followed by one or more copies from registers. However, the arm64 ABI does not actually return the caller-passed pointer as in x86_64. This causes an assertion failure qbe: aarch64: Assertion failed: r == T.rglob \|\| b == fn->start (spill.c: spill: 470) for the following test program type :t = { l 3 } function $f() { @start.1 @start.2 %ret =:t call $g() ret } The assertion failure only triggers when the block containing the call is not the first block, because the check is skipped for the first block (since some registers may have been used for arguments). To fix this, set R0 in the call data so that spill/rega can see that this dummy "return" register was generated by the call. This matches qbe's existing behavior when the function returns void, another case where no register is used for the function result.
2021-03-12	Arrange debug flag table to match pass order	Michael Forney
	This makes it easier to determine which flag to pass to show the desired debug info.
2021-03-02	disable pie for arm64 tests	Quentin Carbonneaux

2021-03-02	arm64: try qemu-system-aarch64	Reini Urban

2021-03-02	fix a couple asan complaints	Quentin Carbonneaux

2021-03-02	renaming in gas.c	Quentin Carbonneaux

2021-03-02	add data $name = section "section" ...	Drew DeVault
	This allows you to explicitly specify the section to emit the data directive for, allowing for sections other than .data: for example, .bss or .init_array.
2021-03-02	silence a gcc10 warning	Quentin Carbonneaux

2021-03-02	gas: emit GNU-stack note so that stack is not executable	Michael Forney
	GNU ld uses the presence of these notes to determine the flags of the final GNU_STACK program header. If they are present in every object, then the resulting executable's GNU_STACK uses flags RW instead of RWE. Reported by Érico Nogueira Rolim.
2021-03-02	arm64: handle stack offsets >=4096 in Oaddr	Michael Forney
	The immediate in the add instruction is only 12 bits. If the offset does not fit, we must move it into a register first.
2021-02-16	docs/llvm: Fix typo jeoparadized -> jeopardized	Thomas Bracht Laumann Jespersen

2020-10-05	fold: zero-initialize padding bits of constants	Michael Forney
	Otherwise, if a constant is stored as a float and retrieved as an int, the padding bits are uninitialized. This can result in the generation of invalid assembly: Error: suffix or operands invalid for `cvtsi2ss' Reported by Hiltjo Posthuma.
2020-08-06	fix a typo in call's BNF	Quentin Carbonneaux
	Thanks to Jakob for pointing this out.
2020-08-06	amd64: Use member class for aggregate parameter temporary	Michael Forney
	Otherwise, we may end up using an integer and floating class for the same register, triggering an assertion failure: qbe: rega.c:215: pmrec: Assertion `KBASE(pm[i].cls) == KBASE(*k)' failed. Test case: type :T = { s } export function $d(:T %.1, s %.2) { @start call $c(s %.2) ret }
2020-08-06	rega: Fix allocation of multiple temporaries to the same register	Michael Forney

2020-08-06	arm64: Make sure SP stays aligned by 16	Michael Forney
	According to the ARMv8 overview document However if SP is used as the base register then the value of the stack pointer prior to adding any offset must be quadword (16 byte) aligned, or else a stack alignment exception will be generated. This manifests as a bus error on my system. To resolve this, just save registers two at a time with stp.
2020-08-06	Move NPred in parse.c and decrease it	Michael Forney
	This now only limits the number of arguments when parsing the input SSA, which is usually a small fixed size (depending on the frontend).
2020-08-06	Use a dynamic array for phi arguments	Michael Forney

2019-11-25	copy: Fix use of compound literal outside its scope	Michael Forney
	C99 6.5.2.5p6: > If the compound literal occurs outside the body of a function, > the object has static storage duration; otherwise, it has automatic > storage duration associated with the enclosing block. So, we can't use the address of a compound literal here. Instead, just set p to NULL, and make the loop conditional on p being non-NULL. Remarks from Quentin: I made a cosmetic change to Michael's original patch and merely pushed the literal at toplevel.
2019-07-11	minic: fix undefined symbol linkage issue	Sergei V. Rogachev
	The mandel example uses SDL2 for graphics output. When GCC is used to assemble the resulting *.s file it shows linker's errors about undefined symbols from the library. This behavior can be fixed by moving the flags passed to the compiler after the source file name.
2019-05-16	Fix a few uses of gassym missed in 9e7e5bff	Michael Forney

2019-05-15	arm64: Handle stack allocations larger than 4095 bytes	Michael Forney
	In this case, the immediate is too large to use directly in the add/sub instructions, so move it into a temporary register first. Also, for clarity, rearrange the if-conditions so that they match the constraints of the instructions that immediately follow.
2019-05-15	arm64: Handle truncd instruction	Michael Forney

2019-05-15	arm64: Use 32-bit register name when loading 'b' or 'h' into 'l'	Michael Forney
	The ldrb and ldrh instructions require a 32-bit register name for the destination and will clear the upper 32-bits of that register.
2019-05-15	Allow specifying literal global names	Michael Forney

2019-05-14	drop dead declaration	Quentin Carbonneaux

2019-05-14	fix a bad bug in copy detection	Quentin Carbonneaux
	The code used to see add 0, 10 as a copy of 0.
2019-05-05	add asm diffing in test script	Quentin Carbonneaux

2019-05-05	fuse epilog deduplication with jump threading	Quentin Carbonneaux

2019-05-05	revert last commit	Quentin Carbonneaux
	The same functionality can be implemented naturally in the cfg simplification pass.