~cnx/roux - Alternative QBE compiler

Age	Commit message (Collapse)	Author
2023-03-13	refresh stale Tmp.link before use	Quentin Carbonneaux
	During coalescing, the resizing/ reordering of the sl[] array invalidates the indices stored in the 'visit' field of temps; we need to reset it before we can use it again.
2023-03-11	kill dead stores when coalescing slots	Quentin Carbonneaux
	This is necessary because, post fusion, dead stores may clobber data. A new test case exposes one such situation.
2022-12-15	bugfix in load elimination	Quentin Carbonneaux
	When checking if two slices represent the same range of memory we must check that offsets match. The bug was revealed by a harec test.
2022-12-14	new blit instruction	Quentin Carbonneaux

2022-10-12	thread-local storage for amd64_apple	Quentin Carbonneaux
	It is quite similar to arm64_apple. Probably, the call that needs to be generated also provides extra invariants on top of the regular abi, but I have not checked that. Clang generates code that is a bit neater than qbe's because, on x86, a load can be fused in a call instruction! We do not bother with supporting these since we expect only sporadic use of the feature. For reference, here is what clang might output for a store to the second entry of a thread-local array of ints: movq _x@TLVP(%rip), %rdi callq *(%rdi) movl %ecx, 4(%rax)
2022-10-03	new arm64_apple target	Quentin Carbonneaux
	Should make qbe work on apple arm-based hardware.
2022-08-31	fix some variadic calls in test/abi8.ssa	Quentin Carbonneaux

2022-08-31	regenerate test/vararg2.ssa	Quentin Carbonneaux
	- update the test generation script to match some manual changes - fix some variadic calls to printf - add a test case where an odd number of slots is used on the stack before varargs
2022-03-10	two new tests in abi5.ssa	Quentin Carbonneaux
	They are meant to exercise the hardware floating-point calling convention of the risc-v target.
2022-03-10	new abi stress test	Quentin Carbonneaux

2022-02-17	add rv64 backend	Michael Forney
	It is mostly complete, but still has a few ABI bugs when passing floats in structs, or when structs are passed partly in register, and partly on stack.
2022-02-17	test: add c[u]od checks to isel2 and add new integer compare test isel3	Michael Forney

2022-01-28	fix test/fpcnv (wrong spacing)	Quentin Carbonneaux

2022-01-28	implement float -> unsigned casts	Bor Grošelj Simić
	amd64 lacks instruction for this so it has to be implemented with float -> signed casts. The approach is borrowed from llvm.
2022-01-28	implement unsigned -> float casts	Bor Grošelj Simić
	amd64 lacks an instruction for this so it has to be implemented with signed -> float casts: - Word casting is done by zero-extending the word to a long and then doing a regular signed cast. - Long casting is done by dividing by two with correct rounding if the highest bit is set and casting that to float, then adding 1 to mantissa with integer addition
2021-10-28	new chacha20 test	Quentin Carbonneaux

2021-10-26	remove trailing whitespace from test/abi7.ssa	Michael Forney

2021-10-22	make variadic args explicit	Quentin Carbonneaux
	Some abis, like the riscv one, treat arguments differently depending on whether they are variadic or not. To prepare for the upcomming riscv target, we change the variadic call syntax and give meaning to the location of the '...' marker. # new syntax %ret =w call $f(w %regular, ..., w %variadic) By nature of their abis, the change is backwards compatible for existing targets.
2021-10-17	amd64/sysv: unbreak env calls	Quentin Carbonneaux
	Env calls were disfunctional from the start. This fixes them on amd64, but they remain to do on arm64. A new test shows how to use them.
2021-09-07	test: use architecture-neutral wrapper for calling vprintf	Michael Forney
	Different architectures use different types for va_list: x86_64 uses an 1-length array of struct type[0]: typedef struct { unsigned int gp_offset; unsigned int fp_offset; void overflow_arg_area; void reg_save_area; } va_list[1]; aarch64 uses a struct type[1] typedef struct { void __stack; void __gr_top; void __vr_top; int __gr_offs; int __vr_offs; } va_list; Consequently, C functions which takes a va_list as an argument, such as vprintf, may pass va_list in different ways depending on the architecture. On x86_64, va_list is an array type, so parameter decays to a pointer and passing the address of the va_list is correct. On aarch64, the va_list struct is passed by value, but since it is larger than 16 bytes, the parameter is replaced with a pointer to caller-allocated memory. Thus, passing the address as an l argument happens to work. However, this pattern of passing the address of the va_list to vprintf doesn't extend to other architectures. On riscv64, va_list is defined as typedef void va_list; which is not passed by reference. This means that tests that call vprintf using the address of a va_list (vararg1 and vararg2) will not work on riscv. To fix this while keeping the tests architecture-neutral, add a small wrapper function to the driver which takes a va_list *, and let the C compiler deal with the details of passing va_list by value. [0] https://c9x.me/compile/bib/abi-x64.pdf#figure.3.34 [1] https://c9x.me/compile/bib/abi-arm64.pdf#%5B%7B%22num%22%3A63%2C%22gen%22%3A0%7D%2C%7B%22name%22%3A%22XYZ%22%7D%2C52%2C757%2C0%5D [2] https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#va_list-va_start-and-va_arg$
2021-09-07	test: assign result of print functions to temporary	Michael Forney
	Though I am not aware of any architecture where this matters, it is technically incorrect to call these stdio functions as if they had no result. The QBE documentation says > Unless the called function does not return a value, a return > temporary must be specified, even if it is never used afterwards. so we should follow it in the tests as well.
2021-08-29	amd64/isel: fix floating point == and != result with NaN	Michael Forney
	On x86_64, ucomis[sd] sets ZF=1, PF=0, CF=0 for equal arguments. However, if the arguments are unordered it sets ZF=1, PF=1, CF=1, and there is no jump/flag instruction for ZF=1 & PF=0 or ZF=1 & CF=0. So, in order to correctly implement ceq[sd] on x86_64, we need to be a bit more creative. There are several options available, depending on whether the result of ceq[sd] is used with jnz, or with other instructions, or both. If the result is used for a conditional jump, both gcc and clang use a combination of jp and jnz: ucomisd %xmm1, %xmm0 jp .Lfalse jnz .Lfalse ... .Lfalse: If the result is used in other instructions or return, gcc does the following for x == y: ucomisd %xmm1, %xmm0 setnp %al movzbl %al, %eax movl $0, %edx cmovne %edx, %eax This sets EAX to PF=0, then uses cmovne to clear it if ZF=0. It also takes care to avoid clobbering the flags register in case the result is also used for a conditional jump. Implementing this approach in QBE would require adding an architecture-specific instruction for cmovne. In contrast, clang does an additional compare, this time using cmpeqsd instead of ucomisd: cmpeqsd %xmm1, %xmm0 movq %xmm0, %rax andl $1, %rax The cmpeqsd instruction doas a floating point equality test, setting XMM0 to all 1s if they are equal and all 0s if they are not. However, we need the result in a non-XMM register, so it moves the result back then masks off all but the first bit. Both of these approaches are a bit awkward to implement in QBE, so instead, this commit does the following: ucomisd %xmm1, %xmm0 setz %al movzbl %al, %eax setnp %cl movzbl %cl, %ecx andl %ecx, %eax This sets the result by anding the two flags, but has a side effect of clobbering the flags register. This was a problem in one of my earlier patches to fix this issue[0], in addition to being more complex than I'd hoped. Instead, this commit always leaves the ceq[sd] instruction in the block, even if the result is only used to control a jump, so that the above instruction sequence is always used. Then, since we now have ZF=!(ZF=1 & PF=0) for x == y, or ZF=!(ZF=0 \| PF=1) for x != y, we can use jnz for the jump instruction. [0] https://git.sr.ht/~sircmpwn/qbe/commit/64833841b18c074a23b4a1254625315e05b86658
2021-08-27	amd64/isel: fix floating < and <= result with NaN	Michael Forney
	When the two operands are Unordered (for instance if one of them is NaN), ucomisd sets ZF=1, PF=1, and CF=1. When the result is LessThan, it sets ZF=0, PF=0, and CF=1. However, jb[e]/setb[e] only checks that CF=1 [or ZF=1] which causes the result to be true for unordered operands. To fix this, change the operand swap condition for these two floating point comparison types: always rewrite x < y as y > x, and never rewrite x > y as y < x. Add a test to check the result of cltd, cled, cgtd, cged, ceqd, and cned with arguments that are LessThan, Equal, GreaterThan, and Unordered. Additionally, check three different implementations for equality testing: one that uses the result of ceqd directly, one that uses the result to control a conditional jump, and one that uses the result both as a value and for a conditional jump. For now, unordered equality tests are still broken so they are disabled.
2021-08-23	parsefields: fix padding calculation	Drew DeVault
	This was causing issues with aggregate types. A simple reproduction is: type :type.1 = align 8 { 24 } type :type.2 = align 8 { w 1, :type.1 1 } The size of type.2 should be 32, adding only 4 bytes of padding between the first and second field. Prior to this patch, 20 bytes of padding was added instead, causing the type to have a size of 48. Signed-off-by: Drew DeVault <sir@cmpwn.com>
2021-06-17	amd64: fix conditional jump when compare is swapped and used elsewhere	Michael Forney
	selcmp may potentially swap the arguments and return 1 indicating that the opposite operation should be used. However, if the compare result is used for a conditional jump as well as elsewhere, the original compare op is used instead of the opposite. To fix this, add a check to see whether the opposite compare should be used, regardless of whether selcmp() is done now, or later on during sel(). Bug report and test case from Charlie Stanton.
2019-04-30	isel fix for amd64 memory stores	Quentin Carbonneaux
	The value argument of store instructions was handled incorrectly.
2019-04-29	fix folding of unsigned operations	Quentin Carbonneaux
	This fixes similar bugs than the ones fixed in the previous commit. In the folding code the invariant is that when a result is 32 bits wide, the low 32 bits of 'x' are correct. The high bits can be anything.
2019-04-29	fold: Make sure 32-bit constants get sign extended when necessary	Michael Forney

2019-04-26	new large test to evaluate performance	Quentin Carbonneaux
	This was generated by csmith and then compiled to qbe il by Michael Forney's C compiler.
2019-04-26	update conaddr test to catch early segfaults	Quentin Carbonneaux

2019-04-26	restore some code from b4a98c	Quentin Carbonneaux
	I had forgotten that %rip can only be used as base when there is no index. I also added a test which stresses addressing selection with and without constants.
2019-03-12	emit valid code for mem->mem copies	Quentin Carbonneaux

2019-03-09	add a stress test for phi spilling	Quentin Carbonneaux

2019-02-21	fix amd64 addressing mode matcher	Quentin Carbonneaux
	The numberer made some arranging choices when numbering arguments of an instruction, but these decisions were ignored when matching. The fix is to reconcile numbering and matching.
2019-02-18	mark phi arguments as escaping	Quentin Carbonneaux
	Thanks to Michael Forney for spotting this oversight and providing the test case. Note: because esc() leaves ABot unchanged, the assertion "a->type == ABot" on line 122 remains valid.
2019-02-06	2 bug fixes in rega	Quentin Carbonneaux
	The worst one was that "part 3" of rega() could break the critical invariant that two interferring temporaries get assigned different registers. This is fixed by being careful when changing the register of a temporary based on predecessor blocks. Thanks to Michael Forney for reporting these bugs and helping with the analysis.
2017-09-22	mark printf call as variadic in test	Quentin Carbonneaux

2017-07-30	fix dynamic stack allocs for amd64	Quentin Carbonneaux
	The arm64 might have the same problem but it is currently unable to handle them even in instruction selection. Thanks to Jean Dao for reporting the bug.
2017-04-08	new abi test for arm64 HFAs	Quentin Carbonneaux

2017-04-08	enable arm64 tests	Quentin Carbonneaux
	The vararg tests had to be changed because va_list is 32-bit wide on arm. The astute reader will notice that the way we pass va_list values is wrong, we should be using the ':valist' type as defined below instead of 'l'. But eh, that works for now, because of the ABI. type :valist = align 8 { 32 }
2017-03-03	add another idiomatic C test (rega does no good)	Quentin Carbonneaux

2017-02-23	add simple idiomatic c test	Quentin Carbonneaux

2017-02-10	tests for variable argument lists	Quentin Carbonneaux

2017-02-10	support variable argument lists	Quentin Carbonneaux
	This change is backward compatible, calls to "variadic" functions (like printf) must now be annotated (with ...).
2016-12-12	new tests for the load optimization	Quentin Carbonneaux

2016-12-12	new eight queens test	Quentin Carbonneaux

2016-08-16	add support for unions in sysv abi	Quentin Carbonneaux

2016-04-22	refine fp conversion instructions	Quentin Carbonneaux

2016-04-13	add regression test for liveon bug	Quentin Carbonneaux

2016-04-13	add regression test for fixarg bug	Quentin Carbonneaux