~cnx/roux - Alternative QBE compiler

Age	Commit message (Collapse)	Author
2023-01-09	reorder some sections in doc v1.1	Quentin Carbonneaux

2022-12-27	ready for this jelly	Quentin Carbonneaux

2022-12-25	link pthread in tests	Quentin Carbonneaux

2022-12-25	new UNDEF Ref	Quentin Carbonneaux
	Crashing loads of uninitialized memory proved to be a problem when implementing unions using qbe. This patch introduces a new UNDEF Ref to represent data that is known to be uninitialized. Optimization passes can make use of it to eliminate some code. In the last compilation stages, UNDEF is treated as the constant 0xdeaddead.
2022-12-16	update documentation	Quentin Carbonneaux

2022-12-15	bugfix in load elimination	Quentin Carbonneaux
	When checking if two slices represent the same range of memory we must check that offsets match. The bug was revealed by a harec test.
2022-12-14	new blit instruction	Quentin Carbonneaux

2022-12-14	fix coalesce() to produce valid ssa	Quentin Carbonneaux
	When multiple stack slots are coalesced one 'alloc' instruction is kept in the il and the other ones are removed and have their uses replaced by the result of the selected one. To produce valid ssa, it must be ensured that the uses that get replaced are dominated by the selected 'alloc' instruction. This patch ensures dominance by moving the selected alloc up in the start block as necessary.
2022-12-12	treat retc as non-escaping	Quentin Carbonneaux
	We may well treat all rets as non-escaping since stack slots are destroyed upon funcion return.
2022-12-12	new rsval() helper for signed Refs	Quentin Carbonneaux
	The .val field is signed in RSlot. Add a new dedicated function to fetch it as a signed int.
2022-12-12	crash loads from uninitialized slots	Quentin Carbonneaux

2022-12-12	renamings in coalesce()	Quentin Carbonneaux

2022-12-12	zero msbs of 32-bit constants	Quentin Carbonneaux
	Some noisy assemblers complain when asked to do it themselves.
2022-11-27	new hlt block terminator	Quentin Carbonneaux
	It is handy to express when the end of a block cannot be reached. If a hlt terminator is executed, it traps the program. We don't go the llvm way and specify execution semantics as undefined behavior.
2022-11-24	cosmetics in mem.c	Quentin Carbonneaux

2022-11-22	use a new struct for symbols	Quentin Carbonneaux
	Symbols are a useful abstraction that occurs in both Con and Alias. In this patch they get their own struct. This new struct packages a symbol name and a type; the type tells us where the symbol name must be interpreted (currently, in gobal memory or in thread-local storage). The refactor fixed a bug in addcon(), proving the value of packaging symbol names with their type.
2022-11-22	rename Tmp.ins to be more descriptive	Quentin Carbonneaux

2022-11-21	fix allocation ordering bug in rega	Quentin Carbonneaux
	When we process one block, we start by allocating registers for all the temporaries live at the exit of the block. Before this patch we processed temps first, then in doblk() we would mark globally live registers allocated. This meant that temps could get wrongly assigned a live register. The fix is simple: we now process registers first at block exits, then allocate temps.
2022-11-21	recognize some phis as copies	Quentin Carbonneaux
	The copy elimination pass is not complete. This patch improves things a bit, but I think we still have quite a bit of incompleteness. We now consistently mark phis with all arguments identical as copies. Previously, they were inconsistently eliminated by phisimpl(). An example where they were not eliminated is the following: @blk2 %a = phi @blk0 %x, @blk1 %x jnz ?, @blk3, @blk4 @blk3 %b = copy %x @blk4 %c = phi @blk2 %a, @blk3 %b In this example, neither %c nor %a were marked as copies of %x because, when phisimpl() is called, the copy information for %b is not available. The incompleteness is still present and can be observed by modifying the example above so that %a takes a copy of %x through a back-edge. Then, phisimpl()'s lack of copy information about %b will prevent optimization.
2022-11-20	new slot coalescing pass	Quentin Carbonneaux
	This pass limits stack usage when many small aggregates are allocated on the stack. A fast liveness analysis figures out which slots interfere and the pass then fuses slots that do not interfere. The pass also kills stack slots that are only ever assigned. On the hare stdlib test suite, this fusion pass managed to reduce the total eligible slot bytes count by 84%. The slots considered for fusion must not escape and not exceed 64 bytes in size.
2022-11-20	export getalias()	Quentin Carbonneaux
	We will be using it in the new coalesce() pass.
2022-11-20	make multiple calls to fillalias() possible	Quentin Carbonneaux
	The asserts (a->type == ABot) made it impossible to run fillalias() multiple times. We now reset the Alias.type field of all temps before starting. Getting rid of the asserts would have been another option.
2022-11-20	stored bytes in Alias information	Quentin Carbonneaux
	Stack slots may have padding bytes, and if we want to have precise liveness information it's important that we are able to tell them apart. This patch extends fillalias() to remember for every slot what bytes were ever assigned. In case the slot address does not escape we know that only these bytes matter. To save space, we only store this information if the slot size is less than or equal to NBit. The Alias struct was reworked a bit to save some space. I am still not very satisfied with its layout though.
2022-11-20	argc does not leak its address argument	Quentin Carbonneaux

2022-11-20	make Alias.base an int	Quentin Carbonneaux
	We had the invariant that it'd always be a temporary.
2022-11-20	fill definition site in filluse()	Quentin Carbonneaux

2022-10-12	thread-local storage for amd64_apple	Quentin Carbonneaux
	It is quite similar to arm64_apple. Probably, the call that needs to be generated also provides extra invariants on top of the regular abi, but I have not checked that. Clang generates code that is a bit neater than qbe's because, on x86, a load can be fused in a call instruction! We do not bother with supporting these since we expect only sporadic use of the feature. For reference, here is what clang might output for a store to the second entry of a thread-local array of ints: movq _x@TLVP(%rip), %rdi callq *(%rdi) movl %ecx, 4(%rax)
2022-10-12	thread-local storage for arm64_apple	Quentin Carbonneaux
	It is documented nowhere how this is supposed to work. It is also quite easy to have assertion failures pop in the linker when generating asm slightly different from clang's! The best source of information is found in LLVM's source code (AArch64ISelLowering.cpp). I paste it here for future reference: /// Darwin only has one TLS scheme which must be capable of dealing with the /// fully general situation, in the worst case. This means: /// + "extern __thread" declaration. /// + Defined in a possibly unknown dynamic library. /// /// The general system is that each __thread variable has a [3 x i64] descriptor /// which contains information used by the runtime to calculate the address. The /// only part of this the compiler needs to know about is the first xword, which /// contains a function pointer that must be called with the address of the /// entire descriptor in "x0". /// /// Since this descriptor may be in a different unit, in general even the /// descriptor must be accessed via an indirect load. The "ideal" code sequence /// is: /// adrp x0, _var@TLVPPAGE /// ldr x0, [x0, _var@TLVPPAGEOFF] ; x0 now contains address of descriptor /// ldr x1, [x0] ; x1 contains 1st entry of descriptor, /// ; the function pointer /// blr x1 ; Uses descriptor address in x0 /// ; Address of _var is now in x0. /// /// If the address of _var's descriptor is known to the linker, then it can /// change the first "ldr" instruction to an appropriate "add x0, x0, #imm" for /// a slight efficiency gain. The call 'blr x1' above is actually special in that it trashes less registers than what the abi would normally permit. In qbe, I don't take advantage of this and lower the call like a regular call. We can revise this later on. Again, the source for this information is LLVM's source code: // TLS calls preserve all registers except those that absolutely must be // trashed: X0 (it takes an argument), LR (it's a call) and NZCV (let's not be // silly).
2022-10-08	mark apple targets with a boolean	Quentin Carbonneaux
	It is more natural to branch on a flag than have different function pointers for high-level passes.
2022-10-08	fix asm comment position	Quentin Carbonneaux
	When emitting data detected as zero the comment appeared before the data directives were output.
2022-10-08	"rel" fields become "reloc"	Quentin Carbonneaux

2022-10-08	do not drop relocation kind in alias analysis	Quentin Carbonneaux

2022-10-08	add support for thread-local storage	Quentin Carbonneaux
	The apple targets are not done yet.
2022-10-03	flag bad vastart uses	Quentin Carbonneaux

2022-10-03	fix case of Pool constants	Quentin Carbonneaux

2022-10-03	new arm64_apple target	Quentin Carbonneaux
	Should make qbe work on apple arm-based hardware.
2022-10-03	refine width of parsb/ub/sh/uh ops	Quentin Carbonneaux

2022-10-03	add new target-specific abi0 pass	Quentin Carbonneaux
	The general idea is to give abis a chance to talk before we've done all the optimizations. Currently, all targets eliminate {par,arg,ret}{sb,ub,...} during this pass. The forthcoming arm64_apple will, however, insert proper extensions during abi0. Moving forward abis can, for example, lower small-aggregates passing there so that memory optimizations can interact better with function calls.
2022-10-03	parse sb,ub,sh,uh abi types	Quentin Carbonneaux

2022-09-15	Fix parsing of multiple globals in datadef	Ember Sawady
	Eg. data $a = { w $b $c }
2022-09-01	capitalize a label	Quentin Carbonneaux

2022-09-01	remove two unsigned	Quentin Carbonneaux
	We have a uint alias that we use everywhere else. I also added a todo about unhandled large offsets in arm64/emit.
2022-09-01	use direct bl calls on arm64	Quentin Carbonneaux
	This generates tidier code and is pic friendly because it lets the linker trampoline calls to dynlinked libs.
2022-08-31	drop -G flag and add target amd64_apple	Quentin Carbonneaux
	apple support is more than assembly syntax in case of arm64 machines, and apple syntax is currently useless in all cases but amd64; rather than having a -G option that only makes sense with amd64, we add a new target amd64_apple
2022-08-31	flag the default target in "qbe -h"	Quentin Carbonneaux

2022-08-31	fix some variadic calls in test/abi8.ssa	Quentin Carbonneaux

2022-08-31	regenerate test/vararg2.ssa	Quentin Carbonneaux
	- update the test generation script to match some manual changes - fix some variadic calls to printf - add a test case where an odd number of slots is used on the stack before varargs
2022-07-01	Reject multiple section definition for a symbol	Roberto E. Vargas Caballero

2022-07-01	Add qbe identifier in error strings	Roberto E. Vargas Caballero
	When qbe is used with other tools is a bit hard to identify what is the tool that is generating the error. Adding an identifier at the beginning of the line makes much easier to identify the tool generating the error.
2022-07-01	Makefile: Avoid double macro expansion in targets	Roberto E. Vargas Caballero
	POSIX specification stays: string1 = [string2] ... Macro expansions in string1 of macro definition lines shall be evaluated when read. Macro expansions in string2 of macro definition lines shall be performed when the macro identified by string1 is expanded in a rule or command. It means that recursive macro expansion is not guaranteed to work in a portable Make. Also, as make is a declarative language makes more sense to declare your targets as a primary concern instead of derivating them from a informational macro like SRC that is only used in a rule command.