summary refs log tree commit diff
AgeCommit message (Collapse)Author
2022-12-12zero msbs of 32-bit constantsQuentin Carbonneaux
Some noisy assemblers complain when asked to do it themselves.
2022-11-27new hlt block terminatorQuentin Carbonneaux
It is handy to express when the end of a block cannot be reached. If a hlt terminator is executed, it traps the program. We don't go the llvm way and specify execution semantics as undefined behavior.
2022-11-24cosmetics in mem.cQuentin Carbonneaux
2022-11-22use a new struct for symbolsQuentin Carbonneaux
Symbols are a useful abstraction that occurs in both Con and Alias. In this patch they get their own struct. This new struct packages a symbol name and a type; the type tells us where the symbol name must be interpreted (currently, in gobal memory or in thread-local storage). The refactor fixed a bug in addcon(), proving the value of packaging symbol names with their type.
2022-11-22rename Tmp.ins to be more descriptiveQuentin Carbonneaux
2022-11-21fix allocation ordering bug in regaQuentin Carbonneaux
When we process one block, we start by allocating registers for all the temporaries live at the exit of the block. Before this patch we processed temps first, then in doblk() we would mark globally live registers allocated. This meant that temps could get wrongly assigned a live register. The fix is simple: we now process registers first at block exits, then allocate temps.
2022-11-21recognize some phis as copiesQuentin Carbonneaux
The copy elimination pass is not complete. This patch improves things a bit, but I think we still have quite a bit of incompleteness. We now consistently mark phis with all arguments identical as copies. Previously, they were inconsistently eliminated by phisimpl(). An example where they were not eliminated is the following: @blk2 %a = phi @blk0 %x, @blk1 %x jnz ?, @blk3, @blk4 @blk3 %b = copy %x @blk4 %c = phi @blk2 %a, @blk3 %b In this example, neither %c nor %a were marked as copies of %x because, when phisimpl() is called, the copy information for %b is not available. The incompleteness is still present and can be observed by modifying the example above so that %a takes a copy of %x through a back-edge. Then, phisimpl()'s lack of copy information about %b will prevent optimization.
2022-11-20new slot coalescing passQuentin Carbonneaux
This pass limits stack usage when many small aggregates are allocated on the stack. A fast liveness analysis figures out which slots interfere and the pass then fuses slots that do not interfere. The pass also kills stack slots that are only ever assigned. On the hare stdlib test suite, this fusion pass managed to reduce the total eligible slot bytes count by 84%. The slots considered for fusion must not escape and not exceed 64 bytes in size.
2022-11-20export getalias()Quentin Carbonneaux
We will be using it in the new coalesce() pass.
2022-11-20make multiple calls to fillalias() possibleQuentin Carbonneaux
The asserts (a->type == ABot) made it impossible to run fillalias() multiple times. We now reset the Alias.type field of all temps before starting. Getting rid of the asserts would have been another option.
2022-11-20stored bytes in Alias informationQuentin Carbonneaux
Stack slots may have padding bytes, and if we want to have precise liveness information it's important that we are able to tell them apart. This patch extends fillalias() to remember for every slot what bytes were ever assigned. In case the slot address does not escape we know that only these bytes matter. To save space, we only store this information if the slot size is less than or equal to NBit. The Alias struct was reworked a bit to save some space. I am still not very satisfied with its layout though.
2022-11-20argc does not leak its address argumentQuentin Carbonneaux
2022-11-20make Alias.base an intQuentin Carbonneaux
We had the invariant that it'd always be a temporary.
2022-11-20fill definition site in filluse()Quentin Carbonneaux
2022-10-12thread-local storage for amd64_appleQuentin Carbonneaux
It is quite similar to arm64_apple. Probably, the call that needs to be generated also provides extra invariants on top of the regular abi, but I have not checked that. Clang generates code that is a bit neater than qbe's because, on x86, a load can be fused in a call instruction! We do not bother with supporting these since we expect only sporadic use of the feature. For reference, here is what clang might output for a store to the second entry of a thread-local array of ints: movq _x@TLVP(%rip), %rdi callq *(%rdi) movl %ecx, 4(%rax)
2022-10-12thread-local storage for arm64_appleQuentin Carbonneaux
It is documented nowhere how this is supposed to work. It is also quite easy to have assertion failures pop in the linker when generating asm slightly different from clang's! The best source of information is found in LLVM's source code (AArch64ISelLowering.cpp). I paste it here for future reference: /// Darwin only has one TLS scheme which must be capable of dealing with the /// fully general situation, in the worst case. This means: /// + "extern __thread" declaration. /// + Defined in a possibly unknown dynamic library. /// /// The general system is that each __thread variable has a [3 x i64] descriptor /// which contains information used by the runtime to calculate the address. The /// only part of this the compiler needs to know about is the first xword, which /// contains a function pointer that must be called with the address of the /// entire descriptor in "x0". /// /// Since this descriptor may be in a different unit, in general even the /// descriptor must be accessed via an indirect load. The "ideal" code sequence /// is: /// adrp x0, _var@TLVPPAGE /// ldr x0, [x0, _var@TLVPPAGEOFF] ; x0 now contains address of descriptor /// ldr x1, [x0] ; x1 contains 1st entry of descriptor, /// ; the function pointer /// blr x1 ; Uses descriptor address in x0 /// ; Address of _var is now in x0. /// /// If the address of _var's descriptor *is* known to the linker, then it can /// change the first "ldr" instruction to an appropriate "add x0, x0, #imm" for /// a slight efficiency gain. The call 'blr x1' above is actually special in that it trashes less registers than what the abi would normally permit. In qbe, I don't take advantage of this and lower the call like a regular call. We can revise this later on. Again, the source for this information is LLVM's source code: // TLS calls preserve all registers except those that absolutely must be // trashed: X0 (it takes an argument), LR (it's a call) and NZCV (let's not be // silly).
2022-10-08mark apple targets with a booleanQuentin Carbonneaux
It is more natural to branch on a flag than have different function pointers for high-level passes.
2022-10-08fix asm comment positionQuentin Carbonneaux
When emitting data detected as zero the comment appeared before the data directives were output.
2022-10-08"rel" fields become "reloc"Quentin Carbonneaux
2022-10-08do not drop relocation kind in alias analysisQuentin Carbonneaux
2022-10-08add support for thread-local storageQuentin Carbonneaux
The apple targets are not done yet.
2022-10-03flag bad vastart usesQuentin Carbonneaux
2022-10-03fix case of Pool constantsQuentin Carbonneaux
2022-10-03new arm64_apple targetQuentin Carbonneaux
Should make qbe work on apple arm-based hardware.
2022-10-03refine width of parsb/ub/sh/uh opsQuentin Carbonneaux
2022-10-03add new target-specific abi0 passQuentin Carbonneaux
The general idea is to give abis a chance to talk before we've done all the optimizations. Currently, all targets eliminate {par,arg,ret}{sb,ub,...} during this pass. The forthcoming arm64_apple will, however, insert proper extensions during abi0. Moving forward abis can, for example, lower small-aggregates passing there so that memory optimizations can interact better with function calls.
2022-10-03parse sb,ub,sh,uh abi typesQuentin Carbonneaux
2022-09-15Fix parsing of multiple globals in datadefEmber Sawady
Eg. data $a = { w $b $c }
2022-09-01capitalize a labelQuentin Carbonneaux
2022-09-01remove two unsignedQuentin Carbonneaux
We have a uint alias that we use everywhere else. I also added a todo about unhandled large offsets in arm64/emit.
2022-09-01use direct bl calls on arm64Quentin Carbonneaux
This generates tidier code and is pic friendly because it lets the linker trampoline calls to dynlinked libs.
2022-08-31drop -G flag and add target amd64_appleQuentin Carbonneaux
apple support is more than assembly syntax in case of arm64 machines, and apple syntax is currently useless in all cases but amd64; rather than having a -G option that only makes sense with amd64, we add a new target amd64_apple
2022-08-31flag the default target in "qbe -h"Quentin Carbonneaux
2022-08-31fix some variadic calls in test/abi8.ssaQuentin Carbonneaux
2022-08-31regenerate test/vararg2.ssaQuentin Carbonneaux
- update the test generation script to match some manual changes - fix some variadic calls to printf - add a test case where an odd number of slots is used on the stack before varargs
2022-07-01Reject multiple section definition for a symbolRoberto E. Vargas Caballero
2022-07-01Add qbe identifier in error stringsRoberto E. Vargas Caballero
When qbe is used with other tools is a bit hard to identify what is the tool that is generating the error. Adding an identifier at the beginning of the line makes much easier to identify the tool generating the error.
2022-07-01Makefile: Avoid double macro expansion in targetsRoberto E. Vargas Caballero
POSIX specification stays: string1 = [string2] ... Macro expansions in string1 of macro definition lines shall be evaluated when read. Macro expansions in string2 of macro definition lines shall be performed when the macro identified by string1 is expanded in a rule or command. It means that recursive macro expansion is not guaranteed to work in a portable Make. Also, as make is a declarative language makes more sense to declare your targets as a primary concern instead of derivating them from a informational macro like SRC that is only used in a rule command.
2022-06-29Fix minor typos in IL docSimon Heath
2022-06-16install with install -m755 v1.0Quentin Carbonneaux
2022-06-14tools/test.sh: Without a TARGET, use $CC if definedHaelwenn (lanodan) Monnier
cc can be absent in Gentoo to make sure the right compiler is picked, for example when clang is preferred or when cross-compiling.
2022-06-14Makefile: POSIXifyilliliti
Makefile now compatible with gmake, bmake, smake and pdpmake.
2022-06-14do not fold cnst+cnst in amd64's iselQuentin Carbonneaux
This may cause invalid assembly to be generated and is not all that useful anyway after constant folding has run.
2022-06-14rv64: implement Oswap for floating-point typesAlexey Yerin
2022-06-14refine assertion in liveness analysisQuentin Carbonneaux
We were redundantly checking cardinality in a way that prevented fp regs from ever being globally live. We now check that the live regs after a return are exactly the globally live ones.
2022-05-12install in /usr/local by defaultQuentin Carbonneaux
2022-05-12tighten function definition specQuentin Carbonneaux
2022-05-12use an alias for \n in the il specQuentin Carbonneaux
2022-05-11avoid folding overflowing divisionsQuentin Carbonneaux
Thanks to Paul Ouellette for reporting.
2022-05-11document spacing in il referenceQuentin Carbonneaux