summary refs log tree commit diff
path: root/amd64/isel.c
AgeCommit message (Collapse)Author
2022-11-27new hlt block terminatorQuentin Carbonneaux
It is handy to express when the end of a block cannot be reached. If a hlt terminator is executed, it traps the program. We don't go the llvm way and specify execution semantics as undefined behavior.
2022-11-22use a new struct for symbolsQuentin Carbonneaux
Symbols are a useful abstraction that occurs in both Con and Alias. In this patch they get their own struct. This new struct packages a symbol name and a type; the type tells us where the symbol name must be interpreted (currently, in gobal memory or in thread-local storage). The refactor fixed a bug in addcon(), proving the value of packaging symbol names with their type.
2022-10-12thread-local storage for amd64_appleQuentin Carbonneaux
It is quite similar to arm64_apple. Probably, the call that needs to be generated also provides extra invariants on top of the regular abi, but I have not checked that. Clang generates code that is a bit neater than qbe's because, on x86, a load can be fused in a call instruction! We do not bother with supporting these since we expect only sporadic use of the feature. For reference, here is what clang might output for a store to the second entry of a thread-local array of ints: movq _x@TLVP(%rip), %rdi callq *(%rdi) movl %ecx, 4(%rax)
2022-10-08add support for thread-local storageQuentin Carbonneaux
The apple targets are not done yet.
2022-08-31drop -G flag and add target amd64_appleQuentin Carbonneaux
apple support is more than assembly syntax in case of arm64 machines, and apple syntax is currently useless in all cases but amd64; rather than having a -G option that only makes sense with amd64, we add a new target amd64_apple
2022-06-14do not fold cnst+cnst in amd64's iselQuentin Carbonneaux
This may cause invalid assembly to be generated and is not all that useful anyway after constant folding has run.
2022-03-14dynamic stack allocs for arm64Quentin Carbonneaux
I also moved some isel logic that would have been repeated a third time in util.c.
2022-01-28amd64/isel: nitsQuentin Carbonneaux
2022-01-28implement float -> unsigned castsBor Grošelj Simić
amd64 lacks instruction for this so it has to be implemented with float -> signed casts. The approach is borrowed from llvm.
2022-01-28implement unsigned -> float castsBor Grošelj Simić
amd64 lacks an instruction for this so it has to be implemented with signed -> float casts: - Word casting is done by zero-extending the word to a long and then doing a regular signed cast. - Long casting is done by dividing by two with correct rounding if the highest bit is set and casting that to float, then adding 1 to mantissa with integer addition
2022-01-23Add a negation instructionEyal Sawady
Necessary for floating-point negation, because `%result = sub 0, %operand` doesn't give the correct sign for 0/-0.
2021-11-22reuse previous address constants in fold()Michael Forney
parseref() has code to reuse address constants, but this is not done in other passes such as fold or isel. Introduce a new function newcon() which takes a Con and returns a Ref for that constant, and use this whenever creating address constants. This is necessary to fix folding of address constants when one operand is already folded. For example, in %a =l add $x, 1 %b =l add %a, 2 %c =w loadw %b %a and %b were folded to $x+1 and $x+3 respectively, but then the second add is visited again since it uses %a. This gets folded to $x+3 as well, but as a new distinct constant. This results in %b getting labeled as bottom instead of either constant, disabling the replacement of %b by a constant in subsequent instructions (such as the loadw).
2021-08-29amd64/isel: fix floating point == and != result with NaNMichael Forney
On x86_64, ucomis[sd] sets ZF=1, PF=0, CF=0 for equal arguments. However, if the arguments are unordered it sets ZF=1, PF=1, CF=1, and there is no jump/flag instruction for ZF=1 & PF=0 or ZF=1 & CF=0. So, in order to correctly implement ceq[sd] on x86_64, we need to be a bit more creative. There are several options available, depending on whether the result of ceq[sd] is used with jnz, or with other instructions, or both. If the result is used for a conditional jump, both gcc and clang use a combination of jp and jnz: ucomisd %xmm1, %xmm0 jp .Lfalse jnz .Lfalse ... .Lfalse: If the result is used in other instructions or return, gcc does the following for x == y: ucomisd %xmm1, %xmm0 setnp %al movzbl %al, %eax movl $0, %edx cmovne %edx, %eax This sets EAX to PF=0, then uses cmovne to clear it if ZF=0. It also takes care to avoid clobbering the flags register in case the result is also used for a conditional jump. Implementing this approach in QBE would require adding an architecture-specific instruction for cmovne. In contrast, clang does an additional compare, this time using cmpeqsd instead of ucomisd: cmpeqsd %xmm1, %xmm0 movq %xmm0, %rax andl $1, %rax The cmpeqsd instruction doas a floating point equality test, setting XMM0 to all 1s if they are equal and all 0s if they are not. However, we need the result in a non-XMM register, so it moves the result back then masks off all but the first bit. Both of these approaches are a bit awkward to implement in QBE, so instead, this commit does the following: ucomisd %xmm1, %xmm0 setz %al movzbl %al, %eax setnp %cl movzbl %cl, %ecx andl %ecx, %eax This sets the result by anding the two flags, but has a side effect of clobbering the flags register. This was a problem in one of my earlier patches to fix this issue[0], in addition to being more complex than I'd hoped. Instead, this commit always leaves the ceq[sd] instruction in the block, even if the result is only used to control a jump, so that the above instruction sequence is always used. Then, since we now have ZF=!(ZF=1 & PF=0) for x == y, or ZF=!(ZF=0 | PF=1) for x != y, we can use jnz for the jump instruction. [0] https://git.sr.ht/~sircmpwn/qbe/commit/64833841b18c074a23b4a1254625315e05b86658
2021-08-27amd64/isel: fix floating < and <= result with NaNMichael Forney
When the two operands are Unordered (for instance if one of them is NaN), ucomisd sets ZF=1, PF=1, and CF=1. When the result is LessThan, it sets ZF=0, PF=0, and CF=1. However, jb[e]/setb[e] only checks that CF=1 [or ZF=1] which causes the result to be true for unordered operands. To fix this, change the operand swap condition for these two floating point comparison types: always rewrite x < y as y > x, and never rewrite x > y as y < x. Add a test to check the result of cltd, cled, cgtd, cged, ceqd, and cned with arguments that are LessThan, Equal, GreaterThan, and Unordered. Additionally, check three different implementations for equality testing: one that uses the result of ceqd directly, one that uses the result to control a conditional jump, and one that uses the result both as a value and for a conditional jump. For now, unordered equality tests are still broken so they are disabled.
2021-07-30err when an address contains a sum $a+$b (afl)Quentin Carbonneaux
Reported by Alessandro Mantovani. These addresses are likely bogus, but they triggered an unwarranted assertion failure. We now raise a civilized error.
2021-07-28handle fast locals in amd64 shifts (afl)Quentin Carbonneaux
Reported by Alessandro Mantovani. Although unlikely in real programs it was found that using the address of a fast local in amd64 shifts triggers assertion failures. We now err when the shift count is given by an address; but we allow shifting an address.
2021-07-28fix amd64 addressing selection bug (afl)Quentin Carbonneaux
Reported by Alessandro Mantovani. Unlikely to be hit in practice because we don't add addresses to addresses. type :biggie = { l, l, l } function $repro(:biggie %p) { @start %x =l add %p, $a storew 42, %x ret }
2021-06-17amd64: fix conditional jump when compare is swapped and used elsewhereMichael Forney
selcmp may potentially swap the arguments and return 1 indicating that the opposite operation should be used. However, if the compare result is used for a conditional jump as well as elsewhere, the original compare op is used instead of the opposite. To fix this, add a check to see whether the opposite compare should be used, regardless of whether selcmp() is done now, or later on during sel(). Bug report and test case from Charlie Stanton.
2019-04-30isel fix for amd64 memory storesQuentin Carbonneaux
The value argument of store instructions was handled incorrectly.
2019-04-26amd64/isel: Error if alloc size doesn't fit in Tmp slot typeMichael Forney
2019-04-26restore some code from b4a98cQuentin Carbonneaux
I had forgotten that %rip can only be used as base when there is no index. I also added a test which stresses addressing selection with and without constants.
2019-04-25cleanup amd64 constant addressingQuentin Carbonneaux
We now emit correct code when the user refers to a specific constant address. I also made some comments clearer in the instruction selection pass and got rid of some apparently useless code.
2019-02-25prefer bigger amd64 addressingQuentin Carbonneaux
Before, amatch() would prefer matching "o + b" to "o + s*i" and "b + s*i".
2019-02-21fix amd64 addressing mode matcherQuentin Carbonneaux
The numberer made some arranging choices when numbering arguments of an instruction, but these decisions were ignored when matching. The fix is to reconcile numbering and matching.
2018-04-26more compiler warnings...Quentin Carbonneaux
2018-04-26Fix compiler warnings.Emil Skoeldberg
Compiler warned about comparison between signed and unsigned values.
2017-07-30fix dynamic stack allocs for amd64Quentin Carbonneaux
The arm64 might have the same problem but it is currently unable to handle them even in instruction selection. Thanks to Jean Dao for reporting the bug.
2017-06-06fix fp subtractions on amd64Quentin Carbonneaux
The stashing of constants in gas.c was also changed to support 16-bytes constants.
2017-06-06fix floating-point divisionQuentin Carbonneaux
It never worked until today.
2017-05-17intern symbol namesQuentin Carbonneaux
Symbols in the source file are still limited in length because the rest of the code assumes that strings always fit in NString bytes. Regardless, there is already a benefit because comparing/copying symbol names does not require using strcmp()/strcpy() anymore.
2017-04-08misc fixes for osxQuentin Carbonneaux
With the default toolchain, it looks like we have to make sure all symbols are loaded using rip-relative addressing.
2017-04-08prepare for multi-targetQuentin Carbonneaux
This big diff does multiple changes to allow the addition of new targets to qbe. The changes are listed below in decreasing order of impact. 1. Add a new Target structure. To add support for a given target, one has to implement all the members of the Target structure. All the source files where changed to use this interface where needed. 2. Single out amd64-specific code. In this commit, the amd64 target T_amd64_sysv is the only target available, it is implemented in the amd64/ directory. All the non-static items in this directory are prefixed with either amd64_ or amd64_sysv (for items that are specific to the System V ABI). 3. Centralize Ops information. There is now a file 'ops.h' that must be used to store all the available operations together with their metadata. The various targets will only select what they need; but it is beneficial that there is only *one* place to change to add a new instruction. One good side effect of this change is that any operation 'xyz' in the IL now as a corresponding 'Oxyz' in the code. 4. Misc fixes. One notable change is that instruction selection now generates generic comparison operations and the lowering to the target's comparisons is done in the emitter. GAS directives for data are the same for many targets, so data emission was extracted in a file 'gas.c'. 5. Modularize the Makefile. The Makefile now has a list of C files that are target-independent (SRC), and one list of C files per target. Each target can also use its own 'all.h' header (for example to define registers).