|
On x86_64, ucomis[sd] sets ZF=1, PF=0, CF=0 for equal arguments.
However, if the arguments are unordered it sets ZF=1, PF=1, CF=1,
and there is no jump/flag instruction for ZF=1 & PF=0 or ZF=1 & CF=0.
So, in order to correctly implement ceq[sd] on x86_64, we need to
be a bit more creative. There are several options available, depending
on whether the result of ceq[sd] is used with jnz, or with other
instructions, or both.
If the result is used for a conditional jump, both gcc and clang
use a combination of jp and jnz:
ucomisd %xmm1, %xmm0
jp .Lfalse
jnz .Lfalse
...
.Lfalse:
If the result is used in other instructions or return, gcc does the
following for x == y:
ucomisd %xmm1, %xmm0
setnp %al
movzbl %al, %eax
movl $0, %edx
cmovne %edx, %eax
This sets EAX to PF=0, then uses cmovne to clear it if ZF=0. It
also takes care to avoid clobbering the flags register in case the
result is also used for a conditional jump. Implementing this
approach in QBE would require adding an architecture-specific
instruction for cmovne.
In contrast, clang does an additional compare, this time using
cmpeqsd instead of ucomisd:
cmpeqsd %xmm1, %xmm0
movq %xmm0, %rax
andl $1, %rax
The cmpeqsd instruction doas a floating point equality test, setting
XMM0 to all 1s if they are equal and all 0s if they are not. However,
we need the result in a non-XMM register, so it moves the result
back then masks off all but the first bit.
Both of these approaches are a bit awkward to implement in QBE, so
instead, this commit does the following:
ucomisd %xmm1, %xmm0
setz %al
movzbl %al, %eax
setnp %cl
movzbl %cl, %ecx
andl %ecx, %eax
This sets the result by anding the two flags, but has a side effect
of clobbering the flags register. This was a problem in one of my
earlier patches to fix this issue[0], in addition to being more
complex than I'd hoped.
Instead, this commit always leaves the ceq[sd] instruction in the
block, even if the result is only used to control a jump, so that
the above instruction sequence is always used. Then, since we now
have ZF=!(ZF=1 & PF=0) for x == y, or ZF=!(ZF=0 | PF=1) for x != y,
we can use jnz for the jump instruction.
[0] https://git.sr.ht/~sircmpwn/qbe/commit/64833841b18c074a23b4a1254625315e05b86658
|
|
When the two operands are Unordered (for instance if one of them
is NaN), ucomisd sets ZF=1, PF=1, and CF=1. When the result is
LessThan, it sets ZF=0, PF=0, and CF=1.
However, jb[e]/setb[e] only checks that CF=1 [or ZF=1] which causes
the result to be true for unordered operands.
To fix this, change the operand swap condition for these two floating
point comparison types: always rewrite x < y as y > x, and never
rewrite x > y as y < x.
Add a test to check the result of cltd, cled, cgtd, cged, ceqd, and
cned with arguments that are LessThan, Equal, GreaterThan, and
Unordered. Additionally, check three different implementations for
equality testing: one that uses the result of ceqd directly, one
that uses the result to control a conditional jump, and one that
uses the result both as a value and for a conditional jump. For
now, unordered equality tests are still broken so they are disabled.
|