Age | Commit message (Collapse) | Author |
|
It is handy to express when
the end of a block cannot be
reached. If a hlt terminator
is executed, it traps the
program.
We don't go the llvm way and
specify execution semantics as
undefined behavior.
|
|
Symbols are a useful abstraction
that occurs in both Con and Alias.
In this patch they get their own
struct. This new struct packages
a symbol name and a type; the type
tells us where the symbol name
must be interpreted (currently, in
gobal memory or in thread-local
storage).
The refactor fixed a bug in
addcon(), proving the value of
packaging symbol names with their
type.
|
|
It is quite similar to arm64_apple.
Probably, the call that needs to be
generated also provides extra
invariants on top of the regular
abi, but I have not checked that.
Clang generates code that is a bit
neater than qbe's because, on x86,
a load can be fused in a call
instruction! We do not bother with
supporting these since we expect
only sporadic use of the feature.
For reference, here is what clang
might output for a store to the
second entry of a thread-local
array of ints:
movq _x@TLVP(%rip), %rdi
callq *(%rdi)
movl %ecx, 4(%rax)
|
|
The apple targets are not done yet.
|
|
apple support is more than assembly syntax
in case of arm64 machines, and apple syntax
is currently useless in all cases but amd64;
rather than having a -G option that only
makes sense with amd64, we add a new target
amd64_apple
|
|
This may cause invalid assembly to be generated
and is not all that useful anyway after constant
folding has run.
|
|
I also moved some isel logic
that would have been repeated
a third time in util.c.
|
|
|
|
amd64 lacks instruction for this so it has to be implemented with
float -> signed casts. The approach is borrowed from llvm.
|
|
amd64 lacks an instruction for this so it has to be implemented with
signed -> float casts:
- Word casting is done by zero-extending the word to a long and then doing
a regular signed cast.
- Long casting is done by dividing by two with correct rounding if the
highest bit is set and casting that to float, then adding
1 to mantissa with integer addition
|
|
Necessary for floating-point negation, because
`%result = sub 0, %operand` doesn't give the correct sign for 0/-0.
|
|
parseref() has code to reuse address constants, but this is not
done in other passes such as fold or isel. Introduce a new function
newcon() which takes a Con and returns a Ref for that constant, and
use this whenever creating address constants.
This is necessary to fix folding of address constants when one
operand is already folded. For example, in
%a =l add $x, 1
%b =l add %a, 2
%c =w loadw %b
%a and %b were folded to $x+1 and $x+3 respectively, but then the
second add is visited again since it uses %a. This gets folded to
$x+3 as well, but as a new distinct constant. This results in %b
getting labeled as bottom instead of either constant, disabling the
replacement of %b by a constant in subsequent instructions (such
as the loadw).
|
|
On x86_64, ucomis[sd] sets ZF=1, PF=0, CF=0 for equal arguments.
However, if the arguments are unordered it sets ZF=1, PF=1, CF=1,
and there is no jump/flag instruction for ZF=1 & PF=0 or ZF=1 & CF=0.
So, in order to correctly implement ceq[sd] on x86_64, we need to
be a bit more creative. There are several options available, depending
on whether the result of ceq[sd] is used with jnz, or with other
instructions, or both.
If the result is used for a conditional jump, both gcc and clang
use a combination of jp and jnz:
ucomisd %xmm1, %xmm0
jp .Lfalse
jnz .Lfalse
...
.Lfalse:
If the result is used in other instructions or return, gcc does the
following for x == y:
ucomisd %xmm1, %xmm0
setnp %al
movzbl %al, %eax
movl $0, %edx
cmovne %edx, %eax
This sets EAX to PF=0, then uses cmovne to clear it if ZF=0. It
also takes care to avoid clobbering the flags register in case the
result is also used for a conditional jump. Implementing this
approach in QBE would require adding an architecture-specific
instruction for cmovne.
In contrast, clang does an additional compare, this time using
cmpeqsd instead of ucomisd:
cmpeqsd %xmm1, %xmm0
movq %xmm0, %rax
andl $1, %rax
The cmpeqsd instruction doas a floating point equality test, setting
XMM0 to all 1s if they are equal and all 0s if they are not. However,
we need the result in a non-XMM register, so it moves the result
back then masks off all but the first bit.
Both of these approaches are a bit awkward to implement in QBE, so
instead, this commit does the following:
ucomisd %xmm1, %xmm0
setz %al
movzbl %al, %eax
setnp %cl
movzbl %cl, %ecx
andl %ecx, %eax
This sets the result by anding the two flags, but has a side effect
of clobbering the flags register. This was a problem in one of my
earlier patches to fix this issue[0], in addition to being more
complex than I'd hoped.
Instead, this commit always leaves the ceq[sd] instruction in the
block, even if the result is only used to control a jump, so that
the above instruction sequence is always used. Then, since we now
have ZF=!(ZF=1 & PF=0) for x == y, or ZF=!(ZF=0 | PF=1) for x != y,
we can use jnz for the jump instruction.
[0] https://git.sr.ht/~sircmpwn/qbe/commit/64833841b18c074a23b4a1254625315e05b86658
|
|
When the two operands are Unordered (for instance if one of them
is NaN), ucomisd sets ZF=1, PF=1, and CF=1. When the result is
LessThan, it sets ZF=0, PF=0, and CF=1.
However, jb[e]/setb[e] only checks that CF=1 [or ZF=1] which causes
the result to be true for unordered operands.
To fix this, change the operand swap condition for these two floating
point comparison types: always rewrite x < y as y > x, and never
rewrite x > y as y < x.
Add a test to check the result of cltd, cled, cgtd, cged, ceqd, and
cned with arguments that are LessThan, Equal, GreaterThan, and
Unordered. Additionally, check three different implementations for
equality testing: one that uses the result of ceqd directly, one
that uses the result to control a conditional jump, and one that
uses the result both as a value and for a conditional jump. For
now, unordered equality tests are still broken so they are disabled.
|
|
Reported by Alessandro Mantovani.
These addresses are likely bogus, but
they triggered an unwarranted assertion
failure. We now raise a civilized error.
|
|
Reported by Alessandro Mantovani.
Although unlikely in real programs it
was found that using the address of a
fast local in amd64 shifts triggers
assertion failures.
We now err when the shift count is
given by an address; but we allow
shifting an address.
|
|
Reported by Alessandro Mantovani.
Unlikely to be hit in practice
because we don't add addresses to
addresses.
type :biggie = { l, l, l }
function $repro(:biggie %p) {
@start
%x =l add %p, $a
storew 42, %x
ret
}
|
|
selcmp may potentially swap the arguments and return 1 indicating
that the opposite operation should be used. However, if the compare
result is used for a conditional jump as well as elsewhere, the
original compare op is used instead of the opposite.
To fix this, add a check to see whether the opposite compare should
be used, regardless of whether selcmp() is done now, or later on
during sel().
Bug report and test case from Charlie Stanton.
|
|
The value argument of store instructions was
handled incorrectly.
|
|
|
|
I had forgotten that %rip can only be
used as base when there is no index.
I also added a test which stresses
addressing selection with and without
constants.
|
|
We now emit correct code when the user
refers to a specific constant address.
I also made some comments clearer in
the instruction selection pass and got
rid of some apparently useless code.
|
|
Before, amatch() would prefer matching
"o + b" to "o + s*i" and "b + s*i".
|
|
The numberer made some arranging choices when
numbering arguments of an instruction, but these
decisions were ignored when matching. The fix
is to reconcile numbering and matching.
|
|
|
|
Compiler warned about comparison between signed and unsigned values.
|
|
The arm64 might have the same problem but it
is currently unable to handle them even in
instruction selection.
Thanks to Jean Dao for reporting the bug.
|
|
The stashing of constants in gas.c was also
changed to support 16-bytes constants.
|
|
It never worked until today.
|
|
Symbols in the source file are still limited in
length because the rest of the code assumes that
strings always fit in NString bytes.
Regardless, there is already a benefit because
comparing/copying symbol names does not require
using strcmp()/strcpy() anymore.
|
|
With the default toolchain, it looks like we have to
make sure all symbols are loaded using rip-relative
addressing.
|
|
This big diff does multiple changes to allow
the addition of new targets to qbe. The
changes are listed below in decreasing order
of impact.
1. Add a new Target structure.
To add support for a given target, one has to
implement all the members of the Target
structure. All the source files where changed
to use this interface where needed.
2. Single out amd64-specific code.
In this commit, the amd64 target T_amd64_sysv
is the only target available, it is implemented
in the amd64/ directory. All the non-static
items in this directory are prefixed with either
amd64_ or amd64_sysv (for items that are
specific to the System V ABI).
3. Centralize Ops information.
There is now a file 'ops.h' that must be used to
store all the available operations together with
their metadata. The various targets will only
select what they need; but it is beneficial that
there is only *one* place to change to add a new
instruction.
One good side effect of this change is that any
operation 'xyz' in the IL now as a corresponding
'Oxyz' in the code.
4. Misc fixes.
One notable change is that instruction selection
now generates generic comparison operations and
the lowering to the target's comparisons is done
in the emitter.
GAS directives for data are the same for many
targets, so data emission was extracted in a
file 'gas.c'.
5. Modularize the Makefile.
The Makefile now has a list of C files that
are target-independent (SRC), and one list
of C files per target. Each target can also
use its own 'all.h' header (for example to
define registers).
|