Age | Commit message (Collapse) | Author |
|
The maximum immediate size for 1, 2, 4, and 8 byte loads/stores is
4095, 8190, 16380, and 32760 respectively[0][1][2].
[0] https://developer.arm.com/documentation/dui0802/a/A64-Data-Transfer-Instructions/LDRB--immediate-
[1] https://developer.arm.com/documentation/dui0802/a/A64-Data-Transfer-Instructions/LDRH--immediate-
[2] https://developer.arm.com/documentation/dui0802/a/A64-Data-Transfer-Instructions/LDR--immediate-
|
|
The recent changes in arm and riscv
typclass() set ngp to 1 when a struct
is returned via a caller-provided
buffer. This interacts bogusly with
selret() that ends up declaring a gp
register live when none is set in
the returning sequence.
The fix is simply to set cty to zero
(all registers dead) in case a caller-
provided buffer is used.
|
|
|
|
The x9 register is used for
the env parameter.
|
|
I also moved some isel logic
that would have been repeated
a third time in util.c.
|
|
|
|
The riscv test abi8.ssa caught a bug
in the arm backend. It turns out we
were using the wrong class when loading
pointers to aggregates from the stack.
The fix is simple and mirrors what is
done in the riscv abi.
|
|
The risc-v abi needs to know if a
type is defined as a union or not.
We cannot use nunion to obtain this
information because the risc-v abi
made the unfortunate decision of
treating
union { int i; }
differently from
int i;
So, instead, I introduce a single
bit flag 'isunion'.
|
|
|
|
|
|
|
|
|
|
amd64 lacks instruction for this so it has to be implemented with
float -> signed casts. The approach is borrowed from llvm.
|
|
amd64 lacks an instruction for this so it has to be implemented with
signed -> float casts:
- Word casting is done by zero-extending the word to a long and then doing
a regular signed cast.
- Long casting is done by dividing by two with correct rounding if the
highest bit is set and casting that to float, then adding
1 to mantissa with integer addition
|
|
Necessary for floating-point negation, because
`%result = sub 0, %operand` doesn't give the correct sign for 0/-0.
|
|
When slots are used with a large offset,
the emitter generates invalid assembly
code. That is caught later on by the
assembler, but it prevents compilation
of programs with large stack frames.
When a slot offset is too large to be
expressed as a constant offset to x29
(the frame pointer), emitins() inserts
a late Oaddr instruction to x16 and
replaces the large slot reference with
x16.
This change also gave me the opportunity
to refactor the save/restore logic for
callee-save registers.
This fixes the following Hare issue:
https://todo.sr.ht/~sircmpwn/hare/387
|
|
If the size of the struct is not a multiple of 8, the actual struct
size may be different from the size reserved on the stack.
This fixes the case where the struct is passed in memory, but we
still may over-read a struct passed in registers. A TODO is added
for now.
|
|
Michael found a bug where some copies
from registers to memory in the arm64
abi clobber the stack. The test case
is:
type :T = { w }
function w $f() {
@start
%p =:T call $g()
%x =w loadw %p
ret %x
}
qbe will write 4 bytes out of bounds
when pulling the result struct from
its register. The same bug can be
observed if :T's definition is {w 3};
in this case qbe writes 16 bytes in
a slot of 12 bytes.
This patch changes stkblob() to use
the rounded argument size if it is
going to be restored from registers.
Relatedly, mem->reg loads for structs
with size < 16 and != 8, are treated
a bit sloppily both in the arm64 and
in the sysv abis. That is much less
harmful than the present bug.
|
|
Some arm64 abi tests have been failing
for some time now. This fixes them by
being a bit more careful with liveset
management in spill.c.
A late bsclr() call in spill.c may drop
legitimately live registers in e.g.,
R12 =w add R12, 1
While it hurts for regs, it does not
matter for ssa temps because those cannot
be both in the arguments & return (by the
ssa invariant). I added a check before
bsclr() to make sure we are clearing
only ssa temps.
One might be surprised that any ssa temp
may be live at this point. The reason why
this is the case is the special handling
of dead return values earlier in spill().
I think that it is the only case where
the return value can be (awkwardly) live
at the same time as the arguments, and I
think this never happens with registers
(i.e., we never have dead register-
assigning instructions). I added an
assert to check the latter invariant.
Finally, there was a simple bug in the
arm64 abi which I fixed: In case the return
happens via a pointer, x8 needs to be marked
live at the beginning of the function. This
was caught by test/abi4.ssa.
|
|
Tested-by: Thomas Bracht Laumann Jespersen <t@laumann.xyz>
Fixes: https://todo.sr.ht/~sircmpwn/hare/312
|
|
Fixes #467. It assumes that the stack won't need to grow beyond 2^32 bytes.
If that were to happen, we'd need another or at most two more `movk` instructions.
Signed-off-by: Sudipto Mallick <smlckz@disroot.org>
|
|
If registers spill onto the stack, we may end up with SSA like
S320 =l copy 0
after rega(). Handle this case in arm64 emit().
|
|
|
|
|
|
Some abis, like the riscv one, treat
arguments differently depending on
whether they are variadic or not.
To prepare for the upcomming riscv
target, we change the variadic call
syntax and give meaning to the
location of the '...' marker.
# new syntax
%ret =w call $f(w %regular, ..., w %variadic)
By nature of their abis, the change
is backwards compatible for existing
targets.
|
|
|
|
The no-op `copy R0` is necessary in order to trigger dopm in spill.c
and rega.c, which assume that a call is always followed by one or
more copies from registers. However, the arm64 ABI does not actually
return the caller-passed pointer as in x86_64. This causes an
assertion failure
qbe: aarch64: Assertion failed: r == T.rglob || b == fn->start (spill.c: spill: 470)
for the following test program
type :t = { l 3 }
function $f() {
@start.1
@start.2
%ret =:t call $g()
ret
}
The assertion failure only triggers when the block containing the
call is not the first block, because the check is skipped for the
first block (since some registers may have been used for arguments).
To fix this, set R0 in the call data so that spill/rega can see
that this dummy "return" register was generated by the call. This
matches qbe's existing behavior when the function returns void,
another case where no register is used for the function result.
|
|
The immediate in the add instruction is only 12 bits. If the offset
does not fit, we must move it into a register first.
|
|
According to the ARMv8 overview document
However if SP is used as the base register then the value of the stack
pointer prior to adding any offset must be quadword (16 byte) aligned,
or else a stack alignment exception will be generated.
This manifests as a bus error on my system.
To resolve this, just save registers two at a time with stp.
|
|
|
|
In this case, the immediate is too large to use directly in the add/sub
instructions, so move it into a temporary register first.
Also, for clarity, rearrange the if-conditions so that they match the
constraints of the instructions that immediately follow.
|
|
|
|
The ldrb and ldrh instructions require a 32-bit register name for the
destination and will clear the upper 32-bits of that register.
|
|
In this case, the potential truncations
flagged by gcc are only affecting debug
information.
|
|
|
|
|
|
Compiler warned about comparison between signed and unsigned values.
|
|
The stashing of constants in gas.c was also
changed to support 16-bytes constants.
|
|
Symbols in the source file are still limited in
length because the rest of the code assumes that
strings always fit in NString bytes.
Regardless, there is already a benefit because
comparing/copying symbol names does not require
using strcmp()/strcpy() anymore.
|
|
|