Age | Commit message (Collapse) | Author |
|
This was causing issues with aggregate types. A simple reproduction is:
type :type.1 = align 8 { 24 }
type :type.2 = align 8 { w 1, :type.1 1 }
The size of type.2 should be 32, adding only 4 bytes of padding between
the first and second field. Prior to this patch, 20 bytes of padding was
added instead, causing the type to have a size of 48.
Signed-off-by: Drew DeVault <sir@cmpwn.com>
|
|
udiv %x, 1 == %x, and for each of sub, or, xor, sar, shr, and shl,
<op> %x, 0 == %x.
|
|
In cases where stash was 0, gasemitfin exits immediately and the
GNU-stack note isn't added to the asm output. This would result in an
executable where GNU_STACK uses flags RWE instead of the desired RW.
|
|
Reported by Alessandro Mantovani.
These addresses are likely bogus, but
they triggered an unwarranted assertion
failure. We now raise a civilized error.
|
|
Previously, all casts but d->w, d->s, l->s, s->d, w->d were supported.
At least the first three can occur by storing to then loading from
a slot, currently triggering an assertion failure. Though the other
two might not be possible, they are easy enough to support as well.
Fixes hare#360.
|
|
Reported by Alessandro Mantovani.
Although unlikely in real programs it
was found that using the address of a
fast local in amd64 shifts triggers
assertion failures.
We now err when the shift count is
given by an address; but we allow
shifting an address.
|
|
Reported by Alessandro Mantovani.
Overly long function names would
trigger out-of-bounds accesses.
|
|
Reported by Alessandro Mantovani.
Unlikely to be hit in practice
because we don't add addresses to
addresses.
type :biggie = { l, l, l }
function $repro(:biggie %p) {
@start
%x =l add %p, $a
storew 42, %x
ret
}
|
|
selcmp may potentially swap the arguments and return 1 indicating
that the opposite operation should be used. However, if the compare
result is used for a conditional jump as well as elsewhere, the
original compare op is used instead of the opposite.
To fix this, add a check to see whether the opposite compare should
be used, regardless of whether selcmp() is done now, or later on
during sel().
Bug report and test case from Charlie Stanton.
|
|
|
|
|
|
This reverts commit be3a67a7f5079f30b0ccc696d549fd03a2dbbad1.
qemu-system-aarch64 is a full system emulator and is not suitable
for running the qbe test suite (at least without a kernel and root
filesystem).
|
|
The no-op `copy R0` is necessary in order to trigger dopm in spill.c
and rega.c, which assume that a call is always followed by one or
more copies from registers. However, the arm64 ABI does not actually
return the caller-passed pointer as in x86_64. This causes an
assertion failure
qbe: aarch64: Assertion failed: r == T.rglob || b == fn->start (spill.c: spill: 470)
for the following test program
type :t = { l 3 }
function $f() {
@start.1
@start.2
%ret =:t call $g()
ret
}
The assertion failure only triggers when the block containing the
call is not the first block, because the check is skipped for the
first block (since some registers may have been used for arguments).
To fix this, set R0 in the call data so that spill/rega can see
that this dummy "return" register was generated by the call. This
matches qbe's existing behavior when the function returns void,
another case where no register is used for the function result.
|
|
This makes it easier to determine which flag to pass to show the
desired debug info.
|
|
|
|
|
|
|
|
|
|
This allows you to explicitly specify the section to emit the data
directive for, allowing for sections other than .data: for example, .bss
or .init_array.
|
|
|
|
GNU ld uses the presence of these notes to determine the flags of
the final GNU_STACK program header. If they are present in every
object, then the resulting executable's GNU_STACK uses flags RW
instead of RWE.
Reported by Érico Nogueira Rolim.
|
|
The immediate in the add instruction is only 12 bits. If the offset
does not fit, we must move it into a register first.
|
|
|
|
Otherwise, if a constant is stored as a float and retrieved as an
int, the padding bits are uninitialized. This can result in the
generation of invalid assembly:
Error: suffix or operands invalid for `cvtsi2ss'
Reported by Hiltjo Posthuma.
|
|
Thanks to Jakob for pointing this out.
|
|
Otherwise, we may end up using an integer and floating class for the
same register, triggering an assertion failure:
qbe: rega.c:215: pmrec: Assertion `KBASE(pm[i].cls) == KBASE(*k)' failed.
Test case:
type :T = { s }
export
function $d(:T %.1, s %.2) {
@start
call $c(s %.2)
ret
}
|
|
|
|
According to the ARMv8 overview document
However if SP is used as the base register then the value of the stack
pointer prior to adding any offset must be quadword (16 byte) aligned,
or else a stack alignment exception will be generated.
This manifests as a bus error on my system.
To resolve this, just save registers two at a time with stp.
|
|
This now only limits the number of arguments when parsing the input SSA,
which is usually a small fixed size (depending on the frontend).
|
|
|
|
C99 6.5.2.5p6:
> If the compound literal occurs outside the body of a function,
> the object has static storage duration; otherwise, it has automatic
> storage duration associated with the enclosing block.
So, we can't use the address of a compound literal here. Instead,
just set p to NULL, and make the loop conditional on p being non-NULL.
Remarks from Quentin:
I made a cosmetic change to Michael's
original patch and merely pushed the
literal at toplevel.
|
|
The mandel example uses SDL2 for graphics
output. When GCC is used to assemble the
resulting *.s file it shows linker's
errors about undefined symbols from the
library.
This behavior can be fixed by moving
the flags passed to the compiler after
the source file name.
|
|
|
|
In this case, the immediate is too large to use directly in the add/sub
instructions, so move it into a temporary register first.
Also, for clarity, rearrange the if-conditions so that they match the
constraints of the instructions that immediately follow.
|
|
|
|
The ldrb and ldrh instructions require a 32-bit register name for the
destination and will clear the upper 32-bits of that register.
|
|
|
|
|
|
The code used to see add 0, 10 as
a copy of 0.
|
|
|
|
|
|
The same functionality can be implemented
naturally in the cfg simplification pass.
|
|
Previously, each ret would lead to an
epilog. This caused bloat for large
functions with multiple return points.
|
|
.align N can either mean align to
the next multiple of N or align to
the next multiple of 1<<N.
Credit goes to Jorge Acereda Maciá
for reporting this issue.
|
|
SCCP is currently the one and only
pass which seriously affects control
flow; so we must compute loop costs
afterwards.
|
|
When lowering pointer arithmetic, it is
natural for a C frontend to generate
those instructions.
|
|
The heuristic was bogus for at least
two reasons (see below), and, looking
at some generated code, it looks like
some other issues are more pressing.
1. A stack slot of 4 bytes could be
used for a temporary of 8 bytes.
2. Should 2 arguments of an operation
end up spilled, the same slot
could be allocated to both!
|
|
The value argument of store instructions was
handled incorrectly.
|
|
This fixes similar bugs than the ones fixed
in the previous commit.
In the folding code the invariant is that
when a result is 32 bits wide, the low 32
bits of 'x' are correct. The high bits
can be anything.
|
|
|