This function is incorrect in that it does not properly consider
CPTR_EL2.FPEN. We've already got another mechanism for raising
an FPU access trap: ARM_CP_FPU, so use that instead.
Remove CP_ACCESS_TRAP_FP_EL{2,3}, which becomes unused.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace a config-time define with a compile time condition
define (compatible with clang and gcc) that must be declared prior to
its usage. This avoids having a global configure time define, but also
prevents from bad usage, if the config header wasn't included before.
This can help to make some code independent from qemu too.
gcc supports __BYTE_ORDER__ from about 4.6 and clang from 3.2.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[ For the s390x parts I'm involved in ]
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220323155743.1585078-7-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As per the AArch64.S2Walk() pseudo-code in the ARMv8 ARM, the final
decision as to the output address's PA space based on the SA/SW/NSA/NSW
bits needs to take the input IPA's PA space into account, and not the
PA space of the result of the stage 2 walk itself.
Signed-off-by: Idan Horowitz <idan.horowitz@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220327093427.1548629-4-idan.horowitz@gmail.com
[PMM: fixed commit message typo]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As per the AArch64.SS2InitialTTWState() psuedo-code in the ARMv8 ARM the
initial PA space used for stage 2 table walks is assigned based on the SW
and NSW bits of the VSTCR and VTCR registers.
This was already implemented for the recursive stage 2 page table walks
in S1_ptw_translate(), but was missing for the final stage 2 walk.
Signed-off-by: Idan Horowitz <idan.horowitz@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220327093427.1548629-3-idan.horowitz@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
LPAE descriptors come in three forms:
* table descriptors, giving the address of the next level page table
* page descriptors, which occur only at level 3 and describe the
mapping of one page (which might be 4K, 16K or 64K)
* block descriptors, which occur at higher page table levels, and
describe the mapping of huge pages
QEMU's page-table-walk code treats block and page entries
identically, simply ORing in a number of bits from the input virtual
address that depends on the level of the page table that we stopped
at; we depend on the previous masking of descaddr with descaddrmask
to have already cleared out the low bits of the descriptor word.
This is not quite right: the address field in a block descriptor is
smaller, and so there are bits which are valid address bits in a page
descriptor or a table descriptor but which are not supposed to be
part of the address in a block descriptor, and descaddrmask does not
clear them. We previously mostly got away with this because those
descriptor bits are RES0; however with FEAT_BBM (part of Armv8.4)
block descriptor bit 16 is defined to be the nT bit. No emulated
QEMU CPU has FEAT_BBM yet, but if the host CPU has it then we might
see it when using KVM or hvf.
Explicitly zero out all the descaddr bits we're about to OR vaddr
bits into.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/790
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220304165628.2345765-1-peter.maydell@linaro.org
This feature widens physical addresses (and intermediate physical
addresses for 2-stage translation) from 48 to 52 bits, when using
4k or 16k pages.
This introduces the DS bit to TCR_ELx, which is RES0 unless the
page size is enabled and supports LPA2, resulting in the effective
value of DS for a given table walk. The DS bit changes the format
of the page table descriptor slightly, moving the PS field out to
TCR so that all pages have the same sharability and repurposing
those bits of the page table descriptor for the highest bits of
the output address.
Do not yet enable FEAT_LPA2; we need extra plumbing to avoid
tickling an old kernel bug.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220301215958.157011-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Merge tlbi_aa64_range_get_length and tlbi_aa64_range_get_base,
returning a structure containing both results. Pass in the
ARMMMUIdx, rather than the digested two_ranges boolean.
This is in preparation for FEAT_LPA2, where the interpretation
of 'value' depends on the effective value of DS for the regime.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220301215958.157011-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This feature widens physical addresses (and intermediate physical
addresses for 2-stage translation) from 48 to 52 bits, when using
64k pages. The only thing left at this point is to handle the
extra bits in the TTBR and in the table descriptors.
Note that PAR_EL1 and HPFAR_EL2 are nominally extended, but we don't
mask out the high bits when writing to those registers, so no changes
are required there.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220301215958.157011-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This feature is relatively small, as it applies only to
64k pages and thus requires no additional changes to the
table descriptor walking algorithm, only a change to the
minimum TSZ (which is the inverse of the maximum virtual
address space size).
Note that this feature widens VBAR_ELx, but we already
treat the register as being 64 bits wide.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220301215958.157011-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The original A.a revision of the AArch64 ARM required that we
force-extend the addresses in these registers from 49 bits.
This language has been loosened via a combination of IMPLEMENTATION
DEFINED and CONSTRAINTED UNPREDICTABLE to allow consideration of
the entire aligned address.
This means that we do not have to consider whether or not FEAT_LVA
is enabled, and decide from which bit an address might need to be
extended.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220301215958.157011-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This field controls the output (intermediate) physical address size
of the translation process. V8 requires to raise an AddressSize
fault if the page tables are programmed incorrectly, such that any
intermediate descriptor address, or the final translated address,
is out of range.
Add a PS field to ARMVAParameters, and properly compute outputsize
in get_phys_addr_lpae. Test the descaddr as extracted from TTBR
and from page table entries.
Restrict descaddrmask so that we won't raise the fault for v7.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220301215958.157011-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Many files use "qemu/log.h" declarations but neglect to include
it (they inherit it via "exec/exec-all.h"). "exec/exec-all.h" is
a core component and shouldn't be used that way. Move the
"qemu/log.h" inclusion locally to each unit requiring it.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20220207082756.82600-10-f4bug@amsat.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The exception caused by an SVC instruction may be taken to AArch32
Hyp mode for two reasons:
* HCR.TGE indicates that exceptions from EL0 should trap to EL2
* we were already in Hyp mode
The entrypoint in the vector table to be used differs in these two
cases: for an exception routed to Hyp mode from EL0, we enter at the
common 0x14 "hyp trap" entrypoint. For SVC from Hyp mode to Hyp
mode, we enter at the 0x08 (svc/hvc trap) entrypoint.
In the v8A Arm ARM pseudocode this is done in AArch32.TakeSVCException.
QEMU incorrectly routed both of these exceptions to the 0x14
entrypoint. Correct the entrypoint for SVC from Hyp to Hyp by making
use of the existing logic which handles "normal entrypoint for
Hyp-to-Hyp, otherwise 0x14" for traps like UNDEF and data/prefetch
aborts (reproduced here since it's outside the visible context
in the diff for this commit):
if (arm_current_el(env) != 2 && addr < 0x14) {
addr = 0x14;
}
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220117131953.3936137-1-peter.maydell@linaro.org
The calculation of the length of TLB range invalidate operations
in tlbi_aa64_range_get_length() is incorrect in two ways:
* the NUM field is 5 bits, but we read only 4 bits
* we miscalculate the page_shift value, because of an
off-by-one error:
TG 0b00 is invalid
TG 0b01 is 4K granule size == 4096 == 2^12
TG 0b10 is 16K granule size == 16384 == 2^14
TG 0b11 is 64K granule size == 65536 == 2^16
so page_shift should be (TG - 1) * 2 + 12
Thanks to the bug report submitter Cha HyunSoo for identifying
both these errors.
Fixes: 84940ed825 ("target/arm: Add support for FEAT_TLBIRANGE")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/734
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20211130173257.1274194-1-peter.maydell@linaro.org
Currently helper.c includes some code which is part of the arm
target's gdbstub support. This code has a better home: in gdbstub.c
and gdbstub64.c. Move it there.
Because aarch64_fpu_gdb_get_reg() and aarch64_fpu_gdb_set_reg() move
into gdbstub64.c, this means that they're now compiled only for
TARGET_AARCH64 rather than always. That is the only case when they
would ever be used, but it does mean that the ifdef in
arm_cpu_register_gdb_regs_for_features() needs to be adjusted to
match.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210921162901.17508-4-peter.maydell@linaro.org
Our current codegen for MVE always calls out to helper functions,
because some byte lanes might be predicated. The common case is that
in fact there is no predication active and all lanes should be
updated together, so we can produce better code by detecting that and
using the TCG generic vector infrastructure.
Add a TB flag that is set when we can guarantee that there is no
active MVE predication, and a bool in the DisasContext. Subsequent
patches will use this flag to generate improved code for some
instructions.
In most cases when the predication state changes we simply end the TB
after that instruction. For the code called from vfp_access_check()
that handles lazy state preservation and creating a new FP context,
we can usually avoid having to try to end the TB because luckily the
new value of the flag following the register changes in those
sequences doesn't depend on any runtime decisions. We do have to end
the TB if the guest has enabled lazy FP state preservation but not
automatic state preservation, but this is an odd corner case that is
not going to be common in real-world code.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210913095440.13462-4-peter.maydell@linaro.org
In v8A, the PSTATE.IL bit is set for various kinds of illegal
exception return or mode-change attempts. We already set PSTATE.IL
(or its AArch32 equivalent CPSR.IL) in all those cases, but we
weren't implementing the part of the behaviour where attempting to
execute an instruction with PSTATE.IL takes an immediate exception
with an appropriate syndrome value.
Add a new TB flags bit tracking PSTATE.IL/CPSR.IL, and generate code
to take an exception instead of whatever the instruction would have
been.
PSTATE.IL and CPSR.IL change only on exception entry, attempted
exception exit, and various AArch32 mode changes via cpsr_write().
These places generally already rebuild the hflags, so the only place
we need an extra rebuild_hflags call is in the illegal-return
codepath of the AArch64 exception_return helper.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210821195958.41312-2-richard.henderson@linaro.org
Message-Id: <20210817162118.24319-1-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[rth: Added missing returns; set IL bit in syndrome]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Currently we rely on all the callsites of cpsr_write() to rebuild the
cached hflags if they change one of the CPSR bits which we use as a
TB flag and cache in hflags. This is a bit awkward when we want to
change the set of CPSR bits that we cache, because it means we need
to re-audit all the cpsr_write() callsites to see which flags they
are writing and whether they now need to rebuild the hflags.
Switch instead to making cpsr_write() call arm_rebuild_hflags()
itself if one of the bits being changed is a cached bit.
We don't do the rebuild for the CPSRWriteRaw write type, because that
kind of write is generally doing something special anyway. For the
CPSRWriteRaw callsites in the KVM code and inbound migration we
definitely don't want to recalculate the hflags; the callsites in
boot.c and arm-powerctl.c have to do a rebuild-hflags call themselves
anyway because of other CPU state changes they make.
This allows us to drop explicit arm_rebuild_hflags() calls in a
couple of places where the only reason we needed to call it was the
CPSR write.
This fixes a bug where we were incorrectly failing to rebuild hflags
in the code path for a gdbstub write to CPSR, which meant that you
could make QEMU assert by breaking into a running guest, altering the
CPSR to change the value of, for example, CPSR.E, and then
continuing.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210817201843.3829-1-peter.maydell@linaro.org
In v7A, the HSTR register has a TJDBX bit which traps NS EL0/EL1
access to the JOSCR and JMCR trivial Jazelle registers, and also BXJ.
Implement these traps. In v8A this HSTR bit doesn't exist, so don't
trap for v8A CPUs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210816180305.20137-3-peter.maydell@linaro.org
Unlike A-profile, for M-profile the UDIV and SDIV insns can be
configured to raise an exception on division by zero, using the CCR
DIV_0_TRP bit.
Implement support for setting this bit by making the helper functions
raise the appropriate exception.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210730151636.17254-3-peter.maydell@linaro.org
Currently, our only caller is sve_zcr_len_for_el, which has
already masked the length extracted from ZCR_ELx, so the
masking done here is a nop. But we will shortly have uses
from other locations, where the length will be unmasked.
Saturate the length to ARM_MAX_VQ instead of truncating to
the low 4 bits.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20210723203344.968563-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>