Currently we give all the v7-and-up CPUs a PMU with 4 counters. This
means that we don't provide the 6 counters that are required by the
Arm BSA (Base System Architecture) specification if the CPU supports
the Virtualization extensions.
Instead of having a single PMCR_NUM_COUNTERS, make each CPU type
specify the PMCR reset value (obtained from the appropriate TRM), and
use the 'N' field of that value to define the number of counters
provided.
This means that we now supply 6 counters instead of 4 for:
Cortex-A9, Cortex-A15, Cortex-A53, Cortex-A57, Cortex-A72,
Cortex-A76, Neoverse-N1, '-cpu max'
This CPU goes from 4 to 8 counters:
A64FX
These CPUs remain with 4 counters:
Cortex-A7, Cortex-A8
This CPU goes down from 4 to 3 counters:
Cortex-R5
Note that because we now use the PMCR reset value of the specific
implementation, we no longer set the LC bit out of reset. This has
an UNKNOWN value out of reset for all cores with any AArch32 support,
so guest software should be setting it anyway if it wants it.
This change was originally landed in commit f7fb73b8cd (during
the 6.0 release cycle) but was then reverted by commit
21c2dd77a6 before that release because it did not work with KVM.
This version fixes that by creating the scratch vCPU in
kvm_arm_get_host_cpu_features() with the KVM_ARM_VCPU_PMU_V3 feature
if KVM supports it, and then only asking KVM for the PMCR_EL0 value
if the vCPU has a PMU.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[PMM: Added the correct value for a64fx]
Message-id: 20220513122852.4063586-1-peter.maydell@linaro.org
In commit 88ce6c6ee8 we switched from directly fishing the number
of breakpoints and watchpoints out of the ID register fields to
abstracting out functions to do this job, but we forgot to delete the
now-obsolete comment in define_debug_regs() about the relation
between the ID field value and the actual number of breakpoints and
watchpoints. Delete the obsolete comment.
Reported-by: CHRIS HOWARD <cvz185@web.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220513131801.4082712-1-peter.maydell@linaro.org
Give all the debug registers their correct names including the
index, rather than having multiple registers all with the
same name string, which is confusing when viewed over the
gdbstub interface.
Signed-off-by: CHRIS HOWARD <cvz185@web.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 4127D8CA-D54A-47C7-A039-0DB7361E30C0@web.de
[PMM: expanded commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Factor out the part of combine_cacheattrs() that is specific to
handling HCR_EL2.FWB == 0. This is the part where we combine the
memory type and cacheability attributes.
The "force Outer Shareable for Device or Normal Inner-NC Outer-NC"
logic remains in combine_cacheattrs() because it holds regardless
(this is the equivalent of the pseudocode EffectiveShareability()
function).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220505183950.2781801-3-peter.maydell@linaro.org
In the original Arm v8 two-stage translation, both stage 1 and stage
2 specify memory attributes (memory type, cacheability,
shareability); these are then combined to produce the overall memory
attributes for the whole stage 1+2 access. In QEMU we implement this
by having get_phys_addr() fill in an ARMCacheAttrs struct, and we
convert both the stage 1 and stage 2 attribute bit formats to the
same encoding (an 8-bit attribute value matching the MAIR_EL1 fields,
plus a 2-bit shareability value).
The new FEAT_S2FWB feature allows the guest to enable a different
interpretation of the attribute bits in the stage 2 descriptors.
These bits can now be used to control details of how the stage 1 and
2 attributes should be combined (for instance they can say "always
use the stage 1 attributes" or "ignore the stage 1 attributes and
always be Device memory"). This means we need to pass the raw bit
information for stage 2 down to the function which combines the stage
1 and stage 2 information.
Add a field to ARMCacheAttrs that indicates whether the attrs field
should be interpreted as MAIR format, or as the raw stage 2 attribute
bits from the descriptor, and store the appropriate values when
filling in cacheattrs.
We only need to interpret the attrs field in a few places:
* in do_ats_write(), where we know to expect a MAIR value
(there is no ATS instruction to do a stage-2-only walk)
* in S1_ptw_translate(), where we want to know whether the
combined S1 + S2 attributes indicate Device memory that
should provoke a fault
* in combine_cacheattrs(), which does the S1 + S2 combining
Update those places accordingly.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220505183950.2781801-2-peter.maydell@linaro.org
Add only the system registers required to implement zero error
records. This means that all values for ERRSELR are out of range,
which means that it and all of the indexed error record registers
need not be implemented.
Add the EL2 registers required for injecting virtual SError.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220506180242.216785-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Drop el3_no_el2_cp_reginfo, el3_no_el2_v8_cp_reginfo, and the local
vpidr_regs definition, and rely on the squashing to ARM_CP_CONST
while registering for v8.
This is a behavior change for v7 cpus with Security Extensions and
without Virtualization Extensions, in that the virtualization cpregs
are now correctly not present. This would be a migration compatibility
break, except that we have an existing bug in which migration of 32-bit
cpus with Security Extensions enabled does not work.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220506180242.216785-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
More gracefully handle cpregs when EL2 and/or EL3 are missing.
If the reg is entirely inaccessible, do not register it at all.
If the reg is for EL2, and EL3 is present but EL2 is not,
either discard, squash to res0, const, or keep unchanged.
Per rule RJFFP, mark the 4 aarch32 hypervisor access registers
with ARM_CP_EL3_NO_EL2_KEEP, and mark all of the EL2 address
translation and tlb invalidation "regs" ARM_CP_EL3_NO_EL2_UNDEF.
Mark the 2 virtualization processor id regs ARM_CP_EL3_NO_EL2_C_NZ.
This will simplify cpreg registration for conditional arm features.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220506180242.216785-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Simplify freeing cp_regs hash table entries by using a single
allocation for the entire value.
This fixes a theoretical bug if we were to ever free the entire
hash table, because we've been installing string literal constants
into the cpreg structure in define_arm_vh_e2h_redirects_aliases.
However, at present we only free entries created for AArch32
wildcard cpregs which get overwritten by more specific cpregs,
so this bug is never exposed.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220501055028.646596-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This function is incorrect in that it does not properly consider
CPTR_EL2.FPEN. We've already got another mechanism for raising
an FPU access trap: ARM_CP_FPU, so use that instead.
Remove CP_ACCESS_TRAP_FP_EL{2,3}, which becomes unused.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace a config-time define with a compile time condition
define (compatible with clang and gcc) that must be declared prior to
its usage. This avoids having a global configure time define, but also
prevents from bad usage, if the config header wasn't included before.
This can help to make some code independent from qemu too.
gcc supports __BYTE_ORDER__ from about 4.6 and clang from 3.2.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[ For the s390x parts I'm involved in ]
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220323155743.1585078-7-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As per the AArch64.S2Walk() pseudo-code in the ARMv8 ARM, the final
decision as to the output address's PA space based on the SA/SW/NSA/NSW
bits needs to take the input IPA's PA space into account, and not the
PA space of the result of the stage 2 walk itself.
Signed-off-by: Idan Horowitz <idan.horowitz@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220327093427.1548629-4-idan.horowitz@gmail.com
[PMM: fixed commit message typo]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As per the AArch64.SS2InitialTTWState() psuedo-code in the ARMv8 ARM the
initial PA space used for stage 2 table walks is assigned based on the SW
and NSW bits of the VSTCR and VTCR registers.
This was already implemented for the recursive stage 2 page table walks
in S1_ptw_translate(), but was missing for the final stage 2 walk.
Signed-off-by: Idan Horowitz <idan.horowitz@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220327093427.1548629-3-idan.horowitz@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
LPAE descriptors come in three forms:
* table descriptors, giving the address of the next level page table
* page descriptors, which occur only at level 3 and describe the
mapping of one page (which might be 4K, 16K or 64K)
* block descriptors, which occur at higher page table levels, and
describe the mapping of huge pages
QEMU's page-table-walk code treats block and page entries
identically, simply ORing in a number of bits from the input virtual
address that depends on the level of the page table that we stopped
at; we depend on the previous masking of descaddr with descaddrmask
to have already cleared out the low bits of the descriptor word.
This is not quite right: the address field in a block descriptor is
smaller, and so there are bits which are valid address bits in a page
descriptor or a table descriptor but which are not supposed to be
part of the address in a block descriptor, and descaddrmask does not
clear them. We previously mostly got away with this because those
descriptor bits are RES0; however with FEAT_BBM (part of Armv8.4)
block descriptor bit 16 is defined to be the nT bit. No emulated
QEMU CPU has FEAT_BBM yet, but if the host CPU has it then we might
see it when using KVM or hvf.
Explicitly zero out all the descaddr bits we're about to OR vaddr
bits into.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/790
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220304165628.2345765-1-peter.maydell@linaro.org
This feature widens physical addresses (and intermediate physical
addresses for 2-stage translation) from 48 to 52 bits, when using
4k or 16k pages.
This introduces the DS bit to TCR_ELx, which is RES0 unless the
page size is enabled and supports LPA2, resulting in the effective
value of DS for a given table walk. The DS bit changes the format
of the page table descriptor slightly, moving the PS field out to
TCR so that all pages have the same sharability and repurposing
those bits of the page table descriptor for the highest bits of
the output address.
Do not yet enable FEAT_LPA2; we need extra plumbing to avoid
tickling an old kernel bug.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220301215958.157011-17-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Merge tlbi_aa64_range_get_length and tlbi_aa64_range_get_base,
returning a structure containing both results. Pass in the
ARMMMUIdx, rather than the digested two_ranges boolean.
This is in preparation for FEAT_LPA2, where the interpretation
of 'value' depends on the effective value of DS for the regime.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220301215958.157011-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>