The mpidr register is 64 bit on arm64 and 32 bit on arm. Fix this by
extending the arm64 definition to include the top 32 bits.
To preserve KBI when MFCing split the value into two 32 bit values.
This will be cleaned up later only on main.
Reviewed by: bz
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36346
When the IDC flag is set in the cache type register we don't need to
clean the data cache to the point of unification. Previously we
supported this flag being set only when the DIC flags was also set.
Add a new handler for when this is not the case.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation, Ampere (hardware)
Differential Revision: https://reviews.freebsd.org/D36296
- Define PC_FREEL and _NPCM in terms of _NPCPV rather than via magic
numbers.
- Remove assertions about _NPC* values from pmap.c. This is less
relevant now that PC_FREEL and _NPCM are derived from _NPCPV.
- Add a helper inline function pc_is_full() which uses a loop to check
if pc_map is all zeroes. Use this to replace three places that
check for a full mask assuming there are only 3 entries in pc_map.
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36217
We need to be careful to not promote or demote the memory containing
the per-CPU structures as the exception handlers will dereference it
so any time it's invalid may cause recursive exceptions.
Add a new pmap function to set a flag in the pte marking memory that
cannot be promoted or demoted and use it to mark pcpu memory.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35434
Define PAGE_SIZE and PAGE_MASK based on PAGE_SHIFT. With this we only
need to set one value to change one value to change the page size.
While here remove the unused PAGE_MASK_* macros.
Sponsored by: The FreeBSD Foundation
The devmap variants used vm_offset_t for some reason, and a few places
explicitly cast bus addresses to vm_offset_t. (Probably those casts
along with similar casts for vm_size_t should just be removed and
instead permit the compiler to DTRT.)
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D35961
Add initial 16k page support on arm64. It is considered experimental,
with no guarantee of compatibility with a userspace or kernel modules
built with the current a 4k page size as code will likely try to pass
in a too small size when working with APIs that take a multiple of a
page, e.g. mmap.
As this is experimental, and because userspace and the kernel need to
have the PAGE_SIZE macro kept in sync there is no kernel option to
enable this. To test a new image should be built with the
PAGE_{SIZE,SHIFT,MASK} macros changed to the 16k versions.
There are currently known issues with loading modules from an old
loader as it can misalign them to load on a non-16k boundary.
Testing has shown good results in kernel workloads that allocate and
free large amounts of memory as only a quarter of the number of calls
into the VM subsystem are needed in the best case.
Reviewed by: markj
Tested by: gallatin
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34793
The field values are only valid when the ID_AA64PFR0_EL1.SVE or
ID_AA64PFR1_EL1.SME vields are non-zero. When this is not the case
the register is reserved as zero so is safe to read, but the SVEver
field will be incorrect so only print the decoded register when
the SVE or SME fields indicate it is valid.
Sponsored by: The FreeBSD Foundation
On arm64 all registers have a name that encodes op0, op1, CRn, CRm, and
op2 that are used to encode the register in the instruction. As some
registers we need to access may not be supportedby older compilers, or
are only supported when specific extensions are enabled support this
alternative form.
Sponsored by: The FreeBSD Foundation
To keep the vfp thread creation code in one place move into vfp.c. This
will also help with adding SVE support as it depends on VFP.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35615
Add support of ARM CMN-600 controller, PMU access functions only.
Add support of PMU counters of ARM CMN-600 controller.
Reviewed by: mhorne
Sponsored By: ARM
Differential Revision: https://reviews.freebsd.org/D32321
Summary:
It can be useful to see a summary of CPU caches on bootup. This is done
for most platforms already, so add this to arm64, in the form of (taken
from Apple M1 pro test):
L1 cache: 192KB (instruction), 128KB (data)
L2 cache: 12288KB (unified)
This is printed out per-CPU, only under bootverbose.
Future refinements could instead determine if a cache level is shared
with other cores (L2 is shared among cores on some SoCs, for instance),
and perform a better calculation to the full true cache sizes. For
instance, it's known that the M1 pro, on which this test was done, has 2
12MB L2 clusters, for a total of 24MB. Seeing each CPU with 12288KB L2
would make one think that there's 12MB * NCPUs, for possibly 120MB
cache, which is incorrect.
Sponsored by: Juniper Networks, Inc.
Reviewed by: #arm64, andrew
Differential Revision: https://reviews.freebsd.org/D35366
To enable it user-space needs to call feenableexcept().
FPE_FLTIDO has been added as the IDF bit can't be mapped to any existing
FPE code.
Reviewed by: andrew@
Differential revision: https://reviews.freebsd.org/D35247
MFC after: 2 weeks
All supported compilers (modern versions of GCC and clang) support
this.
Many places didn't have an #else so would just silently do the wrong
thing. Ancient versions of icc (the original motivation for this) are
no longer a compiler FreeBSD supports.
PR: 263102 (exp-run)
Reviewed by: brooks, imp
Differential Revision: https://reviews.freebsd.org/D34797
Following ARMARM sec D5.2.11, which says:
> Where an instruction results in an update to a System register,
> as is the case with the AT * address translation instructions,
> explicit synchronization must be performed before the result is
> guaranteed to be visible to subsequent direct reads of the
> PAR_EL1.
Reviewed By: andrew
MFC after: 3 weeks
Sponsored by: Ampere Computing
Differential Revision: https://reviews.freebsd.org/D34665
This includes adding support for NT_ARM_VFP for 32-bit binaries
running under aarch64 kernels both for ptrace(), and coredumps via the
kernel and gcore.
Reviewed by: andrew, markj
Sponsored by: University of Cambridge, Google, Inc.
Differential Revision: https://reviews.freebsd.org/D34448
It's unneeded as it was just used to align KERNBASE to a level 2
block start address. KERNBASE was already aligned correctly.
Sponsored by: The FreeBSD Foundation
To support cc -pg on arm64 we need to implement .mcount. As clang and
gcc think it is function like it just needs to load the arguments
to _mcount and call it.
On gcc the first argument is passed in x0, however this is missing on
clang so we need to load it from the stack. As it's the caller return
address this will be at a known location.
PR: 262709
Reviewed by: emaste (earlier version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34634
To allow for a future 16k or 64k page size we need to tell libkvm which
is being used. Add a flag field in unused space in minidumphdr and use
it to signal between the different options.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34548
They are in a different order to the TCR_TG1 values but appear to have
been copied incorrectly.
While here use TCR_TG0_4K in locore.S to make it explicit the userspace
page size is 4K.
Sponsored by: The FreeBSD Foundation
We assume the pointer returned from get_pcpu will be consistent even
if the thread is moved to a new CPU. Fix this by partially reverting
63c858a04d to make get_pcpu a function again.
Sponsored by: The FreeBSD Foundation
We need to not include the MI _san.h files when builing some parts of
the kernel. Fix these checks in the arm64 header files.
Sponsored by: The FreeBSD Foundation
This can be used by debuggers to find which bits in a virtual address
should be masked off to get a canonical address. This is currently used
by the Pointer Authentication Code support to get its mask. It could also
be used if we support Top Byte Ignore for the same purpose.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34302
We should clear the single step flag when entering a signal hander and
set it when returning. This fixes the ptrace__PT_STEP_with_signal test.
While here add support for userspace to set the single step bit as on
x86. This can be used by userspace for self tracing.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34170
While here clean up the names for the naming convention of the other
registers in this file.
Reviewed by: kib, mhorne (earlier version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34060
Pointer authentication allows userspace to add instructions to insert
a Pointer Authentication Code (PAC) into a register based on an address
and modifier and check if the PAC is correct. If the check fails it will
either return an invalid address or fault to the kernel.
As many of these instructions are a NOP when disabled and in earlier
revisions of the architecture this can be used, for example, to sign
the return address before pushing it to the stack making Return-oriented
programming (ROP) attack more difficult on hardware that supports them.
The kernel manages five 128 bit signing keys: 2 instruction keys, 2 data
keys, and a generic key. The instructions then use one of these when
signing the registers. Instructions that use the first four store the
PAC in the register being signed, however the instructions that use the
generic key store the PAC in a separate register.
Currently all userspace threads share all the keys within a process
with a new set of userspace keys being generated when executing a new
process. This means a forked child will share its keys with its parent
until it calls an appropriate exec system call.
In the kernel we allow the use of one of the instruction keys, the ia
key. This will be used to sign return addresses in function calls.
Unlike userspace each kernel thread has its own randomly generated.
Thread0 has a static key as does the early code on secondary CPUs.
This should be safe as there is minimal user interaction with these
threads, however we could generate random keys when the Armv8.5
Random number generation instructions are present.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31261
- Move busdma_lock_mutex to subr_bus_dma.c.
- Move _busdma_lock_dflt to subr_bus_dma.c. This function was named a
couple of different things previously. It is not a public API but
an internal helper used in place of a NULL pointer. The prototype
is in <sys/bus_dma.h> as not all backends include
<sys/bus_dma_internal.h>.
Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33694
When a DMA request using bounce pages completes, a swi is triggered to
schedule pending DMA requests using the just-freed bounce pages. For
a long time this bus_dma swi has been tied to a "virtual memory" swi
(swi_vm). However, all of the swi_vm implementations are the same and
consist of checking a flag (busdma_swi_pending) which is always true
and if set calling busdma_swi. I suspect this dates back to the
pre-SMPng days and that the intention was for swi_vm to serve as a
mux. However, in the current scheme there's no need for the mux.
Instead, remove swi_vm and vm_ih. Each bus_dma implementation that
uses bounce pages is responsible for creating its own swi (busdma_ih)
which it now schedules directly. This swi invokes busdma_swi directly
removing the need for busdma_swi_pending.
One consequence is that the swi now works on RISC-V which had previously
failed to invoke busdma_swi from swi_vm.
Reviewed by: imp, kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33447
We only need to include sys/_atomic_subword.h on arm64 to provide
atomic_testandset_acq_long. Add an implementation in the arm64 atomic.h
based on the existing atomic_testandset macro.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33587
The header exports the following:
- Definition of struct tcb.
- Helpers to get/set the tcb for the current thread.
- TLS_TCB_SIZE (size of TCB)
- TLS_TCB_ALIGN (alignment of TCB)
- TLS_VARIANT_I or TLS_VARIANT_II
- TLS_DTV_OFFSET (bias of pointers in dtv[])
- TLS_TP_OFFSET (bias of "thread pointer" relative to TCB)
Note that TLS_TP_OFFSET does not account for if the unbiased thread
pointer points to the start of the TCB (arm and x86) or the end of the
TCB (MIPS, PowerPC, and RISC-V).
Note also that for amd64, the struct tcb does not include the unused
tcb_spare field included in the current structure in libthr. libthr
does not use this field, and the existing calls in libc and rtld that
allocate a TCB for amd64 assume it is the size of 3 Elf_Addr's (and
thus do not allocate room for tcb_spare).
A <sys/_tls_variant_i.h> header is used by architectures using
Variant I TLS which uses a common struct tcb.
Reviewed by: kib (older version of x86/tls.h), jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33351