vm_offset_t pmap_quick_enter_page(vm_page_t m)
void pmap_quick_remove_page(vm_offset_t kva)
These will create and destroy a temporary, CPU-local KVA mapping of a specified page.
Guarantees:
--Will not sleep and will not fail.
--Safe to call under a non-sleepable lock or from an ithread
Restrictions:
--Not guaranteed to be safe to call from an interrupt filter or under a spin mutex on all platforms
--Current implementation does not guarantee more than one page of mapping space across all platforms. MI code should not make nested calls to pmap_quick_enter_page.
--MI code should not perform locking while holding onto a mapping created by pmap_quick_enter_page
The idea is to use this in busdma, for bounce buffer copies as well as virtually-indexed cache maintenance on mips and arm.
NOTE: the non-i386, non-amd64 implementations of these functions still need review and testing.
Reviewed by: kib
Approved by: kib (mentor)
Differential Revision: http://reviews.freebsd.org/D3013
provide a semantic defined by the C11 fences with corresponding
memory_order.
atomic_thread_fence_acq() gives r | r, w, where r and w are read and
write accesses, and | denotes the fence itself.
atomic_thread_fence_rel() is r, w | w.
atomic_thread_fence_acq_rel() is the combination of the acquire and
release in single operation. Note that reads after the acq+rel fence
could be made visible before writes preceeding the fence.
atomic_thread_fence_seq_cst() orders all accesses before/after the
fence, and the fence itself is globally ordered against other
sequentially consistent atomic operations.
Reviewed by: alc
Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
discrimination between different subarch binaries, at least for mips
and arm. Arm is implemented, mips is still tbd, so not currently
exported. aarch64 does not export this because aarch64 binaries use
different tags and flags than arm.
Differential Revision: https://reviews.freebsd.org/D2611
1. Align to a 64-bit address so 64-bit data will be correctly aligned.
2. Add a comment explaining why.
3. Remove an unneeded value from the struct.
This fixes an issue where the struct may not be correctly aligned on the
stack in the syscall function. This may lead to accesing a 64-bit value
at a non 64-bit. This will raise an exception and panic the kernel.
We have been lucky where on arm and armv6 both clang and gcc correctly
align the data, even without us asking to, however, on armeb with clang to
not be the case. This tells the compiler we really do need this to be
aligned.
Reported and tested by: jmg (on armeb with clang)
MFC after: 1 Week [1, 2]
Perform cache writebacks and invalidations in the correct (inner to outer
or vice versa) order, and add comments that explain that.
Consistantly use 'va' as the variable name for virtual addresses.
Submitted by: Michal Meloun <meloun@miracle.cz>
For consistency with the naming conventions used by the other
implementations kill armv7_sleep and keep armv7_cpu_sleep.
Differential Revision: https://reviews.freebsd.org/D2537
Submitted by: John Wehle
Reviewed by: ian@, andrew@
because the i386 pmap on which the new armv6 pmap is based had it, and in
r281707 pmap_lazyfix() was removed from the i386 pmap.
Discussed with: kib
Submitted by: Michal Meloun (via Svatopluk Kraus)
Offet for the power control register was specified incorrectly (it had
the same value as the prefetch control register.) This change corrects
the offset value to 0xF80, per the ARM PL310 documentation.
Submitted by: Steve Kiernan <stevek@juniper.net>
Obtained from: Juniper Networks, Inc.
Previously we used pmap_kremove(), but with ARM_NEW_PMAP it does the remove
in a way that isn't SMP-coherent (which is appropriate in some circumstances
such as mapping/unmapping sf buffers). With matching enter/remove routines
for device mappings, each low-level implementation can do the right thing.
Reviewed by: Svatopluk Kraus <onwahe@gmail.com>
the PMC_IN_KERNEL() macro definition.
Add missing macros to extract the return address (LR) from the trapframe.
Discussed with: andrew
Obtained from: Cambridge/L41
Sponsored by: DARPA, AFRL
MFC after: 2 weeks
This is pretty much a complete rewrite based on the existing i386 code. The
patches have been circulating for a couple years and have been looked at by
plenty of people, but I'm not putting anybody on the hook as having reviewed
this in any formal sense except myself.
After this has gotten wider testing from the user community, ARM_NEW_PMAP
will become the default and various dregs of the old pmap code will be
removed.
Submitted by: Svatopluk Kraus <onwahe@gmail.com>,
Michal Meloun <meloun@miracle.cz>
Each plaform performs virtual memory split between kernel and user space
and assigns kernel certain amount of memory space. However, is is sometimes
reasonable to change the default values. Such situation may happen on
systems where the demand for kernel buffers is high, many devices occupying
memory etc. This of course comes with the cost of decreasing user space
memory range so shall be used with care. Most embedded systems will not
suffer from this limtation but rather take advantage of this potential
since default behavior is left unchanged.
Submitted by: Wojciech Macek <wma@semihalf.com>
Reviewed by: imp
Obtained from: Semihalf
used by other places that expect to unwind the stack, e.g. dtrace and
stack(9).
As I have written most of this code I'm changing the license to the
standard FreeBSD license. I have received approval from the other
developers who have changed any of the affected code.
Approved by: ian, imp, rpaulo, eadler (all license change)
Switch the cache line size during invalidations/flushes
to be read from CP15 cache type register.
Submitted by: Wojciech Macek <wma@semihalf.com>
Reviewed by: ian, imp
Obtained from: Semihalf
the data the inline functions access together at the start of the bus_space
struct. The start-of part isn't so important, it's the grouping-together
that's the point: now all the most-accessed data should be in one cache line.
Suggested by: cognet
every operation to retrieve the bs_cookie value almost nothing actually uses.
The bus_space struct contains a private data pointer (poorly named bs_cookie,
now renamed to bs_privdata) which is used only by a few old armv4 xscale
implementations. The bus_space functions were all defined to take this
value as the first parameter instead of the bus_space_tag_t, requiring all
the inline macro and function expansions to dereference the tag to pass it
to another function, which never uses it. Now all the functions take the tag
as the first parameter and retrieve the privdata if they need it.
Also fix a couple bus_space_unmap() implementations that were calling
kva_free() instead of pmap_unmapdev().
Discussed with: cognet
that some #ifdef SMP code is also conditional on __ARM_ARCH >= 7; we don't
support SMP on armv6, but some drivers and modules are compiled with it
forced on via the compiler command line.
code in sys/kern/kern_dump.c. Most dumpsys() implementations are nearly
identical and simply redefine a number of constants and helper subroutines;
a generic implementation will make it easier to implement features around
kernel core dumps. This change does not alter any minidump code and should
have no functional impact.
PR: 193873
Differential Revision: https://reviews.freebsd.org/D904
Submitted by: Conrad Meyer <conrad.meyer@isilon.com>
Reviewed by: jhibbits (earlier version)
Sponsored by: EMC / Isilon Storage Division
mostly paves the way for the new pmap code, and shouldn't result in any
noticible behavior differences.
Submitted by: Svatopluk Kraus <onwahe@gmail.com>,
Michal Meloun <meloun@miracle.cz
The ancient gas we've been using interprets .align 0 as align to the
minimum required alignment for the current section. Clang's integrated
assembler interprets it as align to a byte boundary. Fortunately both
assemblers interpret a non-zero value as align to 2^N so just make sure
we have appropriate non-zero values everywhere.
The elftoolchain project includes these additional defines for various
userland programs. Given that arch-specific defines are still interesting
in the context of userland programs reading or writing ELF metadata, they
should be included in top-level ELF headers.
Remove duplicate defines from ARM and MIPS elf headers.
Submitted by: will (initial version)
Reviewed by: imp, will
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D844
are inline functions that handle all the routine maintenance operations
except the flush-all and invalidate-all routines which are required only
during early kernel init.
These inline functions should be very much faster than the old mechanism
that involved jumping through the big cpufuncs table, especially for
common operations such as invalidating a single TLB entry. Note that
nothing is calling these yet, this just is just required infrastructure
for upcoming changes to the pmap-v6 code.
mechanism defined for armv7 (and also present on some armv6 chips including
the arm1176 used on rpi). The information is parsed into a global cpuinfo
structure, which will be used by (upcoming) new cache and tlb maintenance
code to handle cpu-specific variations of the maintence sequences.
Submitted by: Svatopluk Kraus <onwahe@gmail.com>,
Michal Meloun <meloun@miracle.cz
code, passing a 0/1 flag that indicates which type of abort it was. This
sets the stage for unifying the handling of page faults in a single routine.
Submitted by: Svatopluk Kraus <onwahe@gmail.com>,
Michal Meloun <meloun@miracle.cz
If it seems like this is getting out of hand, I quite agree. I wonder if
it's safe, here in the 21st century, to lose the distinction between C and
ASM symbols?
around so that related things are more grouped together, rewrite comments.
No functional changes, this is all so that the functional changes in the
next commit will stand out.
'extra' entry points which are nested within or provide a synonym name
for another function. It's most likely not safe to be messing with the
IP and LR registers at anything other than the primary entry point to a
function. Anywhere beyond initial function entry, those registers may
be in use as scratch or variable registers.
- Eliminate unused irqframe
- Eliminate unused saframe
- Instead of splitting r4-sp storage between the stack and switchframe,
just put all the registers in switchframe and eliminate the un_32 struct.
Submitted by: Svatopluk Kraus <onwahe@gmail.com>,
Michal Meloun <meloun@miracle.cz>
If this feels like deja vu... the last time this was fixed in this file
only ARM_MMU_V6 was fixed, this time it's ARM_ARCH_V6 (and this time I
searched for other occurrances of pj4b in here).
It is automatically set when -fPIC is passed to the compiler.
Reviewed by: dim, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1179
and casuword(9), but do not mix value read and indication of fault.
I know (or remember) enough assembly to handle x86 and powerpc. For
arm, mips and sparc64, implement fueword() and casueword() as wrappers
around fuword() and casuword(), which means that the functions cannot
distinguish between -1 and fault.
On architectures where fueword() and casueword() are native, implement
fuword() and casuword() using fueword() and casuword(), to reduce
assembly code duplication.
Sponsored by: The FreeBSD Foundation
Tested by: pho
MFC after: 2 weeks (ia64 needs treating)
registers and use it in the ARMv7 CPU functions.
The sysreg.h file has been checked by hand, however it may contain errors
with the comments on when a register was first introduced. The ARMv7 cpu
functions have been checked by compiling both the previous and this version
and comparing the md5 of the object files.
Submitted by: Svatopluk Kraus <onwahe at gmail.com>
Submitted by: Michal Meloun <meloun at miracle.cz>
Reviewed by: ian, rpaulo
Differential Revision: https://reviews.freebsd.org/D795
In the fdt data we've written for ourselves, the interrupt properties
for GIC interrupts have just been a bare interrupt number. In standard
data that conforms to the published bindings, GIC interrupt properties
contain 3-tuples that describe the interrupt as shared vs private, the
interrupt number within the shared/private address space, and configuration
info such as level vs edge triggered.
The new gic_decode_fdt() function parses both types of data, based on the
#interrupt-cells property. Previously, each platform implemented a decode
routine and put a pointer to it into fdt_pic_table. Now they can just
list this function in their table instead if they use arm/gic.c.
header (Elf_Ehdr) to determine if a particular interpretor wants to
accept it or not. Use this mechanism to filter EABI arm on OABI arm
kernels, and vice versa. This method could also be used to implement
OABI on EABI arm kernels, if desired, or to allow a single mips kernel
to run o32, n32 and n64 binaries.
Differential Revision: https://reviews.freebsd.org/D609
By Richard Earnshaw at ARM
>
>GCC has for a number of years provides a set of pre-defined macros for
>use with determining the ISA and features of the target during
>pre-processing. However, the design was always somewhat cumbersome in
>that each new architecture revision created a new define and then
>removed the previous one. This meant that it was necessary to keep
>updating the support code simply to recognise a new architecture being
>added.
>
>The ACLE specification (ARM C Language Extentions)
>(http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.set.swdev/index.html)
>provides a much more suitable interface and GCC has supported this
>since gcc-4.8.
>
>This patch makes use of the ACLE pre-defines to map to the internal
>feature definitions. To support older versions of GCC a compatibility
>header is provided that maps the traditional pre-defines onto the new
>ACLE ones.
Stop using __FreeBSD_ARCH_armv6__ and switch to __ARM_ARCH >= 6 in the
couple of places in tree. clang already implements ACLE. Add a define
that says we implement version 1.1, even though the implementation
isn't quite complete.
The MD allocators were very common, however there were some minor
differencies. These differencies were all consolidated in the MI allocator,
under ifdefs. The defines from machine/vmparam.h turn on features required
for a particular machine. For details look in the comment in sys/sf_buf.h.
As result no MD code left in sys/*/*/vm_machdep.c. Some arches still have
machine/sf_buf.h, which is usually quite small.
Tested by: glebius (i386), tuexen (arm32), kevlo (arm32)
Reviewed by: kib
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
don't need any #ifdef stuff to use atomic_load/store_64() elsewhere in
the kernel. For armv4 the atomics are trivial to implement for kernel
code (just disable interrupts), less so for user mode, so this only has
the kernel mode implementations for now.
value shared across multiple cores is with atomic_load_64() and
atomic_store_64(), because the normal 64-bit load/store instructions
are not atomic on 32-bit arm. Luckily the ldrexd/strexd instructions
that are atomic are fairly cheap on armv6. Because it's fairly simple
to do, this implements all the ops for 64-bit, not just load/store.
Reviewed by: andrew, cognet
We have functions nested within functions, and places where we start a
function then never end it, we just jump to the middle of something else.
We tried to express this with nested ENTRY()/END() macros (which result
in .fnstart and .fnend directives), but it turns out there's no way to
express that nesting in ARM EHABI unwind info, and newer tools treat
multiple .fnstart directives without an intervening .fnend as an error.
These changes introduce two new macros, EENTRY() and EEND(). EENTRY()
creates a global label you can call/jump to just like ENTRY(), but it
doesn't emit a .fnstart. EEND() is a no-op that just documents the
conceptual endpoint that matches up with the same-named EENTRY().
This is based on patches submitted by Stepan Dyatkovskiy, but I made some
changes and added the EEND() stuff, so blame any problems on me.
Submitted by: Stepan Dyatkovskiy <stpworld@narod.ru>
handling. For statically linked apps this uses the __exidx_start/end
symbols set up by the linker. For dynamically linked apps it finds the
shared object that contains the given address and returns the location and
size of the exidx section in that shared object.
The dl_unwind_find_exidx() name is used by other BSD projects and Android,
and is mentioned in clang 3.5 comments as "the BSD interface" for finding
exidx data. GCC (in libgcc_s) expects the exact same API and functionality
to be provided by a function named __gnu_Unwind_Find_exidx(), so we provide
that with an alias ("strong reference").
Reviewed by: kib@
MFC after: 1 week
memory ordering model allows writes to different devices to complete out
of order, leading to a situation where the write that clears an interrupt
source at a device can complete after a write that unmasks and EOIs the
interrupt at the interrupt controller, leading to a spurious re-interrupt.
This adds a generic barrier function specific to the needs of interrupt
controllers, and calls that function from the GIC and TI AINTC controllers.
There may still be other soc-specific controllers that need to make the call.
Reviewed by: cognet, Svatopluk Kraus <onwahe@gmail.com>
MFC after: 3 days
platform code, it is expected these will be merged in the future when the
ARM code is more complete.
Until more boards can be tested only use this with the Raspberry Pi and
rrename the functions on the other SoCs.
Reviewed by: ian@
Here, "suitably endowed" means that the System Control Coprocessor
(#15) has Performance Monitoring Registers, including a CCNT (Cycle
Count) register.
The CCNT register is used in a way similar to the TSC register in
x86 processors by the get_cyclecount(9) function. The entropy-harvesting
thread is a heavy user of this function, and will benefit from not
having to call binuptime(9) instead.
One problem with the CCNT register is that it is 32-bit only, so
the upper 32-bits of the returned number are always 0. The entropy
harvester does not care, but in case any one else does, follow-up
work may include an interrup trap to increment an upper-32-bit
counter on CCNT overflow.
Another problem is that the CCNT register is not readable in user-mode
code; in can be made readable by userland, but then it is also
writable, and so is a good chunk of the PMU system. For that reason,
the CCNT is not enabled for user-mode access in this commit.
Like the x86, there is one CCNT per core, so they don't all run in
perfect sync.
Reviewed by: ian@ (an earlier version)
Tested by: ian@ (same earlier version)
Committed from: WANDBOARD-QUAD
On modern ARM SoCs the L2 cache controller sits between the CPU and the
AXI bus, and most on-chip memory-mapped devices are on the AXI bus. We
map the device registers using the 'Device' memory attribute, which means
the memory is not cached, but writes to it are buffered. Ensuring that a
write has made it all the way to a device may require that the L2
controller take some action.
There is currently only one implementation of the new function, for the
PL310 cache controller. It invokes a function that the controller
manual calls "cache sync" but it actually has nothing to do with cache at
all, it triggers a drain of all pending store buffer writes and it blocks
until they complete.
The sheeva and xscale L2 controllers (which predate the concept of Device
memory) don't seem to have a corresponding function. It appears that the
standard armv5 drain_writebuf function includes draining all the way
through the L2 controller.
This was added ca. 2004 for the purpose of ensuring the caches were in the
right state after the debugger set a breakpoint. kdb_cpu_sync_icache()
was added in 2007 to handle that situation, and now the wbinv_all is
actually harmful because the operation isn't broadcast to other cores.
using armv7_idcache_wbinv_all, because wbinv_all doesn't broadcast the
operation to other cores. In elf_cpu_load_file() use icache_sync_all()
and explain why it's needed (and why other sync operations aren't).
As part of doing this, all callers of cpu_icache_sync_all() were
inspected to ensure they weren't relying on the old side effect of
doing a wbinv_all along with the icache work.
* Save the required VFP registers on context switch. If the exception bit
is set we need to save and restore the FPINST register, and if the fp2v
bit is also set we need to save and restore FPINST2.
* Move saving and restoring the floating point control registers to C.
* Clear the fpexc exception and fp2v flags on a floating-point exception.
* Signal a SIGFPE if the fpexc exception flag is set on an undefined
instruction. This is how the ARM core signals to software there is a
floating-point exception.
which was added by cognet in 2012, so remove the no-longer-applicable
license stuff that referred to all the old contents, and put in a
standard 2-clause BSD license (to cover the 6 lines of useful code left
in here).
swi_exit code in exception.S instead of having its own inline expansion
of the DO_AST and PULLFRAME macros. That means that now all references
to the PUSH/PULLFRAME and DO_AST macros are localized to exception.S,
so move the macros themselves into there and remove them from asmacros.h
never actually ran on these chips (other than using SA1 support in an
emulator to do the early porting to FreeBSD long long ago). The clutter
and complexity of some of this code keeps getting in the way of other
maintenance, so it's time to go.
enabled. In vfp_discard(), if the state in the VFP hardware belongs to
the thread which is dying, NULL out pcpu fpcurthread to indicate the
state currently in the hardware belongs to nobody.
Submitted by: Juergen Weiss
Pointy hat to: me