Commit graph

8951 commits

Author SHA1 Message Date
Andrew Gallatin
9cb6ba29cb vm: centralize VM_BATCHQUEUE_SIZE definition
Remove the platform-specific definitions of VM_BATCHQUEUE_SIZE
for amd64 and powerpc64, and instead treat all 64-bit platforms
identically.  This has the effect of increasing the arm64
and riscv VM_BATCHQUEUE_SIZE to match that of other platforms.

Reviewed by: jhb, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D37707
2023-01-21 14:30:00 -05:00
Robert Wing
27029bc08f vmm: fix use after free in ppt_detach()
The vmm module destroys the host_domain before unloading the ppt module
causing a use after free. This can happen when kldunload'ing vmm.

Reviewed by:	markj, jhb
Differential Revision:	https://reviews.freebsd.org/D38072
2023-01-20 11:25:27 +00:00
Robert Wing
c668e8173a vmm: take exclusive mem_segs_lock in vm_cleanup()
The consumers of vm_cleanup() are vm_reinit() and vm_destroy().

The vm_reinit() call path is, here vmmdev_ioctl() takes mem_segs_lock:
    vmmdev_ioctl()
    vm_reinit()
    vm_cleanup(destroy=false)

The call path for vm_destroy() is (mem_segs_lock not taken):
    sysctl_vmm_destroy()
    vmmdev_destroy()
    vm_destroy()
    vm_cleanup(destroy=true)

Fix this by taking mem_segs_lock in vm_cleanup() when destroy == true.

Reviewed by:	corvink, markj, jhb
Fixes:  67b69e76e8 ("vmm: Use an sx lock to protect the memory map.")
Differential Revision:	https://reviews.freebsd.org/D38071
2023-01-20 11:10:53 +00:00
Robert Wing
ccf32a68f8 vmm: take exclusive mem_segs_lock when (un)assigning ppt dev
PR:             268744
Reported by:    mmatalka@gmail.com
Reviewed by:	corvink, markj, jhb
Fixes:  67b69e76e8 ("vmm: Use an sx lock to protect the memory map.")
Differential Revision:	https://reviews.freebsd.org/D37962
2023-01-20 10:03:59 +00:00
Gordon Bergling
05187f2ffc amd64: Fix a common typo in source code comments
- s/comparision/comparison/

MFC after:	3 days
2023-01-19 14:27:18 +01:00
Alexander V. Chernikov
692e19cf51 netlink: add netlink to GENERIC@amd64
Netlink is a communication protocol defined in RFC 3549. It is async,
TLV-based protocol, providing 1-1 and 1-many communications between kernel
and userland. Netlink is currently used in Linux kernel to modify, read and
subscribe for nearly all networking states. Interface state, addresses, routes,
firewall, rules, fibs, etc, are controlled via Netlink.

Netlink support was added in D36002. It has got a number of improvements and
first customers since then:
* net/bird2 got netlink support, enabling route multipath in FreeBSD
* netlink-based devd notifications are being worked on ( D37574 ).
* linux(4) fully supports and depends on Netlink

Enabling Netlink in GENERIC targets two goals.
The first one is to provide stability for the third-party userland applications,
so they can rely on the fact that netlink always exists since 14.0 and potentially 13.2.
Loadable module makes life of the app delepers harder. For example, `net/bird2` can be
either build with netlink or rtsock support, but not both.

The second goal is to enable gradual conversion of the base userland tools
to use netlink(4) interfaces. Converting tools like netstat (D36529), route,
ifconfig one-by-one simplifies testing and addressing the feedback.
Othewise, switching all base to use netlink at once may be too big of a leap.

This change targets amd64, the other architectures will follow soon.

Differential Revision: https://reviews.freebsd.org/D37783
2023-01-13 10:22:40 +00:00
Konstantin Belousov
ad97b9bbfc amd64 pmap.h: make it easier to use the header for other consumers
Guard pmap_invlpg() definition with checks that only provide it when
both sys/pcpu.h and machine/cpufunc.h were already included.

Requested by:	Elliott Mitchell
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2023-01-06 01:30:29 +02:00
Konstantin Belousov
a2c08eba43 amd64: be more precise when enabling the AlderLake small core PCID workaround
In particular, do not enable the workaround if INVPCID is not supported
by the core.

Reported by:	"Chen, Alvin W" <Weike.Chen@Dell.com>
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37940
2023-01-06 01:30:29 +02:00
Konstantin Belousov
231d75568f Move INVLPG to pmap_quick_enter_page() from pmap_quick_remove_page().
If processor prefetches neighboring TLB entries to the one being accessed
(as some have been reported to do), then the spin lock does not prevent
the situation described in the "AMD64 Architecture Programmer's Manual
Volume 2: System Programming" rev. 3.23, "7.3.1 Special Coherency
Considerations".

Reported and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37770
2023-01-01 00:09:46 +02:00
Konstantin Belousov
cde70e312c amd64: for small cores, use (big hammer) INVPCID_CTXGLOB instead of INVLPG
A hypothetical CPU bug makes invalidation of global PTEs using INVLPG
in pcid mode unreliable, it seems.  The workaround is applied for all
CPUs with small cores, since we do not know the scope of the issue, and
the right fix.

Reviewed by:	alc (previous version)
Discussed with:	emaste, markj
Tested by:	karels
PR:	261169, 266145
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37770
2023-01-01 00:09:45 +02:00
Konstantin Belousov
45ac7755a7 amd64: identify small cores
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37770
2023-01-01 00:09:45 +02:00
Andrew Gallatin
1cac76c93f vm: reduce lock contention when processing vm batchqueues
Rather than waiting until the batchqueue is full to acquire the lock &
process the queue, we now start trying to acquire the lock using trylocks
when the batchqueue is 1/2 full. This removes almost all contention on the
vm pagequeue mutex for for our busy sendfile() based web workload.
It also greadly reduces the amount of time a network driver ithread
remains blocked on a mutex, and eliminates some packet drops under
heavy load.

So that the system does not loose the benefit of processing large
batchqueues, I've doubled the size of the batchqueues. This way, when
there is no contention, we process the same batch size as before.

This has been run for several months on a busy Netflix server, as well
as on my personal desktop.

Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D37305
2022-12-14 14:34:07 -05:00
Alan Cox
f0878da03b pmap: standardize promotion conditions between amd64 and arm64
On amd64, don't abort promotion due to a missing accessed bit in a
mapping before possibly write protecting that mapping.  Previously,
in some cases, we might not repromote after madvise(MADV_FREE) because
there was no write fault to trigger the repromotion.  Conversely, on
arm64, don't pointlessly, yet harmlessly, write protect physical pages
that aren't part of the physical superpage.

Don't count aborted promotions due to explicit promotion prohibition
(arm64) or hardware errata (amd64) as ordinary promotion failures.

Reviewed by:	kib, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D36916
2022-12-12 11:32:50 -06:00
John Baldwin
af3b48e101 vmm: Free vCPUs when destroying them.
Reported by:	andrew
Reviewed by:	corvink, andrew, markj
Differential Revision:	https://reviews.freebsd.org/D37649
2022-12-09 10:27:05 -08:00
John Baldwin
d212d6ebb4 vmm: Avoid infinite loop in vcpu_lock_all error case.
Reported by:	Coverity (CIDs 1501060,1501071)
Reviewed by:	corvink, markj, emaste
Differential Revision:	https://reviews.freebsd.org/D37648
2022-12-09 10:26:49 -08:00
John Baldwin
91980db1be vmm: Don't lock a vCPU for VM_PPTDEV_MSI[X].
These are manipulating state in a ppt(4) device none of which is
vCPU-specific.  Mark the vcpu fields in the relevant ioctl structures
as unused, but don't remove them for now.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37639
2022-12-09 10:26:23 -08:00
John Baldwin
62be9ffd82 vmm: VM_GET/SET_KERNEMU_DEV should run with the vCPU locked.
Reviewed by:	corvink, kib, markj
Differential Revision:	https://reviews.freebsd.org/D37638
2022-12-09 10:25:30 -08:00
John Baldwin
1f6db5d6b5 vmm: Remove stale comment for vm_rendezvous.
Support for rendezvous outside of a vcpu context (vcpuid of -1) was
removed in commit 949f0f47a4, and the vm, vcpuid argument pair was
replaced by a single struct vcpu pointer in commit d8be3d523d.

Reported by:	andrew
2022-11-30 13:06:46 -08:00
Bjoern A. Zeeb
4a8e4d1546 net80211: fix IEEE80211_DEBUG_REFCNT builds
Remove the KPI/KBI changes from ieee80211_node.h and always use the
macros to pass in __func__ and __LINE__ to the functions.
The actual implementations are prefixed by "_" rather than suffixed
by "_debug" as they no longer are "debug"-specific.

Some of the select functions were not actually using the passed in
func, line options; however they are calling other functions which
use them.  Directly call the internal implementation in those cases
passing the arguments on.

Use a file-local __debrefcnt_used define to mark the arguments __unused
in cases when we compile without IEEE80211_DEBUG_REFCNT and hope the
toolchain is intelligent enough to not pass them at all in those cases.

Also _ieee80211_free_node() now has a conflict so make the previous
_ieee80211_free_node() the new __ieee80211_free_node().

Add IEEE80211_DEBUG_REFCNT to the NOTES file on amd64 to keep exercising
the option.

Sponsored by:	The FreeBSD Foundation
X-MFC:		never
Discussed on:	freebsd-wireless
Reviewed by:	adrian
Differential Revision: https://reviews.freebsd.org/D37529
2022-11-29 21:20:37 +00:00
Corvin Köhne
7c326ab5bb
vmm: don't lock a mtx in the icr_low write handler
x2apic accesses are handled by a wrmsr exit. This handler is called in a
critical section. So, we can't lock a mtx in the icr_low handler.

Reported by:		kp, pho
Tested by:		kp, pho
Approved by:		manu (mentor)
Fixes:			c0f35dbf19 vmm: Use a cpuset_t for vCPUs waiting for STARTUP IPIs.
MFC after:		1 week
MFC with:		c0f35dbf19
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D37452
2022-11-23 09:00:04 +01:00
Corvin Köhne
fde8ce8892
vmm: remove unneccessary rendezvous assertion
When a vcpu sees that a rendezvous is in progress, it exits and tries to
handle the rendezvous. The vcpu doesn't check if it's part of the
rendezvous or not. If the vcpu isn't part of the rendezvous, the
rendezvous could be done before it reaches the assertion. This will
cause a panic.

The assertion isn't needed at all because vm_handle_rendezvous properly
handles a spurious rendezvous. So, we can just remove it.

PR:			267779
Reviewed by:		jhb, markj
Tested by:		bz
Approved by:		manu (mentor)
MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D37417
2022-11-21 08:19:36 +01:00
Dmitry Chagin
2ee1a18d51 vmm: Fix build w/o KDTRACE_HOOKS.
Reviewed by:		imp
Differential revision:	https://reviews.freebsd.org/D37446
2022-11-20 18:00:55 +03:00
Cy Schubert
d487cba33d vmm: Fix non-INVARIANTS build
Reported by:	O. Hartmann <freebsd@walstatt-de.de>
Reviewed by:	jhb
Fixes:		58eefc67a1
Differential Revision:	https://reviews.freebsd.org/D37444
2022-11-18 13:20:13 -08:00
Mark Johnston
ca6b48f080 vmm: Restore the correct vm_inject_*() prototypes
Fixes:	80cb5d845b ("vmm: Pass vcpu instead of vm and vcpuid...")
Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D37443
2022-11-18 14:11:48 -05:00
John Baldwin
49fd5115a9 vmm: Trim some pointless #ifdef KTR.
Reported by:	markj
Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37272
2022-11-18 10:25:39 -08:00
John Baldwin
ee98f99d7a vmm: Convert VM_MAXCPU into a loader tunable hw.vmm.maxcpu.
The default is now the number of physical CPUs in the system rather
than 16.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37175
2022-11-18 10:25:39 -08:00
John Baldwin
98568a005a vmm: Allocate vCPUs on first use of a vCPU.
Convert the vcpu[] array in struct vm to an array of pointers and
allocate vCPUs on first use.  This avoids always allocating VM_MAXCPU
vCPUs for each VM, but instead only allocates the vCPUs in use.  A new
per-VM sx lock is added to serialize attempts to allocate vCPUs on
first use.  However, a given vCPU is never freed while the VM is
active, so the pointer is read via an unlocked read first to avoid the
need for the lock in the common case once the vCPU has been created.

Some ioctls need to lock all vCPUs.  To prevent races with ioctls that
want to allocate a new vCPU, these ioctls also lock the sx lock that
protects vCPU creation.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37174
2022-11-18 10:25:38 -08:00
John Baldwin
c0f35dbf19 vmm: Use a cpuset_t for vCPUs waiting for STARTUP IPIs.
Retire the boot_state member of struct vlapic and instead use a cpuset
in the VM to track vCPUs waiting for STARTUP IPIs.  INIT IPIs add
vCPUs to this set, and STARTUP IPIs remove vCPUs from the set.
STARTUP IPIs are only reported to userland for vCPUs that were removed
from the set.

In particular, this permits a subsequent change to allocate vCPUs on
demand when the vCPU may not be allocated until after a STARTUP IPI is
reported to userland.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37173
2022-11-18 10:25:38 -08:00
John Baldwin
223de44c93 vmm devmem_mmap_single: Bump object reference under memsegs lock.
Reported by:	markj
Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37273
2022-11-18 10:25:38 -08:00
John Baldwin
67b69e76e8 vmm: Use an sx lock to protect the memory map.
Previously bhyve obtained a "read lock" on the memory map for ioctls
needing to read the map by locking the last vCPU.  This is now
replaced by a new per-VM sx lock.  Modifying the map requires
exclusively locking the sx lock as well as locking all existing vCPUs.
Reading the map requires either locking one vCPU or the sx lock.

This permits safely modifying or querying the memory map while some
vCPUs do not exist which will be true in a future commit.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37172
2022-11-18 10:25:38 -08:00
John Baldwin
08ebb36076 vmm: Destroy mutexes.
Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37171
2022-11-18 10:25:38 -08:00
John Baldwin
d5118d0fc4 vmm stat: Add a special nelems constant for arrays sized by vCPU count.
Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37170
2022-11-18 10:25:38 -08:00
John Baldwin
58eefc67a1 vmm vmx: Allocate vpids on demand as each vCPU is initialized.
Compared to the previous version this does mean that if the system as
a whole runs out of dedicated vPIDs you might end up with some vCPUs
within a single VM using dedicated vPIDs and others using shared
vPIDs, but this should not break anything.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37169
2022-11-18 10:25:38 -08:00
John Baldwin
3f0f4b1598 vmm: Lookup vcpu pointers in vmmdev_ioctl.
Centralize mapping vCPU IDs to struct vcpu objects in vmmdev_ioctl and
pass vcpu pointers to the routines in vmm.c.  For operations that want
to perform an action on all vCPUs or on a single vCPU, pass pointers
to both the VM and the vCPU using a NULL vCPU pointer to request
global actions.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37168
2022-11-18 10:25:38 -08:00
John Baldwin
0cbc39d53d vmm ppt: Remove unused vcpu arg from MSI setup handlers.
Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37167
2022-11-18 10:25:37 -08:00
John Baldwin
e42c24d56b vmm: Remove unused vcpuid argument from vioapic_process_eoi.
Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37166
2022-11-18 10:25:37 -08:00
John Baldwin
d8be3d523d vmm: Use struct vcpu in the rendezvous code.
Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37165
2022-11-18 10:25:37 -08:00
John Baldwin
949f0f47a4 vmm: Remove support for vm_rendezvous with a cpuid of -1.
This is not currently used.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37164
2022-11-18 10:25:37 -08:00
John Baldwin
9388bc1e3a vmm: Remove vcpuid from I/O port handlers.
No I/O ports are vCPU-specific (unlike memory which does have
vCPU-specific ranges such as the local APIC).

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37163
2022-11-18 10:25:37 -08:00
John Baldwin
80cb5d845b vmm: Pass vcpu instead of vm and vcpuid to APIs used from CPU backends.
Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37162
2022-11-18 10:25:37 -08:00
John Baldwin
d3956e4673 vmm: Use struct vcpu in the instruction emulation code.
This passes struct vcpu down in place of struct vm and and integer
vcpu index through the in-kernel instruction emulation code.  To
minimize userland disruption, helper macros are used for the vCPU
arguments passed into and through the shared instruction emulation
code.

A few other APIs used by the instruction emulation code have also been
updated to accept struct vcpu in the kernel including
vm_get/set_register and vm_inject_fault.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37161
2022-11-18 10:25:37 -08:00
John Baldwin
28b561ad9d vmm: Add vm_gpa_hold_global wrapper function.
This handles the case that guest pages are being held not on behalf of
a virtual CPU but globally.  Previously this was handled by passing a
vcpuid of -1 to vm_gpa_hold, but that will not work in the future when
vm_gpa_hold is changed to accept a struct vcpu pointer.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37160
2022-11-18 10:25:36 -08:00
John Baldwin
0f435e6476 vmm: Add _KERNEL guards for io headers shared with userspace.
Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37159
2022-11-18 10:25:36 -08:00
John Baldwin
2b4fe856f4 bhyve: Remove unused vm and vcpu arguments from vm_copy routines.
The arguments identifying the VM and vCPU are only needed for
vm_copy_setup.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37158
2022-11-18 10:25:36 -08:00
John Baldwin
3dc3d32ad6 vmm: Use struct vcpu with the vmm_stat API.
The function callbacks still use struct vm and and vCPU index.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37157
2022-11-18 10:25:36 -08:00
John Baldwin
950af9ffc6 vmm: Expose struct vcpu as an opaque type.
Pass a pointer to the current struct vcpu to the vcpu_init callback
and save this pointer in the CPU-specific vcpu structures.

Add routines to fetch a struct vcpu by index from a VM and to query
the VM and vcpuid from a struct vcpu.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37156
2022-11-18 10:25:36 -08:00
John Baldwin
d030f941e6 vmm: Use VLAPIC_CTR* in more places.
Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37155
2022-11-18 10:25:36 -08:00
John Baldwin
57e0119ef3 vmm vmx: Add VMX_CTR* wrapper macros.
These macros are similar to VCPU_CTR* but accept a single vmx_vcpu
pointer as the first argument instead of separate vm and vcpuid.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37154
2022-11-18 10:25:36 -08:00
John Baldwin
fca494dad0 vmm svm: Add SVM_CTR* wrapper macros.
These macros are similar to VCPU_CTR* but accept a single svm_vcpu
pointer as the first argument instead of separate vm and vcpuid.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37153
2022-11-18 10:25:36 -08:00
John Baldwin
869c8d1946 vmm: Remove the per-vm cookie argument from vmmops taking a vcpu.
This requires storing a reference to the per-vm cookie in the
CPU-specific vCPU structure.  Take advantage of this new field to
remove no-longer-needed function arguments in the CPU-specific
backends.  In particular, stop passing the per-vm cookie to functions
that either don't use it or only use it for KTR traces.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37152
2022-11-18 10:25:35 -08:00