vcpuHint has been expanded to 16 bit on host side to enable
interruptions to be routed to more CPUs. Guest side should align with
the change.
This change has been tested with hosts with 8-bit and 16-bit vcpuHint,
on both platforms host side can get correct value.
This driver is for ESXi product which only supports x86/x64. They are
little-endian. So there is no need to consider big-endian system.
PR: 264840
Reviewed by: imp@, Zhenlei Huang
The ipmi watchdog pretimeout action can trigger unintentionally in
certain rare, complicated situations. What we have seen at Netflix
is that the BMC can sometimes be sent a continuous stream of
writes to port 0x80, and due to what is a bug or misconfiguration
in the BMC software, this results in the BMC running out of memory,
becoming very slow to respond to KCS requests, and eventually being
rebooted by its own internal watchdog. While that is going on in
the BMC, back in the host OS, a number of requests are pending in
the ipmi request queue, and the kcs_loop thread is working on
processing these requests. All of the KCS accesses to process
those requests are timing out and eventually failing because the
BMC is responding very slowly or not at all, and the kcs_loop thread
is holding the IPMI_IO_LOCK the whole time that is going on.
Meanwhile the watchdogd process in the host is trying to pat the
BMC watchdog, and this process is sleeping waiting to get the
IPMI_IO_LOCK. It's not entirely clear why the watchdogd process
is sleeping for this lock, because the intention is that a thread
holding the IPMI_IO_LOCK should not sleep and thus any thread
that wants the lock should just spin to wait for it. My best guess
is that the kcs_loop thread is spinning waiting for the BMC to
respond for so long that it is eventually preempted, and during
the brief interval when the kcs_loop thread is not running,
the watchdogd thread notices that the lock holder is not running
and sleeps. When the kcs_loop thread eventually finishes processing
one request, it drops the IPMI_IO_LOCK and then immediately takes the
lock again so it can process the next request in the queue.
Because the watchdogd thread is sleeping at this point, the kcs_loop
always wins the race to acquire the IPMI_IO_LOCK, thus starving
the watchdogd thread. The callout for the watchdog pretimeout
would be reset by the watchdogd thread after its request to the BMC
watchdog completes, but since that request never processed, the
pretimeout callout eventually fires, even though there is nothing
actually wrong with the host.
To prevent this saga from unfolding:
- when kcs_driver_request() is called in a context where it can sleep,
queue the request and let the worker thread process it rather than
trying to process in the original thread.
- add a new high-priority queue for driver requests, so that the
watchdog patting requests will be processed as quickly as possible
even if lots of application requests have already been queued.
With these two changes, the watchdog pretimeout action does not trigger
even if the BMC is completely out to lunch for long periods of time
(as long as the watchdogd check command does not also get stuck).
Sponsored by: Netflix
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D36555
In non-Hyper-V systems during Hyper-V initialization, system
initialization was getting hung, as hyperv_identify(),
was returning successful irrespective of the type of the platform.
Reviewed by: andrew, whu
Fixes: 9729f076e4
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D37219
Assertions suggest that the loop in iommu_gas_fini_domain is executed
zero times, so remove it.
Reviewed by: alc, kib
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D37204
Maintain a pointer to an element in the domain map that is left of any
sufficiently large free gap in the tree and start the search for free
space there, rather than at the root of the tree. On find_space, move
that pointer to the leftmost leaf in the subtree of nodes with
free_down greater than or equal to the minimum allocation size before
starting the search for space from that pointer. On removal of a node
with address less than that pointer, update that pointer to point to
the predecessor or successor of the removed node.
In experiments with netperf streaming, this reduces by about 40% the
number of map entries examined in first-fit allocation.
Reviewed by: alc, kib
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D36624
This commit brings back the driver from FreeBSD commit
f187d6dfbf plus subsequent fixes from
upstream.
Relative to upstream this commit includes a few other small fixes such
as additional INET and INET6 #ifdef's, #include cleanups, and updates
for recent API changes in main.
Reviewed by: pauamma, gbe, kevans, emaste
Obtained from: git@git.zx2c4.com:wireguard-freebsd @ 3cc22b2
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36909
When flushing the UART, we need to drain manually if LSR_TEMT is
*not* asserted, aka. if the transmit FIFO is not empty.
Reported by: void <void@f-m.fm>
Fixes: c4b68e7e53 "ns8250: Check if flush via FCR succeeded"
Differential Revision: https://reviews.freebsd.org/D37185
- We depend on header polution to include sys/malloc.h. Include it
directly.
- Only define FDT-specific fuctions when building a FDT kernel.
Sponsored by: Innovate UK
This is the last part for ARM64 Hyper-V enablement. This includes
commone files and make file changes to enable the ARM64 FreeBSD
guest on Hyper-V. With this patch, it should be able to build
the ARM64 image and install it on Hyper-V.
Reviewed by: emaste, andrew, whu
Tested by: Souradeep Chakrabarti <schakrabarti@microsoft.com>
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D36744
psci_attach is way too late to provide the intended semantics for
psci_present. psci calls can be made immediately after psci_init(),
called way earlier at SI_SUB_CPU + SI_ORDER_FIRST, and we need it to
be valid as early as we can possibly call a psci function.
This fixes booting RPi3+4 with the in-review spintable patch;
rpi3-psci-monitor patches the FDT to add a PSCI node, but it doesn't
patch each cpus' enable-method. Because of this, we would stall the
boot while enabling CPU 1 as we saw a valid looking enable-method and
"no" functional PSCI and attempted to use the spintable rather than
simply not starting secondary APs.
Fixes: 2218070b2c ("psci: finish psci_present implementation")
Reported by: karels
Refactor the code to put split the MSR values for x86 and arm64
Hyper-V. Code not yet built. This is one of several patches for
the arm64 Hyper-V enablement.
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D37103
2782ed8f6c fixed the standalone module
build. REmove the now duplicate includes for opt_acpi.h and
opt_platform.h. Als remove the if_mdio.h again in both the Makefile
and the implementation file as it is not (currently) used.
X-MFC with: ba7319e909
MFC after: 70 days
New driver to ACPI generic event device, defined in ACPI spec.
Some ACPI power button may not work without this.
In qemu arm64 with "virt" machine, with ACPI firmware,
enable devd check devd message by
and invoke following command in qemu monitor
(qemu) system_powerdown
and make sure some power button input event appear.
(setting sysctl hw.acpi.power_button_state=S5 is not work,
because ACPI tree does not have \_S5 object.)
Reviewed by: andrew, hrs
Differential Revision: https://reviews.freebsd.org/D37032
This is the second part of the ARM64 Hyper-V enablement.
These changes here are mostly with Make, release changes and also
changes required in vmbus.c hyperv.c and common files in hyperv.
Reviewed by: whu
Tested by: Souradeep Chakrabarti <schakrabarti@microsoft.com>
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D36467
In ARM64 gen2 Hyper-V, use IRQ resource from vmbus_res, which is owning
the IRQ for current device tree. It allows the MMIO resource to be
successfully allocated for vmbus from parent acpi_syscontainer.
Reviewed by: whu
Tested by: Souradeep Chakrabarti <schakrabarti@microsoft.com>
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D37064
This add BUS_GET_DEVICE_PATH interface,
which shows device tree of openfirm/fdt.
In qemu-system-arm64 with "virt" machine with device-tree firmware,
% devctl getpath OFW cpu0
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D37031
The emulated UART in the Firecracker VMM (aka the implementation in the
rust-vmm/vm-superio project) includes FIFOs but does not implement the
FCR register, which is used by ns8250_flush to flush the FIFOs.
Check the LSR to see if there is still data in the FIFOs and call
ns8250_drain if necessary.
Discussed with: emaste, imp, jrtc27
Sponsored by: https://patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D36979
We assume that bus addresses from busdma are the same thing as
"physical" addresses in the Virtio specification; this seems to
be true for the platforms on which Virtio is currently supported.
For block devices which are limited to a single data segment per
I/O, we request PAGE_SIZE alignment from busdma; this is necessary
in order to support unaligned I/Os from userland, which may cross
a boundary between two non-physically-contiguous pages. On devices
which support more data segments per I/O, we retain the existing
behaviour of limiting I/Os to (max data segs - 1) * PAGE_SIZE.
Reviewed by: bryanv
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D36667