Commit d3f96f6610 removed <sys/queue.h>
and replaced it with the very broad <sys/systm.h>. However, none of
the changes to sysctl.h in that commit require anything defined in
<sys/systm.h>. On the other hand, <sys/sysctl.h> does still make use
of queue macros. Drop the include of <sys/systm.h> and re-add
<sys/queue.h>.
Reviewed by: imp, kib, asomers
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D37950
Include the phase and argument field to make it easier to determine
at a glance where the failure originated.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D38091
It is the same as callout_stop(9) but the return values are different.
Reviewed by: hselasky
Approved by: hselasky
Differential Revision: https://reviews.freebsd.org/D38081
Some devices have CDC_CM descriptors that would point us to
the wrong interfaces. Add a quirk to ignore those (prefering the
CDC_UNION descriptor effectively)
Reviewed by: manu
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D37942
Add ACPI_RESOURCE_TYPE_FIXED_MEMORY32 to the PCI ECAM driver. This is
used on the Microsoft Dev Kit 2023 and reportedly the Lenovo x13s.
Reviewed by: Robert Clausecker <fuz@fuz.su> (Earlier version)
Tested by: Robert Clausecker <fuz@fuz.su> (Earlier version)
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38031
In set_fpcontext we only need a critical section around vfp_discard.
The remainder of the code can run without it.
While here add an assert to check the passed in thread is the
current thread as the code already this.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D38000
If a thread enters a kernel FP context the PCB_FP_STARTED may be
unset when calling get_fpcontext even if the VFP unit has been used
by the current thread.
Reduce the use of this flag to just decide when to store the VFP state.
While here add an assert to check the assumption that the passed in
thread is the current thread and remove the unneeded critical section.
The latter is unneeded as the only place we would need it is in
vfp_save_state and this already has a critical section when needed.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D37998
The PCB_FP_STARTED is used to indicate that the current VFP context
has been used since either 1. the start of the thread, or 2. exiting
a kernel FP context.
When case 2 was added to the kernel this could cause incorrect results
to be returned when a thread exits the kernel FP context and fill_fpregs
is called before it has restored the VFP state, e.g. by trappin on a
userspace VFP instruction.
In both of the cases the base save area is still valid so reduce the
use of the PCB_FP_STARTED flag check to help decide if we need to
store the current threads VFP state.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D37994
Commit 0ef3ca7ae3 initialized
thread0.td_kstack_pages to KSTACK_PAGES. Due to the lack of an
include of opt_kstack_pages.h it used the fallback value of 4 from
machine/param.h. This meant that increasing KSTACK_PAGES in the kernel
config resulted in a panic in _epoch_enter_preempt as the following
assertion was false during network stack setup:
MPASS((vm_offset_t)et >= td->td_kstack &&
(vm_offset_t)et + sizeof(struct epoch_tracker) <=
td->td_kstack + td->td_kstack_pages * PAGE_SIZE);
Switch to initializing with kstack_pages following other architectures.
Reviewed by: imp, markj
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D38049
Commit 86a994d653 initialized
thread0.td_kstack_pages to KSTACK_PAGES. Due to the lack of an
include of opt_kstack_pages.h it used the fallback value of 4 from
machine/param.h. This meant that increasing KSTACK_PAGES in the kernel
config resulted in a panic in _epoch_enter_preempt as the following
assertion was false during network stack setup:
MPASS((vm_offset_t)et >= td->td_kstack &&
(vm_offset_t)et + sizeof(struct epoch_tracker) <=
td->td_kstack + td->td_kstack_pages * PAGE_SIZE);
Switch to initializing with kstack_pages following other architectures.
Reviewed by: imp, markj
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D38048
This file is indented with a mixture of tabs and spaces. No functional
change intended.
Reviewed by: melifaro
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38100
Some users of nlmsg_reserve_object() and nlmsg_reserve_data() are not
careful to fully initialize pad and reserved fields, allowing
uninitialized bytes to leak to userspace. For example, dump_nhgrp()
doesn't set nhm->resvd = 0.
Meanwhile, nlmsg_get_ns_buf() and nlmsg_get_ns_lbuf() zero-initialize
the buffer, so nlmsg_get_ns_mbuf() is inconsistent. Let's just make
them all behave the same here.
Reported by: KMSAN
Reviewed by: melifaro
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38098
This is the same error code as Linux.
As emaste@ noted in the review, FreeBSD defines the following errno
values in `sys/errno.h`:
* 56 is `EISCONN`
* 57 is `ENOTCONN`
Reviewed by: manu
Approved by: manu
Differential Revision: https://reviews.freebsd.org/D37935
It uses the `VM_MEMATTR_WRITE_BACK` flag on FreeBSD.
It replaces `ioremap_wb()` which doesn't exist in Linux. Perhaps it
existed in the past and was removed.
Reviewed by: emaste, manu
Approved by: emaste, manu
Differential Revision: https://reviews.freebsd.org/D37916
bz@ asked if the KBI breakage is a concern here. My answer was that this
is the first time in the DRM drivers in Linux 5.13 (the version I'm
working on) that this structure is initialized (as a variable local to
the function in this case), so it shouldn't be a problem for the DRM
drivers.
However, I can't speak for other drivers maintained outside of the src
tree.
Reviewed by: emaste, manu
Approved by: emaste, manu
Differential Revision: https://reviews.freebsd.org/D37913
Various handlers for SADB messages will allocate a new mbuf and populate
some structures in it. Some of these structures, such as struct
sadb_supported, contain small reserved fields that are not initialized
and are thus leaked to userspace.
Fix the problem by adding a helper to allocate zeroed mbufs. This
reduces code duplication and the overhead of zeroing these messages
isn't harmful.
Reviewed by: zlei, melifaro
Reported by: KMSAN
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D38068
Avoid including cdefs.h in system headers. Both headers now include
types.h, and we can assume that that pulls in cdefs.h (required for
__typeof usage in some of the atomic macro expansions).
No functional change intended.
Reviewed by: imp, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D38039
Some apps try to provide only the non-zero part of the required message
header instead of the full one. It happens when fetching routes or
interface addresses, where the first header byte is the family.
This behavior is "illegal" under the "strict" Netlink socket option,
however there are many applications out there doing things in the
"old" way.
Support this usecase by copying the provided bytes into the temporary
zero-filled header and running the parser on this header instead.
Reported by: Goran Mekić <meka@tilda.center>
Currently `close(2)` erroneously return `EOPNOTSUPP` for `PF_ROUTE` sockets.
It happened after making rtsock socket implementation self-contained (
36b10ac2cd ). Rtsock code marks socket as connected in `rts_attach()`.
`soclose()` tries to disconnect such socket using `.pr_disconnect` callback.
Rtsock does not implement this callback, resulting in the default method being
substituted. This default method returns `ENOTSUPP`, failing `soclose()` logic.
This diff restores the previous behaviour by adding custom `pr_disconnect()`
returning `ENOTCONN`.
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D38059
For NFSv4.1/4.2, when the client specifies SP4_NONE for
state protection in the ExchangeID operation arguments,
the server MUST allow the state management operations for
any user credentials. (I misread the RFC and thought that
SP4_NONE meant "at the server's discression" and not MUST
be allowed.)
This means that the "sec=XXX" field of the "V4:" exports(5)
line only applies to NFSv4.0.
This patch fixes the server to always allow state management
operations for SP4_NONE, which is the only state management
option currently supported. (I have patches that add support
for SP4_MACH_CRED to the server. These will be in a future commit.)
In practice, this bug does not seem to have caused
interoperability problems.
MFC after: 2 weeks
nd6_resolve_slow() can be called without mbuf. If the LLE entry
is not reachable, nd6_resolve_slow() will add this NULL mbuf to
the holdchain via lltable_append_entry_queue, which will "append"
NULL to the end of the queue (effectively no-op) and bump la_numhold
value. When this entry gets freed, the kernel will panic due to the
inconsistency between the amount of mbufs in the queue and the value
of la_numhold.
Fix the panic by checking of mbuf is not NULL prior to inserting it
into the holdchain.
Reported by: kib
MFC after: 3 days
There is another case where SU code does ffs_syncvnode(dvp) for the
parent directory dvp while the child vnode vp is locked. Avoid the
issue by relocking and returning ERELOOKUP to indicate the need of
resync.
Reported by: jkim
Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37997
There is no point in clearing just this flag. Flags are reset on the
struct mount re-allocation for reuse anyway.
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37966
The pfsync_defer_tmo() callout needs to set the correct vnet before it
can transmit packets. It used the rcvif in the mbuf to get this vnet,
but that doesn't work for locally originated traffic. In that case the
rcvif pointer is NULL, and the dereference leads to a panic.
Instead use the sc_sync_if, which is always set (if pfsync is enabled,
at least).
PR: 268246
MFC after: 2 weeks
In case the reset sequence fails (ena_destroy_device() followed by
ena_restore_device() calls) during ena_restore_device(), the driver
resources are being freed. After the clean-up, the timer service is
re-armed in order to try and re-initialize the driver state.
But, such an attempt would fail given that the resources are freed.
Moreover, this would actually cause either the system to fail or a
panic.
When the driver fails in ena_restore_device() procedure, the only
recovery is either unloading and loading the driver or instance
reboot.
This change removes the timer service re-arm in case of failure
in ena_restore_device().
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Fixes: 78554d0c70 ("ena: start timer service on attach")
Commit [1] first added the ena_tx_buffer.print_once member,
so that a message about a missing tx completion is printed only
once per packet (and not every second when the watchdog runs).
In this commit print_once is initialized to true, and is set back
to false after detecting a missing tx completion and printing
a warning about it to dmesg.
Commit [2] incorrectly reverses the values assigned to print_once.
The variable is initialized to be true but is checked to be false
when a missing tx completion is detected. This is never true, and
therefore the warning print for each missing tx completion is never
printed since this commit.
Commit [3] added time passed since last TX cleanup to the missing
tx completions per-packet print. However, due to the issue in commit
[2], this time is never printed.
This commit reverses back the values assigned to ena_tx_buffer.print_once
erroneously by commit [2], bringing back to life the missing tx
completion per-packet print.
Also add a space after "." in the missing tx completion print.
[1] - 9b8d05b8ac ("Add support for Amazon Elastic Network Adapter (ENA) NIC")
[2] - 74dba3ad78 ("Split function checking for missing TX completion in ENA driver")
[3] - d8aba82b5c ("ena: Store ticks of last Tx cleanup")
Fixes: 74dba3ad78 ("Split function checking for missing TX completion in ENA driver")
Fixes: d8aba82b5c ("ena: Store ticks of last Tx cleanup")
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
To attach to the hypervisor, kvmclock needs to write a per-CPU MSR.
When EARLY_AP_STARTUP is not defined, device attach happens too early:
APs are not yet spun up, so smp_rendezvous only runs the callback on the
local CPU. As a result, the timecounter only gets initialized on the
BSP, and then timekeeping is broken on SMP systems.
Implement handling for !EARLY_AP_STARTUP kernels: keep track of the CPU
on which device attach ran, and then use a SI_SUB_SMP SYSINIT to
register the rest of the CPUs with the hypervisor.
Reported by: Shrikanth R Kamath <kshrikanth@juniper.net>
Reviewed by: kib, jhb (earlier versions)
Sponsored by: Klara, Inc.
Sponsored by: Juniper Networks, Inc.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D37705
Currently function prison_ip_restrict() returns true if the replacement
buffer was used, or no buffer provided and allocation fails and should
redo. The logic is confusing and cause possibly infinite loop from
eb8dcdeac2 .
Reviewed by: jamie, glebius
Approved by: kp (mentor)
Differential Revision: https://reviews.freebsd.org/D37918
And possibly infinite loop calling prison_ip_restrict() in
kern_jail_set() [2].
[1] It is possible that prisons do not have any IPv4 or IPv6 addresses.
[2] If prison_ip_restrict() is not provided with prison_ip, when it
allocates prison_ip successfully, then it should return false to
indicate not redo prison_ip_restrict() later.
Reviewed by: glebius
Approved by: kp (mentor)
Fixes: eb8dcdeac2 jail: network epoch protection for IP address lists
Differential Revision: https://reviews.freebsd.org/D37906