Commit graph

142844 commits

Author SHA1 Message Date
Dmitry Chagin
8545bcff31 linux(4): Change timerfd_settime64 syscall definition to match Linux actual one.
MFC after:		2 weeks
2022-05-04 13:06:50 +03:00
Dmitry Chagin
a1fd2911dd linux(4): Implement timer_settime64 syscall.
MFC after:		2 weeks
2022-05-04 13:06:49 +03:00
Dmitry Chagin
9038a0b74c linux(4): Regen for timer_settime64 syscall.
MFC after:	2 weeks
2022-05-04 13:06:49 +03:00
Dmitry Chagin
1508b1b6a0 linux(4): Change timer_settime64 syscall definition to match Linux actual one.
MFC after:		2 weeks
2022-05-04 13:06:48 +03:00
Dmitry Chagin
783c1bd8cb linux(4): Implement timer_gettime64 syscall.
MFC after:		2 weeks
2022-05-04 13:06:48 +03:00
Dmitry Chagin
1cccef6dff linux(4): Regen for timer_gettime64 syscall.
MFC after:	2 weeks
2022-05-04 13:06:48 +03:00
Dmitry Chagin
ccec96033c linux(4): Change timer_gettime64 syscall definition to match Linux actual one.
MFC after:		2 weeks
2022-05-04 13:06:47 +03:00
Dmitry Chagin
8c84ca657b linux(4): Implement sched_rr_get_interval_time64 syscall.
MFC after:		2 weeks
2022-05-04 13:06:47 +03:00
Dmitry Chagin
cdddbb77c3 linux(4): Regen for sched_rr_get_interval_time64 syscall.
MFC after:	2 weeks
2022-05-04 13:06:46 +03:00
Dmitry Chagin
7b520c0b3c linux(4): Change sched_rr_get_interval_time64 syscall definition to match Linux actual one.
MFC after:		2 weeks
2022-05-04 13:06:45 +03:00
Hans Petter Selasky
a1c0442b41 xhci(4): Tweak USB port speed checks to allow newer super speed generations.
This allows setting the U1 and U2 port timeout values.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-05-04 09:26:39 +02:00
Hans Petter Selasky
d730333c80 xhci(4): Properly define all basic USB port speeds.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-05-04 09:26:38 +02:00
Marko Zec
d461deeaa4 VNET: Revert "ifnet: make if_index global"
This reverts commit 91f44749c6.

Devirtualization of V_if_index and V_ifindex_table was rushed into
the tree lacking proper context, discussion, and declaration of intent,
so I'm backing it out as harmful to VNET on the following grounds:

1) The change repurposed the decades-old and stable if_index KBI for
new, unclear goals which were omitted from the commit note.

2) The change opened up a new resource exhaustion vector where any vnet
could starve the system of ifnet indices, including vnet0.

3) To circumvent the newly introduced problem of separating ifnets
belonging to different vnets from the globalized ifindex_table, the
author introduced sysctl_ifcount() which does a linear traversal over
the (potentially huge) global ifnet list just to return a simple upper
bound on existing ifnet indices.

4) The change effectively led to nonuniform ifnet index allocation
among vnets.

5) The commit note clearly stated that the patch changed the implicit
if_index ABI contract where ifnet indices were assumed to be starting
from one.  The commit note also included a correct observation that
holes in interface indices were always allowed, but failed to declare
that the userland-observable ifindex tables could now include huge
empty spans even under modest operating conditions.

6) The author had an earlier proposal in the works which did not
affect per-vnet ifnet lists (D33265) but which he abandoned without
providing the rationale behind his decision to do so, at the expense
of sacrificing the vnet isolation contract and if_index ABI / KBI.

Furthermore, the author agreed to back out his changes himself and
to follow up with a proposal for a less intrusive alternative, but
later silently declined to act.  Therefore, I decided to resolve the
status-quo by backing this out myself.  This in no way precludes a
future proposal aiming to mitigate ifnet-removal related system
crashes or panics to be accepted, provided it would not unnecessarily
compromise the goal of as strict as possible isolation between vnets.

Obtained from: github.com/glebius/FreeBSD/commits/backout-ifindex
2022-05-03 19:27:57 +02:00
Marko Zec
6c741ffbfa Revert "mbuf: do not restore dying interfaces"
This reverts commit 703e533da5.

Revert "ifnet/mbuf: provide KPI to serialize/restore m->m_pkthdr.rcvif"

This reverts commit e1882428dc.

Obtained from: github.com/glebius/FreeBSD/commits/backout-ifindex
2022-05-03 19:11:40 +02:00
Marko Zec
894c574ed2 Revert "dummynet: use m_rcvif_serialize/restore when queueing packets"
This reverts commit 165746f4e4.

Obtained from: github.com/glebius/FreeBSD/commits/backout-ifindex
2022-05-03 19:11:40 +02:00
Marko Zec
0fa5636966 Revert "netisr: serialize/restore m_pkthdr.rcvif when queueing mbufs"
This reverts commit 6871de9363.

Obtained from: github.com/glebius/FreeBSD/commits/backout-ifindex
2022-05-03 19:11:39 +02:00
Hans Petter Selasky
09dd1adfa4 xhci(4): Always add and evaluate the slot context.
Because the maximum number of endpoint contexts is stored there.

Tested by:	ehaupt@
PR:		262882
MFC after:	3 hours
Sponsored by:	NVIDIA Networking
2022-05-03 18:13:53 +02:00
Hans Petter Selasky
e276d28150 xhci(4): Only drop BULK and INTERRUPT endpoints to reset data toggle.
Only drop BULK and INTERRUPT endpoints, to reset the data toggle,
because for other endpoint types this is not critical.

Tested by:	ehaupt@
PR:		262882
MFC after:	3 hours
Sponsored by:	NVIDIA Networking
2022-05-03 18:13:53 +02:00
Rick Macklem
70910e4b55 nfscl: Acquire a refcount on "cred" for mirrored pNFS RPCs
When the NFSv4.1/4.2 client is doing a pnfs mount to
mirrored DS(s), asynchronous threads are used to do the
RPCs against the DS(s) concurrently.  If a DS is slow
to reply, it is possible for the "cred" to be free'd
before the asynchronous thread is done with it, causing
a panic/crash.

This patch fixes the problem by acquiring a refcount on
the "cred" while it is being used by the asynchronous thread
for a DS RPC.  This bug was found during a recent IETF
NFSv4 testing event.

This bug only affects "pnfs" mounts to mirrored pNFS
servers.

MFC after:	2 weeks
2022-05-03 07:22:15 -07:00
Hans Petter Selasky
d735d604f0 mlx5en(4): Use hard-coded 4K page size for RQ/SQ/CQ.
The page size specified for RQ, SQ and CQ is always in units of 4KBytes.
Make sure we subtract MLX5_ADAPTER_PAGE_SHIFT, 12, instead of PAGE_SHIFT
which may vary. This fixes support for using the mlx5en driver on systems
having non-4K page size.

Linux commit:
68cdf5d6e91068c98d6091b193dc7a5ab7dcf5eb

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-05-03 13:48:43 +02:00
Rick Macklem
271f6d52a6 nfsd: Fix session slot freeing for NFSv4.1/4.2
Without this patch the NFSv4.1/4.2 server erroneously
always frees session slot zero for callbacks.  This only
affects 4.1/4.2 mounts if the server has delegations
enabled or is a pNFS configuration.  Even for those
cases, the effect is mainly to only use slot 0 for
callbacks, serializing all of them.  There is a slight
chance that callbacks will fail if the client performs
them in a different order than received on the TCP
connection.

If this bug affects your server, you will see console
messages like:
  newnfs_request: Bad session slot

This patch fixes the problem.  Found during a recent
IETF NFSv4 testing event.

PR:	263728
MFC after:	2 weeks
2022-05-02 12:47:43 -07:00
Warner Losh
6f78dae849 cam: Remove redunant static __inline forward decls
Sponsored by:		Netflix
2022-05-02 09:30:07 -06:00
Eugene Grosbein
28903f396a ng_pppoe: introduce new sysctl net.graph.pppoe.lcp_pcp
This fixes incomplete commit 2e547442ab

New sysctl allows to mark transmitted PPPoE LCP Control
ethernet frames with needed 3-bit Priority Code Point (PCP) value.
Confirming driver like if_vlan(4) uses the value to fill
IEEE 802.1p class of service field.

This is similar to Cisco IOS "control-packets vlan cos priority"
command.

It helps to avoid premature disconnection of user sessions
due to control frame drops (LCP Echo etc.)
if network infrastructure has a botteleck at a switch
or the xdsl DSLAM.

See also:
https://sourceforge.net/p/mpd/discussion/44692/thread/c7abe70e3a/

Tested by:      Klaus Fokuhl at SourceForge
MFC after:      2 weeks
2022-05-02 21:57:12 +07:00
Hans Petter Selasky
6244b53e16 ibcore: Allow passing NULL-pointers to ib_umem_release()
FreeBSD commit b633e08c70 removed the
NULL-checks from the mlx4ib-driver.

This fixes the following NULL-pointer panic when unloading mlx4ib:
ib_umem_release()
mlx4_ib_destroy_qp()
ib_destroy_qp_user()
ipoib_transport_dev_cleanup()
ipoib_dev_cleanup()
ipoib_remove_one()
ib_unregister_client()
ipoib_cleanup_module()
linker_file_sysuninit()
linker_file_unload()
kern_kldunload()
amd64_syscall()

Linux commit:
836a0fbb3e76f704ad65ddfb57f00725245e509b

MFC after:	1 week
Submitted by:	dandan@lysator.liu.se
Sponsored by:	Lysator ACS
Sponsored by:	NVIDIA Networking
2022-05-02 13:11:06 +02:00
Warner Losh
1599fc904d iosched: Move bio_next() inside of the CAM_IOSCHED_DYNAMIC ifdef
bio_next() is only used by the dynamic scheduler, so move it under that
ifdef.

Sponsored by:		Netflix
2022-05-01 16:54:15 -06:00
Rick Macklem
47d75c29f5 nfsd: Add a sanity check to SecinfoNoname for file type
Robert Morris reported that, for the case of SecinfoNoname
with the Parent option, providing a non-directory could
cause a crash.

This patch adds a sanity check for v_type == VDIR for
this case, to avoid the crash.

Reported by:	rtm@lcs.mit.edu
PR:	260300
MFC after:	2 weeks
2022-05-01 13:41:31 -07:00
Warner Losh
d095d6a34c cam_xpt: Prefer bool to int where it's a boolean
In the places where we set an integer to 0 or 1 and then use it like a
boolean, replace int with bool and 0/1 with false/true. Left alone
places where this is a function argument or return value. No functional
changes intended.

Sponsored by:		Netflix
2022-05-01 12:09:42 -06:00
Warner Losh
d592c0db8b cam: add hw.cam.iosched.read_bias
Allow a global setting for the read_bias for the dynamic io
scheduler. This allows global policy to be set, in addition to the
existing per-drive policy. kern.cam.iosched.read_bias is a new tunable.

Sponsored by:		Netflix
Reviewed by:		chs
Differential Revision:	https://reviews.freebsd.org/D34365
2022-05-01 11:27:34 -06:00
Warner Losh
b65803ba57 cam iosched: default to no read bias in dynamic ioscheduling
When we're doing dynamic I/O scheduling, don't default to a read bias of
100. Default it to 0 so turning on dynamic scheduling only does
scheduling tweaks that are requested. The other limiters are off by
default, and need no further adjustment.

Sponsored by:		Netflix
2022-05-01 11:27:34 -06:00
Warner Losh
cc1572ddeb cam iosched: Remove write bias when read bias = 0
Change the meaning of read bias == 0 in the dynamic I/O scheduler. Prior
to this change, a read bias of 0 would mean prefer writes.  Now, when
read bias is 0, we queue all requests to the same queue removing the
bias. When it's non-zero, we still separate the queues we use so we can
bias reads vs writes for workloads that are read centric. These changes
restore the typical bias you get from disksort or ordered insertion at
the end of the list.

Sponsored by:		Netflix
2022-05-01 11:27:34 -06:00
Warner Losh
6c8ab086fe ada: Retry commands with retries left on CAM_SEL_TIMEOUT
The AHCI and ATA SIMs will return CAM_SEL_TIMEOUT when an underlying
device has stopped responding. This is usually seen after a timeouted
out command and can be a transient event. Rather than fail the
peripheral immediately after seeing this, queue a retry. For transient
events, this allows drives to continue to provide data, though with some
added latency, just like we do when we have some other kind of retriable
error. If the error isn't transient (the drive is truly gone), then
we'll discover that eventually and fail the transaction and invalidate
the drive like we do today.

This helps us avoid a panic at the end of camperiphfree when
CAM_PERIPH_NEW_DEV_FOUND is set. However, the deferred callback should
be queued to xpt_async_td instead of being made inline there. This issue
will be solved in a different patch that does that. PR 263703.

This also helps us avoid another bug where we can drop all references to
the device (causing us to go through camperiphfree and destroy the path)
while we have an I/O pending in the ata_da state machine (usually in
state ADA_STATE_RAHEAD with ATA_SETFEATURES ATA_SF_ENAB_RCACHE
command). It's not clear why the reference that we take out to do the
reprobe isn't effective at blocking this. By retrying this condition,
though we avoid this bug (at least more often, I don't have a good
reproduction test case, I just see this panic a few times a month at
work on systems that have transient disk errors on ahci connected SATA
SSDs). PR 263704. It's too soon to know how much this helps us avoid
this bug.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D34977
2022-05-01 11:08:56 -06:00
Rick Macklem
5218d82c81 nfscl: Add support for a NFSv4 AppendWrite RPC
For IO_APPEND VOP_WRITE()s, the code first does a
Getattr RPC to acquire the file's size, before it
can do the Write RPC.

Although NFS does not have an append write operation,
an NFSv4 compound can use a Verify operation to check
that the client's notion of the file's size is
correct, followed by the Write operation.

This patch modifies the NFSv4 client to use an Appendwrite
RPC, which does a Verify to check the file's size before
doing the Write.  This avoids the need for a Getattr RPC
to preceed this RPC and reduces the RPC count by half for
IO_APPEND writes, so long as the client knows the file's
size.

The nfsd structure was moved from the stack to be malloc()'d,
since the kernel stack limit was being exceeded.

While here, fix the types of a few variables, although
there should not be any semantics change caused by these
type changes.
2022-04-30 13:49:23 -07:00
Hans Petter Selasky
6eb6aeef7e uath(4): Fix incorrect byte-swapping and a buffer length check.
PR:			263638
Reported by:		Jeff Gibbons <jgibbons@protogate.com>
MFC after:		1 week
Sponsored by:		NVIDIA Networking
2022-04-30 11:23:07 +02:00
Bjoern A. Zeeb
00614c9c2d LinuxKPI: 802.11: fill in two more TODOs
Implement ieee80211_is_data_present() and a subset of
ieee80211_is_bufferable_mmpdu() which hopefully is good enough in
the compat code for now.
This is partly in preparation for some TXQ changes coming up soon.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-04-30 08:00:04 +00:00
Bjoern A. Zeeb
3540911bfd LinuxKPI: 802.11: use ieee80211_beacon_miss()
In ieee80211_beacon_loss() call into net80211::ieee80211_beacon_miss()
rather than manually bouncing our state.  That should give us the
ability to send a probereq and see if the AP is till there rather than
right away going to scan.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-04-30 07:57:34 +00:00
Alan Somers
2f6362484c fusefs: use the fsname mount option if set
The daemon can specify fsname=XXX in its mount options.  If so, the file
system should report f_mntfromname as XXX during statfs.  This will show
up in the output of commands like mount and df.

Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D35090
2022-04-29 11:10:03 -06:00
Andrew Turner
5b651b501a Map the ACPI tables into the DMAP
When we try to load these tables via acpidump(8) we need them to be in
the DMAP for /dev/mem to access. Add the EFI ACPI reclaim memory type
to the list of memory we map into DMAP but not used by the kernel as
this is the recommended place to put these.

Sponsored by:	The FreeBSD Foundation
2022-04-29 13:11:02 +01:00
Warner Losh
9fb40baf60 cam_periph: Return ENXIO when peripheral is invalidated
When the peripheral is invalidated, no further I/O is possible. Signal
this up the stack with ENXIO now that upper layers of the stack
differentiate sometimes. In order for there to be further I/O, and new
open is required for any block device that a future periph might
instantiate for devices at this location that might return or otherwise
become available. The I/O scheduler flushes its I/O with the ENXIO error
for pending I/O that didn't make it to the device, so this makes the two
paths match.

MFC After:		3 days
Sponsored by:		Netflix
Reviewed by:		chs, mav
Differential Revision:	https://reviews.freebsd.org/D35093
2022-04-28 16:30:00 -06:00
Warner Losh
ca420b4ef2 mpr/mps: when sending reset on removal, include target in message
It's possible for muliple drives to be departing at the same time (if
the common power rail the share goes dark, for example). To understand
what's going on better, include target and handle in the messages
announcing the reset to allow matching with other corresponding events.

MFC After:		3 days
Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D35092
2022-04-28 16:30:00 -06:00
Alan Somers
45825a12f9 fusefs: fix FUSE_CREATE with file handles and fuse protocol < 7.9
Prior to fuse protocol version 7.9, the fuse_entry_out structure had a
smaller size.  But fuse_vnop_create did not take that into account when
working with servers that use older protocols.  The bug does not matter
for servers which don't use file handles or open flags (the only fields
affected).

PR:		263625
Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC after:	2 weeks
2022-04-28 15:13:09 -06:00
Warner Losh
c5041b4ee8 mpr/mps: Add comment explaining state transition
When we can't load a request due to a shortage of chains, we complete
the command's cm. However, to avoid an assert in mp?_complete_command,
we transition its state to INQUEUE. This transition is legitimate
because this is the only error path that terminates a cm before it's
enqueued and the only other alternative would be an additional transient
state that would add complexity w/o adding value. Add a comment
explainging all this because otherwise the transition can look a bit
weird.

Sponsored by:		Netflix
2022-04-28 11:19:39 -06:00
Robert Wing
65c87a6c81 geom_dev: extend kevent support for geom dev
Add support for the following NOTE events:
    NOTE_OPEN, NOTE_CLOSE, NOTE_CLOSE_WRITE, NOTE_READ, and NOTE_WRITE.

Differential Revision:	https://reviews.freebsd.org/D34777
2022-04-28 08:40:13 -08:00
Piotr Kubaj
25768526bb powerpc: enable wlan and ath modules in GENERIC64*
Reviewed by:	jhibbits (src)
Differential Revision: https://reviews.freebsd.org/D35089
2022-04-28 11:42:39 +02:00
Kornel Duleba
0923ff82fb Add USB ID and quirks for Huawei E3372
Set UQ_MSC_NO_INQUIRY and UQ_MSC_NO_GETMAXLUN quirks for mass storage,
which is the initial mode of this dongle.
The modem is shipped with at least two firmware versions: 10.X and 11.X,
without ability to update to the newer one.
The 11.X version works more or less fine, but the 10.X one resets after
receiving either an SCSI INQUIRY, or a get_max_lun command.
Since both of those are used for automatic quirk detection, this leads
to a reset cycle making the device somewhat unusable.

Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: hps, wma
Differential Revision: https://reviews.freebsd.org/D35076
2022-04-28 08:42:30 +02:00
Kornel Duleba
3ee943868c usb: Respect NO_INQUIRY quirk during device enumeration
Both usb_iface_is_cdrom and usb_msc_auto_quirk functions use SCSI INQUIRY
command to probe various properties of usb mass storage devices.
The problem here is that some very broken devices don't like this command.
Check if UQ_MSC_NO_INQUIRY quirk is set and skip cdrom and quirk
autodetection in that case.

Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: hps, wma
Differential Revision: https://reviews.freebsd.org/D35075
2022-04-28 08:42:26 +02:00
Alexander Motin
404f001161 CAM: Keep periph_links when restoring CCB in camperiphdone().
While recovery command executed, some other commands from the periph
may complete, that may affect periph_links of this CCB.  So restoring
original CCB we must keep current periph_links as more up to date.

I've found this triggering assertions with debug kernel and suspect
some memory corruptions otherwise when spun down disk receives two
or sometimes more concurrent requests.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2022-04-27 21:39:50 -04:00
Konstantin Belousov
6fe78ad434 subr_unit.c: make userspace tests buildable
by defining a placeholder for UNR_NO_MTX

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2022-04-28 03:00:14 +03:00
Konstantin Belousov
709783373e Fix another race between fork(2) and PROC_REAP_KILL subtree
where we might not yet see a new child when signalling a process.
Ensure that this cannot happen by stopping all reapping subtree,
which ensures that the child is not inside a syscall, in particular
fork(2).

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:35 +03:00
Konstantin Belousov
39794d80ad Fix a race between fork(2) and PROC_REAP_KILL subtree
by repeating iteration over the subtree until there are no new processes
to signal.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:35 +03:00
Konstantin Belousov
d1df347368 kern_procctl: add possibility to take stop_all_proc_block() around exec
stop_allo_proc_block() must be taken before proctree_lock.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D35014
2022-04-28 02:27:35 +03:00