Commit graph

139025 commits

Author SHA1 Message Date
Mark Johnston
e6c19aa94d sctp: Allow blocking on I/O locks even with non-blocking sockets
There are two flags to request a non-blocking receive on a socket:
MSG_NBIO and MSG_DONTWAIT.  They are handled a bit differently in that
soreceive_generic() and soreceive_stream() will block on the socket I/O
lock when MSG_NBIO is set, but not if MSG_DONTWAIT is set.  In general,
MSG_NBIO seems to mean, "don't block if there is no data to receive" and
MSG_DONTWAIT means "don't go to sleep for any reason".

SCTP's soreceive implementation did not allow blocking on the I/O lock
if either flag is set, but this violates an assumption in
aio_process_sb(), which specifies MSG_NBIO but nonetheless
expects to make progress if data is available to read.  Change
sctp_sorecvmsg() to block on the I/O lock only if MSG_DONTWAIT
is not set.

Reported by:	syzbot+c7d22dbbb9aef509421d@syzkaller.appspotmail.com
Reviewed by:	tuexen
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31915
2021-09-14 09:02:05 -04:00
Mark Johnston
fa0463c384 socket: De-duplicate SBLOCKWAIT() definitions
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-09-14 09:01:32 -04:00
Andrew Turner
fd860ace3b Remove the DN flag from the initial arm64 fpcr value
This fixes software that expects NaN values to propagate.

Sponsored by:	The FreeBSD Foundation
2021-09-14 12:52:48 +00:00
Andrew Turner
e4d89a633e Add support for gicv2m as a child of gicv3
On some systems, e.g. Parallels set to host a Linux VM under an M1 Mac,
there is a GICv2m as a child of the GICv3. We previously assumed the
GICv2m was always a child of a GICv2. Fix this by adding the needed
support to the GICv3 driver.

PR:		258136
Reported by:	trasz
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31768
2021-09-14 08:24:52 +01:00
Andrew Turner
c6f3076d44 Move the GICv2m msi handling to the parent
This is in preperation for adding support for the GICv2m driver as a
child of the GICv3 driver.

PR:             258136
Reported by:    trasz
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31767
2021-09-14 08:24:52 +01:00
Wojciech Macek
ba4d9d9d5b Revert "if_mvneta: Build the driver as a kernel module"
This reverts commit bcf5c7a8b1.
2021-09-14 11:49:59 +02:00
Hubert Mazur
bcf5c7a8b1 if_mvneta: Build the driver as a kernel module
Fix device detach and attach routine. Add required Makefile
to build as a module. Remove entry from GENERIC, since now
it can be loaded automatically.

Tested on EspressoBin.

Obtained from:		Semihalf
Reviewed by:		manu
Differential revision:	https://reviews.freebsd.org/D31581
2021-09-14 08:29:53 +02:00
Konstantin Belousov
1c56781cc9 amd64 wakeup: rework trampoline page allocation
There is no need to restrict trampoline page table to low 1M, it
should work with any pages below 4G.  Only wakeup code itself should
be below 1M.

Do not waste level 5 page when LA48 mode is used.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31931
2021-09-14 00:23:15 +03:00
Konstantin Belousov
2b6eec531a x86: duplicate acpi_wakeup.c per i386 and amd64
The file as is is the maze of #ifdef passages, all slightly different.
Divorcing i386 and amd64 version actually makes changing the code
easier, also no changes for i386 are planned.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31931
2021-09-14 00:23:14 +03:00
Krzysztof Galazka
abf774528d
ixl(4): Fix 2.5 and 5G speeds reporting and update shared code
Fix 2.5 and 5G speeds reporting and update shared code with recent
changes:
- Update expected FW API versions for X710 and X722 adapters
- Define pointers related to Preservation Rules Module
- Add definitions for Shadow RAM pointers to new modules: 5th and 6th
  FPA, and Preservation Rules Module.
- Add I40E_RX_PTYPE_PARSER_ABORTED definition, so the driver will know
  opcode for parser aborted packets.
- Add the new filter types needed for custom cloud filters.
- Add support for Minimum Rollback Revision
- Fix RX_ONLY mode for unicast promiscuous on VLAN
- Add EEE LPI status check for X722 adapters
- Fix PHY type identifiers for 2.5G and 5G adapters
- Fix update link data for X722
- Increase the timeout value for PF reset to give PF more time to finish
  reset if it is loaded with filters.
- Added support for Min Rollback Revision for 4 more X722 modules
- Fix reporting of Active Optical Cable media type
- Add flags and fields for double VLAN processing
- Fix potentially uninitialized variables in NVM code

Reviewed by:	kbowling@, mike.jakubik@gmail.com
Tested by:	gowtham.kumar.ks@intel.com
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D31565
2021-09-13 14:00:50 -07:00
Stefan Grundmann
e673ac3ffb libnv: Fix array unpack endianness logic
When a nvlist(9) is converted into a binary buffer by nvlist_pack(9),
the host endianness is encoded in the nvlist_header of the binary
buffer. The nvlist_unpack(9) function converts a given binary buffer
to an nvlist. In the conversion process the endianness encoded in the
nvlist_header is evaluated and -- should the encoded endianness differ
from the endianess of the decoding host -- endianness conversion is
applied to nvlist_header and nvpair_header elements as well as
to some nvpair values.

In 2015 @oshogbo extended libnv with array support (in 347a39b).
The unpacking code misses the possible need to convert the endianness
of the nvph_nitems element of nvpair_headers.

The patch (re)enables libnv to unpack nvlists regardless of the
endianness of the packing host.

Pull Request:	https://github.com/freebsd/freebsd-src/pull/528
2021-09-13 21:21:14 +02:00
John Baldwin
f63ddf465f cxgbei: Only convert "plain" TCP connections to ISCSI.
Reject attempts to convert a connection using a different ULP
mode: (e.g. DDP or TLS) to ISCSI.

Reported by:	Jithesh Arakkan @ Chelsio
Sponsored by:	Chelsio Communications
2021-09-13 09:57:54 -07:00
John Baldwin
b7caa81576 cxgbei: Return early for EBUSY error in icl_cxgbei_conn_handoff.
This permits unindenting almost half of the function.

Sponsored by:	Chelsio Communications
2021-09-13 09:57:54 -07:00
John Baldwin
9b1bb0aee6 cxgbei: Disable ISO for -SO cards without external memory.
Reported by:	Jithesh Arakkan @ Chelsio
Sponsored by:	Chelsio Communications
2021-09-13 09:57:54 -07:00
Konstantin Belousov
db2ba218d9 amd64 acpi_wakeup: map 1:1 whole low 4G for the trampoline page table
This is required since kernel text might be physically located
anywhere below 4G.

PR:	258432
Reported by:	Taku YAMAMOTO <taku@tackymt.homeip.net>
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31916
2021-09-13 19:52:13 +03:00
Konstantin Belousov
ceca8ac1ce x86 acpi_install_wakeup_handler(): style
Do not use tab between type and variable name in local declarations.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31916
2021-09-13 19:52:06 +03:00
Konstantin Belousov
e99255c8a6 amd64: do not touch low memory in acpi_wakeup_ap() if booted by UEFI
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31916
2021-09-13 19:51:52 +03:00
Andrew Turner
b029ef7fe6 Restrict spsr updated in the arm64 set_regs*
When using ptrace(2) on arm64 to set registers in a 32-bit program we
need to take care to only set some of the fields. Follow the existing
arm64 path and only let the user set the flags fields. This is also the
case in the arm kernel so fixes a change in behaviour between the two.

While here update set_regs to only set spsr and elr once.

Sponsored by:	The FreeBSD Foundation
2021-09-13 16:32:50 +01:00
Alexander Motin
e76786909c Fix data race in scsi cd driver.
There is a data race between cdsysctlinit and cdcheckmedia.  Both
functions change softc->flags without synchronization.

Submitted by:	Arseny Smalyuk <smalukav@gmail.com>
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D31726
2021-09-13 08:59:51 -04:00
Wojciech Macek
6e93bdfff3 Revert "if_mvneta: Build the driver as a kernel module"
This reverts commit 41b0190cc4.
2021-09-13 12:55:15 +02:00
Hubert Mazur
41b0190cc4 if_mvneta: Build the driver as a kernel module
Fix device detach and attach routine. Add required Makefile
to build as a module. Remove entry from GENERIC, since now
it can be loaded automatically.

Tested on EspressoBin.

Obtained from:		Semihalf
Reviewed by:		manu
Differential revision:	https://reviews.freebsd.org/D31581
2021-09-13 11:44:31 +02:00
Hubert Mazur
ee1b7811a3 e6000sw: Build the driver as a kernel module
Fix detach routine.
Driver was tested on EspressoBin.
Remove it from GENERIC, since now it can be loaded automatically.

Obtained from:		Semihalf
Reviewed by:		manu
Differential revision:	https://reviews.freebsd.org/D31580
2021-09-13 11:42:16 +02:00
Hubert Mazur
1e1253510a e6000sw: Use taskqueue subsytem for MDIO polling
Previosuly the link status was pooled in an infinite loop in a separate
kproc. Use taskqueue subsytem instead. This is a prequisite for making
this driver work as a loadable module.

Obtained from:		Semihalf
Differential revision:	https://reviews.freebsd.org/D31579
2021-09-13 11:37:11 +02:00
Bartlomiej Grzesik
d00c1f7f2f sdhci: add sysctls to dump sdhci registers and capabilites
Add sysctls dev.sdhci.X.slotY.dumpregs and dev.sdhci.X.slotY.dumpcaps
which dumps sdhci registers or capabilities.

Obtained from:		Semihalf
Reviewed by:		mw
Differential revision:	https://reviews.freebsd.org/D31406
2021-09-13 10:00:25 +02:00
Wojciech Macek
6fa041d7f1 Measure latency of PMC interruptions
Add HWPMC events to measure latency.
Provide sysctl to choose the number of outstanding events which
trigger HWPMC event.

Obtained from:		Semihalf
Sponsored by:		Stormshield
Differential revision:	https://reviews.freebsd.org/D31283
2021-09-13 06:08:32 +02:00
Mark Johnston
b864b67a0d socket: Do not include control messages in FIONREAD return value
Some system software expects to be able to read at least the number of
bytes returned by FIONREAD.  When control messages are counted in this
return value, this assumption is violated.  Follow Linux and OpenBSD
here (as well as our own kevent(EVFILT_READ)) and only return the number
of data bytes available.

Reported by:	avg
MFC after:	2 weeks
2021-09-12 16:39:44 -04:00
Michael Tuexen
29545986bd sctp: avoid LOR
Don't lock the inp-info lock while holding an stcb lock.

MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D31921
2021-09-12 21:11:14 +02:00
Michael Tuexen
4181fa2a20 sctp: minor cleanup, no functional change
MFC after:	1 week
2021-09-12 19:21:15 +02:00
Rick Macklem
55089ef4f8 nfscl: Make vfs.nfs.maxcopyrange larger by default
As of commit 103b207536, the NFSv4.2 server will limit the size
of a Copy operation based upon a 1 second timeout.  The Linux 5.2
kernel server also limits Copy operation size to 4Mbytes.
As such, the NFSv4.2 client can attempt a large Copy without
resulting in a long RPC RTT for these servers.

This patch changes vfs.nfs.maxcopyrange to 64bits and sets
the default to the maximum possible size of SSIZE_MAX, since
a larger size makes the Copy operation more efficient and
allows for copying to complete with fewer RPCs.
The sysctl may be need to be made smaller for other non-FreeBSD
NFSv4.2 servers.

MFC after:	2 weeks
2021-09-11 15:36:32 -07:00
John Baldwin
b9485d76e3 Add EPOCH_TRACE to NOTES to get LINT coverage.
Sponsored by:	The FreeBSD Foundation
2021-09-11 13:05:44 -07:00
Mark Johnston
2884918c73 aio: Fix up the opcode in aiocb32_copyin()
With lio_listio(2), the opcode is specified by userspace rather than
being hard-coded by the system call (e.g., aio_readv() -> LIO_READV).
kern_lio_listio() calls aio_aqueue() with an opcode of LIO_NOP, which
gets fixed up when the aiocb is copied in.

When copying in a job request for vectored I/O, we need to dynamically
allocate a uio to wrap an iovec.  So aiocb_copyin() needs to get the
opcode from the aiocb and then decide whether an allocation is required.
We failed to do this in the COMPAT_FREEBSD32 case.  Fix it.

Reported by:	syzbot+27eab6f2c2162f2885ee@syzkaller.appspotmail.com
Reviewed by:	kib, asomers
Fixes:	f30a1ae8d5 ("lio_listio(2):  Allow LIO_READV and LIO_WRITEV.")
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31914
2021-09-11 12:58:41 -04:00
Mark Johnston
2d5c48eccd sctp: Tighten up locking around sctp_aloc_assoc()
All callers of sctp_aloc_assoc() mark the PCB as connected after a
successful call (for one-to-one-style sockets).  In all cases this is
done without the PCB lock, so the PCB's flags can be corrupted.  We also
do not atomically check whether a one-to-one-style socket is a listening
socket, which violates various assumptions in solisten_proto().

We need to hold the PCB lock across all of sctp_aloc_assoc() to fix
this.  In order to do that without introducing lock order reversals, we
have to hold the global info lock as well.

So:
- Convert sctp_aloc_assoc() so that the inp and info locks are
  consistently held.  It returns with the association lock held, as
  before.
- Fix an apparent bug where we failed to remove an association from a
  global hash if sctp_add_remote_addr() fails.
- sctp_select_a_tag() is called when initializing an association, and it
  acquires the global info lock.  To avoid lock recursion, push locking
  into its callers.
- Introduce sctp_aloc_assoc_connected(), which atomically checks for a
  listening socket and sets SCTP_PCB_FLAGS_CONNECTED.

There is still one edge case in sctp_process_cookie_new() where we do
not update PCB/socket state correctly.

Reviewed by:	tuexen
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31908
2021-09-11 10:15:21 -04:00
Ka Ho Ng
3703c18883 md: Add MD_MUSTDEALLOC support
This adds an option to detect if hole-punching is implemented by the
underlying file system.  If this flag is set, and if the underlying file
system does not support hole-punching, md(4) fails BIO_DELETE requests
with EOPNOTSUPP.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D31883
2021-09-11 20:04:52 +08:00
John Baldwin
4d4cf62e29 cxgbei: Handle errors in PDUs.
When a PDU with an error (bad padding, header digest, or data digest)
is received, log the error via ICL_WARN() and then reset the
connection via the ic_error callback.

While here, add per-rxq counters for errors.

Sponsored by:	Chelsio Communications
2021-09-10 15:10:00 -07:00
Mark Johnston
141fe2dcee aio: Interlock with listen(2)
soo_aio_queue() did not handle the possibility that the provided socket
is a listening socket.  Up until recently, to fix this one would have to
acquire the socket lock first and check, since the socket buffer locks
were destroyed by listen(2).

Now that the socket buffer locks belong to the socket, simply check
SOLISTENING(so) after acquiring them, and make listen(2) return an error
if any AIO jobs are enqueued on the socket.

Add a couple of simple regression test cases.

Note that this fixes things only for the default AIO implementation;
cxgbe(4)'s TCP offload has a separate pru_aio_queue implementation which
requires its own solution.

Reported by:	syzbot+c8aa122fa2c6a4e2a28b@syzkaller.appspotmail.com
Reported by:	syzbot+39af117d43d4f0faf512@syzkaller.appspotmail.com
Reported by:	syzbot+60cceb9569145a0b993b@syzkaller.appspotmail.com
Reported by:	syzbot+2d522c5db87710277ca5@syzkaller.appspotmail.com
Reviewed by:	tuexen, gallatin, jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31901
2021-09-10 17:21:11 -04:00
Mark Johnston
74a68313b5 socket: Add macros to lock socket buffers using socket references
Since commit c67f3b8b78 the sockbuf
mutexes belong to the containing socket.  Sockbufs contain a pointer to
a mutex, which by default is initialized to the corresponding mutexes in
the socket.  The SOCKBUF_LOCK() etc. macros operate on this pointer.
However, the pointer is clobbered by listen(2) so it's not safe to use
them unless one is sure that the socket is not a listening socket.

This change introduces a new set of macros which lock socket buffers
through the socket.  This is a bit cheaper since it removes the pointer
indirection, and allows one to safely lock socket buffers and then check
for a listening socket.

For MFC, these macros should be reimplemented in terms of the existing
socket buffer layout.

Reviewed by:	tuexen, gallatin, jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31900
2021-09-10 17:20:39 -04:00
Gleb Smirnoff
89042ff776 ng_l2tp: improve callout locking.
Apparently e62e4b8594 wasn't enough to close the race between
a queue being flushed by a packet and callout executing, because
the callouts used without a lock aren't 100% bulletproof. To close
the race use callout_init_mtx() for L2TP timers, and make sure that
all calls to ng_callout()/ng_uncallout() are done under the seq lock.

If used properly, a locked callout can be used transparently with
old netgraph KPI of ng_callout/ng_uncallout which predates locked
callouts.

While here, utilize ng_uncallout_drain() instead of ng_uncallout()
on the node shutdown.

PR:			241133
Reviewed by:		mjg, markj
Differential Revision:	https://reviews.freebsd.org/D31476
2021-09-10 11:27:19 -07:00
Gleb Smirnoff
0a76c63dd4 ng_l2tp: improve seq structure locking.
Cover few cases of access to seq without lock missed in 702f98951d.
There are no known bugs fixed with this change, however. With INVARIANTS
embed ng_l2tp_seq_check() into lock/unlock macros. Slightly reduce number
of locks/unlocks per packet keeping the lock between functions.

Reviewed by:		mjg, markj
Differential Revision:	https://reviews.freebsd.org/D31476
2021-09-10 11:27:13 -07:00
Gleb Smirnoff
b2954f0a8f netgraph: add ng_uncallout_drain().
Move shared code into ng_uncallout_internal(). While here add a comment
mentioning a problem with scheduled+executing callout.

Reviewed by:		mjg, markj
Differential Revision:	https://reviews.freebsd.org/D31476
2021-09-10 11:27:04 -07:00
Gleb Smirnoff
26cf4b53d9 netgraph: pass return value from callout_stop() unmodified to callers of
ng_uncallout. Most of them do not check it anyway, so very little node
changes are required.

Reviewed by:		mjg, markj
Differential Revision:	https://reviews.freebsd.org/D31476
2021-09-10 11:26:59 -07:00
Kristof Provost
9bdff593ea pf: fix NOINET6 builds
MFC after:	1 week
Sponsored by:	Modirum MDPay
2021-09-10 18:15:44 +02:00
Kristof Provost
b64f7ce98f pf: qid and pqid can be uint16_t
tag2name() returns a uint16_t, so we don't need to use uint32_t for the
qid (or pqid). This reduces the size of struct pf_kstate slightly. That
in turn buys us space to add extra fields for dummynet later.

Happily these fields are not exposed to user space (there are user space
versions of them, but they can just stay uint32_t), so there's no ABI
breakage in modifying this.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31873
2021-09-10 17:07:57 +02:00
Mark Johnston
6d042d7c86 wpi: Fix a lock leak in an error path in wpi_run()
PR:		258243
Reported by:	dinghao.liu@zju.edu.cn
MFC after:	1 week
2021-09-10 10:03:51 -04:00
orange30
f5777c123a net: Fix memory leaks upon arp_fillheader() failures
Free memory before return from arprequest_internal().  In in_arpinput(),
if arp_fillheader() fails, it should use goto drop.

Reviewed by:	melifaro, imp, markj
MFC after:	1 week
Pull Request:	https://github.com/freebsd/freebsd-src/pull/534
2021-09-10 09:45:26 -04:00
Kristof Provost
0a51d74c3a pf: fix synproxy to local
When we're synproxy-ing a connection that's going to us (as opposed to a
forwarded one) we wound up trying to send out the pf-generated tcp
packets through pf_intr(), which called ip(6)_output(). That doesn't
work all that well for packets that are destined for us, so in that case
we must call ip(6)_input() instead.

MFC after:	1 week
Sponsored by:   Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31853
2021-09-10 15:16:37 +02:00
Mark Johnston
10eb2a2bde ipsec: Validate the protocol identifier in ipsec4_ctlinput()
key_allocsa() expects to handle only IPSec protocols and has an
assertion to this effect.  However, ipsec4_ctlinput() has to handle
messages from ICMP unreachable packets and was not validating the
protocol number.  In practice such a packet would simply fail to match
any SADB entries and would thus be ignored.

Reported by:	syzbot+6a9ef6fcfadb9f3877fe@syzkaller.appspotmail.com
Reviewed by:	ae
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31890
2021-09-10 09:09:00 -04:00
Mark Johnston
b1e6a792d6 net: Enter a net epoch around protocol if_up/down notifications
When traversing a list of interface addresses, we need to be in a net
epoch section, and protocol ctlinput routines need a stable reference to
the address.

Reported by:	syzbot+3219af764ead146a3a4e@syzkaller.appspotmail.com
Reviewed by:	kp, melifaro
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31889
2021-09-10 09:07:40 -04:00
Kyle Evans
35aa1d6e45 kern: drop remaining references to removed makesyscalls.sh
This was accidentally omitted from the recent removal of makeyscalls.sh.

Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D30250
2021-09-09 19:40:54 -05:00
Colin Percival
cd165c8bf0 x86/tsc.c: Add TSLOG to test_tsc
On my benchmark system this takes ~ 14 ms; enough to be worth
recording in the boot time profile.
2021-09-09 17:02:15 -07:00
Vladimir Kondratyev
38d2e9314b hkbd(4): Fix build on 32bit platforms
MFC after:	2 weeks
2021-09-10 01:51:25 +03:00