Commit graph

145614 commits

Author SHA1 Message Date
Nick Reilly
bfeef0d32a pf: fix pfi_ifnet leak on interface removal
The detach of the interface and group were leaving pfi_ifnet memory
behind. Check if the kif still has references, and clean it up if it
doesn't

On interface detach, the group deletion was notified first and then a
change notification was sent. This would recreate the group in the kif
layer. Reorder the change to before the delete.

PR:		257218
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D37569
2022-12-14 10:19:01 +01:00
Mateusz Guzik
e6fc01f6be tcp: whack the stale declaration of rack_timer_stop
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-12-14 08:48:52 +00:00
Kristof Provost
a002c839ec if_ovpn: cleanup offsetof() use
Move the use of the `offsetof(struct ovpn_counters, fieldname) /
sizeof(uint64_t)` construct into a macro.
This removes a fair bit of code duplication and should make things a
little easier to read.

Reviewed by:	zlei
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37607
2022-12-14 06:48:59 +01:00
Kristof Provost
c357bf397f if_ovpn: include peer counters in a OVPN_NOTIF_DEL_PEER message
When we remove a peer userspace can no longer retrieve its counters. To
ensure that userspace can get a full count of the entire session we now
include the counters in the deletion message.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37606
2022-12-14 06:48:59 +01:00
Kristof Provost
92f0cf77db if_ovpn: allow peer lookup by vpn4/vpn6 address
Introduce two more RB_TREEs so that we can look up peers by their peer
id (already present) or vpn4 or vpn6 address.
This removes the last linear scan of the peer list.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37605
2022-12-14 06:48:59 +01:00
Kristof Provost
8b630fa9ef if_ovpn: implement OVPN_GET_PEER_STATS
Allow userspace to retrieve per-peer traffic stats.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37604
2022-12-14 06:48:58 +01:00
Kristof Provost
18a30fd39b if_ovpn: start tracking per-peer packets/bytes in/out
OpenVPN will introduce a mechanism to retrieve per-peer statistics.
Start tracking those so we can return them to userspace when queried.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37603
2022-12-14 06:48:58 +01:00
Kristof Provost
66de89d4c2 if_ovpn: remove OVPN_SEND_PKT
OpenVPN userspace no longer uses the ioctl interface to send control
packets. It instead uses the socket directly.
The use of OVPN_SEND_PKT was never released, so we can remove this
without worrying about compatibility.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37602
2022-12-14 06:48:58 +01:00
Gleb Smirnoff
1b5895c624 tcp: remove a 4.4BSD relic
The actual code to modify this counter was disabled in 2c37256e5a
and later removed in d0390e0570.
2022-12-13 20:21:45 -08:00
Li-Wen Hsu
d969aeab73
Complete retire cp(4)
Sponsored by:	The FreeBSD Foundation
2022-12-14 11:42:36 +08:00
Li-Wen Hsu
611c6b233d
Complete retire cp(4)
And fix the LINT build.

Sponsored by:	The FreeBSD Foundation

Fixes:	895992bb66
2022-12-14 11:38:55 +08:00
Gleb Smirnoff
5050df3f4a tcp: fix counter leak for SYN_RCVD state when syncache_socket() fails
The SYN_RCVD state count is tricky here due to default code path and TFO
being so different.  In the default case the count is incremented when a
syncache entry is added to the the database in syncache_insert().  Later
when connection transitions from syncache entry to a socket in
syncache_expand(), this counter is inherited by the tcpcb.  If socket or
tcpcb allocation failed in syncache_socket() failed the syncache_expand()
is responsible for decrement.  In the TFO case the syncache entry is not
inserted into database and count of SYN_RCVD is first incremented in the
syncache_tfo_expand() after successful socket allocation.  Thus, inside
syncache_socket() we can't tell whether we need to decrement in a case of
a failure or not.  The caller is responsible for this book keeping.

Fixes:	07285bb4c2
Differential revision:	https://reviews.freebsd.org/D37610
2022-12-13 19:31:05 -08:00
Mateusz Guzik
e6bc24b038 kref: switch internal type to atomic_t and bring back const to kref_read
This unbreak drm-kmod build.

the const is part of Linux API

Unfortunately drm-kmod uses hand-rolled refcount* calls on a kref
object. For now go the easy route of keeping it operational by casting
stuff internally.

The general goal here is to make FreeBSD refcount API use an opaque
type, hence the ongoing removal of hand-rolled accesses.

Reported by:	emaste
2022-12-13 20:46:58 +00:00
Ed Maste
895992bb66 retire cp(4) driver
Sync serial (e.g. T1/T1/G.703) interfaces are obsolete, this driver
includes obfuscated source, and has reported potential security issues.

Differential Revision:  https://reviews.freebsd.org/D33468
2022-12-13 15:24:52 -05:00
Ed Maste
76f6751844 retire ce(4) driver
Sync serial (e.g. T1/T1/G.703) interfaces are obsolete, this driver
includes obfuscated source, and has reported potential security issues.

Differential Revision:	https://reviews.freebsd.org/D33467
2022-12-13 15:24:25 -05:00
Ed Maste
20dfe27b2d Add deprecation notices to ce,cp sync serial drivers
And the related sconfig utility.  Sync serial (e.g. E1/T1) interfaces
are obsolete, and nobody responded to several inquires on the mailing
lists about use of these drivers.

Relnotes:	Yes
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23928
2022-12-13 14:59:08 -05:00
Mateusz Guzik
67e628b7a6 kref: replace hand-rolled atomic ops with refcount API
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D37608
2022-12-13 09:24:57 +00:00
Søren Schmidt
85e7d8e0ae Add driver for Motorcomm YT8511 GbE PHY
Partially from:	https://reviews.freebsd.org/D36093
2022-12-13 05:58:51 +00:00
Warner Losh
2e1e68cbae stand: Make ioctl declaration consistent
It typically had two args with an optional third from the userland
declaration in sys/ioccom.h. However, the funciton definition used a
non-optional char * argument. This mismatch is UB behavior (but worked
due to the calling convetions of all our machines).

Instead, add a declaration for ioctl to stand.h, make the third arg
'void *' which is a better match to the ... declaration before. This
prevents the convert int * -> char * errors as well. Make the ioctl
user-space declaration truly user-space specific (omit it in the
stand-alone build).

No functional change intended.

Sponsored by:		Netflix
Reviewed by:		emaste
Differential Revision:	https://reviews.freebsd.org/D37680
2022-12-12 21:46:34 -07:00
Alan Cox
f0878da03b pmap: standardize promotion conditions between amd64 and arm64
On amd64, don't abort promotion due to a missing accessed bit in a
mapping before possibly write protecting that mapping.  Previously,
in some cases, we might not repromote after madvise(MADV_FREE) because
there was no write fault to trigger the repromotion.  Conversely, on
arm64, don't pointlessly, yet harmlessly, write protect physical pages
that aren't part of the physical superpage.

Don't count aborted promotions due to explicit promotion prohibition
(arm64) or hardware errata (amd64) as ordinary promotion failures.

Reviewed by:	kib, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D36916
2022-12-12 11:32:50 -06:00
Søren Schmidt
896d3e43b1 Add driver for Rockchip One Time Programmable (OTP) device.
This driver created the possibility to assign fixed MAC adresses to eqos devices.
2022-12-12 14:34:18 +00:00
Mark Johnston
5108879730 bridge: Fix a potential memory leak in bridge_enqueue()
A comment at the beginning of the function notes that we may be
transmitting multiple fragments as distinct packets.  So, the function
loops over all fragments, transmitting each mbuf chain.  If if_transmit
fails, we need to free all of the fragments, but m_freem() only frees an
mbuf chain - it doesn't follow m_nextpkt.

Change the error handler to free each untransmitted packet fragment, and
count each fragment as a separate error since we increment OPACKETS once
per fragment when transmission is successful.

Reviewed by:	zlei, kp
MFC after:	1 week
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D37635
2022-12-11 11:41:12 -05:00
Jason A. Harmening
0ef861e6f4 nullfs: adopt VV_CROSSLOCK
When the lower filesystem directory hierarchy is the same as the nullfs
mount point (admittedly not likely to be a useful situation in
practice), nullfs is subject to the exact deadlock between the busy
count drain and the covered vnode lock that VV_CROSSLOCK is intended
to address.

Reviewed by:	kib
Tested by:	pho
Differential Revision: https://reviews.freebsd.org/D37458
2022-12-10 22:02:39 -06:00
Jason A. Harmening
5cec725cd3 unionfs: allow recursion on covered vnode lock during mount/unmount
When taking the covered vnode lock during mount and unmount operations,
specify LK_CANRECURSE as the existing lock state of the covered vnode
is not guaranteed (AFAIK) either by assertion or documentation for
these code paths.

For the mount path, this is done only for completeness as the covered
vnode lock is not currently held when VFS_MOUNT() is called.
For the unmount path, the covered vnode is currently held across
VFS_UNMOUNT(), and the existing code only happens to work when unionfs
is mounted atop FFS because FFS sets LO_RECURSABLE on its vnode locks.

This of course doesn't cover a hypothetical case in which the covered
vnode may be held shared, but for the mount and unmount paths such a
scenario seems unlikely to materialize.

Reviewed by:	kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D37458
2022-12-10 22:02:38 -06:00
Jason A. Harmening
42442d7a6e Generalize the VV_CROSSLOCK logic in vfs_lookup()
When VV_CROSSLOCK is present, the lock for the vnode at the current
stage of lookup must be held across the VFS_ROOT() call for the
filesystem mounted at the vnode.  Since VV_CROSSLOCK implies that
the root vnode reuses the already-held lock, the possibility for
recursion should be made clear in the flags passed to VFS_ROOT().

For cases in which the lock is held exclusive, this means passing
LK_CANRECURSE.  For cases in which the lock is held shared, it
means clearing LK_NODDLKTREAT to allow VFS_ROOT() to potentially
recurse on the shared lock even in the presence of an exclusive
waiter.

That the existing code works for unionfs is due to a coincidence
of the current unionfs implementation.

Reviewed by:	kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D37458
2022-12-10 22:02:38 -06:00
Mateusz Guzik
bc2ccf0e4f mtx: retire PARTIAL_PICKUP_GIANT
It does not appear to have ever been used.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-12-11 03:26:23 +00:00
Mateusz Guzik
ebdf27b6f3 uipc: remove accept_mtx
It is unused since 779f106aa1 ("Listening sockets improvements.")

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-12-11 02:47:07 +00:00
John Baldwin
af3b48e101 vmm: Free vCPUs when destroying them.
Reported by:	andrew
Reviewed by:	corvink, andrew, markj
Differential Revision:	https://reviews.freebsd.org/D37649
2022-12-09 10:27:05 -08:00
John Baldwin
d212d6ebb4 vmm: Avoid infinite loop in vcpu_lock_all error case.
Reported by:	Coverity (CIDs 1501060,1501071)
Reviewed by:	corvink, markj, emaste
Differential Revision:	https://reviews.freebsd.org/D37648
2022-12-09 10:26:49 -08:00
John Baldwin
91980db1be vmm: Don't lock a vCPU for VM_PPTDEV_MSI[X].
These are manipulating state in a ppt(4) device none of which is
vCPU-specific.  Mark the vcpu fields in the relevant ioctl structures
as unused, but don't remove them for now.

Reviewed by:	corvink, markj
Differential Revision:	https://reviews.freebsd.org/D37639
2022-12-09 10:26:23 -08:00
John Baldwin
62be9ffd82 vmm: VM_GET/SET_KERNEMU_DEV should run with the vCPU locked.
Reviewed by:	corvink, kib, markj
Differential Revision:	https://reviews.freebsd.org/D37638
2022-12-09 10:25:30 -08:00
Konstantin Belousov
645510e62e Provide consistent prototype for swp_pager_meta_free()
This should fix 32bit build breakage.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2022-12-09 17:23:09 +02:00
Konstantin Belousov
0919f29d91 shmfd: account for the actually allocated pages
Return the value as stat(2) st_blocks.

Suggested and reviewed by:	markj (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37097
2022-12-09 14:17:12 +02:00
Konstantin Belousov
37aea2649f tmpfs: for used pages, account really allocated pages, instead of file sizes
This makes tmpfs size accounting correct for the sparce files. Also
correct report st_blocks/va_bytes. Previously the reported value did not
accounted for the swapped out pages.

PR:	223015
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37097
2022-12-09 14:17:12 +02:00
Konstantin Belousov
cd086696c2 vm_pager_allocate(): override resulting object type
For dynamically allocated pager type, which inherits the parent's alloc
method, type of the returned object is set to the parent's type
otherwise.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37097
2022-12-09 14:17:03 +02:00
Konstantin Belousov
ec201dddfb vm_pager: add method to veto page allocation
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37097
2022-12-09 14:15:37 +02:00
Konstantin Belousov
d537d1f12e vm_pager: add methods for page insertion and removal notifications
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37097
2022-12-09 14:15:37 +02:00
Konstantin Belousov
d9dc64f158 tmpfs: make vm_object point to the tmpfs node instead of vnode
The vnode could be reclaimed and allocated again during the lifecycle of
the node, but the node cannot.  Also, referencing the node would allow
to reach it and tmpfs mount data from the object, regardless of the
state of the possibly absent vnode.

Still use swp_tmpfs for back-pointer, instead of using handle. Use of
named swap objects would incur taking the sw_alloc_sx on node allocation
and deallocation.

swp_tmpfs is renamed to swp_priv to remove the last bit of tmpfs in vm/.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37097
2022-12-09 14:15:37 +02:00
Konstantin Belousov
baa1ccceef Make swap_pager_freespace() global
also make it return the count of the swap pages freed, which are not
simultaneously resident in the object.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37097
2022-12-09 14:15:37 +02:00
Konstantin Belousov
83aff0f08c Add 'show tmpfs' ddb command
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37097
2022-12-09 14:15:37 +02:00
Konstantin Belousov
e77f2f9dc6 tmpfs: minor style
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37097
2022-12-09 14:15:37 +02:00
Konstantin Belousov
7ec4b29b08 uiomove_object: hide diagnostic under bootverbose
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37097
2022-12-09 14:15:37 +02:00
Alexander V. Chernikov
1bcd230f95 netlink: add interface notification on link status / flags change.
* Add link-state change notifications by subscribing to ifnet_link_event.
 In the Linux netlink model, link state is reported in 2 places: first is
 the IFLA_OPERSTATE, which stores state per RFC2863.
 The second is an IFF_LOWER_UP interface flag. As many applications rely
 on the latter, reserve 1 bit from if_flags, named as IFF_NETLINK_1.
 This flag is mapped to IFF_LOWER_UP in the netlink headers. This is done
 to avoid making applications think this flag is actually
 supported / presented in non-netlink outputs.
* Add flag change notifications, by hooking into rt_ifmsg().
 In the netlink model, notification should include the bitmask for the
 change flags. Update rt_ifmsg() to include such bitmask.

Differential Revision: https://reviews.freebsd.org/D37597
2022-12-09 11:20:07 +00:00
Ka Ho Ng
e28932c643 vfs: Add spare fileops function pointer slots
This allows backporting of new fileops function pointers while
preserving KBI.

Bump __FreeBSD_version.

Sponsored by:	Juniper Networks, Inc.
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D37636
2022-12-08 20:45:47 -05:00
Mark Johnston
84d7fe4a6f kinst: Add per-CPU interrupt trampolines
In the common case, kinst emulates a traced instruction by copying it to
a trampoline, where it is followed by a jump back to the original code,
and pointing the interrupted thread's %rip at the trampoline.  In
particular, the trampoline is executed with the same CPU context as the
original instruction, so if interrupts are enabled at the point where
the probe fires, they will be enabled when the trampoline is
subsequently executed.

It can happen that an interrupt is raised while a thread is executing a
kinst trampoline.  In that case, it is possible that the interrupt
handler will trigger a kinst probe, so we must ensure that the thread
does not recurse and overwrite its trampoline before it is finished
executing the original contents, otherwise an attempt to trace code
called from interrupt handlers can crash the kernel.

To that end, add a per-CPU trampoline, used when the probe fired with
interrupts disabled.  Note that this is not quite complete since it does
not handle the possibility of kinst probes firing while executing an NMI
handler.

Also ensure that we do not trace instructions which set IF, since in
that case it is not clear which trampoline (the per-thread trampoline or
the per-CPU trampoline) we should use, and since such instructions are
rare.

Reported and tested by:	Domagoj Stolfa
Reviewed by:	christos
Fixes:		f0bc4ed144 ("kinst: Initial revision")
Differential Revision:	https://reviews.freebsd.org/D37619
2022-12-08 15:03:51 -05:00
Mark Johnston
778b74378f kinst: Make the provider ops table const
No functional change intended.
2022-12-08 15:03:41 -05:00
Mark Johnston
ad5d6f38c4 kinst: Correct a comment
Fixes:	f0bc4ed144 ("kinst: Initial revision")
2022-12-08 15:03:32 -05:00
Doug Rabson
5eeb4f737f imgact_binmisc: Optionally pre-open the interpreter vnode
This allows the use of chroot and/or jail environments which depend on
interpreters registed with imgact_binmisc to use emulator binaries from
the host to emulate programs inside the chroot.

Reviewed by:	imp
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D37432
2022-12-08 14:32:03 +00:00
John Baldwin
4eda35b636 atp: Fix mismatch in array bounds.
Reported by:	GCC -Warray-parameter
Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D37552
2022-12-07 12:33:56 -08:00
John Baldwin
183783d34a if_rsu: Fix mismatches in array bounds.
Reported by:	GCC -Warray-parameter
Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D37551
2022-12-07 12:33:41 -08:00