Its intent is to do the initialization of the future part of struct nameidata
which should be used across several namei() and VOPs. Right now it is NOP.
Reviewed by: mckusick
Discussed with: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D30041
Now that the upper layers all go through a layer to tie into these
information functions that translates an sbuf into char * and len. The
current interface suffers issues of what to do in cases of truncation,
etc. Instead, migrate all these functions to using struct sbuf and these
issues go away. The caller is also in charge of any memory allocation
and/or expansion that's needed during this process.
Create a bus_generic_child_{pnpinfo,location} and make it default. It
just returns success. This is for those busses that have no information
for these items. Migrate the now-empty routines to using this as
appropriate.
Document these new interfaces with man pages, and oversight from before.
Reviewed by: jhb, bcr
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29937
The new buffer is somewhat larger, but there should be no functional
changes.
Reviewed By: kib, imp
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D30821
The i386 loader (and hopefully others to come) now passes tslog data
as a "preloaded module". Include this in the data returned by the
debug.tslog sysctl.
Reviewed by: kevans
armv6 and armv7 systems already were 1000Hz. The other armv5 were a
mix of 100 and 1000. This changes them to 1000. Should there be
issues, we can add options HZ=100 to the systems that have bad
performance at the drop of a hat.
mips is a lot more complicated. But most of the systems are already
1000HZ. The hardware exceptions are all fast enough to run at
1000Hz. MALTA is our primary emulator, and history has shown emulators
tend to like 100Hz better, so run those systems at 100Hz. As with arm,
any system that shows a huge performance regression can reverted to
100Hz easily.
This was going to be committed well in advance of the 13 branch, but
it was delayed and forgotten til now.
Discussed on: #bsdmips ages ago
Sponsored by: Netflix
The TOE driver might receive decrypted TLS records that are enqueued
to the socket buffer after ktls_try_toe() returns and before
ktls_enable_rx() locks the receive buffer to call sb_mark_notready().
In that case, sb_mark_notready() would incorrectly treat the decrypted
TLS record as an encrypted record and schedule it for decryption.
This always resulted in the connection being dropped as the data in
the control message did not look like a valid TLS header.
To fix, don't try to handle software decryption of existing buffers in
the socket buffer for TOE TLS in ktls_enable_rx(). If a TOE TLS
driver needs to decrypt existing data in the socket buffer, the driver
will need to manage that in its tod_alloc_tls_session method.
Sponsored by: Chelsio Communications
It seems that Linux does not dequeue siginfo for SIGCHLD when wait*(2)
reports status of the running process. In particular, sigwaitinfo(2)
and other signal querying syscalls can observe the siginfo after wait.
FreeBSD dequeued siginfo from the beginning, so we cannot change the
default ABI to be more compatible. Still, add a knob to enable to
change to the other behavior for debugging purposes.
Reported by: dchagin
Reviewed by: dchagin, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30675
Traditionally, BSD drops signals with the default action during send,
not even putting them to the destination process queue. This semantic
is not shared with other operating systems (Linux), which do queue
such signals. In particular, sigtimedwait(2) and related syscalls can
observe the delivery.
Add a global knob kern.sig_discard_ign which can be set to false to force
enqueuing of the signals with default action. Also add an ABI flag to
indicate that signals should be queued.
Note that it is not practical to run with the knob turned on, because almost
all software that care about the delivery of such signals, is aware of the
difference, and misbehaves if the signals are actually queued. The purpose
of the knob as is is to allow for easier diagnostic of the programs that
need the adjustments, to confirm the cause of problem.
Reported by: dchagin
Reviewed by: dchagin, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30675
Some code was using it already, but in many places we were testing
SO_ACCEPTCONN directly. As a small step towards fixing some bugs
involving synchronization with listen(2), make the kernel consistently
use SOLISTENING(). No functional change intended.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
uipc_ktls.c was missing opt_ratelimit.h, so it was
never noticing that RATELIMIT was enabled. Once it was
enabled, it failed to compile as ktls_modify_txrtlmt()
had accrued a compilation error when it was not being
compiled in.
Sponsored by: Netflix
The kern_poll_kfds() operates on clear kernel data, kfds points to an
array in the kernel, while kern_poll() operates on user supplied pollfd.
Move nfds check to kern_poll_maxfds().
No functional changes, it's for future use in the Linux emulation layer.
Reviewd by: kib
Differential Revision: https://reviews.freebsd.org/D30690
MFC after: 2 weeks
The eventfd code was written by me, rdivacky@ copyrigth applicable only
to epoll part of the Linuxulator code. Roman is ok to retire his copyright
from sys/kern/sys_eventfd.c and 'All rights reserved.' lines from
sys/compat/linux/linux_event.[c|h] and sys/kern/sys_eventfd.c files.
Reviewed by: kib, emaste
Approved by: rdivacky
Differential Revision: https://reviews.freebsd.org/D30677
MFC after: 2 weeks
It was meant to suppress only the printf(), not the subsequent injection
of Giant-protected thunks for various file operations.
Fixes: fbeb4ccac9
Reported by: pho
Tested by: pho
MFC after: 6 days
Pointy hat: markj
During boot we warn that the kbd and openfirm drivers are Giant-locked
and may be deleted. Generally, the warning helps signal that certain
old drivers are not being maintained and are subject to removal, but
this doesn't really apply to certain drivers which are harder to
detangle from Giant.
Add a flag, D_GIANTOK, that devices can specify to suppress the
misleading warning. Use it in the kbd and openfirm drivers.
Reviewed by: imp, jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30649
This is aimed at preventing stacked filesystems like nullfs and unionfs
from "losing" their lower mounts due to forced unmount. Otherwise,
VFS operations that are passed through to the lower filesystem(s) may
crash or otherwise cause unpredictable behavior.
Introduce two new functions: vfs_pin_from_vp() and vfs_unpin().
which are intended to be called on the lower mount(s) when the stacked
filesystem is mounted and unmounted, respectively.
Much as registration in the mnt_uppers list previously did, pinning
will prevent even forced unmount of the lower FS and will allow the
stacked FS to freely operate on the lower mount either by direct
use of the struct mount* or indirect use through a properly-referenced
vnode's v_mount field.
vfs_pin_from_vp() is modeled after vfs_ref_from_vp() in that it uses
the mount interlock coupled with re-checking vp->v_mount to ensure
that it will fail in the face of a pending unmount request, even if
the concurrent unmount fully completes.
Adopt these new functions in both nullfs and unionfs.
Reviewed By: kib, markj
Differential Revision: https://reviews.freebsd.org/D30401
ca1ce50b2b ("vfs: add more safety against concurrent forced
unmount to vn_write") has a side effect of only checking MNT_SYNCHRONOUS
if O_FSYNC is set.
Reviewed By: mjg
Differential Revision: https://reviews.freebsd.org/D30610
Currently, this will still hash the default (all zero) hostuuid and
potentially arrive at a MAC address that has a high chance of collision
if another interface of the same name appears in the same broadcast
domain on another host without a hostuuid, e.g., some virtual machine
setups.
Instead of using the default hostuuid, just treat it as a failure and
generate a random LA unicast MAC address.
Reviewed by: bz, gbe, imp, kbowling, kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29788
Currently passing PDROP to the quisce_cpus() function does not make sense.
Add special meaning for it, by not waiting for the idle thread to schedule.
Also avoid allocating u_int[MAXCPU] on the stack.
Reviewed by: hselasky, markj
Sponsored by: Mellanox Technologies/NVidia Networking
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30468
kld_sx is dropped e.g. for executing sysinits, which allows user
to initiate kldunload while module is not yet fully initialized.
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D30456
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Return EBUSY instead and let caller to handle the issue.
For vgone()/vnode reclamation, caller first does vinvalbuf(V_SAVE),
which return EBUSY in case dirty buffers where not flushed. Then caller
calls vinvalbuf(0) due to non-zero return, which gets rid of all dirty
buffers without dependencies.
PR: 238565
Reviewed by: asomers, mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30555
Instead of requiring all implementations of vfs_quotactl to unbusy
the mount for Q_QUOTAON and Q_QUOTAOFF, add an "mp_busy" in/out param
to VFS_QUOTACTL(9). The implementation may then indicate to the caller
whether it needed to unbusy the mount.
Also, add stbool.h to libprocstat modules which #define _KERNEL
before including sys/mount.h. Otherwise they'll pull in sys/types.h
before defining _KERNEL and therefore won't have the bool definition
they need for mp_busy.
Reviewed By: kib, markj
Differential Revision: https://reviews.freebsd.org/D30556
Parts of libprocstat like to pretend they're kernel components for the
sake of including mount.h, and including sys/types.h in the _KERNEL
case doesn't fix the build for some reason. Revert both the
VFS_QUOTACTL() change and the follow-up "fix" for now.
Instead of requiring all implementations of vfs_quotactl to unbusy
the mount for Q_QUOTAON and Q_QUOTAOFF, add an "mp_busy" in/out param
to VFS_QUOTACTL(9). The implementation may then indicate to the caller
whether it needed to unbusy the mount.
Reviewed By: kib, markj
Differential Revision: https://reviews.freebsd.org/D30218
ktrace(2) may toggle trace points in any of
1. a single process
2. all members of a process group
3. all descendents of the processes in 1 or 2
In the first two cases, we do not permit the operation if the process is
being forked or not visible. However, in case 3 we did not enforce this
restriction for descendents. As a result, the assertions about the child
in ktrprocfork() may be violated.
Move these checks into ktrops() so that they are applied consistently.
Allow KTROP_CLEAR for nascent processes. Otherwise, there is a window
where we cannot clear trace points for a nascent child if they are
inherited from the parent.
Reported by: syzbot+d96676592978f137e05c@syzkaller.appspotmail.com
Reported by: syzbot+7c98fcf84a4439f2817f@syzkaller.appspotmail.com
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30481
Previously, a negative change list length would be treated the same as
an empty change list. A negative event list length would result in
bogus copyouts. Make kevent(2) return EINVAL for both cases so that
application bugs are more easily found, and to be more robust against
future changes to kevent internals.
Reviewed by: imp, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30480
ktrstructarray() may be used to create copies of kevent(2) change and
event arrays. It is called before parameter validation is done and so
should check for bogus array lengths before allocating a copy.
Reported by: syzkaller
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30479
This makes it possible to call __elfN(size_segments) and __elfN(puthdr)
from Linux coredump code.
Reviewed By: kib
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D30455
This avoids creating a duplicate copy on the stack just to
append the trailer.
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30139
This removes support for loadable software backends. The KTLS OCF
support is now always included in kernels with KERN_TLS and the
ktls_ocf.ko module has been removed. The software encryption routines
now take an mbuf directly and use the TLS mbuf as the crypto buffer
when possible.
Bump __FreeBSD_version for software backends in ports.
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30138
This is intended for use in KTLS transmit where each TLS record is
described by a single mbuf that is itself queued in the socket buffer.
Using the existing CRYPTO_BUF_MBUF would result in
bus_dmamap_load_crp() walking additional mbufs in the socket buffer
that are not relevant, but generating a S/G list that potentially
exceeds the limit of the tag (while also wasting CPU cycles).
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30136
This function appends the contents of a single mbuf to an sglist
rather than an entire mbuf chain.
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30135
This function doesn't only copy data into a uio but instead is a
variant of uiomove() similar to uiomove_fromphys().
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30444