Commit graph

11782 commits

Author SHA1 Message Date
Mark Johnston
50cf88be6f jail: Make prison_owns_vnet() operate on a prison instead of a ucred
This will be useful in an upcoming change.  No functional change
intended.

Reviewed by:	jamie
MFC after:	2 weeks
Sponsored by:	Stormshield
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D51524

(cherry picked from commit 748a4ea1caffca48c4949d5a7b964853c44fbdae)
2025-09-08 12:00:14 +02:00
Kristof Provost
4d2a165967 if_ovpn: support floating clients
If a client changes its IP address notify userspace of this.

The UDP filtering function supplies the remote IP address, so we check if the
address changed there. If so, we tag the packet with the new address. Once the
packet is decrypted (and as part of that, has had its signature checked) we
can commit to the address change. Take the write lock and notify userspace of
the change.

Reviewed by:	markj
MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D51468

(cherry picked from commit 9c52600a5a150117b4396df3b868cf2516e1674c)
2025-09-08 10:20:17 +02:00
Olivier Certner
a29140025a Internal scheduling priorities: Always use symbolic ones
Replace priorities specified by a base priority and some hardcoded
offset value by symbolic constants.  Hardcoded offsets prevent changing
the difference between priorities without changing their relative
ordering, and is generally a dangerous practice since the resulting
priority may inadvertently belong to a different selection policy's
range.

Since RQ_PPQ is 4, differences of less than 4 are insignificant, so just
remove them.  These small differences have not been changed for years,
so it is likely they have no real meaning (besides having no practical
effect).  One can still consult the changes history to recover them if
ever needed.

No functional change (intended).

MFC after:      1 month
Event:          Kitchener-Waterloo Hackathon 202506
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D45390

(cherry picked from commit 8ecc41918066422d6788a67251b22d11a6efeddf)
2025-07-31 12:42:21 +02:00
Lexi Winter
b69907d463 jail: add allow.routing jail permission
if allow.routing is set, the jail can modify the system routing table
even if it's not a VNET jail.

Reviewed by:	kevans, des, adrian
Approved by:	kevans (mentor), des (mentor)
Differential Revision:	https://reviews.freebsd.org/D49843

(cherry picked from commit 3a53fe2cc4b7076003163376a7db65e432f6283e)
2025-07-09 10:05:40 +02:00
Olivier Certner
de6a4418ee
sysctl(9): Ease exporting struct sizes; Discourage doing that
Introduce two helpers, the more general SYSCTL_SIZEOF() and
a struct-specific one SYSCTL_SIZEOF_STRUCT() which prepends 'struct' in
the description and in the use of sizeof() but uses the raw structure
name as the knob's name.  The size of the object/structure is exported
under 'debug.sizeof'.

Existing knobs under 'debug.sizeof' were all converted to use the
helpers.

Add a note before the helpers discouraging the introduction of new
leaves for ad-hoc reasons.  List alternative means for developers to
obtain the size of arbitrary kernel structures easily (thanks to markj@
for providing these).

No functional change (intended).

Reviewed by:    kib, markj
MFC after:      3 days
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D50121

(cherry picked from commit 713abc9880aabe0ff924ff644bceb6ff404ed3cd)
(cherry picked from commit efce9f8a510b60736994e50288b78fc7b32b5d90)

Approved by:    re (cperciva)
2025-05-13 14:41:33 +02:00
Colin Percival
3ce28e0624 14.3: create releng/14.3 branch
Update from PRERELEASE to BETA1
Switch pkg(8) configuration to use the quarterly repository
Bump __FreeBSD_version

Approved by:	re (implicit)
Sponsored by:	Amazon
2025-05-02 00:00:00 +00:00
Olivier Certner
ad1f2e80c8
kassert.h: Explicitly include <sys/types.h> on _KERNEL
MFCing commit "kassert: Explicitly include <sys/_types.h>" (as
3772cb0620) was not enough as, on stable/14 and contrary to main,
<sys/kassert.h> also declares a 'panicked' variable as a 'bool' when
_KERNEL is defined, so 'bool' needs to be defined in this case for
<sys/kassert.h> to be includable standalone.

This is a direct commit to stable/14.

Fixes:          64ca7e040a ("queue(3): Debug macros: Finer control knobs, userland support")
Sponsored by:   The FreeBSD Foundation
2025-05-01 23:43:59 +02:00
Olivier Certner
542e14a59b
queue(3): Wrap QMD_ASSERT()'s guard with __predict_false()
Such a guard is bound to be almost always false (obviously).

Reviewed by:    emaste (older version)
MFC after:      3 days
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D49974

(cherry picked from commit 613d66b5e17d92e5304fdc9abe4c62ba015ebf31)
2025-05-01 21:46:38 +02:00
Olivier Certner
64ca7e040a
queue(3): Debug macros: Finer control knobs, userland support
Support enabling debugging macros for userland and _STANDALONE builds,
in addition to the already existing kernel support.  On runtime error,
panic() is used for kernel and _STANDALONE builds, and a simple
fprintf() + abort() combination for userland ones.  These can be
overriden if needed by defining the QMD_PANIC() and/or QMD_ASSERT()
macros.

The expansion of queue debug macros can now be controlled more finely
thanks to the QUEUE_MACRO_DEBUG_ASSERTIONS and
QUEUE_MACRO_NO_DEBUG_ASSERTIONS macros.  The first one serves to
forcibly enable debug code and the second to forcibly disable it.  These
are meant to be used as compile options only, and should normally not be
defined in a source file.  It is an error to have both of them defined.

If none of the two above-mentioned macros are defined, an automatic
determination is performed.  When compiling kernel code,
QUEUE_MACRO_DEBUG_ASSERTIONS is defined if INVARIANTS has been defined
(as before).  For userland and _STANDALONE builds, no debug code is ever
automatically inserted even if NDEBUG is not defined, as doing so would
inflate code size and users may want to have working assert() calls
without this overhead by default.

In the manual page, document check code control under DIAGNOSTICS.
While here, rework a bit the rest of the DIAGNOSTICS section.

Reviewed by:    markj (older version)
MFC after:      3 days
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D49973

(cherry picked from commit 1c5fea9e8b563186b8f5773064458c4ecf2d7004)
2025-05-01 21:46:38 +02:00
Olivier Certner
3772cb0620
kassert: Explicitly include <sys/_types.h>
Include it as <sys/kassert.h> has direct references defined in it (to
'__va_list' and '__printflike' at least).

This is a step to make <sys/kassert.h> usable without the need to
explicitly include other headers.

Reviewed by:    imp, markj
MFC after:      3 days
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D49972

(cherry picked from commit 0f2090ccfeb6e3e1a2290300b53baedfb057c2b5)
2025-05-01 21:46:37 +02:00
Olivier Certner
49066283a4
queue(3): Consistent single space after all #define
Reviewed by:    markj
MFC after:      3 days
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D49970

(cherry picked from commit a4df0830d74dba9d20c01d8c108bddeb1ecd62cd)
2025-05-01 21:46:30 +02:00
Olivier Certner
c4ee6d4acb
queue(3): New *_SPLIT_AFTER(), *_ASSERT_EMPTY(), *_ASSERT_NONEMPTY()
*_SPLIT_AFTER() allows to split an existing queue in two.  It is the
missing block that enables arbitrary splitting and recombinations of
lists/queues together with *_CONCAT() and *_SWAP().

Add *_ASSERT_NONEMPTY(), used by *_SPLIT_AFTER().

Reviewed by:    markj
MFC after:      3 days
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D49608 (stailq)
Differential Revision:  https://reviews.freebsd.org/D49969 (rest)

(cherry picked from commit c02880233949b01fcfb2067962596f5c05553471)
2025-05-01 21:37:04 +02:00
John Baldwin
3a54ca5e28 new-bus: Add taskqueue_bus to process hot-plug device events
Use a system-wide taskqueue for hot-plug events.  This avoids possibly
blocking unrelated events on the thread taskqueue without requiring
multiple driver-specific taskqueues.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D49268

(cherry picked from commit 44d5f5ed1e959d8f2c22b6ee69c6a46a45ccdd8e)
2025-04-29 10:34:05 -04:00
Bjoern A. Zeeb
95e956722a mbuf: add mbuf information to KASSERTs
Be more consistent about printing the mbuf pointer in KASSERT messages.
This massively helps debugging and we were already doing a good job at
it.

Also replace some handrolled KASSERTs with M_ASSERTPKTHDR() for fewer
copies of the check logic.

In m_align() move the msg into the KASSERT given after it was moved
here in ed6a66ca6c the msg is only used in one place.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	glebius, zlei
Differential Revision: https://reviews.freebsd.org/D49817

(cherry picked from commit 22d5d61f91eb70ced6a010d9a1d60f0ff33fff2f)
2025-04-29 10:49:27 +00:00
Bjoern A. Zeeb
a3b2d8e360 Bump __FreeBSD_version to 1402505 for LinuxKPI alloc routine changes
Also for iwlwifi firmware removal.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit 7acd5af48cf1870ec48d5910dff1a26466d98074)
2025-04-18 14:36:03 +00:00
Colin Percival
1bd29261b9 Correct CTLTYPE of SYSCTL_SBINTIME_MSEC etc
These should be CTLTYPE_S64, not CTLTYPE_INT, since they handle 64-bit
values.

Reviewed by:	imp
Fixes:	003ffd57fe ("Add sysctl_usec_to_sbintime [...]")
MFC after:	2 weeks
Sponsored by:	Amazon
Differential Revision:	https://reviews.freebsd.org/D49584

(cherry picked from commit e85aaed60eb061f31b2f1e5dc92b0ff0419b5fbf)
2025-04-15 20:13:13 -07:00
Gleb Smirnoff
fca3395674 cred: fix struct credbatch to use long for refcount
This structure collects count from multiple cred structures.  Of course it
can't use a smaller type.

PR:			283747
Reviewed by:		olce, mjg, markj
Differential Revision:	https://reviews.freebsd.org/D49562
Fixes:			37337709d3

(cherry picked from commit cd46e980134f6fc765b28ee9c8bf41e8fc1b0261)
2025-04-15 12:40:02 -07:00
Mark Johnston
c1aa97cf79 file: Add foffset_lock_pair()
This will be used in kern_copy_file_range(), which needs to lock two
offsets.

Reviewed by:	kib, rmacklem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D49440

(cherry picked from commit 12ecb0fe0afda8c051605045e446371ddd34741f)
2025-04-15 02:25:24 +00:00
Konstantin Belousov
70ba4df540 stat(2): add st_bsdflags field
(cherry picked from commit 86db734ae292fee58532f09b17b50438f6889cc8)
2025-04-09 03:53:17 +03:00
Olivier Certner
f9fa6cb391
cred: Hide internal flag CRED_FLAG_CAPMODE
This flag is used in field 'cr_flags', which is never directly visible
outside the kernel.  That field is however exported through 'struct
kinfo_proc' objects (field 'ki_cr_flags'), either from the kernel via
sysctls or from libkvm, and is supposed to contain exported flags
prefixed with KI_CRF_ (currently, KI_CRF_CAPABILITY_MODE and
KI_CRF_GRP_OVERFLOW, this second one being a purely userland one
signaling overflow of 'ki_groups').

Make sure that KI_CRF_CAPABILITY_MODE is the flag actually exported and
tested by userland programs, and hide the internal CRED_FLAG_CAPMODE.
As both flags are currently defined to the same value, this doesn't
change the KBI, but of course does change the KPI.  A code search via
GitHub and Google fortunately doesn't reveal any outside uses for
CRED_FLAG_CAPMODE.

While here, move assignment of 'ki_uid' to a more logical place in
kvm_proclist(), and definition of XU_NGROUPS as well in 'sys/ucred.h'
(no functional/interface changes intended).

Reviewed by:    mhorne
Approved by:    markj (mentor)
MFC after:      2 weeks
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D46909

(cherry picked from commit 09290c3a0c82524138973b14f393379edf733753)

A ports exp-run (PR 283410) showed one port to be affected
(sysutils/procs), which has been fixed upstream and in the ports tree.
All additional indirect references to CRED_FLAG_CAPMODE we found after
the code search mentioned in the original commit message are
automatically generated from our headers by FFI mechanisms, so
automatically disappear at recompilation (and the KBI is not changed, as
explained above, so recompilation is not needed).
2025-04-08 15:38:14 +02:00
Konstantin Belousov
c07ea8aefd kevent32/kinfo_knote32: remove __LP64__ predicate in definitions
(cherry picked from commit 2452bcd8913bb45ec269d0a3219ca8bfc0c7a183)
2025-04-07 04:28:22 +03:00
Konstantin Belousov
867ac6f5e9 struct kinfo_knote: add spare fields
(cherry picked from commit fe8ece34b446e92218a283ce5a7754784b6c53c1)
2025-04-07 04:28:22 +03:00
Konstantin Belousov
5e6b89bd56 Add NT_PROCSTAT_KQUEUES core note
(cherry picked from commit 5e7c43ff02dc0ec246582af24d3f4d03d5d55bf4)
2025-04-07 04:28:22 +03:00
Konstantin Belousov
c69b90d08b kinfo_knote: add knt_kq_fd member
(cherry picked from commit 9d1e7a7e8e8bde5b343226ce0fefc583932d5af3)
2025-04-07 04:28:21 +03:00
Konstantin Belousov
e260058f33 descriptors: add fget_remote_foreach()
(cherry picked from commit 4b69f1fab66db4fd3f874e78a457e317cd498d36)
2025-04-07 04:28:21 +03:00
Konstantin Belousov
e2af76d470 Add sysctl kern.proc.kqueue
(cherry picked from commit e60f608eb9cf3b38099948545934d699de9bbcea)
2025-04-07 04:28:20 +03:00
Mark Johnston
f2214e48d0 socket: Fix a race in the SO_SPLICE state machine
When so_splice() links two sockets together, it first attaches the
splice control structure to the source socket; at that point, the splice
is in the idle state.  After that point, a socket wakeup will queue up
work for a splice worker thread: in particular, so_splice_dispatch()
only queues work if the splice is idle.

Meanwhile, so_splice() continues initializing the splice, and finally
calls so_splice_xfer() to transfer any already buffered data.  This
assumes that the splice is still idle, but that's not true if some async
work was already dispatched.

Solve the problem by introducing an initial "under construction" state
for the splice control structure, such that wakeups won't queue any work
until so_splice() has finished.

While here, remove an outdated comment from the beginning of
so_splice_xfer().

Reported by:	syzkaller
Reviewed by:	gallatin
Fixes:		a1da7dc1cdad ("socket: Implement SO_SPLICE")
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D49437

(cherry picked from commit 574816356834cb99295b124be0ec34bd9e0b9c72)
2025-04-06 13:54:20 +00:00
Mark Johnston
51489b9ced ktrace: Use STAILQ_EMPTY_ATOMIC when checking for records in userret()
As in commit 36631977d8c9, this check is unlocked and may trigger
spurious assertion failures.  Use STAILQ_EMPTY_ATOMIC() here as well.
Fix nearby whitespace.

Reported by:	syzkaller
Reviewed by:	olce
Fixes:		34740937f7a4 ("queue: New debug macros for STAILQ")
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D49441

(cherry picked from commit e9a846468acfbba35ca40b888670559aaff7228d)
2025-04-06 13:54:13 +00:00
Christos Margiolis
1728d26682 sound: Implement AFMT_FLOAT support
Even though the OSS manual [1] advises against using AFMT_FLOAT, there
are applications that expect the sound driver to support it, and might
not work properly without it.

This patch adds AFMT_F32_LE|BE (as well as AFMT_FLOAT for OSS
compatibility) in sys/soundcard.h and implements AFMT_F32_LE|BE <->
AFMT_S32_LE|BE conversion functions. As a result, applications can
write/read floats to/from sound(4), but internally, because sound(4)
works with integers, we convert floating point samples to integer ones,
before doing any processing.

The reason for encoding/decoding IEEE754s manually, instead of using
fpu_kern(9), is that fpu_kern(9) is not supported by all architectures,
and also introduces significant overhead.

The IEEE754 encoding/decoding implementation has been written by Ariff
Abdullah [2].

[1] http://manuals.opensound.com/developer/AFMT_FLOAT.html
[2] https://people.freebsd.org/~ariff/utils/ieee754.c

PR:		157050, 184380, 264973, 280612, 281390
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D47638

(cherry picked from commit e1bbaa71d62c8681a576f9f5bedf475c7541bd35)
2025-04-06 02:28:14 +02:00
Olivier Certner
c1d7552ddd
New setcred() system call and associated MAC hooks
This new system call allows to set all necessary credentials of
a process in one go: Effective, real and saved UIDs, effective, real and
saved GIDs, supplementary groups and the MAC label.  Its advantage over
standard credential-setting system calls (such as setuid(), seteuid(),
etc.) is that it enables MAC modules, such as MAC/do, to restrict the
set of credentials some process may gain in a fine-grained manner.

Traditionally, credential changes rely on setuid binaries that call
multiple credential system calls and in a specific order (setuid() must
be last, so as to remain root for all other credential-setting calls,
which would otherwise fail with insufficient privileges).  This
piecewise approach causes the process to transiently hold credentials
that are neither the original nor the final ones.  For the kernel to
enforce that only certain transitions of credentials are allowed, either
these possibly non-compliant transient states have to disappear (by
setting all relevant attributes in one go), or the kernel must delay
setting or checking the new credentials.  Delaying setting credentials
could be done, e.g., by having some mode where the standard system calls
contribute to building new credentials but without committing them.  It
could be started and ended by a special system call.  Delaying checking
could mean that, e.g., the kernel only verifies the credentials
transition at the next non-credential-setting system call (we just
mention this possibility for completeness, but are certainly not
endorsing it).

We chose the simpler approach of a new system call, as we don't expect
the set of credentials one can set to change often.  It has the
advantages that the traditional system calls' code doesn't have to be
changed and that we can establish a special MAC protocol for it, by
having some cleanup function called just before returning (this is
a requirement for MAC/do), without disturbing the existing ones.

The mac_cred_check_setcred() hook is passed the flags received by
setcred() (including the version) and both the old and new kernel's
'struct ucred' instead of 'struct setcred' as this should simplify
evolving existing hooks as the 'struct setcred' structure evolves.  The
mac_cred_setcred_enter() and mac_cred_setcred_exit() hooks are always
called by pairs around potential calls to mac_cred_check_setcred().
They allow MAC modules to allocate/free data they may need in their
mac_cred_check_setcred() hook, as the latter is called under the current
process' lock, rendering sleepable allocations impossible.  MAC/do is
going to leverage these in a subsequent commit.  A scheme where
mac_cred_check_setcred() could return ERESTART was considered but is
incompatible with proper composition of MAC modules.

While here, add missing includes and declarations for standalone
inclusion of <sys/ucred.h> both from kernel and userspace (for the
latter, it has been working thanks to <bsm/audit.h> already including
<sys/types.h>).

Reviewed by:    brooks
Approved by:    markj (mentor)
Relnotes:       yes
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D47618

(cherry picked from commit ddb3eb4efe55e57c206f3534263c77b837aff1dc)
2025-04-03 21:31:03 +02:00
Warner Losh
8130722962 posix: POSIX-1.2008 moved SA_* from XSI to base standard
Starting with POSIX-1.2008, "The SA_RESETHAND, SA_RESTART, SA_SIGINFO,
SA_NOCLDWAIT, and SA_NODEFER constants are moved from the XSI option to
the Base." Make them so visible.

PR: 275328
Sponsored by:		Netflix

(cherry picked from commit 06af7bd12a4a654f5c5e8da41cf329eee3aa61f6)
2025-03-14 11:51:03 -06:00
Mark Johnston
c813157a15 umtx: Add a helper for unlocked umtxq_busy() calls
This seems like a natural complement to umtxq_unbusy_unlocked().  No
functional change intended.

Reviewed by:	olce, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D49124

(cherry picked from commit b01495caac2eca73463f4a889936a19e4c1c5909)
2025-03-07 22:51:48 +00:00
Mark Johnston
14d4c1845d queue: Add atomic variants for *_EMPTY
In some places, these macros are used without a lock, under the
assumption that they are naturally atomic.  After commit
34740937f7a4 ("queue: New debug macros for STAILQ"), this assumption is
false.

Provide *_EMPTY_ATOMIC for such cases.  This lets us include extra debug
checks for the non-atomic case, and gives us a way to explicitly
annotate unlocked checks, which generally deserve extra scrutiny and
might otherwise raise reports from KCSAN.

Reviewed by:	kib, olce (previous version)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D48899

(cherry picked from commit d2870b8666f2438af400269c0f6a1a48031bb71e)
2025-03-07 22:51:48 +00:00
Rick Macklem
310646a92a param.h: Bump _FreeBSD_version for commit 9fed515190 2025-02-28 13:25:43 -08:00
Olivier Certner
9e0ef670d8
queue: Fix STAILQ_ASSERT_EMPTY()
The 'while' part corresponding to the 'do' was missing.

Did not notice the problem as later commits using it have been stashed
and never reworked up to now, and it is currently unused in the tree.

While here, fix spacing after the '#define' in the !(_KERNEL &&
INVARIANS) part.

Fixes:          34740937f7a4 ("queue: New debug macros for STAILQ")
MFC after:      1 minute
Sponsored by:   The FreeBSD Foundation

(cherry picked from commit d3c4b002d1fd54ac69c1714e208051867ee56dc4)
2025-02-27 22:15:40 +01:00
John Baldwin
4aed8b3b61 new-bus: Add bus_(identify|attach|detach)_children
These correspond to the current implementations of
bus_generic_(probe|attach|detach) but with more accurate names and
semantics.  The intention is to deprecate bus_generic_(probe|attach)
and reimplement bus_generic_detach in a future commit.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D47673

(cherry picked from commit 46297859a74563dde6fc5bff9f9ecded9fb61ba6)

bus_generic_(probe|attach|detach) will not be changed in stable/14,
but providing the new APIs in stable/14 permits drivers to use the new
APIs.
2025-02-27 10:19:24 -05:00
John Baldwin
d172f42e4b Bump __FreeBSD_version for bus resource API change
Specifically, the change to remove redundant rid and type arguments
from bus_* when passing an allocated struct resource.

(cherry picked from commit a7b9f4d96e8bdc30db27ec7a193a8d8fdf7c652c)
2025-02-27 08:09:23 -05:00
John Baldwin
e14fd50dd8 new-bus: Introduce a simpler bus API for managing resources
Remove the 'type' and 'rid' arguments from the wrapper bus API
functions (e.g. bus_release_resource) that accept a struct resource.
The "new" versions extract the 'type' and/or 'rid' from the passed in
resource object via rman_get_type and rman_get_rid.

This commit adds the new API as functions with a _new suffix.  Wrapper
macros choose between the old and new functions based on the number of
arguments provided to the macro.  This commit does not change the ABI
but can be safely MFCd to older branches so long as older kernels use
rman_set_type when allocating resources.

Future commits will push the removal of these extraneous arguments
through the bus implementation.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D44124

(cherry picked from commit 9edb8d0aedef2f1e13ed1f8134deb3f8291d2fe9)
2025-02-27 08:09:23 -05:00
John Baldwin
f3a9048114 rman: Add rman_get/set_type
This permits associating a resource type (e.g. SYS_RES_MEMORY) with a
struct resource.

I considered adding a new field to struct rman to store the type and
only providing rman_get_type as an accessor.  However, changing
'struct rman' is an ABI breakage.  I might revisit this in main, but
the current approach is MFC'able.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D44122

(cherry picked from commit b30a80b65587fb9fd4a5f012d606dbd0c6239a46)
2025-02-27 08:09:22 -05:00
Mark Johnston
b0f2df45e7 socket: Add an option to retrieve a socket's FIB number
The SO_SETFIB option can be used to set a socket's FIB number, but there
is no way to retrieve it.  Rename SO_SETFIB to SO_FIB and implement a
handler for it for getsockopt(2).

Reviewed by:	glebius
MFC after:	2 weeks
Sponsored by:	Klara, Inc.
Sponsored by:	Stormshield
Differential Revision:	https://reviews.freebsd.org/D48834

(cherry picked from commit ee951eb59f2136a604e3fbb12abf8d8344da0c99)
2025-02-21 01:04:50 +00:00
Mark Johnston
a55bde2328 socket: Move SO_SETFIB handling to protocol layers
In particular, we store a FIB number in both struct socket and in struct
inpcb.  When updating the FIB number with setsockopt(SO_SETFIB), make
the update atomic.  This is required to support the new bind_all_fibs
mode, since in that mode changing the FIB of a bound socket is not
permitted.

This requires a bit more code, but avoids a layering violation in
sosetopt(), where we hard-code the list of protocol families that
implement SO_SETFIB.

Reviewed by:	glebius
MFC after:	2 weeks
Sponsored by:	Klara, Inc.
Sponsored by:	Stormshield
Differential Revision:	https://reviews.freebsd.org/D48666

(cherry picked from commit caccbaef8e263b1d769e7bcac1c4617bdc12d484)
2025-02-21 01:04:50 +00:00
Warner Losh
1873177473 cdefs: Bump the defaults for 'all'
Bump default to POSIX at 202405, C at 2023 and xopen at 800...

Sponsored by:		Netflix
Reviewed by:		brooks
Differential Revision:	https://reviews.freebsd.org/D47578

(cherry picked from commit f95d9ec92122e6b4ef99c9a258f31b9564d327d3)
2025-02-20 09:13:25 -05:00
Mark Johnston
467fa302c3 clock: Add a long ticks variable, ticksl
For compatibility with Linux, it's useful to have a tick counter of
width sizeof(long), but our tick counter is an int.  Currently the
linuxkpi tries paper over this difference, but this cannot really be
done reliably, so it's desirable to have a wider tick counter.  This
change introduces ticksl, keeping the existing ticks variable.

Follow a suggestion from kib to avoid having to maintain two separate
counters and to avoid converting existing code to use ticksl: change
hardclock() to update ticksl instead of ticks, and then use assembler
directives to make ticks and ticksl overlap such that loading ticks
gives the bottom 32 bits.  This makes it possible to use ticksl in the
linuxkpi without having to convert any native code, and without making
hardclock() more complicated or expensive.  Then, the linuxkpi can be
modified to use ticksl instead of ticks.

Reviewed by:	olce, kib, emaste
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D48383

(cherry picked from commit 6b82130e6c9add4a8892ca897df5a0ec04663ea2)
2025-02-14 19:25:18 +00:00
Doug Moore
2ee9d7dcdb libkern: don't use MPASS
Using MPASS in libkern breaks buildworld.  Replace MPASS with KASSERT
in three places.

(cherry picked from commit 08f6f78f81e21b21dd002a9389436b0333cb3488)
2025-02-10 08:16:25 -06:00
Doug Moore
7bcc7a0b88 libkern: avoid local var in order_base_2()
order_base_2(n) is implemented with a variable, which keeps it from
being used at file scope. Implement it instead as ilog2(2*n-1), which
produces a different result when 2*n overflows, which appears unlikely
in practice.

Reviewed by:	bz
Differential Revision:	https://reviews.freebsd.org/D46826

(cherry picked from commit b7cbf741d55468ba34305a14ac3acc1c286af034)
2025-02-10 04:30:05 -06:00
Doug Moore
6f309b9d56 log2: move log2 functions from linuxkpi to libkern
Linux has a header file that defines an ilog2 function and some simple
functions/macros that use it: roundup_pow_of_two, is_power_of_2,
rounddown_pow_of_two, and order_base_2.  This change moves three of
those simple functions (all but is_power_of_2) from linuxkpi to
libkern.  It also deletes a few implementations of these functions
that have previously been copied into code for various device drivers,
so that they can use the libkern version.  The is_power_of_2 macro was
not moved because powerof2 in param.h provides almost the same service
already (except that they disagree about whether 0 is a power of two).

Since the linux definitions of these functions were copied into
FreeBSD 11 years ago, linux has improved them, and this change
provides those improvements.  In particular, a giant table of log
values for evaluating ilog2 for constant values is no longer
necessary.

Reviewed by:	alc, markj (previous version)
Differential Revision:	https://reviews.freebsd.org/D45536

(cherry picked from commit c8b0c33b03ac072413b27bed2bdae2ae27426f3a)
2025-02-10 04:29:23 -06:00
Doug Moore
4ed1837853 libkern: add ilog2 macro
The kernel source contains several definitions of an ilog2 function;
some are slower than necessary, and one of them is incorrect.
Elimininate them all and define an ilog2 macro in libkern to replace
them, in a way that is fast, correct for all argument types, and, in a
GENERIC kernel, includes a check for an invalid zero parameter.

Folks at Microsoft have verified that having a correct ilog2
definition for their MANA driver doesn't break it.

Reviewed by:	alc, markj, mhorne (older version), jhibbits (older version)
Differential Revision:	https://reviews.freebsd.org/D45170
Differential Revision:	https://reviews.freebsd.org/D45235

(cherry picked from commit b0056b31e90029553894d17c441cbb2c06d31412)
2025-02-10 04:27:12 -06:00
Mark Johnston
0db4588bbe thread: Simplify sanitizer integration with thread creation
fork() may allocate a new thread in one of two ways: from UMA, or cached
in a freed proc that was just allocated from UMA.  In either case, KASAN
and KMSAN need to initialize some state; in particular they need to
initialize the shadow mapping of the new thread's stack.

This is done differently between KASAN and KMSAN, which is confusing.
This patch improves things a bit:
- Add a new thread_recycle() function, which moves all kernel stack
  handling out of kern_fork.c, since it doesn't really belong there.
- Then, thread_alloc_stack() has only one local caller, so just inline
  it.
- Avoid redundant shadow stack initialization: thread_alloc()
  initializes the KMSAN shadow stack (via kmsan_thread_alloc()) even
  through vm_thread_new() already did that.
- Add kasan_thread_alloc(), for consistency with kmsan_thread_alloc().

No functional change intended.

Reviewed by:	khng
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D44891

(cherry picked from commit 800da341bc4a35f4b4d82d104b130825d9a42ffa)
2025-02-07 14:46:53 +00:00
Rick Macklem
7773d509d8 fs: Add new VFCF_xxx flags for va_filerev
Richard Kojedzinszky <richard@kojedz.in> reported a problem via
email, where the Linux NFSv4.2 client did not detect a change in a
directory on a FreeBSD NFSv4.2 server.

Adding support for the NFSv4.2 change_attr_type attribute seems
to have fixed the problem. This requires that the server file system
indicate if it increments va_filerev by one, since that file attribute
is used for the NFSv4.2 change attribute.  Fuse requires an indication
that va_filerev is based on ctime.

This patch adds VFCF_FILEREVINC and VFCF_FILEREVCT to indicate this.

A future patch to the NFS server will use these flags.

(cherry picked from commit 1cd455f39d886f27c33f7726f79fc4cc566da7b3)
2025-01-28 14:38:52 -08:00
Olivier Certner
aab45924bd
smr: Load to accept pointers to const pointers
Pointers passed to the smr_entered_load() and smr_serialized_load()
macros are in the end used as arguments to atomic_load_*ptr(), so
convert them to the now acceptable 'const uintptr_t *' ones (instead of
'uintptr_tr *'), making these macros accept pointers to constant
pointers.

Reviewed by:    kib
MFC after:      4 days
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D48497

(cherry picked from commit 1ac0afaa962bc847294d5e6bf1e749b7ffa78cfd)
2025-01-27 19:19:57 +01:00