Since newnfs_copycred() calls crsetgroups() which in turn calls
crextend() which might do a malloc(M_WAITOK), newnfs_copycred()
cannot be called with a mutex held. Fortunately, the malloc()
call is rarely done, since XU_GROUPS is 16 and the NFS client
uses a maximum of 17 (only 17 groups will cause the malloc() to
be called). Further, it is only a problem if the malloc() tries
to sleep(). As such, this bug does not seem to have caused
problems in practice.
This patch fixes the one place in the NFS client where
newnfs_copycred() is called while a mutex is held by moving the
call to after where the mutex is released.
Found by inspection while working on an experimental patch.
(cherry picked from commit 501bdf3001190686bf55d9d333cb533858c2cf2f)
During testing at a recent IETF NFSv4 Bakeathon, a non-FreeBSD
server was rebooted. After the reboot, the FreeBSD client sent
an Open/Claim_previous with a Getattr after the Open in the same
compound. The Open/Claim_previous was done to recover the Open
and a Delegation for for a file. The Open succeeded, but the
Getattr after the Open failed with NFSERR_DELAY. This resulted
in the FreeBSD client retrying the entire RPC over and over again,
until the server's recovery grace period ended. Since the Open
succeeded, there was no need to retry the entire RPC.
This patch modifies the NFSv4 client side recovery Open/Claim_previous
RPC reply handling to deal with this case. With this patch, the
Getattr reply of NFSERR_DELAY is ignored and the successful Open
reply is processed.
This bug will not normally affect users, since this non-FreeBSD
server is not widely used (it may not even have shipped to any
customers).
(cherry picked from commit 14bbf4fe5abb20f1126168e66b03127ae920f78e)
For NFSv4.1/4.2, there are two new options for the Open operation.
These two options use the file handle for the file instead of the
file handle for the directory plus a file name. By doing so, the
client code is simplified (it no longer needs the "nfsv4node" structure
attached to the NFS vnode). It also avoids problems caused by another
NFS client (or process running locally in the NFS server) doing a
rename or remove of the file name between the Lookup and Open.
Unfortunately, there was a bug (fixed recently by commit X)
in the NFS server which mis-parsed the Claim_Deleg_Cur_FH
arguments. To allow this patch to work with the broken FreeBSD
NFSv4.1/4.2 server, NFSMNTP_BUGGYFBSDSRV is defined and is set
when a correctly formatted Claim_Deleg_Cur_FH fails with NFSERR_EXPIRED.
(This is what the old, broken NFS server does, since it erroneously
uses the Getattr arguments as a stateID.) Once this flag is set,
the client fills in a stateID, to make the broken NFS server happy.
Tested at a recent IETF NFSv4 Bakeathon.
(cherry picked from commit 196787f79e67374527a1d528a42efa8b31acd9af)
The argument 's' of getpeerid(3) must be a connected UNIX-domain socket,
so document it.
PR: 248614
Differential Revision: https://reviews.freebsd.org/D42629
(cherry picked from commit fa9f74220146233b7224da7c94870540dc39ae68)
The tty_rubchar() code handling backspaces for UTF-8 characters didn't
properly check whether the beginning of the current line was reached.
This resulted in a kernel panic in ttyinq_unputchar() when prodded with
certain malformed UTF-8 sequences.
PR: 275009
Reviewed by: christos
Differential Revision: https://reviews.freebsd.org/D42564
(cherry picked from commit c6d7be214811c315d234d64c6cbaa92d4f55d2c1)
128f63cedc14 and 9e589b093857 added proper UTF-8 backspacing handling in
the tty(4) driver, which is enabled by setting the new IUTF8 flag
through stty(1). Since the default locale is UTF-8, and the feature
itself is important enough, enable IUTF8 by default.
Related discussion:
https://lists.freebsd.org/archives/freebsd-arch/2023-November/000534.html
Reviewed by: imp, bojan.novkovic_fer.hr
Differential Revision: https://reviews.freebsd.org/D42464
(cherry picked from commit bb830e346bd50545e9868a1802d631afb6b50bb0)
Dummynet re-injects an mbuf with MTAG_IPFW_RULE added, and the same mtag
is used by divert(4) as parameters for packet diversion.
If according to pf rule set a packet should go through dummynet first
and through ipdivert after then mentioned mtag must be removed after
dummynet not to make ipdivert think that this is its input parameters.
At the very beginning ipfw consumes this mtag what means the same
behavior with tag clearing after dummynet.
And after fabf705f4b5a pf passes parameters to ipdivert using its
personal MTAG_PF_DIVERT mtag.
PR: 274850
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D42609
(cherry picked from commit fe3bb40b9e807d4010617de1ef040ba3aa623487)
When support for fpu_kern_enter/fpu_kern_leave was added to powerpc,
set_mcontext was updated to handle Altivec state restoration in the same
way that the FPU state by lazily restoring the context on the first
trap. However the function was not correctly updated to unconditionally
clear the PCB_VEC and PSL_VEC bits from the pcb's flags and srr1
respectively which can sometimes result in a mismatch between a
process's MSR[VEC] state and its pcb_flags.
Fix this by simply clearing the VEC flags unconditionally in
set_mcontext, which is already done for FPU/VSX.
Fixes: a6662c37b6ffe ("powerpc: Implement fpu_kern_enter/fpu_kern_leave")
Reviewed by: alfredo
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42417
(cherry picked from commit 270f75cf3433807d124cdf1f0072ab801532f425)
Summary:
Provide an implementation of fpu_kern_enter/fpu_kern_leave for PPC to
enable FPU, VSX, and Altivec usage in-kernel. The functions currently
only support FPU_KERN_NOCTX, but this is sufficient for ossl(1) and many
other users of the API.
This patchset has been tested on powerpc64le using a modified version of
the in-tree tools/tools/crypto/cryptocheck.c tool to check for FPU/Vec
register clobbering along with a follow-up patch to enable ossl(4) on
powerpc64*.
Reviewed by: jhibbits
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D41540
Relnotes: yes
(cherry picked from commit a6662c37b6ffee46e18be5f7570149edc64c1d0b)
In function 'wg_aip_add()', the error path of returning ENOMEM when
(node == NULL) is forgetting to unlock the radix tree, and thus may lead
to a deadlock.
PR: 275001
Reviewed by: kp
MFC after: 1 week
(cherry picked from commit dcc4d2939f789a6d1f272ffeab2068ba2b7525ea)
On D40102 we implemented support for transport over IPv6 but the
documentation was not updated to reflect the new feature.
Clarify what is available and how it can be used.
MFC after: 1 week
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D42505
(cherry picked from commit 81d4c786209bfa3752c25b2564eb363027f5d914)
Without this patch, a NFSv4 Readdir operation acquires the vnode for
each entry in the directory. If only the Type, Fileid, Mounted_on_fileid
and ReaddirError attributes are requested by a client, acquiring the vnode
is not necessary for non-directories. Directory vnodes must be acquired
to check for server file system mount points.
This patch avoids acquiring the vnode, as above, resulting in a 3-8%
improvement in Readdir RPC RTT for some simple tests I did.
Note that only non-rdirplus NFSv4 mounts will benefit from this change.
Tested during a recent IETF NFSv4 Bakeathon testing event.
(cherry picked from commit cd5edc7db261fb228be4044e6fdd38850eb4e9c4)
Preemptively address a collision with LIBFDT (to be added in the future)
from src.libnames.mk, which gets included via bsd.progs.mk. No
functional change intended.
Reviewed by: imp
MFC after: 1 week
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D42486
(cherry picked from commit b247ff70e8f1b4bf184b9fc85d2908ec4db2d1ab)
Using the new UMA_ALIGN_CACHE_AND_MASK() facility, which allows to
simultaneously guarantee a minimum of 32 bytes of alignment (the 5 lower
bits are always 0).
For the record, to this day, here's a (possibly non-exhaustive) list of
synchronization primitives using lower bits to store flags in pointers
to thread structures:
- lockmgr, rwlock and sx all use the 5 bits directly.
- rmlock indirectly relies on sx, so can use the 5 bits.
- mtx (non-spin) relies on the 3 lower bits.
Reviewed by: markj, kib
MFC after: 2 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42266
(cherry picked from commit 7d1469e555bdce32b3dfc898478ae5564d5072b1)
To be used for structures for which we want to enforce that pointers to
them have some number of lower bits always set to 0, while still
ensuring we benefit from cache line alignment to avoid false sharing
between structures and fields within the structures (provided they are
properly ordered).
First candidate consumer that comes to mind is 'struct thread', see next
commit.
Reviewed by: markj, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42265
(cherry picked from commit 733e0abd2897289e2acf70f7c72e31a5a560394a)
Substituting 'uma_align_cache' by the appropriately named accessor
uma_get_cache_align_mask() made apparent that dma_get_cache_alignment()
was off by one, since it was defined to be the mask derived from the
alignment value.
Reviewed by: markj, bz
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42264
(cherry picked from commit 2c7dd66d09a1b92a4698232996cded6e5315b3bd)
New function check_align_mask() asserts (under INVARIANTS) that the mask
fits in a (signed) integer (see the comment) and that the corresponding
alignment is a power of two.
Use check_align_mask() in uma_set_align_mask() and also in uma_zcreate()
to replace the KASSERT() there (that was checking only for a power of
2).
Reviewed by: kib, markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42263
(cherry picked from commit 87090f5e5a7b927a2ab30878435f6dcba0705a1d)
In uma_set_align_mask(), ensure that the passed value doesn't have its
highest bit set, which would lead to problems since keg/zone alignment
is internally stored as signed integers. Such big values do not make
sense anyway and indicate some programming error. A future commit will
introduce checks for this case and other ones.
Reviewed by: kib, markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42262
(cherry picked from commit 3d8f548b9e5772ff6890bdc01f7ba7b76203857d)
There's no point in setting 'arm_dcache_align_mask' before the
function's end.
Reviewed by: markj, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42261
(cherry picked from commit 1bce6f951a902f03bfb354f5b11473a0d12b3d7d)
Having a special value of -1 that is resolved internally to
'uma_align_cache' provides no significant advantages and prevents
changing that variable to an unsigned type, which is natural for an
alignment mask. So suppress it and replace its use with a call to
uma_get_align_mask(). The small overhead of the added function call is
irrelevant since UMA_ALIGN_CACHE is only used when creating new zones,
which is not performance critical.
Reviewed by: markj, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42259
(cherry picked from commit e557eafe7233f8231c1f5f5b098e4bab8e818645)
Create the uma_get_cache_align_mask() accessor and put it in a separate
private header so as to minimize namespace pollution in header/source
files that need only this function and not the whole 'uma.h' header.
Make sure the accessors have '_mask' as a suffix, so that callers are
aware that the real alignment is the power of two that is the mask plus
one. Rename the stem to something more explicit. Rename
uma_set_cache_align_mask()'s single parameter to 'mask'.
Hide 'uma_align_cache' to ensure that it cannot be set in any other way
then by a call to uma_set_cache_align_mask(), which will perform sanity
checks in a further commit. While here, rename it to
'uma_cache_align_mask'.
This is also in preparation for some further changes, such as improving
the sanity checks, eliminating internal resolving of UMA_ALIGN_CACHE and
changing the type of the 'uma_cache_align_mask' variable.
Reviewed by: markj, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42258
(cherry picked from commit dc8f7692fd1de628814f4eaf4a233dccf4c92199)
This allows us to figure out how many states each hashrow contains. That
can be important to know when debugging performance issues.
A simple probe could be:
dtrace -n 'pf:purge:state:rowcount { @counts["states per row"] = quantize(arg1); }'
dtrace: description 'pf:purge:state:rowcount ' matched 1 probe
^C
states per row
value ------------- Distribution ------------- count
-1 | 0
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 8257624
1 | 14321
2 | 0
MFC after: 1 week
Sponsored by: Modirum MDPay
(cherry picked from commit 0d2ab4a4ced0f153a6b6a58ca3cfa6efbeeec7a2)
This is consistent with version numbers used in releng/13.2.
PR: 275051
Reviewed by: bapt
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D42562
(cherry picked from commit 21e9018ae19662db643a21064150da866bc7beb4)
When print-ip-demux.c was introduced on ee67461e, the pfsync_ip_print
function was missed, causing tcpdump to treat pfsync packets on network
interfaces as an unknown protocol.
MFC after: 1 week
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D42504
(cherry picked from commit 85247ee6a2ba1c2dd0053e9be9055efa4be1438e)
KIOXIA CD8 SSDs routinely take ~25 seconds to delete non-empty
namespace. In some cases like hot-plug it takes longer, triggering
timeout and controller resets after just 30 seconds. Linux for many
years has separate 60 seconds timeout for admin queue. This patch
does the same. And it is good to be consistent.
Sponsored by: iXsystems, Inc.
Reviewed by: imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42454
(cherry picked from commit 8d6c0743e36e3cff9279c40468711a82db98df23)
In a recent email list discussion related to NFSv4 mount problems
against a non-FreeBSD NFSv4 server, the reporter of the issue noted
that the server had replied 10068 (NFSERR_RETRYUNCACHEDREP). This
did not seem related to the mount problem, but I had never seen this
error before. It indicates that an RPC retry after a new TCP
connection has been established failed because the server did not
cache the reply. Since this should only happen for idempotent
operations, redoing the RPC should be safe.
This patch modifies the NFSv4.1/4.2 client to redo the RPC instead
of considering the server error fatal. It should only affect the
unusual case where TCP connections to NFSv4 servers are breaking
without the NFSv4 server rebooting.
MFC after: 2 weeks
(cherry picked from commit c4e298251ab01665f5bb3edeb740a51331818a45)
Commit 4692906480 made e6000sw's
implementation of miibus_(read|write)reg assume that the softc lock is
held. I presume that is to avoid lock recursion in e6000sw_attach() ->
e6000sw_attach_miibus() -> mii_attach() -> MIIBUS_READREG().
However, the lock assertion in e6000sw_readphy_locked() can fail if a
different driver uses the interface to probe registers. Work around the
problem by providing implementations which lock the softc if it is not
already locked.
PR: 274795
Fixes: 4692906480 ("e6000sw: add readphy and writephy wrappers")
Reviewed by: kp, imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42466
(cherry picked from commit 725962a9f4c050b21488edd58d317e87c76d6f66)
This should make crash reports a bit more useful without having to ask
for additional information.
Reviewed by: imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42465
(cherry picked from commit 3e356fb885f6a742c496cefd71d8d33564de542a)
Allow userspace to retrieve low and high water marks, as well as the
current number of half open states.
MFC after: 1 week
Sponsored by: Modirum MDPay
(cherry picked from commit a6173e94635b03aa7aab90a67785c8c3e7c6247b)
The loader tunable 'security.mac.veriexec.block_unlink' has been
already flagged with CTLFLAG_RDTUN, no need to re-fetch it with
TUNABLE_INT_FETCH.
While here move the definition of sysctl knob out of function body,
which is more common in FreeBSD.
No functional change intended.
Reviewed by: stevek
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42132
(cherry picked from commit bb8d4411e0c668415538f66fb25e6b38bb910cdd)
The dead_bpf_if is not subjected to be written. Make it const so that
on destructive writing to it the kernel will panic instead of silent
memory corruption.
No functional change intended.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42189
(cherry picked from commit 7a974a649848e1222a49d0d49726d06bd5c1dbd9)
Headers from src/include were in the runtime-dev package but
subdirectories of src/include ended up in utilities-dev by default.
Neither package is a good choice - the headers in src/include are not
useful without the libraries contained in clibs-dev.
This moves the standard C headers to clibs-dev (C++ headers are already
in this package). While working on this, I found that various clang
libraries and headers were also bundled into utilities-dev by default
so these are also moved to clang-dev.
I also added a FreeBSD-build-essential meta package to make it simple to
install all the toolchain parts.
PR: 254173
Reviewed byb: manu
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41815
(cherry picked from commit 78847e1e592789dc85bddf4d2f1d9a8ce4614ff1)
The driver has a tunable hw.xn.enable_lro which is intended to control
whether LRO is enabled. This is currently non-functional - even if its
set to zero, the driver still requests LRO support from the backend.
This change fixes the feature so that if enable_lro is set to zero, LRO
no longer appears in the interface capabilities and LRO is not requested
from the backend.
PR: 273046
MFC after: 1 week
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D41439
(cherry picked from commit da4b0d6eb06d730487d48e15d2d5e10c56266fd9)
When the cross-mount walking logic in vfs_lookup() was factored into
a separate function, the main cross-mount traversal loop was changed
from a do...while loop conditional on the current vnode having
VIRF_MOUNTPOINT set to an unconditional for(;;) loop. For the
unionfs 'crosslock' case in which the vnode may be re-locked, this
meant that continuing the loop upon finding inconsistent
v_mountedhere state would no longer branch to a check that the vnode
is in fact still a mountpoint. This would in turn lead to over-
iteration and, for INVARIANTS builds, a failed assert on the next
iteration.
Fix this by restoring the previous loop behavior.
Reported by: pho
Tested by: pho
Fixes: 80bd5ef070
(cherry picked from commit 586fed0b03561558644eccc37f824c7110500182)