Once a new vnode is visible from the mountpoint hash, we should set its
state from VSTATE_UNINITIALIZED to VSTATE_CONSTRUCTED. I do not think
this affects correctness at all, but the bug trips a check in
vop_unlock_debugpost(), previously hidden under options DEBUG_VFS_LOCKS.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D51720
(cherry picked from commit e2ac6de6e4edc1c6e7bfdfb0ec8fcf62f46d503f)
Replace priorities specified by a base priority and some hardcoded
offset value by symbolic constants. Hardcoded offsets prevent changing
the difference between priorities without changing their relative
ordering, and is generally a dangerous practice since the resulting
priority may inadvertently belong to a different selection policy's
range.
Since RQ_PPQ is 4, differences of less than 4 are insignificant, so just
remove them. These small differences have not been changed for years,
so it is likely they have no real meaning (besides having no practical
effect). One can still consult the changes history to recover them if
ever needed.
No functional change (intended).
MFC after: 1 month
Event: Kitchener-Waterloo Hackathon 202506
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45390
(cherry picked from commit 8ecc41918066422d6788a67251b22d11a6efeddf)
We set MNTK_LOOKUP_SHARED on p9fs mounts, but disable shared locking of
vnodes (i.e., LK_SHARED requests are automatically translated to
LK_EXCLUSIVE.
Reviewed by: kib
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D50759
This truncation is mostly harmless today, but fix it anyway to avoid
pain later down the road.
Reviewed by: olce, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D50417
(partially cherry picked from commit 0d224af399a66f00a5b33e5512fc018062cabf1d)
REMOVE doesn't work properly in the face of hard links. Use UNLINKAT
instead, which is implemented by qemu and bhyve and lets the client
specify the name being removed.
PR: 282432
Reviewed by: dfr
Differential Revision: https://reviews.freebsd.org/D47438
This code is using the vnode after it has been released and causing a
panic when a p9fs shared volume is unmounted. In fact, it seems like it's
just duplicated code left behind from a bad merge.
PR: 279887
Reported by: Michael Dexter
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1323
Mostly copied from smbfs. This driver in its current state has the exact
same issue that prevents the generic putpages implementation from
working.
Sponsored by: https://www.patreon.com/valpackett
Reviewed by: dfr
Differential Revision: https://reviews.freebsd.org/D45639
MFC after: 3 months
The lib9p implementation takes a strict interpretation of the Twalk RPC
call and returns an error for attempts to lookup ".". The workaround is
to fake the lookup locally.
Reviewed by: Val Packett <val@packett.cool>
MFC after: 3 months
This is derived from swills@ fork of the Juniper virtfs with many
changes by me including bug fixes, style improvements, clearer layering
and more consistent logging. The filesystem is renamed to p9fs to better
reflect its function and to prevent possible future confusion with
virtio-fs.
Several updates and fixes from Juniper have been integrated into this
version by Val Packett and these contributions along with the original
Juniper authors are credited below.
To use this with bhyve, add 'virtio_p9fs_load=YES' to loader.conf. The
bhyve virtio-9p device allows access from the guest to files on the host
by mapping a 'sharename' to a host path. It is possible to use p9fs as a
root filesystem by adding this to /boot/loader.conf:
vfs.root.mountfrom="p9fs:sharename"
for non-root filesystems add something like this to /etc/fstab:
sharename /mnt p9fs rw 0 0
In both examples, substitute the share name used on the bhyve command
line.
The 9P filesystem protocol relies on stateful file opens which map
protocol-level FIDs to host file descriptors. The FreeBSD vnode
interface doesn't really support this and we use heuristics to guess the
right FID to use for file operations. This can be confused by privilege
lowering and does not guarantee that the FID created for a given file
open is always used for file operations, even if the calling process is
using the file descriptor from the original open call. Improving this
would involve changes to the vnode interface which is out-of-scope for
this import.
Differential Revision: https://reviews.freebsd.org/D41844
Reviewed by: kib, emaste, dch
MFC after: 3 months
Co-authored-by: Val Packett <val@packett.cool>
Co-authored-by: Ka Ho Ng <kahon@juniper.net>
Co-authored-by: joyu <joyul@juniper.net>
Co-authored-by: Kumara Babu Narayanaswamy <bkumara@juniper.net>
Introduce two helpers, the more general SYSCTL_SIZEOF() and
a struct-specific one SYSCTL_SIZEOF_STRUCT() which prepends 'struct' in
the description and in the use of sizeof() but uses the raw structure
name as the knob's name. The size of the object/structure is exported
under 'debug.sizeof'.
Existing knobs under 'debug.sizeof' were all converted to use the
helpers.
Add a note before the helpers discouraging the introduction of new
leaves for ad-hoc reasons. List alternative means for developers to
obtain the size of arbitrary kernel structures easily (thanks to markj@
for providing these).
No functional change (intended).
Reviewed by: kib, markj
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D50121
(cherry picked from commit 713abc9880aabe0ff924ff644bceb6ff404ed3cd)
(cherry picked from commit efce9f8a510b60736994e50288b78fc7b32b5d90)
Approved by: re (cperciva)
Otherwise we end up searching for a device using an uninitialized key,
tripping up KMSAN.
MFC after: 2 weeks
(cherry picked from commit 48d6b52add36cc09e7fb1fbec44ab66c0742f320)
Certain NFSv4.1 callbacks are not currently supported/used
by the FreeBSD client. Without this patch, NFS4ERR_NOTSUPP
is replied for the callbacks. Since NFSv4.1 does not specify
all of these callbacks as optional, I think it is preferable
to reply NFS_OK or NFS4ERR_REJECT_DELEG instead of NFS4ERR_NOTSUPP.
This patch changes the reply status for these unsupported
callbacks, which the client has no use for.
I am not aware of any NFSv4.1 servers that will perform
any of these callbacks against the FreeBSD client at this time.
(cherry picked from commit 56c8c19046c46cb9e89c2a3967f8cf2cd0ace428)
Commit f5aff1871d32 and 7e26f1c21049 moved the delegation
and layout high water variables into the clientID structure.
This patch uses those variables to implement the
CB_RECALL_ANY NFSv4.1/4.2 callback.
This patch only affects NFSv4.1/4.2 mounts to non-FreeBSD
NFS servers that use CB_RECALL_ANY. The Linux knfsd is
one example of such a server.
(cherry picked from commit 8dc0889f56dd6ac5c33ce79337a971af4b9ff127)
Commit f5aff1871d32 moved the delegation high water
variables into the clientID structure, so that they are now
per mount instead of global. This patch does the
same for the layout highwater variables. It happens
that the layout highwater variables are not actually
used. This patch changes the code to use them.
This is needed to add support
for the CB_RECALL_ANY callback in a future commit.
This patch only affects NFSv4.1/4.2 mounts with the "pnfs"
mount option. The effect on these mounts will be minimal,
since layouts are returned when they are stale and this
normally ensures that the highwater mark is never hit.
(cherry picked from commit 7e26f1c21049b5a1a2f490d8ac1909ccb24f0db2)
Without this patch, the variables used to maintain a high
water limit for delegations are global and apply to all
mounts. This patch moves them into the clientID structure,
which makes them per mount. This is needed to add support
for the CB_RECALL_ANY callback in a future commit.
The only effect of this patch is an increase in the
total number of delegations held if there are multiple NFSv4
mounts to NFSv4 servers with delegations enabled.
Since the default of NFSCLDELEGHIGHWATER is fairly small,
this should not have a significant impact.
(cherry picked from commit f5aff1871d3273b3cd3621ea5d3e37cdd807e66f)
The callback CB_RECALL_SLOT is required for NFSv4.1/4.2.
Fortunately, there does not appear to be any extant
NFSv4.1/4.2 servers that use it. Since commit b97a478896e9
fixed handling of session slot shrinking, this patch
adds support for CB_RECALL_SLOT, which shrinks the
number of session slots as well.
(cherry picked from commit 4517fbfd4251180147082f94253c4347fa44f570)
It was reported on freebsd-fs@ that unrolling a tarball
failed to set the correct modify time if delegations
were being issued.
This patch fixes the problem.
This bug only affects NFSv4 mounts where delegations
are being issued. Not running the nfscbd or disabling
delegations on the NFSv4 server avoids the problem.
(cherry picked from commit b616d997cb48eaafe13069eecd95f0495b2358eb)
When a NFSv4.1/4.2 server reduces the size of the slot table
for a session as indicated by a smaller value for sr_target_highest_slot
in a Sequence reply, the sequence numbers for the slots no
longer in use must be re-initialized. This is needed since the
slot table may be grown again by the server later.
The RFC did not make the need for the sequence numbers to be
re-initialized when a shrink/grow of the slot table size happens,
but this has now been confirmed as correct behaviour.
The patch adds the code that does this re-initialization.
I am not currently aware of a NFSv4.1/4.2 server where the
session slots fail if this is not done, but there may be such
a case.
(cherry picked from commit b97a478896e9523245c2041b064121ccb1f70426)
Richard Kojedzinszky reported an intermittent problem where
the Linux NFSv4.2 client would sometimes not see changes done
to a directory by another client, although the change attribute
for the directory had changed.
A test patch that added the change_attr_type attribute to the
server and always returned NFS4_CHANGE_TYPE_VERSION_COUNTER_NOPNFS
seems to have resolved the issue. Somewhat oddly, the Linux
knfsd server does not support this attribute but does not
seem to exhibit the stale caching problem.
This patch uses the VFCF_FILEREVINC flag on a file system (UFS, ZFS)
to return NFS4_CHANGE_TYPE_VERSION_COUNTER_NOPNFS. It also
returns NFS4_CHANGE_TYPE_TIME_METADATA if VFCF_FILEREVCT is set,
which may be useful for exported fuse file systems.
PR: 284186
(cherry picked from commit 709c18911ad70978d47198556c0fb1c0e703fb68)
Commit 026cdaa3b3a9 added a check for a nul or "/" in a file
name in a readdir reply. Unfortunately, the minimal testing
done on it did not detect a bug that can cause the client
to crash.
This patch fixes the code so that it does not crash.
Note that a NFS server will not normally return a file
name in a readdir reply that has a nul or "/" in it,
so the crash is unlikely.
PR: 283965
(cherry picked from commit f9f0a1d61c7b97c705246c747baec385e0592966)
Fix a leak of a fuse_ticket structure. The leak mostly affected
NFS-exported fuse file systems, and was triggered by a failure during
FUSE_LOOKUP.
Sponsored by: ConnectWise
(cherry picked from commit 969d1aa4dbfcbccd8de965f7761203208bf04e46)
The FUSE_NO_OPEN_SUPPORT and FUSE_NO_OPENDIR_SUPPORT flags
are only meant to indicate kernel features, and should be ignored
if they appear in the FUSE_INIT reply flags.
Also fix the corresponding test cases.
Reviewed by: Alan Somers <asomers@FreeBSD.org>
Signed-off-by: CismonX <admin@cismon.net>
Pull Request: https://github.com/freebsd/freebsd-src/pull/1509
(cherry picked from commit f0f596bd955e5b48c55db502e79fc652ac8970d3)
Unusually, the FUSE_NOTIFY_INVAL_INODE and FUSE_NOTIFY_INVAL_ENTRY
messages are fully asynchronous. The server sends them to the kernel
unsolicited. That means that unlike every other fuse message coming
from the server, these two arrive to a potentially unbusied mountpoint.
So they must explicitly busy it. Otherwise a page fault could result if
the mountpoint were being unmounted.
Reported by: JSML4ThWwBID69YC@protonmail.com
(cherry picked from commit 989998529387b4d98dfaa6c499ad88b006f78de8)
Re-ordering the fields suppresses the trailing padding which was causing
the structure to overflow 'struct fid'.
While here, re-indent in a more visually pleasing way.
Reviewed by: rmacklem, emaste, markj
Approved by: markj (mentor)
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D47955
(cherry picked from commit 8ae6247aa966989412bd75fc7c26728690b9e944)
Sponsored by: The FreeBSD Foundation
File system specific *fid structures are copied into the generic
struct fid defined in sys/mount.h.
As such, they cannot be larger than struct fid.
This patch packed the structure and checks via a __Static_assert().
Reviewed by: markj
MFC after: 2 weeks
(cherry picked from commit bfc8e3308bee23d0f7836d57f32ed8d47da02627)
As the 'gen' field in 'struct tarfs_node' (and then 'struct tarfs_fid')
is filled with arc4random() which returns an unsigned int, change its
type in both structures. This allows reordering fields in 'struct
tarfs_fid' to reduce its size, finally avoiding the use of '__packed' to
ensure it fits into 'struct fid'.
While here, remove the 'data0' field which wasn't necessary from the
start.
Reviewed by: markj, rmacklem, des
Approved by: markj (mentor)
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D47954
(cherry picked from commit cf0ede720391de986e350f23229da21c13bc7e9d)
Sponsored by: The FreeBSD Foundation
File system specific *fid structures are copied into the generic
struct fid defined in sys/mount.h.
As such, they cannot be larger than struct fid.
This patch packs the structure and checks via a __Static_assert().
Reviewed by: markj
MFC after: 2 weeks
(cherry picked from commit 4db1b113b15158c7d134df83e7a7201cf46d459b)
Change 'struct tmpfs_fid_data' to behave consistently with the private
structure other FSes use. In a nutshell, make it a full alias of
'struct fid', instead of just using it to fill 'fid_data'. This implies
adding a length field at start (aliasing 'fid_len' of 'struct fid'), and
filling 'fid_len' with the full size of the aliased structure.
To ensure that the new 'struct tmpfs_fid_data' is smaller than 'struct
fid', which the compile-time assert introduced in commit
91b5592a1e1af974 ("fs: Add static asserts for the size of fid
structures") checks (and thus was not strong enough when added), use
'__packed'.
A consequence of this change is that copying the 'struct tmpfs_fid_data'
into a stack-allocated variable becomes unnecessary, we simply rely on
the compiler emitting the proper code on seeing '__packed' (and on the
start of 'struct tmpfs_fid_data' being naturally aligned, which is
normally guaranteed by kernel's malloc() and/or inclusion in 'struct
fhandle').
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D47956
(cherry picked from commit 1ccbdf561f417f9fe802131d5b1756ac45fd314d)
As a process really changes credentials at the moment proc_set_cred() or
proc_unset_cred() is called, these functions are the proper locations to
perform the update of the new and old real users' process count (using
chgproccnt()).
Before this change, change_ruid() instead would perform that update,
although it operates only on a passed credential which is a priori not
tied to the calling process (or not to any process at all). This was
arguably a flaw of commit b1fc0ec1a7, r77183, based on its commit
message, and in particular the portion "(...) In each case, the call now
acts on a credential not a process (...)".
Fixing this makes using change_ruid() more natural when building
candidate credentials that in the end are not applied to a process,
e.g., because of some intervening privilege check. Also, it removes
a hack around this unwanted process count change in unionfs.
We also introduce the new proc_set_cred_enforce_proc_lim() so that
callers can respect the per-user process limit, and will use it for the
upcoming setcred(). We plan to change all callers of proc_set_cred() to
call this new function instead at some point. In the meantime, both
proc_set_cred() and the new function will coexist.
As detailed in some proc_set_cred_enforce_proc_lim()'s comment, checking
against the process limit is currently flawed as the kernel doesn't
really maintain the number of processes per UID (besides RLIMIT_NPROC,
this in fact also applies to RLIMIT_KQUEUES, RLIMIT_NPTS, RLIMIT_SBSIZE
and RLIMIT_SWAP). The applied limit is currently that of the old real
UID. Root (or a process granted with PRIV_PROC_LIMIT) is not subject to
this limit.
Approved by: markj (mentor)
Fixes: b1fc0ec1a7
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D46923
(cherry picked from commit d2be7ed63affd8af5fe6203002b7cc3cbe7f7891)
This permits the mask bits to control the upper 3 bits used for setuid,
setgid, and sticky permissions. While here, clarify the manpage language
as non-Rockridge volumes with extended attributes can also supply users
and groups along with permissions.
Reviewed by: olce
Fixes: 82f2275b73e5 cd9660: Add support for mask,dirmask,uid,gid options
Differential Revision: https://reviews.freebsd.org/D47357
(cherry picked from commit c1ad5b4b10c5e426d3d782b7216a038187419a1e)
File system specific *fid structures are copied into the generic
struct fid defined in sys/mount.h.
As such, they cannot be larger than struct fid.
This patch adds _Static_assert()s to check for this.
ZFS and fuse already have _Static_assert()s.
(cherry picked from commit 91b5592a1e1af97480d615cf508be05b5674d2f3)
- cd_ino_t can be dropped since ino_t is now 64 bits wide.
- ISOFSMNT_ROOT is unused (and defined only for the kernel).
No functional change intended.
Reviewed by: olce, imp, kib, emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D47880
(cherry picked from commit 96e69c8e3167a57b8b37317796eba3068d12a771)
File system specific *fid structures are copied into the generic
struct fid defined in sys/mount.h.
As such, they cannot be larger than struct fid.
This patch packs the structure and checks via a __Static_assert().
Reported by: Kevin Miller <mas@0x194.net>
Reviewed by: olce, imp, kib, emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D47879
(cherry picked from commit 205659c43d87bd42c4a0819fde8f81e8ebba068e)
Without this patch, an all upper case user domain name
(as specified by nfsuserd(8)) would not work.
I believe this was done so that Kerberos realms were
not confused with user domains.
Now, RFC8881 specifies that the user domain name is a
DNS name. As such, all upper case names should work.
This patch fixes this case so that it works. The custom
comparison function is no longer needed.
PR: 282620
(cherry picked from commit 0347ddf41f4226c0351d2d1d78f09e8300ebac93)
RFC8275 defines a new attribute as an extension to NFSv4.2
called MODE_UMASK. This patch adds support for this attribute
to the NFSv4.2 client and server.
Since FreeBSD applies the umask above the VFS/VOP layer,
this attribute does not actually have any effect on the
handling of ACL inheritance, which is what it is designed for.
However, future changes to NFSv4.2 require support of it,
so this patch does that, resulting in behaviour identcal to
the mode attribute already supported.
(cherry picked from commit 2477e88b8d4328535357bc62409f673a551be179)
This is mostly to reduce the diff with CheriBSD which adds additional
constants to enum uio_rw, but also matches the normal style used for
uio_segflg.
Reviewed by: kib, emaste
Obtained from: CheriBSD
Differential Revision: https://reviews.freebsd.org/D45142
(cherry picked from commit 473c90ac04cec0abbb414978c53e9c259c9129e8)
We cannot unconditionally access nfsd's VNET variables in
'sys/kern/vfs_export.c' nor 'sys/fs/nfsserver/nfs_nfsdsubs.c', as they
may not have been compiled in depending on build options.
So, forget about the extra mile of using the configured default group
and use the hardcoded GID_NOGROUP (which differs only on systems running
nfsuserd(8) and with a non-default GID for their "nogroup" group).
Reported by: rpokala, bapt (MINIMAL compile breakup)
Reported by: cy, David Wolfskill (panics caused by mountd(8))
Approved by: markj (mentor)
Fixes: cfbe7a62dc62 ("nfs, rpc: Ensure kernel credentials have at least one group")
(cherry picked from commit 5169d4307eb9c8b7bb0bd46d600012bcc12cbdae)
Approved by: markj (mentor)
This fixes several bugs where some 'struct ucred' in the kernel,
constructed from user input (via nmount(2)) or obtained from other
servers (e.g., gssd(8)), could have an unfilled 'cr_groups' field and
whose 'cr_groups[0]' (or 'cr_gid', which is an alias) was later
accessed, causing an uninitialized access giving random access rights.
Use crsetgroups_fallback() to enforce a fallback group when possible.
For NFS, the chosen fallback group is that of the NFS server in the
current VNET (NFSD_VNET(nfsrv_defaultgid)).
There does not seem to be any sensible fallback available in rpc code
(sys/rpc/svc_auth.c, svc_getcred()) on AUTH_UNIX (TLS or not), so just
fail credential retrieval there. Stock NSS sources, rpc.tlsservd(8) or
rpc.tlsclntd(8) provide non-empty group lists, so will not be impacted.
Discussed with: rmacklem (by mail)
Approved by: markj (mentor)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D46918
(cherry picked from commit cfbe7a62dc62e8a5d7520cb5eb8ad7c4a9418e26)
Approved by: markj (mentor)