Commit graph

385 commits

Author SHA1 Message Date
Olivier Certner
c1d7552ddd
New setcred() system call and associated MAC hooks
This new system call allows to set all necessary credentials of
a process in one go: Effective, real and saved UIDs, effective, real and
saved GIDs, supplementary groups and the MAC label.  Its advantage over
standard credential-setting system calls (such as setuid(), seteuid(),
etc.) is that it enables MAC modules, such as MAC/do, to restrict the
set of credentials some process may gain in a fine-grained manner.

Traditionally, credential changes rely on setuid binaries that call
multiple credential system calls and in a specific order (setuid() must
be last, so as to remain root for all other credential-setting calls,
which would otherwise fail with insufficient privileges).  This
piecewise approach causes the process to transiently hold credentials
that are neither the original nor the final ones.  For the kernel to
enforce that only certain transitions of credentials are allowed, either
these possibly non-compliant transient states have to disappear (by
setting all relevant attributes in one go), or the kernel must delay
setting or checking the new credentials.  Delaying setting credentials
could be done, e.g., by having some mode where the standard system calls
contribute to building new credentials but without committing them.  It
could be started and ended by a special system call.  Delaying checking
could mean that, e.g., the kernel only verifies the credentials
transition at the next non-credential-setting system call (we just
mention this possibility for completeness, but are certainly not
endorsing it).

We chose the simpler approach of a new system call, as we don't expect
the set of credentials one can set to change often.  It has the
advantages that the traditional system calls' code doesn't have to be
changed and that we can establish a special MAC protocol for it, by
having some cleanup function called just before returning (this is
a requirement for MAC/do), without disturbing the existing ones.

The mac_cred_check_setcred() hook is passed the flags received by
setcred() (including the version) and both the old and new kernel's
'struct ucred' instead of 'struct setcred' as this should simplify
evolving existing hooks as the 'struct setcred' structure evolves.  The
mac_cred_setcred_enter() and mac_cred_setcred_exit() hooks are always
called by pairs around potential calls to mac_cred_check_setcred().
They allow MAC modules to allocate/free data they may need in their
mac_cred_check_setcred() hook, as the latter is called under the current
process' lock, rendering sleepable allocations impossible.  MAC/do is
going to leverage these in a subsequent commit.  A scheme where
mac_cred_check_setcred() could return ERESTART was considered but is
incompatible with proper composition of MAC modules.

While here, add missing includes and declarations for standalone
inclusion of <sys/ucred.h> both from kernel and userspace (for the
latter, it has been working thanks to <bsm/audit.h> already including
<sys/types.h>).

Reviewed by:    brooks
Approved by:    markj (mentor)
Relnotes:       yes
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D47618

(cherry picked from commit ddb3eb4efe55e57c206f3534263c77b837aff1dc)
2025-04-03 21:31:03 +02:00
Olivier Certner
731dc8994c
MAC: syscalls: mac_label_copyin(): 32-bit compatibility
Needed by the upcoming setcred() system call.  More generally, is a step
on the way to support 32-bit compatibility for MAC-related system calls.

Reviewed by:    brooks
Approved by:    markj (mentor)
MFC after:      2 weeks
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D47878

(cherry picked from commit 3bdc5ba2ac760634056c66c3c98b6b3452258a5b)
2025-01-16 19:06:56 +01:00
Olivier Certner
c2bf375d10
MAC: syscalls: Split mac_set_proc() into reusable pieces
This is in preparation for enabling the new setcred() system call to set
a process' MAC label.

No functional change (intended).

MFC after:      2 weeks
Approved by:    markj (mentor)
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D46905

(cherry picked from commit 8a4d24a39098ed8170a37ca2aa83bf1da1976de1)
2025-01-16 19:06:56 +01:00
Olivier Certner
f0bd9df3e6
MAC: syscalls: Factor out common label copy-in code
Besides simplifying existing code, this will later enable the new
setcred() system call to copy MAC labels.

MFC after:      2 weeks
Approved by:    markj (mentor)
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D46904

(cherry picked from commit 2e593dd3b5e1c515d57b3d3f929e544a6622b04a)
2025-01-16 19:06:56 +01:00
Olivier Certner
62d3e81935
MAC: mac_policy.h: Declare common MAC sysctl and jail parameters' nodes
Do this only when the headers for these functionalities were included
prior to this one.  Indeed, if they need to be included, style(9)
mandates they should have been so before this one.

Remove the common MAC sysctl declaration from
<security/mac/mac_internal.h>, as it is now redundant (all its includers
also include <security/mac/mac_policy.h>).

Remove local such declarations from all policies' files.

Reviewed by:    jamie
Approved by:    markj (mentor)
MFC after:      5 days
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D46903

(cherry picked from commit db33c6f3ae9d1231087710068ee4ea5398aacca7)

The original changes in 'sys/security/mac_grantbylabel/mac_grantbylabel.c' were
removed as MAC/grantbylabel has not been MFCed.
2025-01-16 19:06:55 +01:00
Olivier Certner
66fb52a272
MAC: Define a common 'mac' node for MAC's jail parameters
To be used by MAC/do.

Reviewed by:    jamie
Approved by:    markj (mentor)
MFC after:      5 days
Relnotes:       yes
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D46899

(cherry picked from commit 5041b20503dbb442cc9ebd0a6e4db26905102c72)
2025-01-16 19:06:54 +01:00
Olivier Certner
f878893bdc
MAC: 'kernel_mac_support' module: Make an outdated comment more generic
No functional change.

Reviewed by:    jamie
Approved by:    markj (mentor)
MFC after:      5 days
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D46898

(cherry picked from commit 90678c892d7b3a90339b7fc19fde16c64fe3cb70)
2025-01-16 19:06:54 +01:00
Michael Tuexen
66c7d5365a MAC: improve handling of listening sockets
so_peerlabel can only be used when the socket is not listening.

Reviewed by:		markj
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46755

(cherry picked from commit 2fb778fab893b4a8a86ecfa20acf2e23bb2cdae8)
2024-10-31 12:32:36 +01:00
Michael Tuexen
406d75a581 MAC: improve consistency in error handling
Whenever mac_syncache_init() returns an error, ensure that
*label = NULL. This simplifies the error handling by the caller.

Reviewed by:		rscheff
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46701

(cherry picked from commit 3f2792166aeed4baf07d351bcb12a9d196c443eb)
2024-10-31 12:24:34 +01:00
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh
95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
Shivank Garg
215bab7924 mac_ipacl: new MAC policy module to limit jail/vnet IP configuration
The mac_ipacl policy module enables fine-grained control over IP address
configuration within VNET jails from the base system.
It allows the root user to define rules governing IP addresses for
jails and their interfaces using the sysctl interface.

Requested by:	multiple
Sponsored by:	Google, Inc. (GSoC 2019)
MFC after:	2 months
Reviewed by:	bz, dch (both earlier versions)
Differential Revision: https://reviews.freebsd.org/D20967
2023-07-26 00:07:57 +00:00
Steve Kiernan
8deb442cf7 mac: Honor order when registering MAC modules.
Ensure MAC modules are inserted in order that they are registered.

Reviewed by:	markj
Obtained from:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D39589
2023-04-18 15:36:27 -04:00
Mark Johnston
cab1056105 kdb: Modify securelevel policy
Currently, sysctls which enable KDB in some way are flagged with
CTLFLAG_SECURE, meaning that you can't modify them if securelevel > 0.
This is so that KDB cannot be used to lower a running system's
securelevel, see commit 3d7618d8bf.  However, the newer mac_ddb(4)
restricts DDB operations which could be abused to lower securelevel
while retaining some ability to gather useful debugging information.

To enable the use of KDB (specifically, DDB) on systems with a raised
securelevel, change the KDB sysctl policy: rather than relying on
CTLFLAG_SECURE, add a check of the current securelevel to kdb_trap().
If the securelevel is raised, only pass control to the backend if MAC
specifically grants access; otherwise simply check to see if mac_ddb
vetoes the request, as before.

Add a new secure sysctl, debug.kdb.enter_securelevel, to override this
behaviour.  That is, the sysctl lets one enter a KDB backend even with a
raised securelevel, so long as it is set before the securelevel is
raised.

Reviewed by:	mhorne, stevek
MFC after:	1 month
Sponsored by:	Juniper Networks
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D37122
2023-03-30 10:45:00 -04:00
Justin Hibbits
30af2c131b IfAPI: Add if_get/setmaclabel() and use it.
Summary:
Port the MAC modules to use the IfAPI APIs as part of this.

Sponsored by:	Juniper Networks, Inc.
Reviewed by:	glebius
Differential Revision: https://reviews.freebsd.org/D38197
2023-01-31 15:02:15 -05:00
Mateusz Guzik
85dac03e30 vfs: stop using NDFREE
It provides nothing but a branchfest and next to no consumers want it
anyway.

Tested by:	pho
2022-12-19 08:07:23 +00:00
Allan Jude
5031550134 Bump MAC_VERSION to 5
2449b9e5fe introduced API changes
that require ensuring that loadable MAC modules use the matching API.

Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
2022-10-07 15:24:32 +00:00
Mateusz Guzik
a75d1ddd74 vfs: introduce V_PCATCH to stop abusing PCATCH 2022-09-17 15:41:37 +00:00
Gleb Smirnoff
e7d02be19d protosw: refactor protosw and domain static declaration and load
o Assert that every protosw has pr_attach.  Now this structure is
  only for socket protocols declarations and nothing else.
o Merge struct pr_usrreqs into struct protosw.  This was suggested
  in 1996 by wollman@ (see 7b187005d1), and later reiterated
  in 2006 by rwatson@ (see 6fbb9cf860).
o Make struct domain hold a variable sized array of protosw pointers.
  For most protocols these pointers are initialized statically.
  Those domains that may have loadable protocols have spacers. IPv4
  and IPv6 have 8 spacers each (andre@ dff3237ee5).
o For inetsw and inet6sw leave a comment noting that many protosw
  entries very likely are dead code.
o Refactor pf_proto_[un]register() into protosw_[un]register().
o Isolate pr_*_notsupp() methods into uipc_domain.c

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36232
2022-08-17 11:50:32 -07:00
Mateusz Guzik
60dae3b83b mac: cheaper check for mac_pipe_check_read
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D36082
2022-08-17 14:21:25 +00:00
Mateusz Guzik
92b5b97cb0 mac: s/0/false/ in macros denoting probe enablement
No functional changes.
2022-08-11 22:11:24 +00:00
Mitchell Horne
2449b9e5fe mac: kdb/ddb framework hooks
Add three simple hooks to the debugger allowing for a loaded MAC policy
to intervene if desired:
 1. Before invoking the kdb backend
 2. Before ddb command registration
 3. Before ddb command execution

We extend struct db_command with a private pointer and two flag bits
reserved for policy use.

Reviewed by:	markj
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D35370
2022-07-18 22:06:13 +00:00
Dmitry Chagin
31d1b816fe sysent: Get rid of bogus sys/sysent.h include.
Where appropriate hide sysent.h under proper condition.

MFC after:	2 weeks
2022-05-28 20:52:17 +03:00
Mateusz Guzik
7e1d3eefd4 vfs: remove the unused thread argument from NDINIT*
See b4a58fbf64 ("vfs: remove cn_thread")

Bump __FreeBSD_version to 1400043.
2021-11-25 22:50:42 +00:00
Mateusz Guzik
f77697dd9f mac: cheaper check for ifnet_create_mbuf and ifnet_check_transmit
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-06-29 15:06:45 +02:00
Gleb Smirnoff
08d9c92027 tcp_input/syncache: acquire only read lock on PCB for SYN,!ACK packets
When packet is a SYN packet, we don't need to modify any existing PCB.
Normally SYN arrives on a listening socket, we either create a syncache
entry or generate syncookie, but we don't modify anything with the
listening socket or associated PCB. Thus create a new PCB lookup
mode - rlock if listening. This removes the primary contention point
under SYN flood - the listening socket PCB.

Sidenote: when SYN arrives on a synchronized connection, we still
don't need write access to PCB to send a challenge ACK or just to
drop. There is only one exclusion - tcptw recycling. However,
existing entanglement of tcp_input + stacks doesn't allow to make
this change small. Consider this patch as first approach to the problem.

Reviewed by:	rrs
Differential revision:	https://reviews.freebsd.org/D29576
2021-04-12 08:25:31 -07:00
Robert Watson
a92c6b24c0 Add a comment on why the call to mac_vnode_relabel() might be in the wrong
place -- in the VOP rather than vn_setexttr() -- and that it is for historic
reasons.  We might wish to relocate it in due course, but this way at least
we document the asymmetry.
2021-02-27 16:25:26 +00:00
Mateusz Guzik
6b3a9a0f3d Convert remaining cap_rights_init users to cap_rights_init_one
semantic patch:

@@

expression rights, r;

@@

- cap_rights_init(&rights, r)
+ cap_rights_init_one(&rights, r)
2021-01-12 13:16:10 +00:00
Mateusz Guzik
77589de8aa mac: cheaper check for mac_vnode_check_readlink 2021-01-08 13:57:10 +00:00
Mateusz Guzik
33f3e81df5 cache: combine fast path enabled status into one flag
Tested by:	pho
2021-01-06 07:28:06 +00:00
Mateusz Guzik
89744405e6 pipe: allow for lockless pipe_stat
pipes get stated all thet time and this avoidably contributed to contention.
The pipe lock is only held to accomodate MAC and to check the type.

Since normally there is no probe for pipe stat depessimize this by having the
flag.

The pipe_state field gets modified with locks held all the time and it's not
feasible to convert them to use atomic store. Move the type flag away to a
separate variable as a simple cleanup and to provide stable field to read.
Use short for both fields to avoid growing the struct.

While here short-circuit MAC for pipe_poll as well.
2020-11-19 06:30:25 +00:00
Andriy Gapon
137d26e8a3 mac_framework.h: fix build with DEBUG_VFS_LOCKS and !MAC
I have such a custom kernel configuration and its build failed with:
linking kernel.full
ld: error: undefined symbol: mac_vnode_assert_locked
>>> referenced by mac_framework.h:556 (/usr/devel/git/apu2c4/sys/security/mac/mac_framework.h:556)
>>>               tmpfs_vnops.o:(mac_vnode_check_stat)
>>> referenced by mac_framework.h:556 (/usr/devel/git/apu2c4/sys/security/mac/mac_framework.h:556)
>>>               vfs_default.o:(mac_vnode_check_stat)
>>> referenced by mac_framework.h:556 (/usr/devel/git/apu2c4/sys/security/mac/mac_framework.h:556)
>>>               ufs_vnops.o:(mac_vnode_check_stat)
2020-09-03 20:30:52 +00:00
Mateusz Guzik
e5ecee7440 security: clean up empty lines in .c and .h files 2020-09-01 21:26:00 +00:00
Mateusz Guzik
4ec34a908b mac: even up all entry points to the same scheme
- use a macro for checking whether the site is enabled
- expand it to 0 if mac is not compiled in to begin with
2020-08-06 00:23:06 +00:00
Mateusz Guzik
18f67bc413 vfs: add a cheaper entry for mac_vnode_check_access 2020-08-05 07:34:45 +00:00
Mateusz Guzik
5b0acaf75f Fix tinderbox build after r363714 2020-07-30 22:56:57 +00:00
Mateusz Guzik
fad6dd772d vfs: elide MAC-induced locking on rename if there are no relevant hoooks 2020-07-29 17:05:31 +00:00
Mateusz Guzik
07d2145a17 vfs: add the infrastructure for lockless lookup
Reviewed by:    kib
Tested by:      pho (in a patchset)
Differential Revision:	https://reviews.freebsd.org/D25577
2020-07-25 10:32:45 +00:00
Mateusz Guzik
3ea3fbe685 vfs: fix vn_poll performance with either MAC or AUDIT
The code would unconditionally lock the vnode to audit or call the
mac hoook, even if neither want to do anything. Pre-check the state
to avoid locking in the common case of nothing to do.

Note this code should not be normally executed anyway as vnodes are
always return ready. However, poll1/2 from will-it-scale use regular
files for benchmarking, presumably to focus on the interface itself
as the vnode handler is not supposed to do almost anything.

This in particular fixes poll2 which passes 128 fds.

$ ./poll2_processes -s 10
before: 134411
after:  271572
2020-07-16 14:09:18 +00:00
Mateusz Guzik
ab06a30517 vfs: fix MAC/AUDIT mismatch in vn_poll
Auditing would not be performed without MAC compiled in.
2020-07-16 14:04:28 +00:00
Jason A. Harmening
407a5b7953 mac_policy: Remove mac_policy_sx
This lock was made unnecessary by the addition of mac_policy_rms in r356120.

Reviewed by:	mjg, kib
Differential Revision:	https://reviews.freebsd.org/D24283
2020-04-04 04:03:10 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Mateusz Guzik
6ebab6bad2 vfs: use mac fastpath for lookup, open, read, write, mmap 2020-02-13 22:22:55 +00:00
Mateusz Guzik
91061084d1 mac: implement fast path for checks
All checking routines walk a linked list of all modules in order to determine
if given hook is installed. This became a significant problem after mac_ntpd
started being loaded by default.

Implement a way perform checks for select hooks by testing a boolean.

Use it for priv_check and priv_grant, which are constantly called from priv_check.

The real fix would use hotpatching, but the above provides a way to know when
to do it.
2020-02-13 22:19:17 +00:00
Mateusz Guzik
b249ce48ea vfs: drop the mostly unused flags argument from VOP_UNLOCK
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D21427
2020-01-03 22:29:58 +00:00
Mateusz Guzik
deb2e577a2 mac: use a sleepable rmlock instead of an sx lock
If any non-static modules are loaded (and mac_ntpd tends to be), the lock is
taken all the time al over the kernel. On platforms like arm64 this results in
an avoidable significant performance degradation. Since write-locking is almost
never needed, use a primitive optimized towards read-locking.

Sample result of building the kernel on tmpfs 11 times:
stock           11142.80s user 6704.44s system 4924% cpu 6:02.42 total
patched         11118.95s user 2374.94s system 4547% cpu 4:56.71 total
2019-12-27 11:23:32 +00:00
Doug Moore
83704cc236 Instead of looking up a predecessor or successor to the current map
entry, when that entry has been seen already, keep the
already-looked-up value in a variable and use that instead of looking
it up again.

Approved by: alc, markj (earlier version), kib (earlier version)
Differential Revision: https://reviews.freebsd.org/D22348
2019-11-20 16:06:48 +00:00
Doug Moore
7cdcf86360 Define wrapper functions vm_map_entry_{succ,pred} to act as wrappers
around entry->{next,prev} when those are used for ordered list
traversal, and use those wrapper functions everywhere. Where the next
field is used for maintaining a stack of deferred operations, #define
defer_next to make that different usage clearer, and then use the
'right' pointer instead of 'next' for that purpose.

Approved by: markj
Tested by: pho (as part of a larger patch)
Differential Revision: https://reviews.freebsd.org/D22347
2019-11-13 15:56:07 +00:00
Doug Moore
2288078c5e Define macro VM_MAP_ENTRY_FOREACH for enumerating the entries in a vm_map.
In case the implementation ever changes from using a chain of next pointers,
then changing the macro definition will be necessary, but changing all the
files that iterate over vm_map entries will not.

Drop a counter in vm_object.c that would have an effect only if the
vm_map entry count was wrong.

Discussed with: alc
Reviewed by: markj
Tested by: pho (earlier version)
Differential Revision:	https://reviews.freebsd.org/D21882
2019-10-08 07:14:21 +00:00
Doug Moore
83ea714f4f vm_map_simplify_entry considers merging an entry with its two
neighbors, and is used in a way so that if entries a and b cannot be
merged, we consider them twice, first not-merging a with its successor
b, and then not-merging b with its predecessor a. This change replaces
vm_map_simplify_entry with vm_map_try_merge_entries, which compares
two adjacent entries only, and uses it to avoid duplicated
merge-checks.

Tested by: pho
Reviewed by: alc
Approved by: markj (implicit)
Differential Revision: https://reviews.freebsd.org/D20814
2019-08-25 07:06:51 +00:00