Commit graph

7970 commits

Author SHA1 Message Date
Franco Fichtner
5dc500ba7a route: protect against unattached AF deep down #207
For pppoe/ng interfaces sometimes we enter ip6_tryforward() with
a NULL pointer array and IN6_LINKMTU() glancing over the fact
that this is not a valid destination since if_afdata structure
is not initialized.

While here remove the RT_LINK_IS_UP macro since nothing outside
of nhop is using it.

This is probably a side effect generator, but fixing one spot
instead of the general case would leave other holes in the stack.
Do not return a route destination if the address families were not
yet attached.
2025-03-03 10:35:30 +01:00
Zhenlei Huang
accbbd1a64 carp: Fix checking IPv4 multicast address
An IPv4 address stored in `struct in_addr` is in network byte order but
`IN_MULTICAST` wants host order.

PR:		284872
Reported by:	Steven Perreau
Reported by:	Brett Merrick <brett.merrick@itcollective.nz>
Reviewed by:	Franco Fichtner <franco@opnsense.org>, ae, kp, glebius
Tested by:	Steven Perreau
Fixes:		137818006d carp: support unicast
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D49053

(cherry picked from commit 1776633438f24df09cb9815650891bcef0152874)
2025-02-25 09:11:42 +01:00
Kristof Provost
eb2415e79d pfil: set PFIL_FWD for IPv4 forwarding
Just like we already do for IPv6 set the PFIL_FWD flag when we're forwarding
IPv4 traffic. This allows firewalls to make more precise decisions.

Reviewed by:	glebius
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D48824
2025-02-05 11:21:18 +01:00
Kristof Provost
ebed92a975 netinet: enter epoch in garp_rexmit()
garp_rexmit() is a callback, so is not in net_epoch, which
arprequest_internal() expects.
Enter and exit the net_epoch.

PR:		284073
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 38fdcca05d09b4d5426a253d3c484f9481a73ac2)
2025-01-29 08:12:03 +01:00
Mark Johnston
c9a9bef302 ip: Defer checks for an unspecified dstaddr until after pfil hooks
To comply with LINCE certification, it's necessary to ensure that
packets to 0.0.0.0/::0 are dropped and logged by the firewall.  Such
packets are dropped by ip_input() and ip6_input() before reaching pfil
hooks; reorder the checks to give firewalls a chance to drop the packets
themselves, as this gives better observability.

Note that ip_forward() and ip6_forward() ensure that such packets are
not forwarded; they are passed back unmodified.
2025-01-08 08:34:07 +01:00
Kristof Provost
b18d147f48 pfil: PFIL_PASS never frees the mbuf
pfil hooks (i.e. firewalls) may pass, modify or free the mbuf passed
to them. (E.g. when rejecting a packet, or when gathering up packets
for reassembly).

If the hook returns PFIL_PASS the mbuf must still be present. Assert
this in pfil_mem_common() and ensure that ipfilter follows this
convention. pf and ipfw already did.
Similarly, if the hook returns PFIL_DROPPED or PFIL_CONSUMED the mbuf
must have been freed (or now be owned by the firewall for further
processing, like packet scheduling or reassembly).

This allows us to remove a few extraneous NULL checks.

Suggested by:	tuexen
Reviewed by:	tuexen, zlei
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D43617
2024-12-11 13:34:57 +01:00
Xavier Beaudouin
eac0f46922 Add UDP encapsulation of ESP in IPv6
This patch provides UDP encapsulation of ESP packets over IPv6.
Ports the IPv4 code to IPv6 and adds support for IPv6 in udpencap.c
As required by the RFC and unlike in IPv4 encapsulation,
UDP checksums are calculated.

Co-authored-by:	Aurelien Cazuc <aurelien.cazuc.external@stormshield.eu>
Sponsored-by:	Stormshield
Sponsored-by:	Wiktel
Sponsored-by:	Klara, Inc.

Fix KASSERT in 80044c78 causing build failures

Move the KASSERT to where struct ip6_hdr is populated

Fixes:		80044c785cb040a2cf73779d23f9e1e81a00c6c3
Reported-by:	bapt
Reviewed-by:	markj
Sponsored-by:	Klara, Inc.
2024-12-11 13:34:57 +01:00
Franco Fichtner
f935a066bc pf|ipfw|netinet6?: shared IP forwarding
This removes the if_output calls in the pf(4) code that escape further
processing by defering the forwarding execution to the network stack
using on/off style sysctls for both IPv4 and IPv6.

Also see: https://reviews.freebsd.org/D8877
2024-12-11 13:34:55 +01:00
Stephan de Wit
5e72057985 rss: add sysctl enable toggle
This commit also includes the original refactoring changes

This change allows the kernel to operate with the default netisr cpu-affinity settings while having RSS compiled in. Normally, RSS changes quite a bit of the behaviour of the kernel dispatch service - this change allows for reducing impact on incompatible hardware while preserving the option to boost throughput speeds based on packet flow CPU affinity.

Make sure to compile the following options in the kernel:

    options  RSS

As well as setting the following sysctls:

    net.inet.rss.enabled: 1
    net.isr.bindthreads: 1
    net.isr.maxthreads: -1 (automatically sets it to the number of CPUs)

And optionally (to force a 1:1 mapping between CPUs and buckets):

    net.inet.rss.bits: 3 (for 8 CPUs)
    net.inet.rss.bits: 2 (for 4 CPUs)

etc.

Set pin_default_swi to 0 by default in the RSS case.
2024-12-11 11:10:51 +01:00
Franco Fichtner
8dcdc32e49 dummynet: passin after dispatch
Based on a patch originally found in m0n0wall, expanded
to IPv6 and aligned with FreeBSD's IP input path.

The limit may not be correctly accounted for on the WAN
interface due to dummynet counting the packet again even
though it was already processed.

The problem here is that there's no proper way to reinject
the packet at the point where it was previously removed
from so we make the assumption that ip input was already
done (including pfil) and more or less directly move to
packet output processing.

While here move the passin label up to take the extra check
but avoiding a second label.  Also remove the spurious tag
read for forward check since we don't use it and we should
really trust the mbuf flag.
2024-12-11 11:10:50 +01:00
Michael Tuexen
35874d28c8 sctp: fix debug message
(cherry picked from commit 518a1163d0aa73b26da1dd1a4bb186042ea3c66e)
(cherry picked from commit 0e8faabc270f89fbc54bbc118b2ebe2a38364375)

Approved by:	re (cperviva)
2024-11-07 01:03:09 +01:00
Michael Tuexen
a0bc4ec08b sctp: improve handling of address changes
Identify interfaces consistenly by the pair of the ifn pointer
and the index.
This avoids a use after free when the ifn and or index was reused.

Reported by:	bz, pho, and others

(cherry picked from commit 523913c94371ab50a8129cbab820394d25f7a269)
(cherry picked from commit 331db93815afb49b01f269aeff0fe899acd47455)

Approved by:	re (cperviva)
2024-11-07 01:02:52 +01:00
Michael Tuexen
2a6bd6e37b sctp: garbage collect two unused functions
(cherry picked from commit 470a63cde4285ea4a317b0bba966514c11f4ed5b)
(cherry picked from commit e3f26ce52b71d4005e666ced22c0855dbc70b28e)

Approved by:	re (cperviva)
2024-11-07 01:02:33 +01:00
Michael Tuexen
bb6af83fe4 sctp: don't consider the interface name when removing an address
Checking the interface name can not be done consistently, so
don't do it.

(cherry picked from commit bf11fdaf0d095fecca61fa8b457d06e27fae5946)
(cherry picked from commit 66628552a38751ed5c395858d1754660557674cd)

Approved by:	re (cperviva)
2024-11-07 01:02:12 +01:00
Michael Tuexen
33197f22b5 sctp: editorial cleanup
Improve consistency, no functional change intended.

(cherry picked from commit d839cf2fbb47c52d5153fb366c51bd6f6a3dd0fd)
(cherry picked from commit 107704217b)

Approved by: 	re (cperviva)
2024-11-07 01:01:23 +01:00
Michael Tuexen
d27f63fa8c sctp: another cleanup
No functional change intended.

(cherry picked from commit d08713dcdb158b2f55a885e7cfbbe410272c55a2)
2024-10-31 12:44:02 +01:00
Michael Tuexen
abbfa0cb48 sctp: cleanup the addition of addresses which are already known
No functional change intended.

(cherry picked from commit a05620b0f67fe526350bf386882262ca8005533f)
2024-10-31 12:43:29 +01:00
Michael Tuexen
676b45d04b sctp: further cleanup
(cherry picked from commit 02478e65910ab1ef53511ebb2271cdcf0e9a14cf)
2024-10-31 12:42:52 +01:00
Michael Tuexen
129057d5fa sctp garbage collect sctp_update_ifn_mtu
(cherry picked from commit ce5b5361d4d1b3868631baa6870ba6e1e6ec8330)
2024-10-31 12:42:13 +01:00
Michael Tuexen
18a20a4306 sctp: cleanup
No functional change intended.

(cherry picked from commit e4ac0183a1a846ef6556c9876dab76c06f5fea9c)
2024-10-31 12:41:41 +01:00
Michael Tuexen
8689398f09 sctp: improve debug output
(cherry picked from commit ce20b48a60fbae275085237dd48075d426f00d37)
2024-10-31 12:41:06 +01:00
Michael Tuexen
bbb73d8941 sctp: check locking requirements
Actually assert the locking instead of describing it in a comment.
No functional change intended.

(cherry picked from commit 4466a97e83fd9484cb22dd2867b6972f6b185e8b)
2024-10-31 12:40:21 +01:00
Michael Tuexen
ebdee305b1 sctp: make sctp_free_ifn() static
It is not used outside of the file.
No functional change intended.

(cherry picked from commit e1a09d1e9df30347c279604191a04ce2ef20bf0c)
2024-10-31 12:40:03 +01:00
Michael Tuexen
efcaa63aca sctp: cleanup sctp_delete_ifn
The address lock is always held, so no need for the second
parameter.
No functional change intended.

(cherry picked from commit 2e9761eb80f3e58c116efc10c739ed0d8497c1d6)
2024-10-31 12:38:39 +01:00
Michael Tuexen
b785f83e98 tcp: small cleanup
No functional change intended.

Reviewed by:		cc, glebius, markj, rscheff
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46850

(cherry picked from commit 2eacb0841c7dfc92030abc433e53cd31383a0648)
2024-10-31 12:37:05 +01:00
Michael Tuexen
67e4692998 tcp: improve mbuf handling when processing SYN segments
When the sysctl-variable net.inet.ip.accept_sourceroute is non-zero,
an mbuf would be leaked when processing a SYN-segment containing an
IPv4 strict or loose source routing option, when the on-stack
syncache entry is used or there is an error related to processing
TCP MD5 options.
Fix this by freeing the mbuf whenever an error occurred or the
on-stack syncache entry is used.

Reviewed by:		markj, rscheff
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46839

(cherry picked from commit 01eb635d12953e24ee5fae69692c28e4aab4f0f6)
2024-10-31 12:36:12 +01:00
Michael Tuexen
9a3bb25bab tcp: whitespace cleanup
No functional change intended.

Reported by:	markj
Sponsored by:	Netflix, Inc.

(cherry picked from commit a2e4f45480c248036b002904ddbceef20ba7c523)
2024-10-31 12:35:24 +01:00
Michael Tuexen
00c3c39fcc tcp: improve ref count handling when processing SYN
Don't leak a reference count for so->so_cred when processing an
incoming SYN segment with an on-stack syncache entry and the
sysctl variable net.inet.tcp.syncache.see_other is false.

Reviewed by:		cc, markj, rscheff
Sponsored by:		Netflix, Inc.
Pull Request:		https://reviews.freebsd.org/D46793

(cherry picked from commit cbc9438f0505bd971e9eba635afdae38a267d76e)
2024-10-31 12:34:33 +01:00
Michael Tuexen
2f5ac48d9b tcp: improve MAC error handling for SYN segments
Don't leak a maclabel when SYN segments are processed which results
in an error due to MD5 signature handling.
Tweak the #idef MAC to allow additional upcoming changes.

Reviewed by:		markj
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46766

(cherry picked from commit 78e1b031d2e8ef0e1cbc8874891f5476dc7868bc)
2024-10-31 12:33:35 +01:00
Michael Tuexen
8df12a277f tcp: make tcp_lro_flush() static
tcp_lro_flush() is not used anymore outside of tcp_lro.c. Therefore
make it static.

Reviewed by:		rscheff, glebius, Peter Lei
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46435

(cherry picked from commit e06cf0fc5dd626c34acdef308b696b4995371a4b)
2024-10-31 12:20:35 +01:00
Michael Tuexen
003f1ebcbc tcp: improve consistency of syncache_respond() failure handling
When the initial sending of the SYN ACK segment using
syncache_respond() fails, it is handled as a permanent error.
To improve consistency, apply this policy in all cases, where
syncache_respond() is called. These include
* timer based retransmissions of the SYN ACK
* retransmitting a SYN ACK in response to a SYN retransmission
* sending of challenge ACKs in response to received RST segments
In these cases, fall back to SYN cookies, if enabled.
While there, also improve consistency of the TCP stats counters.

Reviewed by:		cc, glebius (earlier version)
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46428

(cherry picked from commit ef438f7706be48f1cf7fd4c8a60329e1619cfe30)
2024-10-31 12:17:53 +01:00
Michael Tuexen
2e45166856 tcp rack, bbr: improve handling of soft errors
Do not report an error, if it is stored as a soft error. This avoids,
for example, the dropping of TCP connections using an interface,
while enabling or disabling LRO on that interface.

Reviewed by:		cc
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46427

(cherry picked from commit b2044c4557443bbce974101f04e2b465d1bbe769)
2024-10-31 12:16:29 +01:00
Zhenlei Huang
1821145f28 tcp cc: Remove a stray semicolon
MFC after:	1 week

(cherry picked from commit 2f395cfda8b5c1dc267e9cd4d99d7d0862fb4fca)
2024-10-31 12:40:18 +08:00
Richard Scheffenegger
6b2977c597 tcp: fix duplicate retransmissions when RTO happens during SACK loss recovery
When snd_nxt doesn't track snd_max, partial SACK ACKs may elicit
unexpected duplicate retransmissions. This is usually masked by
LRO not necessarily ACKing every individual segment, and prior
to RFC6675 SACK loss recovery, harder to trigger even when an
RTO happens while SACK loss recovery is ongoing.

Address this by improving the logic when to start a SACK loss recovery
and how to deal with a RTO, as well as improvements to the adjusted
congestion window during transmission selection.

Reviewed By:	tuexen, cc, #transport
Sponsored by:	NetApp, Inc.
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D43355

(cherry picked from commit 440f4ba18e3ab7be912858bbcb96a419fcf14809)
2024-10-18 09:51:38 +02:00
Ed Maste
ae3d7e27ab sctp: propagate cap rights on sctp_peeloff
PR:		201052
Reviewed by:	oshogbo, tuexen
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D46884

(cherry picked from commit 91a9e4e01dab7a740b8e3b7c39c59a537e71e5d2)
2024-10-17 12:29:21 -04:00
Richard Scheffenegger
8fee873d78 tcp: keep syncache flags when updating ECN info
While processing the ECN flags of an incoming packet,
incorrectly cleared all other syncache flags.

Reported by: tuexen
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D46694

(cherry picked from commit 0a05ea1f56e65ec0477d56daf5ed623087464082)
2024-09-22 18:24:36 +02:00
Mark Johnston
59f3eb3b71 netinet: Explicitly disallow connections to the unspecified address
If the V_connect_ifaddr_wild sysctl says that we shouldn't infer a
destination address, return an error.  Otherwise it's possible for use
of an unspecified foreign address to trigger a subsequent assertion
failure, for example in in_pcblookup_hash_locked().

Similarly, if no interface addresses are assigned, fail quickly upon an
attempt to connect to the unspecified address.

Reported by:	Shawn Webb <shawn.webb@hardenedbsd.org>
MFC after:	2 weeks
Reviewed by:	zlei, allanjude, emaste
Differential Revision:	https://reviews.freebsd.org/D46454

(cherry picked from commit 0c605af3f9d9e66be6af0a3bbc36dbedc5dfe516)
2024-09-20 11:39:16 +00:00
Kristof Provost
4acf9ba16d netinet: fix LINT-NOINET build failure
Sponsored by:	Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 3b62f3350017ab6722ebe8e4fccd9ba76acbb214)
2024-09-07 01:52:31 +01:00
Mark Johnston
8ae58e0edb netinet: Add a sysctl to allow disabling connections to INADDR_ANY
See the discussion in Bugzilla PR 280705 for context.

PR:		280705
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D46259

(cherry picked from commit 417b35a97b7669eb0bf417b43e97cccbedbce6f9)
2024-09-03 14:54:42 +00:00
Michael Tuexen
2dcd21ddab tcp: fix format of sysctl variable
The format for CTLTYPE_UINT is "IU" instead of "UI" as specified
in sysctl.9.

Reviewed by:		cc, zlei
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46408

(cherry picked from commit 498286d4e807d6b9e4caad22b96ebca7f16e9b18)
2024-08-30 08:32:11 +02:00
Michael Tuexen
bb31f15d17 sctp: fix format of sysctl variables
(cherry picked from commit a1d9ce19b13f220c5738e6aa58cf0c3750a05526)
2024-08-30 08:31:35 +02:00
Michael Tuexen
6c0fb6c5ac tcp: improve consistency of SYN-cache handling
Originally, a SYN-cache entry was always allocated and later freed,
when not needed anymore. Then the allocation was avoided, when no
SYN-cache entry was needed, and a copy on the stack was used.
But the logic regarding freeing was not updated.
This patch doesn't re-check conditions (which may have changed) when
deciding to insert or free the entry, but uses the result of
the earlier check.
This simplifies the code and improves also consistency.

Reviewed by:		glebius
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46410

(cherry picked from commit e41364711ca3f7e214f9607ebedf62e03e51633d)
2024-08-30 08:30:54 +02:00
Michael Tuexen
e0bcb3aa4f tcp: initialize the LRO hash table with correct size
There will at most lro_entries entries in the LRO hash table. So no
need to take lro_mbufs into account, which only results in the
LRO hash table being too large and therefore wasting memory.

Reviewed by:		rrs
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46378

(cherry picked from commit aa6c490bf80fcef15cfc0d3f562fae19ef2375aa)
2024-08-30 08:29:50 +02:00
Michael Tuexen
118ab70d57 tcp: fix list iteration in tcp_lro_flush_active()
Use LIST_FOREACH_SAFE(), since the list element is removed from
the list in the loop body, zero out and inserted in the free list.

Reviewed by:		rrs
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46383

(cherry picked from commit 64443828bbe7c571db8d8731758ec8c4b8364c86)
2024-08-30 08:29:03 +02:00
Kristof Provost
88e1bc0669 mcast: fix leaked igmp packets on multicast cleanup
When we release a multicast address (e.g. on interface shutdown) we may
still have packets queued in inm_scq. We have to free those, or we'll
leak memory.

Reviewed by:	glebius
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D43033

(cherry picked from commit c2e340452c147b551180f2a1600ae76491342b0e)
2024-08-26 09:46:21 -06:00
Kristof Provost
31ad232d85 Revert "mcast: fix memory leak in imf_purge()"
This reverts commit fa03d37432caf17d56a931a9e6f5d9b06f102c5b.

This commit caused us to not send IGMP leave messages if the inpcb went
away. In other words: we freed pending packets whenever the socket
closed rather than when the interface (or address) goes away.

Reviewed by:	glebius
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D43032

(cherry picked from commit c196e43243b83840cc9f3d1dadc7dacb3b0f556f)
2024-08-26 09:45:58 -06:00
Eugene Grosbein
2441180265 libalias: fix subtle racy problem in outside-inside forwarding
sys/netinet/libalias/alias_db.c has internal static function UseLink()
that passes a link to CleanupLink() to verify if the link has expired.
If so, UseLink() may return NULL.

_FindLinkIn()'s usage of UseLink() is not quite correct.

Assume there is "redirect_port udp" configured to forward incoming
traffic for specific port to some internal address.
Such a rule creates partially specified permanent link.

After first such incoming packet libalias creates new fully specifiled
temporary LINK_UDP with default timeout of 60 seconds.
Also, in case of low traffic libalias may assign "timestamp"
for this new temporary link way in the past because
LibAliasTime is updated seldom and can keep old value
for tens of seconds, and it will be used for the temporary link.

It may happen that next incoming packet for redirected port
passed to _FindLinkIn() results in a call to UseLink()
that returns NULL due to detected expiration.
Immediate return of NULL results in broken translation:
either a packet is dropped (deny_incoming mode) or delivered to
original destination address instead of internal one.

Fix it with additional check for NULL to proceed with a search
for original partially specified link. In case of UDP,
it also recreates temporary fully specified link
with a call to ReLink().

Practical examples are "redirect_port udp" rules for unidirectional
SYSLOG protocol (port 514) or some low volume VPN encapsulated in UDP.

Thanks to Peter Much for initial analysis and first version of a patch.

Reported by:	Peter Much <pmc@citylink.dinoex.sub.org>
PR:		269770

(cherry picked from commit 8132e959099f0c533f698d8fbc17386f9144432f)
(cherry picked from commit e5b85380836378c9e321a4e6d300591e6faf622a)
2024-08-25 13:31:24 +07:00
Michael Tuexen
f18b9b2b95 ddb: update printing of t_flags and tflags2
Update the ddb printing of t_flags and t_flags2 to the current state of
definitions in tcp_var.h.

Reviewed by:		cc
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46222

(cherry picked from commit 093d9b46f4720392e53c171eaabfd7a6a8101170)
2024-08-13 19:05:06 +02:00
Michael Tuexen
b97b3dead5 Revert "ddb: update printing of t_flags and tflags2"
This reverts commit d3c1df53f5.
2024-08-13 19:02:20 +02:00
Michael Tuexen
d3c1df53f5 ddb: update printing of t_flags and tflags2
Update the ddb printing of t_flags and t_flags2 to the current state of
definitions in tcp_var.h.

Reviewed by:		cc
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D46222

(cherry picked from commit 093d9b46f4720392e53c171eaabfd7a6a8101170)
2024-08-13 15:54:17 +02:00