opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-06-03 22:02:58 -04:00

Author	SHA1	Message	Date
Franco Fichtner	9cb6d71f6a	netinet6: routed label was misplaced, checking non-shared-forward case which caused nh to be NULL just below the label.	2024-07-20 21:30:32 +02:00
Kristof Provost	f257b8d7e1	pfil: PFIL_PASS never frees the mbuf pfil hooks (i.e. firewalls) may pass, modify or free the mbuf passed to them. (E.g. when rejecting a packet, or when gathering up packets for reassembly). If the hook returns PFIL_PASS the mbuf must still be present. Assert this in pfil_mem_common() and ensure that ipfilter follows this convention. pf and ipfw already did. Similarly, if the hook returns PFIL_DROPPED or PFIL_CONSUMED the mbuf must have been freed (or now be owned by the firewall for further processing, like packet scheduling or reassembly). This allows us to remove a few extraneous NULL checks. Suggested by: tuexen Reviewed by: tuexen, zlei Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D43617	2024-07-18 13:13:55 +02:00
Xavier Beaudouin	18b8a9d5d3	Add UDP encapsulation of ESP in IPv6 This patch provides UDP encapsulation of ESP packets over IPv6. Ports the IPv4 code to IPv6 and adds support for IPv6 in udpencap.c As required by the RFC and unlike in IPv4 encapsulation, UDP checksums are calculated. Co-authored-by: Aurelien Cazuc <aurelien.cazuc.external@stormshield.eu> Sponsored-by: Stormshield Sponsored-by: Wiktel Sponsored-by: Klara, Inc. Fix KASSERT in 80044c78 causing build failures Move the KASSERT to where struct ip6_hdr is populated Fixes: 80044c785cb040a2cf73779d23f9e1e81a00c6c3 Reported-by: bapt Reviewed-by: markj Sponsored-by: Klara, Inc.	2024-06-10 14:33:01 +02:00
Stephan de Wit	31ce49c7ab	rss: add sysctl enable toggle This commit also includes the original refactoring changes This change allows the kernel to operate with the default netisr cpu-affinity settings while having RSS compiled in. Normally, RSS changes quite a bit of the behaviour of the kernel dispatch service - this change allows for reducing impact on incompatible hardware while preserving the option to boost throughput speeds based on packet flow CPU affinity. Make sure to compile the following options in the kernel: options RSS As well as setting the following sysctls: net.inet.rss.enabled: 1 net.isr.bindthreads: 1 net.isr.maxthreads: -1 (automatically sets it to the number of CPUs) And optionally (to force a 1:1 mapping between CPUs and buckets): net.inet.rss.bits: 3 (for 8 CPUs) net.inet.rss.bits: 2 (for 4 CPUs) etc. Set pin_default_swi to 0 by default in the RSS case.	2024-06-03 11:06:55 +02:00
Franco Fichtner	d8394bcdaa	pf\|ipfw\|netinet6?: shared IP forwarding This removes the if_output calls in the pf(4) code that escape further processing by defering the forwarding execution to the network stack using on/off style sysctls for both IPv4 and IPv6. Also see: https://reviews.freebsd.org/D8877	2024-06-03 11:06:55 +02:00
Franco Fichtner	38104a2f6e	dummynet: passin after dispatch Based on a patch originally found in m0n0wall, expanded to IPv6 and aligned with FreeBSD's IP input path. The limit may not be correctly accounted for on the WAN interface due to dummynet counting the packet again even though it was already processed. The problem here is that there's no proper way to reinject the packet at the point where it was previously removed from so we make the assumption that ip input was already done (including pfil) and more or less directly move to packet output processing. While here move the passin label up to take the extra check but avoiding a second label. Also remove the spurious tag read for forward check since we don't use it and we should really trust the mbuf flag.	2024-06-03 11:06:53 +02:00
Lexi Winter	6df9fa1c6b	sys/netinet6/in6_pcb.c: fix compile without INET in6_mapped_sockaddr() and in6_mapped_peeraddr() both define a local variable named 'inp', but in the non-INET case, this variable is set and never used, causing a compiler error: /src/freebsd/src/lf/sys/netinet6/in6_pcb.c:547:16: error: variable 'inp' set but not used [-Werror,-Wunused-but-set-variable] 547 \| struct inpcb inp; \| ^ /src/freebsd/src/lf/sys/netinet6/in6_pcb.c:573:16: error: variable 'inp' set but not used [-Werror,-Wunused-but-set-variable] 573 \| struct inpcb inp; Fix this by guarding all the INET-specific logic, including the variable definition, behind #ifdef INET. While here, tweak formatting in in6_mapped_peeraddr() so both functions are the same. Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1155 (cherry picked from commit 042fb58d009e7efc5b334b68fffbef9b1f620ec8) (cherry picked from commit `f30c2d86c3`) Approved-by: re (cperciva)	2024-05-21 14:08:41 -06:00
Mike Karels	096a438138	in6.h: expose s6_addr* definitions to user level The only element of of in6_addr that is specified in RFC 3493 or in POSIX.1-2017 is s6_addr, implemented via a #define to a union member. However, FreeBSD and other BSD systems have additional definitions for the other union members, s6_addr{8,16,32} which are defined for the kernel and loader. Some Linux applications also use them, and they seem to be allowed by the RFC and POSIX. Remove the current ifdefs, exposing the additional fields to user level, and replace with #if __BSD_VISIBLE. Add an explanatory comment expanding on the previous "nonstandard" comment. Reviewed by: bz Differential Revision: https://reviews.freebsd.org/D44979 Approved by: re (cperciva) (cherry picked from commit eb3dbf2dbe22ed6d4df54aebbf23f5b555a21cf1) (cherry picked from commit `a5a2e963f9`)	2024-05-14 15:01:43 -05:00
Gleb Smirnoff	d6e1ae659b	carp: check CARP status in in_localip_fib(), in6_localip_fib() Don't report a BACKUP CARP address as local. These two functions are used only by source address validation for input packets, controlled by sysctls net.inet.ip.source_address_validation and net.inet6.ip6.source_address_validation. For this purpose we definitely want to treat BACKUP addresses as non local. This change is conservative and doesn't modify compat in_localip() and in6_localip(). They are used more widely than the FIB-aware versions. The change would modify the notion of ipfw(4) 'me' keyword. There might be other consequences as in_localip() is used by various tunneling protocols. PR: 277349 (cherry picked from commit 56f7860087eec14b4a65310b70bd704e79e1b48c)	2024-03-28 12:35:45 -07:00
Mark Johnston	93f523ab36	netinet: Remove stale references to Giant from comments MFC after: 1 week (cherry picked from commit bbf86c65d04d6013fd3f7b6d74a341256c4e7336)	2024-02-03 14:10:36 -05:00
Gordon Bergling	a8dc27290f	netinet6: Fix two typos in source code comments - s/adddress/address/ (cherry picked from commit 496432f192165b8700da4b0ab8ebdd253002e265)	2024-01-25 07:46:35 +01:00
John Baldwin	9c50c9b776	sys: Use mbufq_empty instead of comparing mbufq_len against 0 Reviewed by: bz, emaste Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D43338 (cherry picked from commit 8cb9b68f5821e45c63ee08d8ee3029ca523ac174)	2024-01-18 14:37:29 -08:00
Mark Johnston	e4db787bb8	frag6: Add another use of frag6_rmqueue() No functional change intended. Reviewed by: kp, bz MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D43256 (cherry picked from commit 8d01ecd8e9da5192a8b2dfb6c7d58b4aae9ea358)	2024-01-11 09:22:14 -05:00
Michael Tuexen	a4925f0f8c	udplite: make socketoption available on IPv6 sockets This patch allows the IPPROTO_UDPLITE-level socket options UDPLITE_SEND_CSCOV and UDPLITE_RECV_CSCOV to be used on AF_INET6 sockets in addition to AF_INET sockets. Reviewed by: ae, rscheff MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D42430 (cherry picked from commit 03c3a70abe5e9fa259b954de78ae69229fa9c99f)	2024-01-10 20:22:52 -05:00
Gleb Smirnoff	2bfe735277	inpcb: reoder inpcb destruction First, merge in_pcbdetach() with in_pcbfree(). The comment for in_pcbdetach() was no longer correct. Then, make sure we remove the inpcb from the hash before we commit any destructive actions on it. There are couple functions that rely on the hash lock skipping SMR + inpcb lock to lookup an inpcb. Although there are no known functions that similarly rely on the global inpcb list lock, also do list removal before destructive actions. PR: 273890 Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D43122 (cherry picked from commit a13039e2709277b1c3b159e694cc909a5e044151)	2024-01-08 16:29:38 -08:00
Mark Johnston	e4ca8864c2	frag6: Reduce code duplication The code which removes a fragment queue from the per-VNET hash table was duplicated three times. Factor it out into a function. No functional change intended. Reviewed by: kp, bz MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D43228 (cherry picked from commit 0736a38072b52204289c669770a34d0b801a8a7e)	2024-01-07 11:55:40 -05:00
Mark Johnston	213077d6e4	frag6: Drop unneeded casts from malloc calls No functional change intended. MFC after: 1 week (cherry picked from commit f12a9a4c041a4dbce7dccc85aa5fad155e137d7e)	2024-01-07 11:55:32 -05:00
Andrey V. Elsukov	9be802c04b	Avoid IPv6 source address selection on accepting TCP connections When an application listens IPv6 TCP socket, due to ipfw forwarding tag it may handle connections for addresses that do not belongs to the jail or even current host (transparent proxy). Syncache code can successfully handle TCP handshake for such connections. When syncache finally accepts connection it uses in6_pcbconnect() to properly initlize new connection info. For IPv4 this scenario just works, but for IPv6 it fails when local address doesn't belongs to the jail. This check occurs when in6_pcbladdr() applies IPv6 SAS algorithm. We need IPv6 SAS when we are connection initiator, but in the above case connection is already established and both source and destination addresses are known. Use unused argument to notify in6_pcbconnect() when we don't need source address selection. This will fix `ipfw fwd` to jailed IPv6 address. When we are connection initiator, we stil use IPv6 SAS algorithm and apply all related restrictions. MFC after: 1 month Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D41685 (cherry picked from commit 0bf5377b6b9642acc85355062b921a07604b7c04)	2023-10-30 20:12:50 +03:00
Zhenlei Huang	da2b630c12	netinet6: Add sysctl flag CTLFLAG_TUN to loader tunables The following sysctl variables are actually loader tunables. Add sysctl flag CTLFLAG_TUN to them so that `sysctl -T` will report them correctly. 1. net.inet6.ip6.auto_linklocal 2. net.inet6.ip6.accept_rtadv 3. net.inet6.ip6.no_radr No functional change intended. Reviewed by: glebius MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D41928 (cherry picked from commit 03dac3e37993801dab4418087bfedacce0526e66)	2023-10-02 08:49:37 +08:00
Michael Tuexen	c3179e6660	sctp: cleanup cdefs.h include	2023-08-18 15:25:34 +02:00
Warner Losh	685dc743dc	sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s]__FBSDID$"\$FreeBSD\$"$;?\s*\n/	2023-08-16 11:54:36 -06:00
Warner Losh	dfc016587a	sys: Remove $FreeBSD$: two-line .c pattern Remove /^#include\s+<sys/cdefs.h>.*$\n\s+__FBSDID$"\$FreeBSD\$"$;\n/	2023-08-16 11:54:30 -06:00
Warner Losh	71625ec9ad	sys: Remove $FreeBSD$: one-line .c comment pattern Remove /^/[/]\s\$FreeBSD\$.*\n/	2023-08-16 11:54:24 -06:00
Warner Losh	2ff63af9b8	sys: Remove $FreeBSD$: one-line .h pattern Remove /^\s\+\s\$FreeBSD\$.$\n/	2023-08-16 11:54:18 -06:00
Warner Losh	95ee2897e9	sys: Remove $FreeBSD$: two-line .h pattern Remove /^\s\\n \*\s+\$FreeBSD\$$\n/	2023-08-16 11:54:11 -06:00
Michael Tuexen	9ade2745db	sctp: remove duplicate code No functional change intended. MFC after: 1 week	2023-08-08 13:05:39 +02:00
Michael Tuexen	c7587f7a3f	sctp: cleanup No functional change intended. MFC after: 1 week	2023-08-08 12:40:51 +02:00
Jonathan T. Looney	ff3d1a3f9d	frag6: Avoid a possible integer overflow in fragment handling Reviewed by: kp, markj, bz Approved by: so Security: FreeBSD-SA-23:06.ipv6 Security: CVE-2023-3107	2023-08-01 15:45:41 -04:00
Gleb Smirnoff	e3ba0d6add	inpcb: do not copy so_options into inp_flags2 Since `f71cb9f748` socket stays connnected with inpcb through latter's lifetime and there is no reason to complicate things and copy these flags. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D41198	2023-07-26 20:35:42 -07:00
Marius Strobl	e82d7b2952	gif(4): Revert in{,6}_gif_output() misalignment handling The code added in `c89c8a1029` in order to compensate possible misalignment caused by prepending the IP4/6 header with an EtherIP one got broken at some point by a rewrite of gif(4). For better or worse, `8018ac153f` relaxed the alignment of struct ip from 32 bit to 16 bit, though. As a result, a 16 bit offset of the IPv4 header induced by the addition of the 16 bit EtherIP one no longer is a problem in the first place. The alignment of struct ip6_hdr currently is even only 8 bit, making it even less problematic with regards to possible misalignment. Thus, remove the code for handling misalignment in in{,6}_gif_output() altogether again. While at it, replace the 3 bcopy(9) calls in gif(4) with memcpy(9) as there's no need to handle overlap here.	2023-07-26 13:14:22 +02:00
Shivank Garg	215bab7924	mac_ipacl: new MAC policy module to limit jail/vnet IP configuration The mac_ipacl policy module enables fine-grained control over IP address configuration within VNET jails from the base system. It allows the root user to define rules governing IP addresses for jails and their interfaces using the sysctl interface. Requested by: multiple Sponsored by: Google, Inc. (GSoC 2019) MFC after: 2 months Reviewed by: bz, dch (both earlier versions) Differential Revision: https://reviews.freebsd.org/D20967	2023-07-26 00:07:57 +00:00
Kristof Provost	9c9a76dc68	mld: always commit state changes on leaving Resolve a race condition where we'd lose the Solicited-node multicast group subscription if we assigned the same IPv6 address twice. PR: 233683 Reviewed by: ae MFC after: 1 week Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D41124	2023-07-24 16:47:34 +02:00
Konstantin Belousov	bc310a95c5	ip output: ensure that mbufs are mapped if ipsec is enabled Ipsec needs access to packet headers to determine if a policy is applicable. It seems that typically IP headers are mapped, but the code is arguably needs to check this before blindly accessing them. Then, operations like m_unshare() and m_makespace() are not yet ready for unmapped mbufs. Ensure that the packet is mapped before calling into IPSEC_OUTPUT(). PR: 272616 Reviewed by: jhb, markj Sponsored by: NVidia networking MFC after: 1 week Differential revision: https://reviews.freebsd.org/D41112	2023-07-21 21:51:13 +03:00
Kristof Provost	b8039bf5b3	Fix MINIMAL build Pre-declare struct ucred, to fix build issues on the MINIMAL config: In file included from /usr/src/sys/netpfil/pf/pfsync_nv.c:40: /usr/src/sys/netinet6/ip6_var.h:384:31: error: declaration of 'struct ucred' will not be visible outside of this function [-Werror,-Wvisibility] struct ip6_pktopts , struct ucred , int); ^ /usr/src/sys/netinet6/ip6_var.h:408:28: error: declaration of 'struct ucred' will not be visible outside of this function [-Werror,-Wvisibility] struct inpcb , struct ucred , int, struct in6_addr , int ); ^ 2 errors generated.	2023-07-14 09:18:43 +02:00
Alexander V. Chernikov	bb06a80cf6	netinet[6]: make in[6]_control use ucred instead of td. Reviewed by: markj, zlei Differential Revision: https://reviews.freebsd.org/D40793 MFC after: 2 weeks	2023-07-01 06:52:24 +00:00
Andrey V. Elsukov	0cd2d88d8d	carp: use nd6log() macro to log debug messages Obtained from: Yandex LLC Sponsored by: Yandex LLC	2023-06-28 13:27:37 +03:00
Mark Johnston	6775ef4188	netinet6: Implement in6_cksum_partial() using m_apply() This ensures that in6_cksum_partial() can be applied to unmapped mbufs, which can happen at least when icmp6_reflect() quotes a packet. The basic idea is to restructure in6_cksum_partial() to operate on one mbuf at a time. If the buffer length is odd or unaligned, an extra residual byte may be returned, to be incorporated into the checksum when processing the next buffer. PR: 268400 Reviewed by: cy MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D40598	2023-06-23 09:55:43 -04:00
Alexander V. Chernikov	e32221a15f	netinet6: make IPv6 fragment TTL per-VNET configurable. Having it configurable adds more flexibility, especially for the systems with low amount of memory. Additionally, it allows to speedup frag6/ tests execution. Reviewed by: kp, markj, bz Differential Revision: https://reviews.freebsd.org/D35755 MFC after: 2 weeks	2023-06-01 12:04:49 +00:00
Alexander V. Chernikov	a77facd273	ifnet: consistently call hooks when the interface gets up. Some context on the current IPv6 interface setup & address management: There are two data path for IPv6 initialisation in context of assigning LL addresses: 1) Userland explicitly requests IFF_UP for the interface w/o any addresses. if_up() then calls in6_if_up(), which calls in6_ifattach(). The latter sets up some initial ND/IN6 state and disables IPv6 for the interface if it’s not loopback. If the interface is loopback, then it adds ::1/128 and LL addresses via in6_ifattach_loopback(). Then, devd notification is generated (if the VNET is the default one), which triggers rc.network ifconfig_up(), causing ifdisabled to be removed via SIOCSIFINFO_IN6 from ifconfig. The kernel SIOCSIFINFO_IN6 handler calls in6_if_up() once again and it assigns the interface link-local address. 2) Userland adds IPv4 or IPv6 address to the interface. SIOCAIFADDR[_IN6] kernel handler calls IPv4/IPv6 protocol handler to add the address. Both then call if_ioctl() with SIOCSIFADDR. Ethernet/loopback ioctl handlers silently sets IFF_UP for the interface. Finally, if.c:ifioctl() wrapper code compares old and new interface flags and, if IFF_UP is added, it explicitly calls in6_if_up(), which adds link-local address if either the original address is IPv6 or the interface is loopback. In the latter case, “formal” interface-up notifications are missing. The kernel does not trigger event handler event, does not call carp hook and does not provide any userland notification. This diff unifies the event handling in both scenarios, providing the necessary notifications to the kernel and userland. Reviewed By: kp Differential Revision: https://reviews.freebsd.org/D40332 MFC after: 2 weeks	2023-06-01 11:44:19 +00:00
Doug Rabson	5ab151574c	netinet*: Fix redirects for connections from localhost Redirect rules use PFIL_IN and PFIL_OUT events to allow packet filter rules to change the destination address and port for a connection. Typically, the rule triggers on an input event when a packet is received by a router and the destination address and/or port is changed to implement the redirect. When a reply packet on this connection is output to the network, the rule triggers again, reversing the modification. When the connection is initiated on the same host as the packet filter, it is initially output via lo0 which queues it for input processing. This causes an input event on the lo0 interface, allowing redirect processing to rewrite the destination and create state for the connection. However, when the reply is received, no corresponding output event is generated; instead, the packet is delivered to the higher level protocol (e.g. tcp or udp) without reversing the redirect, the reply is not matched to the connection and the packet is dropped (for tcp, a connection reset is also sent). This commit fixes the problem by adding a second packet filter call in the input path. The second call happens right before the handoff to higher level processing and provides the missing output event to allow the redirect's reply processing to perform its rewrite. This extra processing is disabled by default and can be enabled using pfilctl: pfilctl link -o pf:default-out inet-local pfilctl link -o pf:default-out6 inet6-local PR: 268717 Reviewed-by: kp, melifaro MFC-after: 2 weeks Differential Revision: https://reviews.freebsd.org/D40256	2023-05-31 11:11:05 +01:00
Mark Johnston	a306ed50ec	inpcb: Restore missing validation of local addresses for jailed sockets When looking up a listening socket, the SMR-protected lookup routine may return a jailed socket with no local address. This happens when using classic jails with more than one IP address; in a single-IP classic jail, a bound socket's local address is always rewritten to be that of the jail. After commit `7b92493ab1`, the lookup path failed to check whether the jail corresponding to a matched wildcard socket actually owns the address, and would return the match regardless. Restore the omitted checks. Fixes: `7b92493ab1` ("inpcb: Avoid inp_cred dereferences in SMR-protected lookup") Reported by: peter Reviewed by: bz Differential Revision: https://reviews.freebsd.org/D40268	2023-05-30 15:15:48 -04:00
Alexander V. Chernikov	b50e1465e8	routing: plug mbuf leak for the packets hitting IPv6 blackhole route Reported by: Dmitriy Smirnov <fox@sage.su> Tested by: Dmitriy Smirnov <fox@sage.su> MFC after: 1 day	2023-05-17 09:06:04 +00:00
Warner Losh	4d846d260e	spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix	2023-05-12 10:44:03 -06:00
Ed Maste	b73183d1a2	ipv6: disable RFC 4620 nodeinfo by default RFC 4620 is an experimental RFC that can be used to request information about a host, including: - the fully-qualified or single-component name - some set of the Responder's IPv6 unicast addresses - some set of the Responder's IPv4 unicast addresses This is not something that should be made available by default. PR: 257709 Submitted by: ruben@verweg.com Reviewed by: melifaro Relnotes: Yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D39778	2023-04-26 13:47:59 -04:00
Mark Johnston	7b92493ab1	inpcb: Avoid inp_cred dereferences in SMR-protected lookup The SMR-protected inpcb lookup algorithm currently has to check whether a matching inpcb belongs to a jail, in order to prioritize jailed bound sockets. To do this it has to maintain a ucred reference, and for this to be safe, the reference can't be released until the UMA destructor is called, and this will not happen within any bounded time period. Changing SMR to periodically recycle garbage is not trivial. Instead, let's implement SMR-synchronized lookup without needing to dereference inp_cred. This will allow the inpcb code to free the inp_cred reference immediately when a PCB is freed, ensuring that ucred (and thus jail) references are released promptly. Commit `220d892129` ("inpcb: immediately return matching pcb on lookup") gets us part of the way there. This patch goes further to handle lookups of unconnected sockets. Here, the strategy is to maintain a well-defined order of items within a hash chain so that a wild lookup can simply return the first match and preserve existing semantics. This makes insertion of listening sockets more complicated in order to make lookup simpler, which seems like the right tradeoff anyway given that bind() is already a fairly expensive operation and lookups are more common. In particular, when inserting an unconnected socket, in_pcbinhash() now keeps the following ordering: - jailed sockets before non-jailed sockets, - specified local addresses before unspecified local addresses. Most of the change adds a separate SMR-based lookup path for inpcb hash lookups. When a match is found, we try to lock the inpcb and re-validate its connection info. In the common case, this works well and we can simply return the inpcb. If this fails, typically because something is concurrently modifying the inpcb, we go to the slow path, which performs a serialized lookup. Note, I did not touch lbgroup lookup, since there the credential reference is formally synchronized by net_epoch, not SMR. In particular, lbgroups are rarely allocated or freed. I think it is possible to simplify in_pcblookup_hash_wild_locked() now, but I didn't do it in this patch. Discussed with: glebius Tested by: glebius Sponsored by: Klara, Inc. Sponsored by: Modirum MDPay Differential Revision: https://reviews.freebsd.org/D38572	2023-04-20 12:13:06 -04:00
Mark Johnston	3e98dcb3d5	inpcb: Move inpcb matching logic into separate functions These functions will get some additional callers in future revisions. No functional change intended. Discussed with: glebius Tested by: glebius Sponsored by: Modirum MDPay Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D38571	2023-04-20 12:13:06 -04:00
Mark Johnston	fdb987bebd	inpcb: Split PCB hash tables Currently we use a single hash table per PCB database for connected and bound PCBs. Since we started using net_epoch to synchronize hash table lookups, there's been a bug, noted in a comment above in_pcbrehash(): connecting a socket can cause an inpcb to move between hash chains, and this can cause a concurrent lookup to follow the wrong linkage pointers. I believe this could cause rare, spurious ECONNREFUSED errors in the worse case. Address the problem by introducing a second hash table and adding more linkage pointers to struct inpcb. Now the database has one table each for connected and unconnected sockets. When inserting an inpcb into the hash table, in_pcbinhash() now looks at the foreign address of the inpcb to figure out which table to use. This ensures that queue linkage pointers are stable until the socket is disconnected, so the problem described above goes away. There is also a small benefit in that in_pcblookup_*() can now search just one of the two possible hash buckets. I also made the "rehash" parameter of in(6)_pcbconnect() unused. This parameter seems confusing and it is simpler to let the inpcb code figure out what to do using the existing INP_INHASHLIST flag. UDP sockets pose a special problem since they can be connected and disconnected multiple times during their lifecycle. To handle this, the patch plugs a hole in the inpcb structure and uses it to store an SMR sequence number. When an inpcb is disconnected - an operation which requires the global PCB database hash lock - the write sequence number is advanced, and in order to reconnect, the connecting thread must wait for readers to drain before reusing the inpcb's hash chain linkage pointers. raw_ip (ab)uses the hash table without using the corresponding accessors. Since there are now two hash tables, it arbitrarily uses the "connected" table for all of its PCBs. This will be addressed in some way in the future. inp interators which specify a hash bucket will only visit connected PCBs. This is not really correct, but nothing in the tree uses that functionality except raw_ip, which as mentioned above places all of its PCBs in the "connected" table and so is unaffected. Discussed with: glebius Tested by: glebius Sponsored by: Klara, Inc. Sponsored by: Modirum MDPay Differential Revision: https://reviews.freebsd.org/D38569	2023-04-20 12:13:06 -04:00
Mateusz Guzik	f5a365e51f	inet6: protect address manipulation with a lock This is a total hack/bare minimum which follows inet4. Otherwise 2 threads removing the same address can easily crash. Reviewed by: kp Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D39317	2023-03-30 08:46:38 +00:00
Justin Hibbits	bb55bb1740	inet6: Include if_private.h in one more netstack file ip6_input() and ip6_destroy() both directly reference ifnet members. This file was missed in `3d0d5b21` Fixes: `3d0d5b21` ("IfAPI: Explicitly include <net/if_private.h>...") Sponsored by: Juniper Networks, Inc.	2023-03-24 10:25:35 -04:00
Kristof Provost	b52b61c0b6	pf: distinguish forwarding and output cases for pf_refragment6() Re-introduce PFIL_FWD, because pf's pf_refragment6() needs to know if we're ip6_forward()-ing or ip6_output()-ing. ip6_forward() relies on m->m_pkthdr.rcvif, at least for link-local traffic (for in6_get_unicast_scopeid()). rcvif is not set for locally generated traffic (e.g. from icmp6_reflect()), so we need to call the correct output function. Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revisi: https://reviews.freebsd.org/D39061	2023-03-16 10:59:04 +01:00

1 2 3 4 5 ...

2334 commits