For pppoe/ng interfaces sometimes we enter ip6_tryforward() with
a NULL pointer array and IN6_LINKMTU() glancing over the fact
that this is not a valid destination since if_afdata structure
is not initialized.
While here remove the RT_LINK_IS_UP macro since nothing outside
of nhop is using it.
This is probably a side effect generator, but fixing one spot
instead of the general case would leave other holes in the stack.
Do not return a route destination if the address families were not
yet attached.
To comply with LINCE certification, it's necessary to ensure that
packets to 0.0.0.0/::0 are dropped and logged by the firewall. Such
packets are dropped by ip_input() and ip6_input() before reaching pfil
hooks; reorder the checks to give firewalls a chance to drop the packets
themselves, as this gives better observability.
Note that ip_forward() and ip6_forward() ensure that such packets are
not forwarded; they are passed back unmodified.
pfil hooks (i.e. firewalls) may pass, modify or free the mbuf passed
to them. (E.g. when rejecting a packet, or when gathering up packets
for reassembly).
If the hook returns PFIL_PASS the mbuf must still be present. Assert
this in pfil_mem_common() and ensure that ipfilter follows this
convention. pf and ipfw already did.
Similarly, if the hook returns PFIL_DROPPED or PFIL_CONSUMED the mbuf
must have been freed (or now be owned by the firewall for further
processing, like packet scheduling or reassembly).
This allows us to remove a few extraneous NULL checks.
Suggested by: tuexen
Reviewed by: tuexen, zlei
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D43617
This patch provides UDP encapsulation of ESP packets over IPv6.
Ports the IPv4 code to IPv6 and adds support for IPv6 in udpencap.c
As required by the RFC and unlike in IPv4 encapsulation,
UDP checksums are calculated.
Co-authored-by: Aurelien Cazuc <aurelien.cazuc.external@stormshield.eu>
Sponsored-by: Stormshield
Sponsored-by: Wiktel
Sponsored-by: Klara, Inc.
Fix KASSERT in 80044c78 causing build failures
Move the KASSERT to where struct ip6_hdr is populated
Fixes: 80044c785cb040a2cf73779d23f9e1e81a00c6c3
Reported-by: bapt
Reviewed-by: markj
Sponsored-by: Klara, Inc.
This removes the if_output calls in the pf(4) code that escape further
processing by defering the forwarding execution to the network stack
using on/off style sysctls for both IPv4 and IPv6.
Also see: https://reviews.freebsd.org/D8877
This commit also includes the original refactoring changes
This change allows the kernel to operate with the default netisr cpu-affinity settings while having RSS compiled in. Normally, RSS changes quite a bit of the behaviour of the kernel dispatch service - this change allows for reducing impact on incompatible hardware while preserving the option to boost throughput speeds based on packet flow CPU affinity.
Make sure to compile the following options in the kernel:
options RSS
As well as setting the following sysctls:
net.inet.rss.enabled: 1
net.isr.bindthreads: 1
net.isr.maxthreads: -1 (automatically sets it to the number of CPUs)
And optionally (to force a 1:1 mapping between CPUs and buckets):
net.inet.rss.bits: 3 (for 8 CPUs)
net.inet.rss.bits: 2 (for 4 CPUs)
etc.
Set pin_default_swi to 0 by default in the RSS case.
Based on a patch originally found in m0n0wall, expanded
to IPv6 and aligned with FreeBSD's IP input path.
The limit may not be correctly accounted for on the WAN
interface due to dummynet counting the packet again even
though it was already processed.
The problem here is that there's no proper way to reinject
the packet at the point where it was previously removed
from so we make the assumption that ip input was already
done (including pfil) and more or less directly move to
packet output processing.
While here move the passin label up to take the extra check
but avoiding a second label. Also remove the spurious tag
read for forward check since we don't use it and we should
really trust the mbuf flag.
If the V_connect_ifaddr_wild sysctl says that we shouldn't infer a
destination address, return an error. Otherwise it's possible for use
of an unspecified foreign address to trigger a subsequent assertion
failure, for example in in_pcblookup_hash_locked().
Similarly, if no interface addresses are assigned, fail quickly upon an
attempt to connect to the unspecified address.
Reported by: Shawn Webb <shawn.webb@hardenedbsd.org>
MFC after: 2 weeks
Reviewed by: zlei, allanjude, emaste
Differential Revision: https://reviews.freebsd.org/D46454
(cherry picked from commit 0c605af3f9d9e66be6af0a3bbc36dbedc5dfe516)
See the discussion in Bugzilla PR 280705 for context.
PR: 280705
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D46259
(cherry picked from commit 417b35a97b7669eb0bf417b43e97cccbedbce6f9)
The nd6 code listens for RTM_DELETE events so that it can mark the
corresponding default router as inactive in the case where the default
route is deleted. A subsequent RA from the router may then reinstall
the default route.
Commit fedeb08b6a broke this for non-multipath routes, as
rib_decompose_notification() only invokes the callback for multipath
routes. Restore the old behaviour. Also ensure that we update the
router only for RTM_DELETE notifications, lost in commit 2259a03020.
Reviewed by: bz
Fixes: fedeb08b6a ("Introduce scalable route multipath.")
Fixes: 2259a03020 ("Rework part of routing code to reduce difference to D26449.")
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Sponsored by: Bell Tower Integration
Differential Revision: https://reviews.freebsd.org/D46020
(cherry picked from commit a48df53e4249499be3e8779dd30888a405aa81ae)
Zero means limit is disabled, so the value doesn't need to be checked
against jitter value.
Fixes: ac44739fd834f51cacb26485a4140fd482e20150
Fixes: a03aff88a14448c3084a0384082ec996d7213897
(cherry picked from commit 4399e055ea610cdefa1470ad1ee614dd81ba5e56)
Use counter_ratecheck() instead of racy and slow ppsratecheck. Use a
separate counter for every currently known type of ICMPv6. Provide logging
of ratelimit events. Provide jitter to counter open UDP port detection.
Reviewed by: tuexen, zlei
Differential Revision: https://reviews.freebsd.org/D44482
(cherry picked from commit a03aff88a14448c3084a0384082ec996d7213897)
Most of them can be declared as static after the move out of in6_proto.c.
Keeping sysctl(9) declarations with their text descriptions next to the
variable declaration create self-documenting code. There should be no
functional changes.
Differential Revision: https://reviews.freebsd.org/D44481
(cherry picked from commit 4f96be33fe7676c69c5abb476bb09bba0c63a3f4)
The generation of ICMP6_ECHO_REPLY bypasses icmp6_error(), thus rate
limit was not applied.
Reviewed by: tuexen, zlei
Differential Revision: https://reviews.freebsd.org/D44480
(cherry picked from commit 32aeee8ce7e72738fff236ccd5629d55035458f8)
in6_mapped_sockaddr() and in6_mapped_peeraddr() both define a local
variable named 'inp', but in the non-INET case, this variable is set
and never used, causing a compiler error:
/src/freebsd/src/lf/sys/netinet6/in6_pcb.c:547:16: error:
variable 'inp' set but not used [-Werror,-Wunused-but-set-variable]
547 | struct inpcb *inp;
| ^
/src/freebsd/src/lf/sys/netinet6/in6_pcb.c:573:16: error:
variable 'inp' set but not used [-Werror,-Wunused-but-set-variable]
573 | struct inpcb *inp;
Fix this by guarding all the INET-specific logic, including the variable
definition, behind #ifdef INET.
While here, tweak formatting in in6_mapped_peeraddr() so both functions
are the same.
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1155
(cherry picked from commit 042fb58d009e7efc5b334b68fffbef9b1f620ec8)
The only element of of in6_addr that is specified in RFC 3493 or
in POSIX.1-2017 is s6_addr, implemented via a #define to a union
member. However, FreeBSD and other BSD systems have additional
definitions for the other union members, s6_addr{8,16,32} which
are defined for the kernel and loader. Some Linux applications
also use them, and they seem to be allowed by the RFC and POSIX.
Remove the current ifdefs, exposing the additional fields to user
level, and replace with #if __BSD_VISIBLE. Add an explanatory
comment expanding on the previous "nonstandard" comment.
Reviewed by: bz
Differential Revision: https://reviews.freebsd.org/D44979
(cherry picked from commit eb3dbf2dbe22ed6d4df54aebbf23f5b555a21cf1)
Don't report a BACKUP CARP address as local. These two functions are used
only by source address validation for input packets, controlled by sysctls
net.inet.ip.source_address_validation and
net.inet6.ip6.source_address_validation. For this purpose we definitely
want to treat BACKUP addresses as non local.
This change is conservative and doesn't modify compat in_localip() and
in6_localip(). They are used more widely than the FIB-aware versions.
The change would modify the notion of ipfw(4) 'me' keyword. There might
be other consequences as in_localip() is used by various tunneling
protocols.
PR: 277349
(cherry picked from commit 56f7860087eec14b4a65310b70bd704e79e1b48c)
This patch allows the IPPROTO_UDPLITE-level socket options
UDPLITE_SEND_CSCOV and UDPLITE_RECV_CSCOV to be used on
AF_INET6 sockets in addition to AF_INET sockets.
Reviewed by: ae, rscheff
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42430
(cherry picked from commit 03c3a70abe5e9fa259b954de78ae69229fa9c99f)
First, merge in_pcbdetach() with in_pcbfree(). The comment for
in_pcbdetach() was no longer correct. Then, make sure we remove
the inpcb from the hash before we commit any destructive actions
on it. There are couple functions that rely on the hash lock
skipping SMR + inpcb lock to lookup an inpcb. Although there are
no known functions that similarly rely on the global inpcb list
lock, also do list removal before destructive actions.
PR: 273890
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D43122
(cherry picked from commit a13039e2709277b1c3b159e694cc909a5e044151)
The code which removes a fragment queue from the per-VNET hash table was
duplicated three times. Factor it out into a function. No functional
change intended.
Reviewed by: kp, bz
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43228
(cherry picked from commit 0736a38072b52204289c669770a34d0b801a8a7e)
When an application listens IPv6 TCP socket, due to ipfw
forwarding tag it may handle connections for addresses that do not
belongs to the jail or even current host (transparent proxy).
Syncache code can successfully handle TCP handshake for such connections.
When syncache finally accepts connection it uses in6_pcbconnect() to
properly initlize new connection info.
For IPv4 this scenario just works, but for IPv6 it fails when
local address doesn't belongs to the jail. This check occurs when
in6_pcbladdr() applies IPv6 SAS algorithm.
We need IPv6 SAS when we are connection initiator, but in the above
case connection is already established and both source and destination
addresses are known.
Use unused argument to notify in6_pcbconnect() when we don't need
source address selection. This will fix `ipfw fwd` to jailed IPv6
address.
When we are connection initiator, we stil use IPv6 SAS algorithm and
apply all related restrictions.
MFC after: 1 month
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D41685
(cherry picked from commit 0bf5377b6b9642acc85355062b921a07604b7c04)
The following sysctl variables are actually loader tunables. Add sysctl
flag CTLFLAG_TUN to them so that `sysctl -T` will report them correctly.
1. net.inet6.ip6.auto_linklocal
2. net.inet6.ip6.accept_rtadv
3. net.inet6.ip6.no_radr
No functional change intended.
Reviewed by: glebius
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D41928
(cherry picked from commit 03dac3e37993801dab4418087bfedacce0526e66)
Since f71cb9f748 socket stays connnected with inpcb through latter's
lifetime and there is no reason to complicate things and copy these
flags.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D41198
The code added in c89c8a1029 in order
to compensate possible misalignment caused by prepending the IP4/6
header with an EtherIP one got broken at some point by a rewrite of
gif(4). For better or worse, 8018ac153f
relaxed the alignment of struct ip from 32 bit to 16 bit, though. As
a result, a 16 bit offset of the IPv4 header induced by the addition
of the 16 bit EtherIP one no longer is a problem in the first place.
The alignment of struct ip6_hdr currently is even only 8 bit, making
it even less problematic with regards to possible misalignment.
Thus, remove the code for handling misalignment in in{,6}_gif_output()
altogether again.
While at it, replace the 3 bcopy(9) calls in gif(4) with memcpy(9) as
there's no need to handle overlap here.
The mac_ipacl policy module enables fine-grained control over IP address
configuration within VNET jails from the base system.
It allows the root user to define rules governing IP addresses for
jails and their interfaces using the sysctl interface.
Requested by: multiple
Sponsored by: Google, Inc. (GSoC 2019)
MFC after: 2 months
Reviewed by: bz, dch (both earlier versions)
Differential Revision: https://reviews.freebsd.org/D20967
Resolve a race condition where we'd lose the Solicited-node multicast
group subscription if we assigned the same IPv6 address twice.
PR: 233683
Reviewed by: ae
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D41124
Ipsec needs access to packet headers to determine if a policy is
applicable. It seems that typically IP headers are mapped, but the code
is arguably needs to check this before blindly accessing them. Then,
operations like m_unshare() and m_makespace() are not yet ready for
unmapped mbufs.
Ensure that the packet is mapped before calling into IPSEC_OUTPUT().
PR: 272616
Reviewed by: jhb, markj
Sponsored by: NVidia networking
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D41112
Pre-declare struct ucred, to fix build issues on the MINIMAL config:
In file included from /usr/src/sys/netpfil/pf/pfsync_nv.c:40:
/usr/src/sys/netinet6/ip6_var.h:384:31: error: declaration of 'struct ucred' will not be visible outside of this function [-Werror,-Wvisibility]
struct ip6_pktopts *, struct ucred *, int);
^
/usr/src/sys/netinet6/ip6_var.h:408:28: error: declaration of 'struct ucred' will not be visible outside of this function [-Werror,-Wvisibility]
struct inpcb *, struct ucred *, int, struct in6_addr *, int *);
^
2 errors generated.
This ensures that in6_cksum_partial() can be applied to unmapped mbufs,
which can happen at least when icmp6_reflect() quotes a packet.
The basic idea is to restructure in6_cksum_partial() to operate on one
mbuf at a time. If the buffer length is odd or unaligned, an extra
residual byte may be returned, to be incorporated into the checksum when
processing the next buffer.
PR: 268400
Reviewed by: cy
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D40598
Having it configurable adds more flexibility, especially
for the systems with low amount of memory.
Additionally, it allows to speedup frag6/ tests execution.
Reviewed by: kp, markj, bz
Differential Revision: https://reviews.freebsd.org/D35755
MFC after: 2 weeks
Some context on the current IPv6 interface setup & address management:
There are two data path for IPv6 initialisation in context of assigning
LL addresses:
1) Userland explicitly requests IFF_UP for the interface w/o any addresses.
if_up() then calls in6_if_up(), which calls in6_ifattach().
The latter sets up some initial ND/IN6 state and disables IPv6 for the
interface if it’s not loopback. If the interface is loopback, then it
adds ::1/128 and LL addresses via in6_ifattach_loopback().
Then, devd notification is generated (if the VNET is the default one),
which triggers rc.network ifconfig_up(), causing ifdisabled to be removed
via SIOCSIFINFO_IN6 from ifconfig. The kernel SIOCSIFINFO_IN6 handler
calls in6_if_up() once again and it assigns the interface link-local address.
2) Userland adds IPv4 or IPv6 address to the interface. SIOCAIFADDR[_IN6]
kernel handler calls IPv4/IPv6 protocol handler to add the address.
Both then call if_ioctl() with SIOCSIFADDR. Ethernet/loopback ioctl handlers
silently sets IFF_UP for the interface. Finally, if.c:ifioctl() wrapper code
compares old and new interface flags and, if IFF_UP is added, it explicitly
calls in6_if_up(), which adds link-local address if either the original
address is IPv6 or the interface is loopback.
In the latter case, “formal” interface-up notifications are missing.
The kernel does not trigger event handler event, does not call carp hook
and does not provide any userland notification.
This diff unifies the event handling in both scenarios, providing the
necessary notifications to the kernel and userland.
Reviewed By: kp
Differential Revision: https://reviews.freebsd.org/D40332
MFC after: 2 weeks