In some setups we end up with multiple states created for a single
packet, which in turn can mean we run the packet through dummynet
multiple times. That's not expected or intended. Mark each packet when
it goes through dummynet, and do not pass packet through dummynet if
they're marked as having already passed through.
See also: https://redmine.pfsense.org/issues/14854
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D44365
If we redirect a packet to localhost and it gets dummynet'd it may be
re-injected later (e.g. when delayed) which means it will be passed
through ip_input() again. ip_input() will then reject the packet because
it's directed to the loopback address, but did not arrive on a loopback
interface.
Fix this by having pf set the rcvif to V_iflo if we redirect to
loopback.
See also: https://redmine.pfsense.org/issues/15363
Sponsored by: Rubicon Communications, LLC ("Netgate")
Apply the fixes from c6f1116357904 and b8ef285f6cc6a to IPv6 as well.
Ensure that when dummynet re-injects it does so in the correct direction, and
uses the correct dummynet pipes.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Ensure that we pick the correct dummynet pipe (i.e. forward vs. reverse
direction) when applying route-to.
We mark the processing as outbound so that dummynet will re-inject in
the correct phase of processing after it's done with the packet, but
that will cause us to pick the wrong pipe number. Reverse them so that
the incorrect decision ends up picking the correct pipe.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D44366
If we apply a route-to to an inbound packet pf_route() may hand that
packet over to dummynet. Dummynet may then delay the packet, and later
re-inject it. This re-injection (in dummynet_send()) needs to know
if the packet was inbound or outbound, to call the correct path for
continued processing.
That's done based on the pf_pdesc we pass along (through
pf_dummynet_route() and pf_pdesc_to_dnflow()). In the case of pf_route()
on inbound packets that may be wrong, because we're called in the input
path, and didn't update pf_pdesc->dir.
This can manifest in issues with fragmented packets. For example, a
fragmented packet will be re-fragmented in pf_route(), and if dummynet
makes different decisions for some of the fragments (that is, it delays
some and allows others to pass through directly) this will break.
The packets that pass through dummynet without delay will be transmitted
correctly (through the ifp->if_output() call in pf_route()), but
the delayed packets will be re-injected in the input path (and not
the output path, as they should be). These packets will pass through
pf_test(PF_IN) as they're tagged PF_MTAG_FLAG_DUMMYNET. However,
this tag is then removed and the packet will be routed and enter
pf_test(PF_OUT) where pf_reassemble() will hold them indefinitely
(as some fragments have been transmitted directly, and will never hit
pf_test(PF_OUT)).
The fix is simple: we must update pf_pfdesc->dir to PF_OUT before we
pass the packet to dummynet.
See also: https://redmine.pfsense.org/issues/15156
Reviewed by: rcm
Sponsored by: Rubicon Communications, LLC ("Netgate")
In some setups we end up with multiple states created for a single
packet, which in turn can mean we run the packet through dummynet
multiple times. That's not expected or intended. Mark each packet when
it goes through dummynet, and do not pass packet through dummynet if
they're marked as having already passed through.
See also: https://redmine.pfsense.org/issues/14854
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D44365
As per RFC (RFC4960 section 3.3.7) an ABORT terminates the connection fully. We
should mode the state to CLOSED rather than CLOSING.
Suggested by: Oliver Thomas
See also: https://redmine.pfsense.org/issues/15924
Sponsored by: Rubicon Communications, LLC ("Netgate")
All of the do_cmd() calls are in dummynet.c and specify the socket
option at compile time; none of these removed cases are used in ipfw
after the v3 work.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D53378
(cherry picked from commit 0e2e0fb955adf15a217949bc4cc337d53d2c7259)
(cherry picked from commit 6b1e5d4d20a94b5bebd726eb6d1df8dca2738f8e)
IP_DUMMYNET_GET is no longer used in ipfw(1).
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D53348
(cherry picked from commit 28e52dea96809c7904e498759ee1f79bda929a82)
(cherry picked from commit 73c105268cc6138015241b080bc7945c6cde0fa6)
The failed allocation in the error pertains to IP_FW_XADD, not
IP_FW_ADD.
Reviewed by: ae
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D53359
(cherry picked from commit 498e56142660c8dd864c878e820252358c9a15cf)
(cherry picked from commit c22437c8b574878241a3c897a095ae6939e66743)
Dummynet v3 switched to IP_DUMMYNET3 but did not update these
warnings/errors.
Fixes: cc4d3c30ea ("Bring in the most recent version of ipfw and dummynet, developed")
Sponsored by: The FreeBSD Foundation
Differential Revision: sbin/ipfw/ipfw2.c
(cherry picked from commit 1f95a517880bae5fc0a9fe4463a8f2ec36ed734a)
(cherry picked from commit a5dd21c7dd1f3c8103c2fc6a1caa5635d70671aa)
Virtual Functions have access to a limited number of registers,
and their bus space size is lower. Use KASSERT to detect out-of-bounds
access and eliminate them to avoid kernel panics in production
environment.
Signed-off-by: Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by: jmg
Tested by: mateusz.moga_intel.com
Approved by: kbowling (mentor), erj (mentor)
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D52976
(cherry picked from commit 2c02e6ca7154593d214b62578f67d9fe7db23d70)
Add device IDs and branding strings for E835 adapters.
This is a follow up for E830 adapters with Security Protocol
and Data Model (SPDM) support and RDMA support available
on 100 and 200Gbps links.
Signed-off-by: Krzysztof Galazka <krzysztof.galazka@intel.com>
Approved by: kbowling (mentor), erj (mentor)
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D52782
(cherry picked from commit b202176dc76d862f886778439b96dd1243d8b999)
This is part 1 of the support for the new Intel Ethernet E610 family of devices.
Introduce new PCI device IDs:
• 57AE: Intel(R) E610 (Backplane)
• 57AF: Intel(R) E610 (SFP)
• 57B0: Intel(R) E610 (10 GbE)
• 57B1: Intel(R) E610 (2.5 GbE)
• 57B2: Intel(R) E610 (SGMII)
Key updates for E610 family:
• Firmware manages Link and PHY
• Implement new CSR-based Admin Command Interface (ACI) for SW-FW interaction
• Tested exclusively for x64 operating systems on E610-XT2/XT4 (10G) and E610-IT4 (2.5G)
• Enable link speeds above 1G: 2.5G, 5G and 10G
• NVM Recovery Mode and Rollback support
Signed-off-by: Yogesh Bhosale yogesh.bhosale@intel.com
Co-developed-by: Krzysztof Galazka krzysztof.galazka@intel.com
Approved by: kbowling (mentor), erj (mentor)
Tested by: gowtham.kumar.ks_intel.com
Sponsored by: Intel Corporation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D50067
(cherry picked from commit dea5f973d0c8d29a79b433283d0a2de8f4615957)
This change reapplies the improvements from commit 89e7335 and adds
additional fixes and code optimizations on top of it.
The ixl driver supports up to 128 multicast filters in hardware. When this
limit is exceeded, the driver should enable multicast promiscuous mode.
When the count drops below 128, it should disable promiscuous mode and
restore individual filters.
The driver previously had problems that could corrupt multicast filters list.
The main issue was that ixl_dis_multi_promisc() would attempt to disable
promiscuous mode without checking if it was actually enabled, potentially
corrupting existing filters. There was also no state tracking across driver
functions, leading to redundant operations.
This change adds an IXL_FLAGS_MC_PROMISC flag to track the multicast
promiscuous mode state. The flag is set when enabling promiscuous mode and
cleared when disabling it. Early return checks prevent redundant operations
when the mode is already in the desired state, avoiding filter corruption
and unnecessary hardware calls.
Signed-off-by: Yogesh Bhosale yogesh.bhosale@intel.com
PR: 283820
Approved by: kbowling (mentor)
Tested by: gowtham.kumar.ks_intel.com
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D52549
(cherry picked from commit 46a8a1f08f88c278e60ebb6daa7a551eb641c67b)
According to section 5.1.6.2.1 of version 1.3 of the virtio
specification, the driver MUST NOT set VIRTIO_NET_HDR_F_DATA_VALID in
the flags. So don't do that.
Reviewed by: Timo Völker
Differential Revision: https://reviews.freebsd.org/D53650
(cherry picked from commit 836b3cd9d7910aff5225e9e58189067ca03fae30)
Transmit segment offloading depends on transmit checksum offloading.
Enforce that constraint. This also fixes a bug, since if_hwassist bits
are from the CSUM_ space, not from the IFCAP_ space.
PR: 290773
Reviewed by: Timo Völker
Tested by: lg@efficientip.com
Differential Revision: https://reviews.freebsd.org/D53629
(cherry picked from commit 4c50ac68166caf7e08c5a9984d63fa91490fa50d)
These structures are copied out to userspace, and it's possible to leak
uninitialized stack bytes since these routines and their callers weren't
careful to clear them first. Add memsets to avoid this.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by: kp, emaste
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D53342
(cherry picked from commit ff08916e9ac689e6ce734de72325fc2bd9495a35)
The handlers were not checking that the group names are nul-terminated.
Add checks for this.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by: zlei
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D53344
(cherry picked from commit 32919a34f17ac1af99dec7376f22a8393c251602)
Fix the htons byteorder of vxlan packets after
`vxlan_pick_source_port` picks a source port during encapsulation.
Reviewed by: zlei, kp, adrian
Differential Revision: https://reviews.freebsd.org/D53022
(cherry picked from commit 1cc316727ebae157b3d035d9fb1ad38310a80698)
The current IPFW version 3 dates to 2010 (commit cc4d3c30ea, "Bring in
the most recent version of ipfw and dummynet, developed").
The compat code for FreeBSD 8 and earlier has a number of issues and is
no longer needed, so remove it.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by: ae, glebius
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D53343
(cherry picked from commit c59aab9a5b3970b3ccec744f759e6cb87e938dbe)
(cherry picked from commit 9657c50cdd7741404d99881fdd9243175086ede1)
m_pullup() here will have freed the mbuf chain, but we pass back an
IP_FW_DENY without any signal that the outer loop should finish. Thus,
rule processing continues without an mbuf and there's a chance that we
conclude that the packet may pass (but there's no mbuf remaining)
depending on the rules that follow it.
PR: 284606
Reviewed by: ae
(cherry picked from commit c0382512bfce872102d213b9bc2550de0bc30b67)
Both for the DIOCADDSTATE ioctl and for states imported through pfsync packets.
Add a test case to exercise this code path.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC ("Netgate")
(cherry picked from commit faacc0d968816cf8714c974b6d8df6191cfb0e0d)
Unterminated strings in the anchor or name could cause crashes.
Validate them, and add a test case.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC ("Netgate")
(cherry picked from commit 1da3c0ca5b1decaa9cf55859cd134bdcd1218116)
1 GiB is a convenient disk image size for testing. It is also the
installer's minimum size, but the minimum applies to the partition
rather than the whole disk. Testing with a 1 GiB image resulted in the
counterintuitive error "There is not enough free space on <disk> to
install FreeBSD (1.0 GB free, 1.0 GB required)."
Reduce the installer's minimum size slightly to support this case.
Reviewed by: brd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38881
(cherry picked from commit 57e12d397387542b13f175d4c0b8b5adca198690)
This is consistent with other operating systems and with bsdinstall's
UFS config and with bsdinstall's ZFS config prior to commit
0b7472b3d8.
PR: 290857
Fixes: 0b7472b3d8 ("Mount the EFI system partition (ESP) on newly-installed systems.")
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D53642
(cherry picked from commit 4109cdf0f817162cf3032aa589dd180dfa910025)
(cherry picked from commit 65e347d315449e8c28dbcb0c5bb64f79d822d024)
The second and third members of struct bsddialog_menuitem are `bool on`
and `unsigned int depth`. The newfs dialog options in bsdinstall's
partition tool had these two swapped, so the default selection did not
work.
PR: 290857
Reviewed by: asiciliano
Fixes: 50e244964e ("bsdinstall/partedit: Replace libdialog with libbsddialog")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D53639
(cherry picked from commit 4e36942420712c2ab6ebc2c646e61d47b2b68e7b)
(cherry picked from commit 980aa8d4cfdf57a1f99401fa4160c0d82c927d7c)
- Change vm_page_reclaim_contig[_domain] to return an errno instead
of a boolean. 0 indicates a successful reclaim, ENOMEM indicates
lack of available memory to reclaim, with any other error (currently
only ERANGE) indicating that reclamation is impossible for the
specified address range. Change all callers to only follow
up with vm_page_wait* in the ENOMEM case.
- Introduce vm_domainset_iter_ignore(), which marks the specified
domain as unavailable for further use by the iterator. Use this
function to ignore domains that can't possibly satisfy a physical
allocation request. Since WAITOK allocations run the iterators
repeatedly, this avoids the possibility of infinitely spinning
in domain iteration if no available domain can satisfy the
allocation request.
PR: 274252
Reported by: kevans
Tested by: kevans
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D42706
(cherry picked from commit 2619c5ccfe1f7889f0241916bd17d06340142b05)
MFCed as a prerequisite for further MFC of VM domainset changes. Based
on analysis, it would not hurt, and I have been using it in productions
for months now.
Resolved the trivial conflict due to commit 718d1928f874 ("LinuxKPI:
make linux_alloc_pages() honor __GFP_NORETRY") having been MFCed before
this one.
This follows the commit 4cdc1f5421, which introduces the IFCAP_HWSTATS
capability.
Fixes: 4cdc1f5421 There are some high performance NICs that count statistics in hardware
MFC after: 3 days
(cherry picked from commit 595acb29a35f36a4fc08b89d3a476f16c1d108b4)
(cherry picked from commit 6bcce275a5a9e10f8e5b990f8cfa2166aa49875a)
Historically this capability is IFCAP_NOMAP but it was renamed to
IFCAP_MEXTPG. Catch up with the change 3f43ada98c.
PR: 289545
Fixes: 3f43ada98c Catch up with 6edfd179c8: mechanically rename IFCAP_NOMAP to IFCAP_MEXTPG
MFC after: 3 days
(cherry picked from commit 5017fdb728811fd3e15d7151524378f49a49aee1)
(cherry picked from commit 5f472754ba6f9cc95607956c6e2ad6483c9dd157)
Some options (in particular, -g) are processed immediately upon being
parsed. This will produce the wrong result in combination with -j since
we only attach to the jail after we're done parsing arguments. Solve
this by attaching to the jail immediately when -j is encountered. The
downside is that e.g. `ifconfig -j foo -j bar` would previously attach
to jail “bar”, whereas now it will attempt to attach to jail “foo”, and
if successful, attempt to attach to jail “bar” within jail “foo”. This
may be considered a feature.
PR: 289134
MFC after: 1 week
Reviewed by: zlei
Differential Revision: https://reviews.freebsd.org/D52501
(cherry picked from commit 18fd1443d205aed6be22966125a4820f77571948)
Note, it looks like this code may be unused since commit 4a77657cbc01
("ipfw: migrate ipfw to 32-bit size rule numbers"). In particular, it
looks like the ipfw_nat_*_ptr pointers are unused now.
Reviewed by: ae
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D53068
(cherry picked from commit 2df39ce5d4a8836ef5fd3c2666f48041042eff42)
When sending UDP packets:
* compute the checksum in the correct order. This only has an impact
if the length of the payload is odd.
* don't send packet with a checksum of zero, use 0xffff instead as
required.
When receiving UDP packets:
* don't do any computations when the checksum is zero.
* compute the checksum in the correct order. This only has an impact
if the length of the payload is odd.
* when computing the checksum, store the pseudo header checksum
* if the checksum is computed as zero, use 0xffff instead.
* also accept packets, when the checksum in the packet is the pseudo
header checksum.
The last point fixes a problem when the DHCP client runs in a VM,
the DHCP server runs on the host serving the VM and the network
interface supports transmit checksum offloading. Since dhclient
doesn't use UDP sockets but bpf devices to read the packets, the
checksum will be incorrect and only contain the checksum of the
pseudo header.
PR: 263229
Reviewed by: markj, Timo Völker
Tested by: danilo
Differential Revision: https://reviews.freebsd.org/D52394
(cherry picked from commit 187ee62c71f2be62870f26ae98de865e330121be)
When the SCTP, TCP, or UDP implementation send a packet, it does not
compute the corresponding checksum but defers that. The network layer
will determine whether the network interface selected for the packet
has the requested capability and computes the checksum in software,
if the selected network interface doesn't have the requested
capability.
Do this not only for packets being sent by the local SCTP, TCP,
and UDP stack, but also when forwarding packets. Furthermore, when
such packets are delivered to a local SCTP, TCP, or UDP stack, do not
compute or validate the checksum, since such packets never have been on
the wire.
This allows to support checksum offloading also in the case of local
virtual machines or jails.
Support for epair, vtnet, and tap interfaces will be added in
separate commits.
Reviewed by: kp, rgrimes, tuexen, manpages
Differential Revision: https://reviews.freebsd.org/D51475
(cherry picked from commit bcb298fa9e23c1192c5707086a67d3b396186abc)
This describes the current status of the implementation.
While there, be a bit more precise on how long the checksum
computation is delayed.
Reviewed by: Timo Völker, bcr
Differential Revision: https://reviews.freebsd.org/D51590
(cherry picked from commit fe35f275ab0240cb5ed05484c943293a71aadb5f)
Approved by: so
(cherry picked from commit 1dd66c6ac2c146f540b2ff825fbee442354aeee5)
(cherry picked from commit 7272e2d029c20c3144d7aa49500dc86d70344030)
While TCP disallows connect()ing a socket with SO_REUSEPORT_LB, UDP does
not. As a result, a connected UDP socket can be placed in the lbgroup
hash and thus receive datagrams from sources other than the connected
host.
Reported by: Amit Klein <amit.klein@mail.huji.ac.il>
Reported by: Omer Ben Simhon <omer.bensimhon@mail.huji.ac.il>
Reviewed by: glebius
Approved by: so
Security: FreeBSD-SA-25:09.netinet
Security: CVE-2025-24934
(cherry picked from commit 320ad3dec5ff1b37f6907a47961c18b9d77e6a53)
(cherry picked from commit e276759b368701a49e543c45d5d6ea08ed4fbc38)
The type of variable promisc and allmulti was changed from int to bool
by commit [1].
[1] 7dce56596f Convert to if_foreach_llmaddr() KPI
MFC after: 3 days
(cherry picked from commit 80dfed11fc1c61ce9168db01dee263447619e859)
Keep the hwassist flags for transmit checksum offload and transmit
segment offload in sync with the enabled capabilities.
Reported by: Timo Völker
Reviewed by: Timo Völker
Differential Revision: https://reviews.freebsd.org/D52765
(cherry picked from commit f2575d56c8c9a8acad4a61a3586546dff4febce1)