opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-04-29 10:11:09 -04:00

Author	SHA1	Message	Date
Alexander V. Chernikov	9e88f47c8f	Unbreak LINT-NOINET[6] builds broken in r360191. Reported by: np	2020-04-23 06:55:33 +00:00
Michael Tuexen	8262311cbe	Improve input validation when processing AUTH chunks. Thanks to Natalie Silvanovich from Google for finding and reporting the issue found by her in the SCTP userland stack. MFC after: 3 days X-MFC with: https://svnweb.freebsd.org/changeset/base/360193	2020-04-22 21:22:33 +00:00
Michael Tuexen	97feba891d	Improve input validation when processing AUTH chunks. Thanks to Natalie Silvanovich from Google for finding and reporting the issue found by her in the SCTP userland stack. MFC after: 3 days	2020-04-22 12:47:46 +00:00
Alexander V. Chernikov	8d6708ba80	Convert TOE routing lookups to the new routing KPI. Reviewed by: np Differential Revision: https://reviews.freebsd.org/D24388	2020-04-22 07:53:43 +00:00
Richard Scheffenegger	bb410f9ff2	revert rS360143 - Correctly set up initial cwnd due to syzkaller panics found Reported by: tuexen Approved by: tuexen (mentor) Sponsored by: NetApp, Inc.	2020-04-22 00:16:42 +00:00
Richard Scheffenegger	73b7696693	Correctly set up the initial TCP congestion window in all cases, by adjust snd_una right after the connection initialization, to include the one byte in sequence space occupied by the SYN bit. This does not change the regular ACK processing, while making the BYTES_THIS_ACK macro to work properly. PR: 235256 Reviewed by: tuexen (mentor), rgrimes (mentor) Approved by: tuexen (mentor), rgrimes (mentor) MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D19000	2020-04-21 13:05:44 +00:00
Jonathan T. Looney	5d6e356cb0	Avoid calling protocol drain routines more than once per reclamation event. mb_reclaim() calls the protocol drain routines for each protocol in each domain. Some protocols exist in more than one domain and share drain routines. In the case of SCTP, it also uses the same drain routine for its SOCK_SEQPACKET and SOCK_STREAM entries in the same domain. On systems with INET, INET6, and SCTP all defined, mb_reclaim() calls sctp_drain() four times. On systems with INET and INET6 defined, mb_reclaim() calls tcp_drain() twice. mb_reclaim() is the only in-tree caller of the pr_drain protocol entry. Eliminate this duplication by ensuring that each pr_drain routine is only specified for one protocol entry in one domain. Reviewed by: tuexen MFC after: 2 weeks Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D24418	2020-04-16 20:17:24 +00:00
Alexander V. Chernikov	539642a29d	Add nhop parameter to rti_filter callback. One of the goals of the new routing KPI defined in r359823 is to entirely hide`struct rtentry` from the consumers. It will allow to improve routing subsystem internals and deliver more features much faster. This change is one of the ongoing changes to eliminate direct struct rtentry field accesses. Additionally, with the followup multipath changes, single rtentry can point to multiple nexthops. With that in mind, convert rti_filter callback used when traversing the routing table to accept pair (rt, nhop) instead of nexthop. Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D24440	2020-04-16 17:20:18 +00:00
Richard Scheffenegger	d7ca3f780d	Reduce default TCP delayed ACK timeout to 40ms. Reviewed by: kbowling, tuexen Approved by: tuexen (mentor) MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D23281	2020-04-16 15:59:23 +00:00
Alexander V. Chernikov	9ac7c6cfed	Convert IP/IPv6 forwarding, ICMP processing and IP PCB laddr selection to the new routing KPI. Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D24245	2020-04-14 23:06:25 +00:00
Michael Tuexen	b89af8e16d	Improve the TCP blackhole detection. The principle is to reduce the MSS in two steps and try each candidate two times. However, if two candidates are the same (which is the case in TCP/IPv6), this candidate was tested four times. This patch ensures that each candidate actually reduced the MSS and is only tested 2 times. This reduces the time window of missclassifying a temporary outage as an MTU issue. Reviewed by: jtl MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D24308	2020-04-14 16:35:05 +00:00
Andrew Gallatin	23feb56348	KTLS: Re-work unmapped mbufs to carry ext_pgs in the mbuf itself. While the original implementation of unmapped mbufs was a large step forward in terms of reducing cache misses by enabling mbufs to carry more than a single page for sendfile, they are rather cache unfriendly when accessing the ext_pgs metadata and data. This is because the ext_pgs part of the mbuf is allocated separately, and almost guaranteed to be cold in cache. This change takes advantage of the fact that unmapped mbufs are never used at the same time as pkthdr mbufs. Given this fact, we can overlap the ext_pgs metadata with the mbuf pkthdr, and carry the ext_pgs meta directly in the mbuf itself. Similarly, we can carry the ext_pgs data (TLS hdr/trailer/array of pages) directly after the existing m_ext. In order to be able to carry 5 pages (which is the minimum required for a 16K TLS record which is not perfectly aligned) on LP64, I've had to steal ext_arg2. The only user of this in the xmit path is sendfile, and I've adjusted it to use arg1 when using unmapped mbufs. This change is almost entirely mechanical, except that we change mb_alloc_ext_pgs() to no longer allow allocating pkthdrs, the change to avoid ext_arg2 as mentioned above, and the removal of the ext_pgs zone, This change saves roughly 2% "raw" CPU (~59% -> 57%), or over 3% "scaled" CPU on a Netflix 100% software kTLS workload at 90+ Gb/s on Broadwell Xeons. In a follow-on commit, I plan to remove some hacks to avoid access ext_pgs fields of mbufs, since they will now be in cache. Many thanks to glebius for helping to make this better in the Netflix tree. Reviewed by: hselasky, jhb, rrs, glebius (early version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24213	2020-04-14 14:46:06 +00:00
Alexander V. Chernikov	6722086045	Plug netmask NULL check during route addition causing kernel panic. This bug was introduced by the r359823. Reported by: hselasky	2020-04-14 13:12:22 +00:00
Kristof Provost	1d126e9b94	carp: Widen epoch coverage Fix panics related to calling code which expects to be running inside the NET_EPOCH from outside that epoch. This leads to panics (with INVARIANTS) such as this one: panic: Assertion in_epoch(net_epoch_preempt) failed at /usr/src/sys/netinet/if_ether.c:373 cpuid = 7 time = 1586095719 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0090819700 vpanic() at vpanic+0x182/frame 0xfffffe0090819750 panic() at panic+0x43/frame 0xfffffe00908197b0 arprequest_internal() at arprequest_internal+0x59e/frame 0xfffffe00908198c0 arp_announce_ifaddr() at arp_announce_ifaddr+0x20/frame 0xfffffe00908198e0 carp_master_down_locked() at carp_master_down_locked+0x10d/frame 0xfffffe0090819910 carp_master_down() at carp_master_down+0x79/frame 0xfffffe0090819940 softclock_call_cc() at softclock_call_cc+0x13f/frame 0xfffffe00908199f0 softclock() at softclock+0x7c/frame 0xfffffe0090819a20 ithread_loop() at ithread_loop+0x279/frame 0xfffffe0090819ab0 fork_exit() at fork_exit+0x80/frame 0xfffffe0090819af0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0090819af0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- Widen the NET_EPOCH to cover the relevant (callback / task) code. Differential Revision: https://reviews.freebsd.org/D24302	2020-04-12 16:09:21 +00:00
Alexander V. Chernikov	a666325282	Introduce nexthop objects and new routing KPI. This is the foundational change for the routing subsytem rearchitecture. More details and goals are available in https://reviews.freebsd.org/D24141 . This patch introduces concept of nexthop objects and new nexthop-based routing KPI. Nexthops are objects, containing all necessary information for performing the packet output decision. Output interface, mtu, flags, gw address goes there. For most of the cases, these objects will serve the same role as the struct rtentry is currently serving. Typically there will be low tens of such objects for the router even with multiple BGP full-views, as these objects will be shared between routing entries. This allows to store more information in the nexthop. New KPI: struct nhop_object fib4_lookup(uint32_t fibnum, struct in_addr dst, uint32_t scopeid, uint32_t flags, uint32_t flowid); struct nhop_object fib6_lookup(uint32_t fibnum, const struct in6_addr dst6, uint32_t scopeid, uint32_t flags, uint32_t flowid); These 2 function are intended to replace all all flavours of <in_\|in6_>rtalloc[1]<_ign><_fib>, mpath functions and the previous fib[46]-generation functions. Upon successful lookup, they return nexthop object which is guaranteed to exist within current NET_EPOCH. If longer lifetime is desired, one can specify NHR_REF as a flag and get a referenced version of the nexthop. Reference semantic closely resembles rtentry one, allowing sed-style conversion. Additionally, another 2 functions are introduced to support uRPF functionality inside variety of our firewalls. Their primary goal is to hide the multipath implementation details inside the routing subsystem, greatly simplifying firewalls implementation: int fib4_lookup_urpf(uint32_t fibnum, struct in_addr dst, uint32_t scopeid, uint32_t flags, const struct ifnet src_if); int fib6_lookup_urpf(uint32_t fibnum, const struct in6_addr dst6, uint32_t scopeid, uint32_t flags, const struct ifnet src_if); All functions have a separate scopeid argument, paving way to eliminating IPv6 scope embedding and allowing to support IPv4 link-locals in the future. Structure changes: * rtentry gets new 'rt_nhop' pointer, slightly growing the overall size. * rib_head gets new 'rnh_preadd' callback pointer, slightly growing overall sz. Old KPI: During the transition state old and new KPI will coexists. As there are another 4-5 decent-sized conversion patches, it will probably take a couple of weeks. To support both KPIs, fields not required by the new KPI (most of rtentry) has to be kept, resulting in the temporary size increase. Once conversion is finished, rtentry will notably shrink. More details: * architectural overview: https://reviews.freebsd.org/D24141 * list of the next changes: https://reviews.freebsd.org/D24232 Reviewed by: ae,glebius(initial version) Differential Revision: https://reviews.freebsd.org/D24232	2020-04-12 14:30:00 +00:00
Michael Tuexen	07ddae2822	Revert https://svnweb.freebsd.org/changeset/base/359809 The intended change was sp->next.tqe_next = NULL; sp->next.tqe_prev = NULL; which doesn't fix the issue I'm seeing and the committed fix is not the intended fix due to copy-and-paste. Thanks a lot to Conrad Meyer for making me aware of the problem. Reported by: cem	2020-04-12 09:31:36 +00:00
Michael Tuexen	9803dbb3ea	Zero out pointers for consistency. This was found by running syzkaller on an INVARIANTS kernel. MFC after: 3 days	2020-04-11 20:36:54 +00:00
Alexander V. Chernikov	4684d3cbcb	Remove per-AF radix_mpath initializtion functions. Split their functionality by moving random seed allocation to SYSINIT and calling (new) generic multipath function from standard IPv4/IPv5 RIB init handlers. Differential Revision: https://reviews.freebsd.org/D24356	2020-04-11 07:37:08 +00:00
Warner Losh	28540ab153	Fix copyright year and eliminate the obsolete all rights reserved line. Reviewed by: rrs@	2020-04-08 17:55:45 +00:00
Michael Tuexen	f4cb790a35	Do more argument validation under INVARIANTS when starting/stopping an SCTP timer. MFC after: 1 week	2020-04-06 13:58:13 +00:00
Alexander V. Chernikov	66bc03d415	Use interface fib for proxyarp checks. Before the change, proxyarp checks for src and dst addresses were performed using default fib, breaking multi-fib scenario. PR: 245181 Submitted by: Scott Aitken (original version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D24244	2020-04-02 20:06:37 +00:00
Michael Tuexen	413c3db101	Allow the TCP backhole detection to be disabled at all, enabled only for IPv4, enabled only for IPv6, and enabled for IPv4 and IPv6. The current blackhole detection might classify a temporary outage as an MTU issue and reduces permanently the MSS. Since the consequences of such a reduction due to a misclassification are much more drastically for IPv4 than for IPv6, allow the administrator to enable it for IPv6 only. Reviewed by: bcr@ (man page), Richard Scheffenegger Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D24219	2020-03-31 15:54:54 +00:00
Mark Johnston	9b1d850be8	Remove the "config" taskqgroup and its KPIs. Equivalent functionality is already provided by taskqueue(9), just use that instead. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2020-03-30 14:24:03 +00:00
Michael Tuexen	9aca687811	Small cleanup by using a variable just assigned. MFC after: 1 week	2020-03-28 22:35:04 +00:00
Michael Tuexen	25ec355353	Handle integer overflows correctly when converting msecs and secs to ticks and vice versa. These issues were caught by recently added panic() calls on INVARIANTS systems. Reported by: syzbot+b44787b4be7096cd1590@syzkaller.appspotmail.com Reported by: syzbot+35f82d22805c1e899685@syzkaller.appspotmail.com MFC after: 1 week	2020-03-28 20:25:45 +00:00
Ed Maste	c012cfe68a	sys/netinet: remove spurious doubled ;s	2020-03-27 23:10:18 +00:00
Michael Tuexen	d5d190f2f9	Some more uint32_t cleanups, no functional change. MFC after: 1 week	2020-03-27 21:48:52 +00:00
Michael Tuexen	239e5865df	Use uint32_t where it is expected to be used. No functional change. MFC after: 1 week	2020-03-27 11:08:11 +00:00
Michael Tuexen	7c63520c42	Remove an optimization, which was incorrect a couple of times and therefore doesn't seem worth to be there. In this case COOKIE where not retransmitted anymore, when the socket was already closed. MFC after: 1 week	2020-03-25 18:20:37 +00:00
Michael Tuexen	37686ccf08	Improve consistency in debug output. MFC after: 1 week	2020-03-25 18:14:12 +00:00
Michael Tuexen	24187cfe72	Revert https://svnweb.freebsd.org/changeset/base/357829 This introduces a regression reported by koobs@ when running a pyhton test suite on a loaded system. This patch resulted in a failing accept() call, when the association was setup and gracefully shutdown by the peer before accept was called. So the following packetdrill script would fail: +0.0 socket(..., SOCK_STREAM, IPPROTO_SCTP) = 3 +0.0 bind(3, ..., ...) = 0 +0.0 listen(3, 1) = 0 +0.0 < sctp: INIT[flgs=0, tag=1, a_rwnd=15000, os=1, is=1, tsn=1] +0.0 > sctp: INIT_ACK[flgs=0, tag=2, a_rwnd=..., os=..., is=..., tsn=1, ...] +0.1 < sctp: COOKIE_ECHO[flgs=0, len=..., val=...] +0.0 > sctp: COOKIE_ACK[flgs=0] +0.0 < sctp: DATA[flgs=BE, len=116, tsn=1, sid=0, ssn=0, ppid=0] +0.0 > sctp: SACK[flgs=0, cum_tsn=1, a_rwnd=..., gaps=[], dups=[]] +0.0 < sctp: SHUTDOWN[flgs=0, cum_tsn=0] +0.0 > sctp: SHUTDOWN_ACK[flgs=0] +0.0 < sctp: SHUTDOWN_COMPLETE[flgs=0] +0.0 accept(3, ..., ...) = 4 +0.0 close(3) = 0 +0.0 recv(4, ..., 4096, 0) = 100 +0.0 recv(4, ..., 4096, 0) = 0 +0.0 close(4) = 0 Reported by: koops@	2020-03-25 15:29:01 +00:00
Michael Tuexen	23e3c0880d	Use consistent debug output. MFC after: 1 week	2020-03-25 13:19:41 +00:00
Michael Tuexen	e056fafd92	Don't restore the vnet too early in error cases. MFC after: 1 week	2020-03-25 13:18:37 +00:00
Michael Tuexen	7522682e5e	Only call panic when building with INVARIANTS. MFC after: 1 week	2020-03-24 23:04:07 +00:00
Michael Tuexen	a412576e36	Another cleanup of the timer code. Also be more pedantic about the parameters of the timer start and stop routines. Several inconsistencies have been fixed in earlier commits. Now they will be catched when running an INVARIANTS system. MFC after: 1 week	2020-03-24 22:44:36 +00:00
Michael Tuexen	d084818d9d	Cleanup the file and add two ASSERT variants for locks, which will be used shortly. MFC after: 1 week	2020-03-23 12:17:13 +00:00
Michael Tuexen	a57fb68b92	More timer cleanups, no functional change. MFC after: 1 week	2020-03-21 16:12:19 +00:00
Michael Tuexen	fa8ceba9ca	Remove a set, but unused variable. MFC after: 1 week	2020-03-20 14:49:44 +00:00
Michael Tuexen	2bdebd0ce3	A a missing NET_EPOCH_ENTER/NET_EPOCH_EXIT pair. This was affecting implicit connection setups via sendmsg(). Reported by: syzbot+febbe3383a0e9b700c1b@syzkaller.appspotmail.com Reported by: syzbot+dca98631455d790223ca@syzkaller.appspotmail.com Reported by: syzbot+5a71a7760d6bcf11b8cd@syzkaller.appspotmail.com Reported by: syzbot+da64217e140444c49f00@syzkaller.appspotmail.com	2020-03-19 23:07:52 +00:00
Michael Tuexen	6fb7b4fbdb	Consistently provide arguments for timer start and stop routines. This is another step in cleaning up timer handling. MFC after: 1 week	2020-03-19 21:01:16 +00:00
Michael Tuexen	e95b3d7faf	Cleanup the stream reset and asconf timer. MFC after: 1 week	2020-03-19 18:55:54 +00:00
Michael Tuexen	42078d5ada	The MTU candidates MUST be a multiple of 4, so make them so. MFC after: 1 week	2020-03-19 14:37:28 +00:00
Michael Tuexen	0554e01d8b	Handle the timers in a consistent sequence according to the definition of the timer type. Just a cleanup, no functional change intended. MFC after: 1 week	2020-03-17 19:20:12 +00:00
Andrew Gallatin	ee7a9e506e	Avoid a cache miss accessing an mbuf ext_pgs pointer when doing SW kTLS. For a Netflix 90Gb/s 100% TLS software kTLS workload, this reduces the CPI of tcp_m_copym() from ~3.5 to ~2.5 as reported by vtune. Reviewed by: jtl, rrs Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D23998	2020-03-16 14:03:27 +00:00
Michael Tuexen	7ca6e2963f	Use KMOD_TCPSTAT_INC instead of TCPSTAT_INC for RACK and BBR, since these are kernel modules. Also add a KMOD_TCPSTAT_ADD and use that instead of TCPSTAT_ADD. Reviewed by: jtl@, rrs@ MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D23904	2020-03-12 15:37:41 +00:00
Andrew Gallatin	98085bae8c	make lacp's use_numa hashing aware of send tags When I did the use_numa support, I missed the fact that there is a separate hash function for send tag nic selection. So when use_numa is enabled, ktls offload does not work properly, as it does not reliably allocate a send tag on the proper egress nic since different egress nics are selected for send-tag allocation and packet transmit. To fix this, this change: - refectors lacp_select_tx_port_by_hash() and lacp_select_tx_port() to make lacp_select_tx_port_by_hash() always called by lacp_select_tx_port() - pre-shifts flowids to convert them to hashes when calling lacp_select_tx_port_by_hash() - adds a numa_domain field to if_snd_tag_alloc_params - plumbs the numa domain into places where we allocate send tags In testing with NIC TLS setup on a NUMA machine, I see thousands of output errors before the change when enabling kern.ipc.tls.ifnet.permitted=1. After the change, I see no errors, and I see the NIC sysctl counters showing active TLS offload sessions. Reviewed by: rrs, hselasky, jhb Sponsored by: Netflix	2020-03-09 13:44:51 +00:00
Hiroki Sato	d726e6331b	Fix an issue of net.inet.igmp.stats handler. The header of (struct igmpstat) could be cleared by sysctl(3). This can be reproduced by "netstat -s -z -p igmp". PR: 244584 MFC after: 1 week	2020-03-07 08:41:10 +00:00
Michael Tuexen	9c04fdfd34	When using automatically generated flow labels and using TCP SYN cookies, use the same flow label for the segments sent during the handshake and after the handshake. This fixes a bug by making sure that sc_flowlabel is always stored in network byte order. Reviewed by: bz@ MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D23957	2020-03-04 16:41:25 +00:00
Bjoern A. Zeeb	d2b8fd0da1	Add new ICMPv6 counters for Anti-DoS limits. Add four new counters for ND6 related Anti-DoS measures. We split these out into a separate upfront commit so that we only change the struct size one time. Implementations using them will follow. PR: 157410 Reviewed by: melifaro MFC after: 2 weeks X-MFC: cannot really MFC this without breaking netstat Sponsored by: Netflix (initially) Differential Revision: https://reviews.freebsd.org/D22711	2020-03-04 16:20:59 +00:00
Michael Tuexen	6605e5791f	Don't send an uninitilised traffic class in the IPv6 header, when sending a TCP segment from the TCP SYN cache (like a SYN-ACK). This fix initialises it to zero. This is correct for the ECN bits, but is does not honor the DSCP what an application might have set via the IPPROTO_IPV6 level socket options IPV6_TCLASS. That will be fixed separately. Reviewed by: Richard Scheffenegger MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D23900	2020-03-04 12:22:53 +00:00

1 2 3 4 5 ...

6550 commits