If an error occurs while processing a TCP segment with some data and the FIN
flag, the back out of the sequence number advance does not take into account the
increase by 1 due to the FIN flag.
Reviewed By: jch, gnn, #transport, tuexen
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D2970
Combined changes to allow experimentation with net 0/8 (network 0),
240/4 (Experimental/"Class E"), and part of the loopback net 127/8
(all but 127.0/16). All changes are disabled by default, and can be
enabled by the following sysctls:
net.inet.ip.allow_net0=1
net.inet.ip.allow_net240=1
net.inet.ip.loopback_prefixlen=16
When enabled, the corresponding addresses can be used as normal
unicast IP addresses, both as endpoints and when forwarding.
Add descriptions of the new sysctls to inet.4.
Add <machine/param.h> to vnet.h, as CACHE_LINE_SIZE is undefined in
various C files when in.h includes vnet.h.
The proposals motivating this experimentation can be found in
https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-0https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-240https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-127
Reviewed by: rgrimes, pauamma_gundo.com; previous versions melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D35741
The comment from Robert Watson doubts that this condition ever happens.
Our analysis confirm that. Also, we found that if you manage to create
such a connection with help of some other bug, then after the "second
case" code is executed, the kernel will panic in other part of the stack.
Reviewed by: rrs, tuexen
Differential revision: https://reviews.freebsd.org/D35714
Some command definitions were forced to use DB_FUNC in order to specify
their required flags, CS_OWN or CS_MORE. Use the new macros to simplify
these.
Reviewed by: markj, jhb
MFC after: 3 days
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35582
o Retire SS_FDREF as it is basically a debug flag on top of already
existing soref()/sorele().
o Convert SS_PROTOREF into soref()/sorele().
o Change reference model for the listen queues, see below.
o Make sofree() private. The correct KPI to use is only sorele().
o Make soabort() respect the model and sorele() instead of sofree().
Note on listening queues. Until now the sockets on a queue had zero
reference count. And the reference were given only upon accept(2). The
assumption was that there is no way to see the queued socket from anywhere
except its head. This is not true, since queued sockets already have pcbs,
which are linked at least into the global pcb lists. With this change we
put the reference right in the sonewconn() and on accept(2) path we just
hand the existing reference to the file descriptor.
Differential revision: https://reviews.freebsd.org/D35679
The flag SS_NOFDREF is a private flag of the socket layer. It also
is supposed to be read with SOCK_LOCK(), which we don't own here.
Reviewed by: rrs, tuexen
Differential revision: https://reviews.freebsd.org/D35663
Unlock the inp when hanlding TCP_MD5SIG socket options. tcp_ipsec_pcbctl
handles locking the inp when the option is being modified.
This was found by Claudio Jeker while working on the OpenBGPd port.
On 14 we get a panic when trying to call getsockopt, on 13.1 the process
locks up using 100% CPU.
Reviewed by: rscheff (transport), tuexen
MFC after: 3 days
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D35532
The KASSERT criteria needs to be checked against the
sendbuffer so_snd in a subsequent version.
Reviewed By: tuexen, #transport
PR: 263445
MFC after: 1 week
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D35431
Missed another NULL dereference during KASSERTS after traversing
the scoreboard. While at it, scratch the goto by making the
traversal conditional, and remove duplicate checks using an
unconditional loop with all checks inside.
Reviewed By: hselasky
PR: 263445
MFC after: 1 week
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D35428
Adding a few KASSERT() to validate sanity of sack holes, and
bail out if sack hole is inconsistent to avoid panicing non-invariant builds.
Reviewed By: hselasky, glebius
PR: 263445
MFC after: 1 week
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D35387
By analogy with IP address matching, add a way to use ipfw radix
tables for MAC matching. This is implemented using new ipfw table
with mac:radix type. Also there are src-mac and dst-mac lookup
commands added.
Usage example:
ipfw table 1 create type mac
ipfw table 1 add 11:22:33:44:55:66/48
ipfw add skipto tablearg src-mac 'table(1)'
ipfw add deny src-mac 'table(1, 100)'
ipfw add deny lookup dst-mac 1
Note: sysctl net.link.ether.ipfw=1 should be set to enable ipfw
filtering on L2.
Reviewed by: melifaro
Obtained from: Yandex LLC
MFC after: 1 month
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D35103
When the TCP sequence number subtracted is greater than 2**32 minus
the window size, or 2**31 minus the window size, the use of unsigned
long as an intermediate variable, may result in an incorrect retransmit
length computation on all 64-bit platforms.
While at it create a helper macro to facilitate the computation of
the difference between two TCP sequence numbers.
Differential Revision: https://reviews.freebsd.org/D35388
Reviewed by: rscheff
MFC after: 3 days
Sponsored by: NVIDIA Networking
Mbufs leak when manually removing incomplete NDP records with pending packet via ndp -d.
It happens because lltable_drop_entry_queue() rely on `la_numheld`
counter when dropping NDP entries (lles). It turned out NDP code never
increased `la_numheld`, so the actual free never happened.
Fix the issue by introducing unified lltable_append_entry_queue(),
common for both ARP and NDP code, properly addressing packet queue
maintenance.
Reviewed By: melifaro
Differential Revision: https://reviews.freebsd.org/D35365
MFC after: 2 weeks
In order to decrease ifdef INET/INET6s in the lltable implementation,
introduce the llt_post_resolved callback and implement protocol-dependent
code in the protocol-dependent part.
Reviewed By: melifaro
Differential Revision: https://reviews.freebsd.org/D35322
MFC after: 2 weeks
Accept send() calls only when the association is not being
shut down or the expicit message EOR mode is used and the
application provides follow-up data.
Reported by: syzbot+341e9ebd9d24ca7dc62a@syzkaller.appspotmail.com
MFC after: 3 days
Provide sticky ARP flag for network interface which marks it as the
"sticky" one similarly to what we have for bridges. Once interface is
marked sticky, any address resolved using the ARP will be saved as a
static one in the ARP table. Such functionality may be used to prevent
ARP spoofing or to decrease latencies in Ethernet networks.
The drawbacks include potential limitations in usage of ARP-based
load-balancers and high-availability solutions such as carp(4).
The implemented option is disabled by default, therefore should not
impact the default behaviour of the networking stack.
Sponsored by: Conclusive Engineering sp. z o.o.
Reviewed By: melifaro, pauamma_gundo.com
Differential Revision: https://reviews.freebsd.org/D35314
MFC after: 2 weeks
Ensure that a HB can be sent faster than a HB.Interval when performing
path verification of a reachable peer address.
Thanks to Alexander Funke for finding the issue and proposing a fix.
MFC after: 3 days
When sending path confirmation heartbeats, do not take HB.interval
into account when the path is still reachable.
Thanks to Alexander Funke for finding the issue and suggesting a fix.
MFC after: 3 days
If the interface does not support debugnet(4) we should bail early,
rather than having the user find this out at the time of the panic.
dumpon(8) already expects this return value and will print a helpful
error message.
Reviewed by: cem, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D35180
The physical address argument is essentially ignored by every dumper
method. In addition, the dump routines don't actually pass a real
address; every call to dump_append() passes a value of zero for
physical.
Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D35173
Since c67f3b8b78 the sockbuf mutexes belong to the containing socket,
and socket buffers just point to it. In 74a68313b5 macros that access
this mutex directly were added. Go over the core socket code and
eliminate code that reaches the mutex by dereferencing the sockbuf
compatibility pointer.
This change requires a KPI change, as some functions were given the
sockbuf pointer only without any hint if it is a receive or send buffer.
This change doesn't cover the whole kernel, many protocols still use
compatibility pointers internally. However, it allows operation of a
protocol that doesn't use them.
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35152
Rack converted to micro-seconds quite some time ago, but in testing
we have found a miss in that work. The idle reduce time is still based
in ticks, so it must be converted to microseconds before any comparisons
else you will likely not do idle reduce.
Reviewed by: tuexen, thj
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D35066