Similar to dcfddc8dc091e7688abc8488a0307eba425fa7a2, replace the
simpler, inlined version with the full version.
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D41690
(cherry picked from commit 897e564361624411c4e557e0817642e1477f0af4)
In particular, the kernel RPC layer used by the NFS client never
invokes pru_rcvd since it always reads data from the socket upcall
via MSG_SOCALLBCK which avoids calling pru_rcvd. As a result, on an
NFS client connection managed by t4_tom, RX credits were never
returned to the TOE connection to open the TCP window resulting in
connection hangs.
To fix, expand the set of conditions in do_rx_data where RX credits
are returned to match those in t4_rcvd_locked by calling the function
directly.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D41688
(cherry picked from commit dcfddc8dc091e7688abc8488a0307eba425fa7a2)
1. The tracing ifnet's name must match the nexus name. One way to do
this is to not use a unit number.
2. Do not hold the tracer lock around ether_ifattach or ether_ifdetach.
netlink calls back into the driver (see stack below) and that leads
to illegal lock recursion.
tracer_ioctl
if_ioctl
get_operstate_ether
get_operstate
dump_iface
rtnl_handle_ifevent
rtnl_handle_ifattach
if_attach_internal
if_attach
ether_ifattach
t4_cloner_create
MFC after: 3 days
Sponsored by: Chelsio Communications
(cherry picked from commit e203cb393fe0b963dd585d0576dcc6a47a28c386)
netlink(4) calls back into the driver during detach and it attempts to
start an internal synchronized op recursively, causing an interruptible
hang. Fix it by failing the ioctl if the VI has been marked as DOOMED
by cxgbe_detach.
Here's the stack for the hang for reference.
#6 begin_synchronized_op
#7 cxgbe_media_status
#8 ifmedia_ioctl
#9 cxgbe_ioctl
#10 if_ioctl
#11 get_operstate_ether
#12 get_operstate
#13 dump_iface
#14 rtnl_handle_ifevent
#15 rtnl_handle_ifnet_event
#16 rt_ifmsg
#17 if_unroute
#18 if_down
#19 if_detach_internal
#20 if_detach
#21 ether_ifdetach
#22 cxgbe_vi_detach
#23 cxgbe_detach
#24 DEVICE_DETACH
MFC after: 3 days
Sponsored by: Chelsio Communications
(cherry picked from commit 3814249f1e8dacfcd326dd7c416c528a1d88b6a1)
This change adds struct tcp_info fields corresponding to the following
struct tcpcb ones:
- snd_una
- snd_max
- rcv_numsacks
- rcv_adv
- dupacks
Note that while both tcp_fill_info() and fill_tcp_info_from_tcb() are
extended accordingly, no counterpart of rcv_numsacks is available in
the cxgbe(4) TOE PCB, though.
Sponsored by: NetApp, Inc. (originally)
This function actually only ever reads from the TCP PCB. Consequently,
also make the pointer to its TCP PCB parameter const.
Sponsored by: NetApp, Inc. (originally)
This is the list of changes since last release, taken from the release
notes of Chelsio Unified Wire 3.18.0.1.
Version : 1.27.4.0
Date : 07/05/2023
=======================================
Fixes
-----
BASE:
- Handle 40G to 100G cable change.
- Avoid unnecessary i2c read.
=======================================
Obtained from: Chelsio Communications
Sponsored by: Chelsio Communications
MFC after: 1 week
- Add new DB_DEFINE_TABLE and DB_DECLARE_TABLE macros to define new
command tables. DB_DECLARE_TABLE is intended for use in headers
similar to MALLOC_DECLARE and SYSCTL_DECL.
DB_DEFINE_TABLE takes three arguments, the name of the parent table,
the command name, and the name of the table itself, e.g.
DB_DEFINE_TABLE(show, foo, show_foo) defines a new "show foo" table.
- DB_TABLE_COMMAND, DB_TABLE_COMMAND_FLAGS, DB_TABLE_ALIAS, and
DB_ALIAS_FLAGS allow new commands and aliases to be defined. These
are similar to the existing DB_COMMAND, etc. except that they take
an initial argument giving the name of the parent table, e.g.:
DB_TABLE_COMMAND(show_foo, bar, db_show_foo_bar)
defines a new "show foo bar" command.
This provides a cleaner interface than the ad-hoc use of internal
macros like _DB_SET that was required previously (e.g. in cxgbe(4)).
This retires DB_FUNC macro as well as the internal _DB_FUNC macro.
Reviewed by: melifaro, kib, markj
Differential Revision: https://reviews.freebsd.org/D40819
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
These are the changes since the last update (copy-pasted from the
release notes for Chelsio Unified Wire v3.18.0.0):
====================
Version : 1.27.3.0
Date : 04/07/2023
Fixes
-----
BASE:
- Fixed a hang if module eeprom reads gives invalid data.
- KR backlplane no-fec link problem fixed.
OFLD:
- iscsi ddp errors fixed.
- iwarp connection abort in rare cases causing NIC traffic hang fixed.
ENHANCEMENTS
------------
BASE:
- Cisco GLC-TE 1G modules support added.
====================
Version : 1.27.1.0
Date : 12/02/2022
Fixes
-----
BASE:
- memwrite dsgl cannot be used for T5.
OFLD:
- Enabled FCoE in SO adapters.
- TOE-TLS crash fixed.
- iscsi hang fixed.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
t4_dump_stag to dump hw state for a known STAG.
t4_dump_all_stag to dump hw state for all valid STAGs. This routine
walks the entire STAG region looking for valid entries and this can take
a while for some configurations.
MFC after: 1 week
Sponsored by: Chelsio Communications
Each physical port has an associated loopback tx channel and anything
transmitted over that channel by the driver is looped back internally by
the hardware as if received on that physical port. This change allows
tracing filters to be installed in this loopback path.
MFC after: 1 week
Sponsored by: Chelsio Communications
Previously private to t4_sge.c, this allows other parts of the driver
(such as NIC TLS) to use these helpers directly.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D38578
If parse_pkt returns EINPROGRESS, return from cxgbe_transmit
without queueing the packet in a txq. Use this to move the call
to ethofld_transmit for packet pacing into parse_pkt.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D38577
The 0b70e3e78b changed the original design of a single entry point
into pfil(9) chains providing separate functions for the filtering
points that always provide mbufs and know the direction of a flow.
The motivation was to reduce branching. The logical continuation
would be to do the same for the filtering points that always provide
a memory pointer and retire the single entry point.
o Hooks now provide two functions: one for mbufs and optional for
memory pointers.
o pfil_hook_args() has a new member and pfil_add_hook() has a
requirement to zero out uninitialized data. Bump PFIL_VERSION.
o As it was before, a hook function for a memory pointer may realloc
into an mbuf. Such mbuf would be returned via a pointer that must
be provided in argument.
o The only hook that supports memory pointers is ipfw:default-link.
It is rewritten to provide two functions.
o All remaining uses of pfil_run_hooks() are converted to
pfil_mem_in().
o Transparent union of pfil_packet_t and tricks to fix pointer
alignment are retired. Internal pfil_realloc() reduces down to
m_devget() and thus is retired, too.
Reviewed by: mjg, ocochard
Differential revision: https://reviews.freebsd.org/D37977
After c034143269, if_cxgbe.ko fails to load if crypto is not compiled
in kernel, e.g., MINIMAL.
link_elf_obj: symbol hmac_init_ipad undefined
linker_load_file: /boot/kernel/if_cxgbe.ko - unsupported file type
kldload: an error occurred while loading module if_cxgbe. Please check dmesg(8) for more details.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D38482
Summary:
Keep TOEDEV() macro for backwards compatibility, and add a SETTOEDEV()
macro to complement with the new accessors.
Sponsored by: Juniper Networks, Inc.
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D38199
Prior to Conrad's changes to replace session integer IDs with a
pointer to the driver-specific state in commit 1b0909d51a, the
driver had to find the softc pointer from the adapter before it could
locate the ccr_session structure for a completed request. Since
Conrad's changes, the ccr_session pointer can now be obtained directly
from the crp. Add a backpoint from ccr_session back to ccr_softc and
use this in place of the ccr_softc member in cxgbe's struct adapter.
Sponsored by: Chelsio Communications
For the TCP protocol inpcb storage specify allocation size that would
provide space to most of the data a TCP connection needs, embedding
into struct tcpcb several structures, that previously were allocated
separately.
The most import one is the inpcb itself. With embedding we can provide
strong guarantee that with a valid TCP inpcb the tcpcb is always valid
and vice versa. Also we reduce number of allocs/frees per connection.
The embedded inpcb is placed in the beginning of the struct tcpcb,
since in_pcballoc() requires that. However, later we may want to move
it around for cache line efficiency, and this can be done with a little
effort. The new intotcpcb() macro is ready for such move.
The congestion algorithm data, the TCP timers and osd(9) data are
also embedded into tcpcb, and temprorary struct tcpcb_mem goes away.
There was no extra allocation here, but we went through extra pointer
every time we accessed this data.
One interesting side effect is that now TCP data is allocated from
SMR-protected zone. Potentially this allows the TCP stacks or other
TCP related modules to utilize that for their own synchronization.
Large part of the change was done with sed script:
s/tp->ccv->/tp->t_ccv./g
s/tp->ccv/\&tp->t_ccv/g
s/tp->cc_algo/tp->t_cc/g
s/tp->t_timers->tt_/tp->tt_/g
s/CCV\(ccv, osd\)/\&CCV(ccv, t_osd)/g
Dependency side effect is that code that needs to know struct tcpcb
should also know struct inpcb, that added several <netinet/in_pcb.h>.
Differential revision: https://reviews.freebsd.org/D37127
Rather than requiring a socket to be created as a TLS socket from the
get go, switch a TOE socket from "plain" TOE to TLS mode when a
receive key is added to the socket.
The firmware is only able to switch a "plain" TOE connection to TLS
mode if the head of the pending socket data is the start of a TLS
record, so the connection is migrated to TLS mode as a multi-step
process.
When TOE TLS RX is enabled, the associated connection's receive side
is frozen via a flag in the TCB. The state of the socket buffer is
then examined to determine if the pending data in the socket buffer
ends on a TLS record boundary. If so, the connection is migrated to
TLS mode and unfrozen. Otherwise, the connection is unfrozen
temporarily until more data arrives. Once more data arrives, the
receive queue is frozen again and rechecked. This continues until the
connection is paused at a record boundary. Any records received
before TLS mode is enabled are decrypted as software records.
Note that this removes the 'rx_tls_ports' sysctl. TOE TLS offload for
receive is now enabled automatically on existing TOE connections when
using a KTLS-aware SSL library just as it was previously enabled
automatically for TLS transmit. This also enables TLS offload for TOE
connections which enable TLS after passing initial data in the clear
(e.g. STARTTLS with SMTP).
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D37351
The previous commit lost an implicit struct socket * cast. Use an
inline function instead as the macro is already rather long.
Fixes: e1401f7579 cxgbe: use standard sototcpcb() accessor macro to get socket's tcpcb
Sponsored by: Chelsio Communications
Mechanically cleanup INP_TIMEWAIT from the kernel sources. After
0d7445193a, this commit shall not cause any functional changes.
Note: this flag was very often checked together with INP_DROPPED.
If we modify in_pcblookup*() not to return INP_DROPPED pcbs, we
will be able to remove most of this checks and turn them to
assertions. Some of them can be turned into assertions right now,
but that should be carefully done on a case by case basis.
Differential revision: https://reviews.freebsd.org/D36400
This was added in 7cba15b16e in 2016 and firmwares at that time were
already setting up the iSCSI tag mask properly. Since then it has also
become possible to split the iSCSI region between multiple PCIE PFs but
the driver's calculation takes only its own PF's allocation into account
and that means this code is incorrect and not just a harmless no-op.
MFC after: 1 week
Sponsored by: Chelsio Communications
This is mostly cosmetic, but it also doesn't leave a gap of time where
no structures are valid. Instead, we permit the ISR to continue to
use the previous structure if the write to update cal_current isn't
yet visible.
Reviewed by: gallatin
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D36669
This uses fixed-point math already used elsewhere in the kernel for
sub-second time values. To avoid overflows this does require updating
the calibration once a second rather than once every 30 seconds. Note
that the cxgbe driver already queries multiple registers once a second
for the statistics timers. This version also uses fewer instructions
with no branches (for the math portion) in the per-packet fast path.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D36663
Coverity flagged this in its latest run but it is not a problem in
practice as the card's core clock would have to be > 4.2GHz for any
overflow to occur.
CID 1498303: Integer handling issues (OVERFLOW_BEFORE_WIDEN)
Potentially overflowing expression "sc->params.vpd.cclk * 1000U" with type "unsigned int" (32 bits, unsigned) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type "uint64_t" (64 bits, unsigned).
Reported by: Coverity Scan (CID 1498303)
Sponsored by: Chelsio Communications