The IBTA specification has new speed - NDR. That speed supports signaling
rate of 100Gb. mlx5 IB driver translates link modes reported by ConnectX
device to IB speed and width. Added translation of new 100Gb, 200Gb and
400Gb link modes to NDR IB type and width of x1, x2 or x4 respectively.
Linux commits:
f946e45f59ef01ff54ffb3b1eba3a8e7915e7326
MFC after: 1 week
Sponsored by: NVIDIA Networking
A panic has been observed on a system with a Intel X520 dual LAN
device. The panic is caused by a KASSERT() noticing that the amount
of VPD data copied out to the pciconf command does not match the
amount of data read from the device.
The cause of the size mismatch was VPD data that started with 0x82,
the VPD tag that indicates that a VPD ident follows, but with a length
of more than 255 characters, which happens to be the maximum ident
size supported by the API between kernel and the pciconf program.
The data provided did not resemble an actual VPD identifier, and it
can be assumed that the initial tag value 0x82 happens to be there
by accident.
An ident size of 255 far exceeds the sensible length of that data
element, which is in the order of at most 30 to 40 bytes.
This patch adds several consitstency checks to the VPD parser, the
most critical being that ident lengths of more than 255 bytes are
rejected. Other checks reject VPD with more than one ident tag or
with an empty (zero length) ident string.
This patch prevents the panic that occured when "pciconf -lV" was
executed on the affected system.
During the anaylsis of the issue and the VPD code it has been
found that the VPD parser uses a state machine that accepts tags
in any order and combination. This is a bad match for the actual
VPD data, which has a very simple structure that can be parsed
with a non-recursive direct descent parser (which always knows
exactly which token to expect next).
A review fpr a much simpler VPD parser that performs many more
consistency checks and rejects invalid VPD has been proposed in
review https://reviews.freebsd.org/D34268.
Reported by: mikej at paymentallianceintl.com (Michael Jung)
Approved by: jhb
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D34255
The iicbus devinfo uses uint32_t for storing iic bus address and new method
should comply with this fact.
MFC with: 1bd3e8ba696633ccd7525030d951b58ade167814#
Some IIC multifunction devices may have multiple I2C addresses per chip, but
only the primary address is listed in the DT (e.g. MAX776200). In this case,
the sub-devices for the secondary addresses must be created manually with
fixed OFW parameters (node, name, compatibility string, IIC address).
Add a bus method to the ofw_iicbus interface that does this.
MFC after: 4 weeks
* New error_flags that can be used from the error ithread and elsewhere
without a synch_op.
* Stop the adapter immediately in t4_fatal_err but defer most of the
rest of the handling to a task. The task is allowed to sleep, unlike
the ithread. Remove async_event_task as it is no longer needed.
* Dump the devlog, CIMLA, and PCIE_FW exactly once on any fatal error
involving the firmware or the CIM block. While here, dump some
additional info (see dump_cim_regs) for these errors.
* If both reset_on_fatal_err and panic_on_fatal_err are set then attempt
a reset first and do not panic the system if it is successful.
MFC after: 1 week
Sponsored by: Chelsio Communications
UEFI provides a protocol for accessing randomness. This is a good way
to gather early entropy, especially when there's no driver for the RNG
on the platform (as is the case on the Marvell Armada8k (MACCHIATObin)
for now).
If the entropy_efi_seed option is enabled in loader.conf (default: YES)
obtain 2048 bytes of entropy from UEFI and pass is to the kernel as a
"module" of name "efi_rng_seed" and type "boot_entropy_platform"; if
present, ingest it into the kernel RNG.
Submitted by: Greg V
Reviewed by: markm, kevans
Approved by: csprng (markm)
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D20780
Create a dedicated free state, in case the taskqueue worker is still pending,
to avoid re-activation of a freed send tag.
MFC after: 1 week
Sponsored by: NVIDIA Networking
Use the send tag refcounting mechanism to refcount the RX- and TX- TLS
send tags. Then it is no longer needed to wait for refcounts to reach
zero when destroying RX- and TX- TLS send tags as a result of pending
data or WQE commands.
This also ensures that when TX-TLS and rate limiting is used at the same
time, the underlying SQ is not prematurely destroyed.
MFC after: 1 week
Sponsored by: NVIDIA Networking
callout *_sbt functions are used to reduce ping/timeout scheduling
overhead, while allowing later improvments in the functionality.
Keep similar 1000ms callouts while adding a 10 ms window, to allow
some kernel scheduling improvements.
Reviewed By: jhb
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D34222
These zones are cache zones used to allocate TLS offload contexts from
firmware. Releasing items from the cache is a sleepable operation due
to the need to await a response from the firmware command freeing the
tag, so items cannot be reclaimed from the zone in non-sleepable
contexts. Since the cache size is limited by firmware limits, avoid
this by setting UMA_ZONE_UNMANAGED to avoid reclamation by uma_timeout()
and the low memory handler.
Reviewed by: hselasky, kib
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34142
The proper way to drop this kind of CQE is advancing rxq tail
without indicating the packet to the upper network layer.
MFC after: 2 weeks
Sponsored by: Microsoft
On architectures with strict alignment requirements (e.g. arm), clang 14
warns about a packed struct which encloses a non-packed union:
In file included from sys/dev/bwi/bwimac.c:79:
sys/dev/bwi/if_bwivar.h:308:7: error: field iv_val within 'struct bwi_fw_iv' is less aligned than 'union (unnamed union at sys/dev/bwi/if_bwivar.h:305:2)' and is usually due to 'struct bwi_fw_iv' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
} iv_val;
^
It appears to help if you also add __packed to the inner union (i.e.
iv_val). No change to the layout is intended.
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D34196
cxgbe_refresh_stats takes into account VI_SKIP_STATS but not
VI_INIT_DONE when deciding whether to read the hardware stats. But
before this change VI_SKIP_STATS was set only for VIs with VI_INIT_DONE.
That meant that cxgbe_refresh_stats always accessed the hardware for
uninitialized VIs, and this is a problem if the adapter is suspended or
in the middle of a reset.
Fix this by setting VI_SKIP_STATS on all VIs during suspend. While
here, ignore VI_INIT_DONE in vi_refresh_stats too to be consistent with
cxgbe_refresh_stats.
MFC after: 1 week
Sponsored by: Chelsio Communications
Backport from Linux 5.17 (drivers/infiniband/hw/mlx5/fs.c)
This fixes creating flow rules from user-space after the
kernel space update based on Linux 5.7-rc1 .
Sponsored by: NVIDIA Networking
This was missed in 74d6c131cb where other geom modules were annotated
with MODULE_VERSION. Again, the problem is the same: we can't detect
that geom_md is loaded into the kernel without it.
This was noticed in release builds on the cluster; mdconfig attempts to
load geom_md because it can't detect it in the kernel, but the cluster
config includes md(4) and does not build the kmod. This problem would
have been masked on hosts with the kmod built, as the kmod attempts to
register the g_md module and fails. With this commit, mdconfig would
not even try to load it again.
Reported by: re (cperciva)
MFC after: 3 days
In the (extremely unlikely) case of vd->vd_height ==
vt_logo_sprite_height the vd_drawrect code would write outside of
frame-buffer memory.
MFC after: 1 week
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D34220
Having a single pool of worker threads adds extra complexity and
overhead. The software backend also uses per-connection kthreads.
Sponsored by: Chelsio Communications
Previously the driver was called to send PDUs to the NIC synchronously
from the icl_conn_pdu_queue_cb callback. However, this performed a
fair bit of work while holding the icl connection lock. Instead,
change the callback to add sent PDUs to a STAILQ and defer dispatching
of PDUs to the NIC to a helper thread similar to the scheme used in
the TCP iSCSI backend.
- Replace rx_flags int and the sole RXF_ACTIVE flag with a simple
rx_active bool.
- Add a pool of transmit worker threads for cxgbei.
- Fix worker thread exit to depend on the wakeup in kthread_exit()
to fix a race with module unload.
Reported by: mav
Sponsored by: Chelsio Communications
These headers originate with the Xen project and shouldn't be mixed with
the main portion of the FreeBSD kernel. Notably they shouldn't be the
target of clean-up commits.
Switch to use the headers in sys/contrib/xen.
Reviewed by: royger
There's no need to explicitly add linear mappings for the grant table
area, as the memory is allocated using xenmem_alloc and it should
already have a linear mapping that can be obtained using
rman_get_virtual.
While there also remove the return value of gnttab_map, since there's
no return value anymore.
Sponsored by: Citrix Systems R&D
Reviewed by: Elliott Mitchell <ehem+freebsd@m5p.com>
Differential revision: https://reviews.freebsd.org/D29602
sbcut() returns mbufs in reverse order so is not suitable for reading
data from the socket buffer. Instead, check for already-received data
in the receive worker thread before passing offload PDUs up to the
iSCSI layer. This uses soreceive() to read data from the socket and
is also to use M_WAITOK since it now runs from a worker thread instead
of an interrupt thread.
Also, fix decoding of the data segment length for pre-offload PDUs.
Reported by: Jithesh Arakkan @ Chelsio
Fixes: a8c4147edc cxgbei: Parse all PDUs received prior to enabling offload mode.
Sponsored by: Chelsio Communications
Summary:
This switch is based off of the AR8327/AR8337 external switch/PHY.
However unlike the AR8327/AR8337 it itself doesn't have any PHYs;
instead an external PHY connects to it using the PSGMII port.
Differential Revision: https://reviews.freebsd.org/D34112
Reviewed by: manu
This code is inspired by the ar40xx code in openwrt, which itself
is based on the Qualcomm QCA-SSDK. Both of these sources are, amusingly,
BSD licenced - and thus I have included some of the comments in the
hardware workaround paths to document some of the magic numbers.
This adds support for the IPQ4018/IPQ4019 MDIO bus. This is used to
talk to external PHYs and switches. (There's an internal switch
in the IPQ4018/IPQ4019 as well, but it's accessible via MMIO/AXI.)
Differential Revision: https://reviews.freebsd.org/D34110
Reviewed by: manu
A lot more generic cam related things were done in mmc_sim so this
simplifies the driver a lot.
Differential Revision: https://reviews.freebsd.org/D32154
Reviewed by: imp
There seem to be systems returning some garbage here. I still don't
know why, but at least I hope this check fix indefinite printf loop.
MFC after: 2 weeks
74cf7cae4d ("softclock: Use dedicated ithreads for running callouts.")
switched callouts away from the swi infrastructure. It turns out that
this was a major source of entropy in early boot, which we've now lost.
As a result, first boot on hardware without a 'fast' entropy source
would block waiting for fortuna to be seeded with little hope of
progressing without manual intervention.
Let's resolve it by explicitly harvesting entropy in callout_process()
if we've handled any callouts. cc/curthread/now seem to be reasonable
sources of entropy, so use those.
Discussed with: jhb (also proposed initial patch)
Reported by: many
Reviewed by: cem, markm (both csprng)
Differential Revision: https://reviews.freebsd.org/D34150
When FILEMON_SET_FD is used, the filemon handle effectively wraps the
passed file. In particular, the handle may be inherited by a child
process, or transferred over a unix domain socket, so we must verify
that the backing file permits this.
Reported by: syzbot+36e6be9e02735fe66ca8@syzkaller.appspotmail.com
Reviewed by: emaste
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34128
According to Broadcom, mixing 64-bit SGEs with 32-bit chain entries can
lead to IOC Fault code 0x40000d04. This fault code has been observed to
suddenly increase on certain machines when the OCA firmware images are
deployed. The hardware interprets all elements of a 64-bit SGE, even
ones marked as 32-bit. Depending on the other bits, this will just work,
but sometimes generate the above fault. Broadcom recommends this
practice, and the Linux and NetBSD drivers follow it.
Rework the chaining code to use MPI2_SGE_CHAIN64 instead of
MPI2_SGE_CHAIN32. Adjust MPS_SGC_SIZE from 8 to 12 to match the size of
the new structure. Flag the structure as being 64-bits now. Since
MPS_SGE64_SIZE and MPS_SGC_SIZE are the same now, mps_push_sge could be
simplified (after the same fashion of mpr). The different number of
cases collapse to whether or not there's room for the segments and if
not we need a chain, however these changes haven't been made yet as the
current code handles those cases properly with the new defines.
Made chain_busaddr 64-bits, even though we ask for all allocations to be
below 4GB for this tag. Use it to set both parts of the CHAIN64 address
rather than baking the 4GB assumption. Add asserts around the allocation
to detect and BUSDMA bugs in allocation.
Remove asserts and associated comment in mpi_pre_fw_download and
mpi_pre_fw_upload. The code does not, it seems, depend on this
invariant. The mpr driver has similar code, no asserts and also doesn't
depend on this.
Adjust comments to reflect the updated size.
Sponsored by: Netflix
Reviewed by: scottl, mav
Differential Revision: https://reviews.freebsd.org/D34016
If port resume fails, likely the USB device is detached. Ignore such errors,
because else the USB stack might try forever trying to resume the device,
before it will proceed detaching it.
MFC after: 1 week
Sponsored by: NVIDIA Networking