Coherently read the phase bit of the status completion record. We loop
over the completion record array, looking for all the transactions in
the same phase that have been completed. In doing that, we have to be
careful to read the status field first, and if it indicates a complete
record, we need to read and process that record. Otherwise, the host
might be overtaken by device when reading this completion record,
leading to a mistaken belief that the record is in phase. This leads to
the code using old values and looking at an already completed entry, which
has no current tracker.
To work around this problem, we read the status and make sure it is in
phase, we then re-read the entire completion record guaranteeing it's
complete, valid, and consistent . In addition we resync the dmatag to
reflect changes since the prior loop for the bouncing dma case.
Reviewed by: jrtc27@, chuck@
Found by: jrtc27 (this fix is based in part on her D30995 fix)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D31002
(cherry picked from commit aa0ab681ae)
Remove __packed from nvme_command, nvme_completion and
nvme_dsm_trim. Add super-alignment to nvme_completion since it's always
at least that aligned in hardware (and in our existing uses of it
embedded in structures). It generates better code in
nvme_qpair_process_completions on riscv64 because otherwise the ABI
assumes a 4-byte alignment, and the same on all other platforms.
Reviewed by: jrtc27@, mav@, chuck@
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D31001
(cherry picked from commit fea3cf1d6d)
Put the { on the same line as the struct nvme_foo when we define these
structures. It's FreeBSD standard and these were inconsistent.
Sponsored by: Netflix
(cherry picked from commit 80a75155e1)
Part of the nvme recovery process for errors is to reset the
card. Sometimes, this results in failing the entire controller. When nda
is in use, we free the sim, which will sleep until all the I/O has
completed. However, with only one thread, the request fail task never
runs once the reset thread sleeps here. Create two threads to allow I/O
to fail until it's all processed and the reset task can proceed.
This is a temporary kludge until I can work out questions that arose
during the review, not least is what was the race that queueing to a
failure task solved. The original commit is vague and other error paths
in the same context do a direct failure. I'll investigate that more
completely before committing changing that to a direct failure. mav@
raised this issue during the review, but didn't otherwise object.
Multiple threads, though, solve the problem in the mean time until other
such means can be perfected.
Reviewed by: jhb@
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30366
(cherry picked from commit f0f4712165)
nvme drives are configured early in boot. However, a number of the configuration
steps takes which take a while, so we defer those to a config intrhook that runs
before the root filesystem is mounted. At the same time, the PCI hot plug wakes
up and tests the status of the card. It may decide that the card has gone away
and deletes the child. As part of that process nvme_detach is called. If this
call happens after the config_intrhook starts to run, but before it is finished,
there's a race where we can tear down the device's soft state while the
config_intrhook is still using it. Use the new config_intrhook_drain to
disestablish the hook. Either it will be removed w/o running, or the routine
will wait for it to finish. This closes the race and allows safe hotplug at any
time, even very early in boot.
Sponsored by: Netflix, Inc
Reviewed by: jhb, mav
Differential Revision: https://reviews.freebsd.org/D29006
(cherry picked from commit 8423f5d4c1)
to only match the first vendor specific interface, if any.
PR: 253374
Sponsored by: Mellanox Technologies // NVIDIA Networking
(cherry picked from commit dab84426a6)
The fallback for __align_up() used by roundup2() uses __typeof__()
which doesn't work for bitfields. This fixes the build on GCC which
uses the fallback.
Reviewed by: arichardson, markj
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D28599
(cherry picked from commit 50a61f8db5)
DRIVER_OK status is set after device_attach() succeeds. For now postpone
disk_create to attach_completed() method.
Sponsored by: The FreeBSD Foundation
Reviewed by: grehan
Approved by: lwhsu (mentor)
Differential Revision: https://reviews.freebsd.org/D30049
(cherry picked from commit 4e1e1d667f)
The vmbus ISR needs to live in a trampoline. Dynamically allocating a
trampoline at driver initialization time poses some difficulties due to
the fact that the KENTER macro assumes that the offset relative to
tramp_idleptd is fixed at static link time. Another problem is that
native_lapic_ipi_alloc() uses setidt(), which assumes a fixed trampoline
offset.
Rather than fight this, move the Hyper-V ISR to i386/exception.s. Add a
new HYPERV kernel option to make this optional, and configure it by
default on i386. This is sufficient to make use of vmbus(4) after the
4/4 split. Note that vmbus cannot be loaded dynamically and both the
HYPERV option and device must be configured together. I think this is
not too onerous a requirement, since vmbus(4) was previously
non-functional.
Reported by: Harry Schmalzbauer <freebsd@omnilan.de>
Tested by: Harry Schmalzbauer <freebsd@omnilan.de>
Reviewed by: whu, kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 97993d1ebf)
This makes it easier to change the socket locking protocols. No
functional change intended.
Sponsored by: The FreeBSD Foundation
(cherry picked from commit a100217489)
Some code was using it already, but in many places we were testing
SO_ACCEPTCONN directly. As a small step towards fixing some bugs
involving synchronization with listen(2), make the kernel consistently
use SOLISTENING(). No functional change intended.
Sponsored by: The FreeBSD Foundation
(cherry picked from commit f4bb1869dd)
During boot we warn that the kbd and openfirm drivers are Giant-locked
and may be deleted. Generally, the warning helps signal that certain
old drivers are not being maintained and are subject to removal, but
this doesn't really apply to certain drivers which are harder to
detangle from Giant.
Add a flag, D_GIANTOK, that devices can specify to suppress the
misleading warning. Use it in the kbd and openfirm drivers.
Reviewed by: imp, jhb
Sponsored by: The FreeBSD Foundation
(cherry picked from commit fbeb4ccac9)
Reading EEPROM from Intel 4965AGN M2 takes 60 us which was causing panic
on system startup.
PR: 255465
Reviewed by: markj
(cherry picked from commit 03d4b58fee)
This fixes lose of evdev events after moused has been killed.
While here use bitwise operations for UMS_EVDEV_OPENED flag.
Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D30342
(cherry picked from commit 05ab03a317)
This fixes inability to start USB xfers in a case when FIFO has been
already open()-ed but no read() or poll() calls has been issued yet.
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D30343
(cherry picked from commit 47791339f0)
The second set of USB transfer is requested by hkbd(4) and
should improve HID keyboard handling in kdb and panic contexts.
Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D30486
(cherry picked from commit 9aa0e5af75)
Which happens when USB transfer setup is failed.
PR: 254974
Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D30485
(cherry picked from commit e889a462d8)
The generated C output for aicasm_scan.l defines yylineno already, so
references to it from other files should use an extern declaration.
The STAILQ_HEAD use in aicasm_symbol.h also provided an identifier,
causing it to both define the struct type and define a variable of that
struct type, causing any C file including the header to define the same
variable. This variable is not used (and confusingly clashes with a
field name just below) and was likely caused by confusion when switching
between defining fields using similar type macros and defining the type
itself.
Reviewed by: imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30525
(cherry picked from commit 5e912f5fec)
It includes:
1)Newly added TMF feature.
2)Added newly Huawei & Inspur PCI ID's
3)Fixed smartpqi driver hangs in Z-Pool while running on FreeBSD12.1
4)Fixed flooding dmesg in kernel while the controller is offline during in ioctls.
5)Avoided unnecessary host memory allocation for rcb sg buffers.
6)Fixed race conditions while accessing internal rcb structure.
7)Fixed where Logical volumes exposing two different names to the OS it's due to the system memory is overwritten with DMA stale data.
8)Fixed dynamically unloading a smartpqi driver.
9)Added device_shutdown callback instead of deprecated shutdown_final kernel event in smartpqi driver.
10)Fixed where Os is crashed during physical drive hot removal during heavy IO.
11)Fixed OS crash during controller lockup/offline during heavy IO.
12)Fixed coverity issues in smartpqi driver
13)Fixed system crash while creating and deleting logical volume in a continuous loop.
14)Fixed where the volume size is not exposing to OS when it expands.
15)Added HC3 pci id's.
Reviewed by: Scott Benesh (microsemi), Murthy Bhat (microsemi), imp
Differential Revision: https://reviews.freebsd.org/D30182
(cherry picked from commit 9fac68fc38)
m_pullup() frees the input mbuf chain upon a failure. Set *mpp to NULL
in this case to ensure that the caller does not free the chain again.
PR: 255864
Submitted by: Lv Yunlong <lylgood@foxmail.com> (original version)
MFC after: 1 week
(cherry picked from commit 71776d6719)
Otherwise the resouce buffer may have been freed when
AcpiSetCurrentResources() is called, leading to a use-after-free.
PR: 255862
Submitted by: Lv Yunlong <lylgood@foxmail.com> (original version)
MFC after: 1 week
(cherry picked from commit 4cf3327528)
behind USB HUBs are detected and the USB reset counter logic will kick in
preventing enumeration of continuously failing ports.
Submitted by: phk@
Tested by: bz@
PR: 237666
Sponsored by: Mellanox Technologies // NVIDIA Networking
(cherry picked from commit e5ff940a81)
0 milliseconds and 2 seconds inclusivly. Some style fixes while at it.
The USB specification has minimum values and maximum values,
and not only minimum values.
Sponsored by: Mellanox Technologies // NVIDIA Networking
(cherry picked from commit 00e501d720)
says it should be max 10 milliseconds.
This may fix some USB enumeration issues:
> usbd_req_re_enumerate: addr=3, set address failed! (USB_ERR_IOERROR, ignored)
> usbd_setup_device_desc: getting device descriptor at addr 3 failed,
Found by: Zhichao1.Li@dell.com
Sponsored by: Mellanox Technologies // NVIDIA Networking
(cherry picked from commit 70ffaaa69c)
This is just clerical work to ease bug triage and may be used to set
expectations around the ability for anyone in the community to perform
testing and development on older parts.
Approved by: erj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D29876
(cherry picked from commit fdbcd35a75)
* Fix 82574 Link Status Changes, carrying the OTHER mask bit around as
needed.
* Move igb-class LSC re-arming out of FAST back into the handler.
* Clarify spurious/other interrupt re-arms in FAST.
In MSI-X mode, 82574 and igb-class devices use an interrupt filter to
handle Link Status Changes. We want to do LSC re-arms in the handler
to take advantage of autoclear (EIAC) single shot behavior.
82574 uses 'Other' in ICR and IMS for LSC interrupt types when in MSI-X
mode, so we need to set and re-arm the 'Other' bit during attach and
after ICR reads in the FAST handler if not an LSC or after handling on
LSC due to autoclearing.
This work was primarily done to address the referenced PR, but inspired
some clarification and improvement for igb-class devices once the
intentions of previous bug fix attempts became clearer.
PR: 211219
Reported by: Alexey <aserp3@gmail.com>
Tested by: kbowling (I210 lagg), markj (I210)
Approved by: markj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D29943
(cherry picked from commit eea55de7b1)
This is just clerical work to ease bug triage and may be used to set
expectations around the ability for anyone in the community to perform
testing and development on older parts (this driver covers over 20 years
of silicon)
Reviewed by: erj
Approved by: markj
Sponsored by: Pink Floyd - Any Colour You Like (in kind)
Differential Revision: https://reviews.freebsd.org/D29872
(cherry picked from commit 0f6bea61ed)
There are a number of issues in the e1000 multicast filter handling
that have been present for a long time. Take the updated approach from
ixgbe(4) which does not have the issues.
The issues are outlined in the PR, in particular this solves crossing
over and under the hardware's filter limit, not programming the
hardware filter when we are above its limit, disabling SBP (show bad
packets) when the tunable is enabled and exiting promiscuous mode, and
an off-by-one error in the em_copy_maddr function.
PR: 140647
Reported by: jtl
Reviewed by: markj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D29789
(cherry picked from commit 4b38eed76d)
We don't need to set the bits here since the if/else if/else statements
fully cover setting these bit pairs.
Reported by: markj
Reviewed by: markj, erj
Approved by: #intel_networking
MFC aftter: 1 week
Differential Revision: https://reviews.freebsd.org/D29827
(cherry picked from commit deecaa1445)
Allow new enclosure to replace previously existing one if there is
no completely unused table entry, same as it is done for devices.
If we can not process DPM due to corruption -- wipe it and restart
from scratch. Otherwise I don't see a way to recover persistence if
something go wrong and there is no BIOS to recover it for us.
Together this solves a problem that appeared when 9300-8i firmware
update to 16.00.10.00 somehow switched its mapping mode from Device
Persistence to Enclosure/Slot without wiping the DPM table. It made
HBA completely unusable, since overflowed and conflicting mapping
table was unable to map any of enclosures and so devices.
Also while there make some enclosure mapping errors more informative.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
(cherry picked from commit b99419aee4)
I saw a situation where the driver set CAM_AUTOSNS_VALID on a failed ccb
even though SRB_STATUS_AUTOSENSE_VALID was not set in the status.
The actual sense data remained all zeros.
The problem seems to be that create_storvsc_request() always sets
hv_storvsc_request::sense_info_len, so checking for sense_info_len != 0
is not enough to determine if any auto-sense data is actually available.
Sponsored by: CyberSecure
(cherry picked from commit 8afecefd57)
Attaching and detaching devices can be heavy-weight and detaching can
sleep waiting for events. For that reason using the system-wide
single-threaded taskqueue_thread is not really appropriate.
There is even a possibility for a deadlock if taskqueue_thread is used
for detaching.
In fact, there is an easy to reproduce deadlock involving nvme, pass
and a sudden removal of an NVMe device.
A pass peripheral would not release a reference on an nvme sim until
pass_shutdown_kqueue() is executed via taskqueue_thread. But the
taskqueue's thread is blocked in nvme_detach() -> ... -> cam_sim_free()
because of the outstanding reference.
Sponsored by: CyberSecure
Reviewed by: mav, imp
(cherry picked from commit 12588ce02d)
The boundary differentiating "lem" vs "em" class devices was wrong
after the iflib conversion of lem(4).
The Packet Buffer size for 82547 class chips was not set correctly
after the iflib conversion of lem(4).
These changes restore functionality on an 82547 for the submitter.
PR: 236119
Reported by: Jeff Gibbons <jgibbons@protogate.com>
Reviewed by: markj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D29766
(cherry picked from commit bb1b375fa7)
This is a debugging tunable that shouldn't have retained this setting
after the initial iflib conversion of the driver
PR: 248934
Reported by: Franco Fichtner <franco@opnsense.org>
Reviewed by: markj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D29768
(cherry picked from commit 548d8a131d)
Several protocol methods take a sockaddr as input. In some cases the
sockaddr lengths were not being validated, or were validated after some
out-of-bounds accesses could occur. Add requisite checking to various
protocol entry points, and convert some existing checks to assertions
where appropriate.
Reported by: syzkaller+KASAN
Reviewed by: tuexen, melifaro
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29519
(cherry picked from commit f161d294b9)
AIM (adaptive interrupt moderation) was part of BSD11 driver. Upon IFLIB
migration, AIM feature got lost. Re-introducing AIM back into IFLIB
based IXGBE driver.
One caveat is that in BSD11 driver, a queue comprises both Rx and Tx
ring. Starting from BSD12, Rx and Tx have their own queues and rings.
Also, IRQ is now only configured for Rx side. So, when AIM is
re-enabled, we should now consider only Rx stats for configuring EITR
register in contrast to BSD11 where Rx and Tx stats were considered to
manipulate EITR register.
Reviewed by: gallatin, markj
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D27344
(cherry picked from commit 64881da478)
The _ext event notification includes the address being added/removed and
that gives the driver an easy way to ignore non-IPv6 addresses. Remove
'tom' from the handler's name while here, it was moved out of t4_tom a
long time ago.
Sponsored by: Chelsio Communications
(cherry picked from commit f4ba035bca)
There is no need to panic in if_transmit if the checksums requested are
inconsistent with the frame being transmitted. This typically indicates
that the kernel and driver were built with different INET/INET6 options,
or there is some other kernel bug. The driver should just throw away
the requests that it doesn't understand and move on.
Sponsored by: Chelsio Communications
(cherry picked from commit b9820bca18)