Commit graph

42038 commits

Author SHA1 Message Date
John Baldwin
cb63a64b8c acpi_pcib: Rename decoded_bus_range to get_decoded_bus_range
While here, change the return value to bool.

Discussed by:	gibbs

(cherry picked from commit f6c2774fe415f3b79c551b8075c159d6a7d4d0bf)
2023-10-24 12:19:59 -07:00
John Baldwin
76b37faac8 acpi_pcib: Trust decoded bus range from _CRS over _BBN
Currently if _BBN doesn't match the first bus in the decoded bus range
from _CRS for a Host to PCI bridge, the driver fails to attach as a
defensive measure.

There is now firmware in the field where these do not match, and the
_BBN values are clearly wrong, so rather than failing attach, trust
the range from _CRS over _BBN.

Co-authored-by:	Justin Gibbs <gibbs@FreeBSD.org>
Reported by:	gibbs
Reviewed by:	imp (earlier version)
Differential Revision:	https://reviews.freebsd.org/D42231

(cherry picked from commit 22a6678b627b39ceb94f7323be1010e928d92494)
2023-10-24 12:00:57 -07:00
John Baldwin
bfa1565246 Trim various $FreeBSD$
Approved by:	markj (cddl/contrib changes)
Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D41961

(cherry picked from commit f53355131f65d64e7643d734dbcd4fb2a5de20ed)
2023-10-24 11:22:23 -07:00
John Baldwin
efb26b3fe6 Update a couple of tools to not embed __FBSDID in generated files
Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D41955

(cherry picked from commit 99159b076a278d1feb0e18ae99fd866c90443893)
2023-10-24 10:09:55 -07:00
John Baldwin
5db9e9e296 Remove a few more stray __FBSDID uses
Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D41954

(cherry picked from commit 16837d353cdde87672d08112610e51e4121c4e50)
2023-10-24 10:08:46 -07:00
John Baldwin
62ba76f3d1 videomode: Regenerate files
Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D41953

(cherry picked from commit fc3cc652e500bd8e33b4b77449d167f1df073acb)
2023-10-24 10:08:29 -07:00
John Baldwin
c6dc53e7c8 videomode/devlist2h.awk: Don't include $FreeBSD$ in generated files
Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D41952

(cherry picked from commit bd524e2ddb77e1c691f308359ab917414ecb8bed)
2023-10-24 10:08:19 -07:00
Wei Hu
c81166b018 Hyper-V: vmbus: check if signaling host is needed in vmbus_rxbr_read
It is observed that netvsc's send rings could stall on the latest
Azure Boost platforms. This is due to vmbus_rxbr_read() routine
doesn't check if host is waiting for more room to put data, which
leads to host side sleeping forever on this vmbus channel. The
problem was only observed on the latest platform because the host
requests larger buffer ring room to be available, which causes
the issue to happen much more easily.

Fix this by adding check in the vmbus_rxbr_read call and signaling
the host in the callers if check returns positively.

Reported by:	NetApp
Tested by:	whu
Sponsored by:	Microsoft

(cherry picked from commit 49fa9a64372b087cfd66459a20f4ffd25464b6a3)
2023-10-20 10:03:35 +00:00
Sumit Saxena
10bbea2e25 mpi3mr: Move creation of watchdog to interupt config hook
Move creation of watchdog process from just before we configure the
interrupt config hook to into the config hook itself. This prevents it
from racing the config intr hook and doing an extra reset of the
card. This extra reset is usually harmless, but sometimes it can prevent
discovery of devices if done at just the wrong time. This can lead to no
disks being registered in a box full of disks, for example. Starting it
later eliminates this race, making discovery reliable.

Reviewed by: imp

(cherry picked from commit 7e02c7074c4c6df77b860e0dbcd032a2ea04b98b)
2023-10-19 15:21:12 -06:00
John Hall
01619a8faf smartpqi: Change alignment for dma tags
Problem: Under certain I/O conditions, a program doing large block disk
reads can cause a controller to crash.

Root Cause: The SCSI read request and destination address in the BDMA
descriptor is incorrect, causing the BDMA engine in the controller to
assert.

Fix: Change the alignment for creating bus_dma_tags in the driver from
PAGE_SIZE (4k) to 1, which allows the controller to manage it's own
address range for BDMA transactions.

Risk: Medium

Exposure: This reverts a change first made to support NVMe drives on
Excalibur. At that time a 4k alignment was necessary. This no longer
seems to be the case.

PR: 259541
Reported by: Ka Ho Ng <khng@freebsd.org>
Reviewed by: imp
Differential Revision:	https://reviews.freebsd.org/D41619

(cherry picked from commit f07b267d8cc87e88be3c78aa69504b5ebc6571ee)
2023-10-19 15:21:11 -06:00
John F. Carr
1ad148a68a smartpqi: Drop spinlock before freeing memory
pqisrc_free_device frees the device softc with the os spinlock
held. This causes crashes when devices are removed because the memory
free might sleep (which is prohibited with spin locks held). Drop the
spinlock before releasing the memory.

MFC After: 2 days
PR: 273289
Reviewed by: imp

(cherry picked from commit b064a4c9eed5b1dd2a40fc4fd2cb7e738b681547)
2023-10-19 15:21:11 -06:00
Emmanuel Vadot
d7d51aad8f iicbus: pmic: rk8xx: Fix logic in clock-output-names detection
Pointy hat to:	manu (probably)

(cherry picked from commit 66946511380bf088c96a7517ba9b018c943655c6)
2023-10-18 16:33:38 +02:00
Emmanuel Vadot
26bf8fff64 xilinx: reset: Remove debug printfs
Sponsored by:	Beckhoff Automation GmbH & Co. KG

(cherry picked from commit 257405d707d77bc55b38e7c2bb83b8a9247a86ae)
2023-10-18 16:32:37 +02:00
Emmanuel Vadot
b4cd14485a i2c: Add Microcrystal RV3032 RTC driver
This is a simple RTC driver for the rv3032 from Microcrystal.
Just the basic functionality is implemented (no timer, alarm etc ..).

Sponsored by:	Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D41995

(cherry picked from commit 1d6a6a524409662992ca96bc91ae69b2a2a5ff35)
2023-10-18 16:32:19 +02:00
Emmanuel Vadot
7b824791e7 i2c: Add cadence iic driver
This IP is found in Xilinx SoC, it only been tested on ZynqMP (arm64)
so only enable it there for now.

Differential Revision:	https://reviews.freebsd.org/D41994

(cherry picked from commit 137b58e4d2044adc200d13c8989d3746a0a4bd7f)
2023-10-18 16:32:17 +02:00
Emmanuel Vadot
c133589105 iicbus: Move opencores i2c driver into controller subdirectory
Sponsored by:	Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D41914

(cherry picked from commit 125f5c5b48b1fdccf364b821ce48bfdbd9687ed1)
2023-10-18 16:32:15 +02:00
Emmanuel Vadot
96edbfe36b iicbus: Move i2c sensors drivers into new sensor subdirectory
No reason that they should live directly under iicbus

Sponsored by:   Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D41913

(cherry picked from commit 7c569caa0a6fffa7e1cc0a7f61e986dbc7c59074)
2023-10-18 16:32:14 +02:00
Emmanuel Vadot
cd2f6226e7 iicbus: Move ADC drivers into a new adc subfolder
No reason that they should live directly under iicbus

Sponsored by:   Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D41911

(cherry picked from commit 06589d6e029c6ff64a7816d743e0a508abe6193b)
2023-10-18 16:32:11 +02:00
Emmanuel Vadot
26dd10a4f3 iicbus: Move adm1030 and adt746x to new pwm subdirectory
Those are (mainly) pwm controller so move it under a new subdirectory.

Sponsored by:	Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D41910

(cherry picked from commit 22d7dd834bc5cd189810e414701e3ad1e98102e4)
2023-10-18 16:32:10 +02:00
Emmanuel Vadot
8d715e2f49 iicbus: Move Silergy pmic/regulators under pmic/silergy subdirectory
Sponsored by:	Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D41909

(cherry picked from commit 062944cc4227e7bd002e4de2be48ec9b710bfaa5)
2023-10-18 16:32:08 +02:00
Emmanuel Vadot
5e25c410bd iicbus: Move remaining rtc driver into rtc subfolder
No reason that they should live directly under iicbus

Sponsored by:	Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D41908

(cherry picked from commit 2f16049c985a364e2bd2b256f5bef9af17e10c62)
2023-10-18 16:32:07 +02:00
Emmanuel Vadot
a5b3cbe6cc iicbus: Move twsi under a new controller subdirectory
The folder is a mess so start moving stuff into sub-directories.

Sponsored by:	Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D41907

(cherry picked from commit 580d00f42fdd94ce43583cc45fe3f1d9fdff47d4)
2023-10-18 16:32:05 +02:00
Emmanuel Vadot
1e685c7dd5 if_cgem: Rewrite clock part
- pclk and hclk are mandatory so always try to get them.
   Don't make it fatal if it fails as some platform (like Zynq) don't
   have a proper clock driver.
 - Always use pclk for the reference clock.
 - Try to get all the possible clocks and enable them.

Reviewed-by:	mhorne
Tested-by:	Milan Obuch <bsd@dino.sk>
Differential Revision:	https://reviews.freebsd.org/D41857
Sponsored by:	Beckhoff Automation GmbH & Co. KG

(cherry picked from commit 4c52dde5bda099936d43820da84e569dccc6f475)
2023-10-18 16:31:13 +02:00
Emmanuel Vadot
770b790eab if_cgem: Cleanup compatible and add new ones
- Remove cdns,gem, it's the generic binding but for all platform that include
  this one we need specific drivers setup so remove it.
- Remove cdns,macb, it's the generic binding for Atmel AT91 which we don't suport
- Remove cadence,gem, it's not an official binding and seems to be only used in some
  obscure ARM11 SoC.
- Note that the cdns,zynq* are deprecated
- Add the new Xilinx compatible for zynq and zynqmp

Reviewed-by:	mhorne
Tested-by:	skibo, Milan Obuch <bsd@dino.sk>
Differential Revision:	https://reviews.freebsd.org/D41856
Sponsored by:	Beckhoff Automation GmbH & Co. KG

(cherry picked from commit bdbbbbb32104569fccd786d9cc07d17f6231a713)
2023-10-18 16:31:11 +02:00
Emmanuel Vadot
0af2307d59 sdhci: fdt: Correctly export clock per the binding
The binding says that we can have one or two clocks to export.
The first one is the actual sdclock while the second is the sample clock.
Both have the same parent, clk_xin.
Correctly export the clocks for RK3399 and ZynqMP.
No need to use a high ID as before, we have our own clock domain so use
ids starting at 1 as all exported clocks should be.

Reviewed-by:	bz
Differential Revision:	https://reviews.freebsd.org/D41810
Sponsored by:	Beckhoff Automation GmbH & Co. KG

(cherry picked from commit 81a4fe38a6ce818bb7cba548bb2c697429fa9479)
2023-10-18 16:31:10 +02:00
Emmanuel Vadot
62b14a7531 sdhci: fdt: Always try to get the phy and the syscon
Per the bindings the phy and the syscon can always be present not just
for RK3399.

Reviewed-by:	bz
Differiential Revision:	https://reviews.freebsd.org/D41809
Sponsored by:	Beckhoff Automation GmbH & Co. KG

(cherry picked from commit 0ee5d6fcfc63be48fd7c1b461917dfb880dc7f72)
2023-10-18 16:31:08 +02:00
Emmanuel Vadot
d6a1d41df4 sdhci: fdt: Always enable clock for ZynqMP and RK3399
Those two (in fact all of the supported one in this driver except RK3568) always
needs the clocks to be enabled.

Reviewed-by:	bz
Differential Revision:	https://reviews.freebsd.org/D41808
Sponsored by:	Beckhoff Automation GmbH & Co. KG

(cherry picked from commit 9377d7049c846d1e35c8fc8809c23e6413909fca)
2023-10-18 16:31:07 +02:00
Emmanuel Vadot
023ba06b4f sdhci: fdt: Remove sdhci_generic compatible string
This was used when we had our own DTS, it's not used anymore.

Reviewed-by:	bz
Differential Revision:	https://reviews.freebsd.org/D41807
Sponsored by:	Beckhoff Automation GmbH & Co. KG

(cherry picked from commit 8c7e747491ad636d6ee4069a74ddb24814870540)
2023-10-18 16:31:05 +02:00
Emmanuel Vadot
20d6c796fa arm64: zynqmp: Add clock driver
Add clock and reset drivers for the ZynqMP SoC.
The clocks are discovered by talking to the firmware as the topology isn't
fixed on this SoC.

Differential Revision:	https://reviews.freebsd.org/D41812
Sponsored by:	Beckhoff Automation GmbH & Co. KG

(cherry picked from commit 4e579ad047720775ab580b74192c7de8a3386fea)
2023-10-18 16:31:03 +02:00
Emmanuel Vadot
ab8f34675a arm64: zynqmp: Add firmware driver
The ZynqMP SoC have a MCU running a firmware to control clocks, resets,
fpga loading etc ...
Add a driver that can be use to communicate with it.
For now only the clock and reset part are implemented.

Differential Revision:	https://reviews.freebsd.org/D41811
Sponsored by:	Beckhoff Automation GmbH & Co. KG

(cherry picked from commit 9e88711f28dc9afa7d68ae8dd027d2399a2a290b)
2023-10-18 16:31:01 +02:00
Emmanuel Vadot
62ce4a798f cpufreq_dt: Find the closest frequency
When building the frequencies table we convert the value in the DTS to
megahertz and loose precision. While it's not a problem for most of the
DTS it is when the expected frequency value is strict down to the hertz.
So it's either we don't truncate the value and have some ugly and long
values in the sysctls or we just find the closest frequency.
Do the latter.

Reviewed by:	mmel
Differential Revision:	https://reviews.freebsd.org/D41762
Sponsored by:	Beckhoff Automation GmbH & Co. KG

(cherry picked from commit 17c17872ca98df0e2b9f9c7a2c41ef73f7dee21a)
2023-10-18 16:30:22 +02:00
Konstantin Belousov
23a55498a8 vkbd: correct ref count on cloned cdevs
(cherry picked from commit 6e92fc930943a85f311e986a02e2b3dae9e37126)
2023-10-16 10:16:01 +03:00
Mark Johnston
29de7af6ee mrsas: Fix callout locking in mrsas_complete_cmd()
callout_stop() requires the associated lock to be held.

This is a bit hacky, but I believe it's safe since the subsequent
mrsas_cmd_done() call will also acquire the SIM lock to stop a different
callout.

PR:		265484
Reviewed by:	imp
Tested by:	Jérémie Jourdin <jeremie.jourdin@advens.fr>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D39559

(cherry picked from commit 4640df1b0a49697840b81f6bcd269a483514c6aa)
2023-10-14 11:29:11 -04:00
Sk Razee
dc80e764d6 if_re: add Realtek Killer Ethernet E2600 IDs
PR:		274292
MFC after:	1 week
Reviewed by:	kp
Event:		Oslo Hackathon at Modirum

(cherry picked from commit 3c871489cdd6c5606b2b1125f66b0e9b8f39561f)
2023-10-13 09:25:32 +02:00
John Baldwin
969dc06e91 cxgbe t4_tls: Call t4_rcvd_locked from do_rx_tls_cmp
Similar to dcfddc8dc091e7688abc8488a0307eba425fa7a2, replace the
simpler, inlined version with the full version.

Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D41690

(cherry picked from commit 897e564361624411c4e557e0817642e1477f0af4)
2023-10-11 08:10:32 -07:00
John Baldwin
cb2cd58dbd cxgbe t4_tls: Don't bother returning RX credits for a protocol receive error
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D41689

(cherry picked from commit 75af2d951cce7d51d2033405f96f083c01f39f04)
2023-10-11 08:10:32 -07:00
John Baldwin
bd8cecc466 cxgbe tom: Call t4_rcvd_locked from do_rx_data to return RX credits
In particular, the kernel RPC layer used by the NFS client never
invokes pru_rcvd since it always reads data from the socket upcall
via MSG_SOCALLBCK which avoids calling pru_rcvd.  As a result, on an
NFS client connection managed by t4_tom, RX credits were never
returned to the TOE connection to open the TCP window resulting in
connection hangs.

To fix, expand the set of conditions in do_rx_data where RX credits
are returned to match those in t4_rcvd_locked by calling the function
directly.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D41688

(cherry picked from commit dcfddc8dc091e7688abc8488a0307eba425fa7a2)
2023-10-11 08:10:32 -07:00
John-Mark Gurney
b7f5e99347 virtio_random: Pipeline fetching the data
Queue an initial fetch of data during attach and after every read
rather than synchronously fetching data and polling for completion.

If data has not been returned from an previous fetch during read,
just return EAGAIN rather than blocking.

Co-authored-by: John Baldwin <jhb@FreeBSD.org>

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D41656

(cherry picked from commit f1c5a2e3a625053e2b70d5b1777d849a4d9328f2)
2023-10-11 08:10:32 -07:00
John Baldwin
b53155d4df efirt: Move comment about fpu_kern_enter to where it is called
Reviewed by:	imp, kib, andrew, markj
Differential Revision:	https://reviews.freebsd.org/D41576

(cherry picked from commit 8173fa60ddb7e9a805dec9fef7bf07e74ae4144d)
2023-10-11 08:10:31 -07:00
Damien Broka
b973cdbb20 axge: Add support for AX88179A
The AX88179A has two firmware modes, one of which is backward
compatible with existing AX88178A/179 driver. The active firmware mode
can be controlled through a register.

Update axge(4) man page to mention 179A support and ensure that, when
bound to a AX88179A, the driver activates the compatible firmware mode.

Reviewed by:	markj
Pull Request:	https://github.com/freebsd/freebsd-src/pull/854
MFC after:	1 week

(cherry picked from commit 6962da914dd511349b219241e92b32329be76fc6)
2023-10-11 09:16:14 -04:00
Olivier Certner
6c59ac8c79 x86: AMD Zen2: Zenbleed chicken bit mitigation
Applies only to bare-metal Zen2 processors.  The system currently
automatically applies it to all of them.

Tunable/sysctl 'machdep.mitigations.zenbleed.enable' can be used to
forcibly enable or disable the mitigation at boot or run-time.  Possible
values are:

    0: Mitigation disabled
    1: Mitigation enabled
    2: Run the automatic determination.

Currently, value 2 is the default and has identical effect as value 1.
This might change in the future if we choose to take into account
microcode revisions in the automatic determination process.

The tunable/sysctl value is simply ignored on non-applicable CPU models,
which is useful to apply the same configuration on a set of machines
that do not all have Zen2 processors.  Trying to set it to any integer
value not listed above is silently equivalent to setting it to value 2
(automatic determination).

The current mitigation state can be queried through sysctl
'machdep.mitigations.zenbleed.state', which returns "Not applicable",
"Mitigation enabled" or "Mitigation disabled".  Note that this state is
not guaranteed to be accurate in case of intervening modifications of
the corresponding chicken bit directly via cpuctl(4) (this includes the
cpucontrol(8) utility).  Resetting the desired policy through
'machdep.mitigations.zenbleed.enable' (possibly to its current value)
will reset the hardware state and ensure that the reported state is
again coherent with it.

Reviewed by:	kib
Sponsored by:   The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D41817

(cherry picked from commit ebaea1bcd2eb0aa90937637ed305184b6fedc69b)
2023-10-10 09:34:31 -04:00
David Sloan
510404f2f4 nvme: Fix memory leak in pt ioctl commands
When running nvme passthrough commands through the ioctl interface
memory is mapped with vmapbuf() but not unmapped. This results in leaked
memory whenever a process executes an nvme passthrough command with a
data buffer. This can be replicated with a simple c function (error
checks skipped for brevity):

void leak_memory(int nvme_ns_fd, uint16_t nblocks) {
	struct nvme_pt_command pt = {
		.cmd = {
			.opc = NVME_OPC_READ,
			.cdw12 = nblocks - 1,
		},
		.len = nblocks * 512, // Assumes devices with 512 byte lba
		.is_read = 1, // Reads and writes should both trigger leak
	}
	void *buf;

	posix_memalign(&buf, nblocks * 512);
	pt.buf = buf;
	ioctl(nvme_ns_fd, NVME_PASSTHROUGH_COMMAND, &pt);
	free(buf);
}

Signed-off-by: David Sloan <david.sloan@eideticom.com>

PR:		273626
Reviewed by:	imp, markj
MFC after:	1 week

(cherry picked from commit 7ea866eb14f8ec869a525442c03228b6701e1dab)
2023-10-08 20:41:25 -04:00
Bjoern A. Zeeb
493d625543 net80211 / drivers: remove public use of ieee80211_node_incref()
ieee80211_node_incref() is the FreeBSD implementation of
ieee80211_ref_node().  Not being interested in the node returned
it was used as a shortcut in 3 drivers (ath, uath, wpi).
Replace the call with the public KPI of ieee80211_ref_node() and
ignore the result.
This leaves us with the single internal call going
ieee80211_ref_node() -> ieee80211_node_incref() and that should
help increasing portability but also limiting the places to trace
for node reference operations.

Sponsored by:	The FreeBSD Foundation

(cherry picked from commit f156cd892b55c04a39fa06d1899e6e316de77f03)
2023-10-04 15:19:18 +00:00
Mark Johnston
1e8737f4e8 hdac: Defer interrupt allocation in hdac_attach()
hdac_attach() registers an interrupt handler before allocating various
driver resources which are accessed by the interrupt handler.  On some
platforms we observe what appear to be spurious interrupts upon a cold
boot, resulting in panics.

Partially work around the problem by deferring irq allocation until
after other resources are allocated.  I think this is not a complete
solution, but is correct and sufficient to work around the problems
reported in the PR.

PR:		268393
Tested by:	Alexander Sherikov <asherikov@yandex.com>
Tested by:	Oleh Hushchenkov <o.hushchenkov@gmail.com>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D41883

(cherry picked from commit 015daf5221f7588b9258fe0242cee09bde39fe21)
2023-10-04 09:41:52 -04:00
Alan Somers
c70a4185c6 mprutil: "fix user reply buffer (64)..." warnings
Depending on the card's firmware version, it may return different length
responses for MPI2_FUNCTION_IOC_FACTS.  But the first part of the
response contains the length of the rest, so query it first to get the
length and then use that to size the buffer for the full response.

Also, correctly zero-initialize MPI2_IOC_FACTS_REQUEST.  It only worked
by luck before.

PR:		264848
Reported by:	Julien Cigar <julien@perdition.city>
Sponsored by:	Axcient
Reviewed by:	scottl, imp
Differential Revision: https://reviews.freebsd.org/D38739

(cherry picked from commit 7d154c4dc64e61af7ca536c4e9927fa07c675a83)
2023-10-02 19:31:12 -06:00
Priit Trees
6d22a25725 vge: correct pause_frames sysctl description
Reviewed by:	emaste
Pull Request:	https://github.com/freebsd/freebsd-src/pull/806

(cherry picked from commit 0a5d2802b41fd216d9a345f749af1a6ccbe9f382)
2023-09-29 08:26:00 -04:00
Warner Losh
3cd49bc5b3 nvme: Supress noise messages
When we're suspending, we get messages about waiting for the controller
to reset. These are in error: we're not waiting for it to reset. We put
the recovery state as part of suspending, so we should suppress these as
a false positive.

Also remove a stray debug that's left over from earlier versions of
the recovery code that no longer makes sense.

Sponsored by:		Netflix

(cherry picked from commit 1d6021cd72689f54093af4ed77066a2f8abde664)
2023-09-28 15:05:15 -06:00
Warner Losh
81b118e842 nvme: Fix locking protocol violation to fix suspend / resume
Currently, when we suspend, we need to tear down all the qpairs. We call
nvme_admin_qpair_abort_aers with the admin qpair lock held, but the
tracker it will call for the pending AER also locks it (recursively)
hitting an assert. This routine is called without the qpair lock held
when we destroy the device entirely in a number of places. Add an assert
to this effect and drop the qpair lock before calling it.
nvme_admin_qpair_abort_aers then locks the qpair lock to traverse the
list, dropping it around calls to nvme_qpair_complete_tracker, and
restarting the list scan after picking it back up.

Note: If interrupts are still running, there's a tiny window for these
AERs: If one fires just an instant after we manually complete it, then
we'll be fine: we set the state of the queue to 'waiting' and we ignore
interrupts while 'waiting'. We know we'll destroy all the queue state
with these pending interrupts before looking at them again and we know
all the TRs will have been completed or rescheduled. So either way we're
covered.

Also, tidy up the failure case as well: failing a queue is a superset of
disabling it, so no need to call disable first. This solves solves some
locking issues with recursion since we don't need to recurse.. Set the
qpair state of failed queues to RECOVERY_FAILED and stop scheduling the
watchdog. Assert we're not failed when we're enabling a qpair, since
failure currently is one-way. Make failure a little less verbose.

Next, kill the pre/post reset stuff. It's completely bogus since we
disable the qparis, we don't need to also hold the lock through the
reset: disabling will cause the ISR to return early. This keeps us from
recursing on the recovery lock when resuming. We only need the recovery
lock to avoid a specific race between the timer and the ISR.

Finally, kill NVME_RESET_2X. It'S been a major release since we put it
in and nobody has used it as far as I can tell. And it was a motivator
for the pre/post uglification.

These are all interrelated, so need to be done at the same time.

Sponsored by:		Netflix
Reviewed by:		jhb
Tested by:		jhb (made sure suspend / resume worked)
MFC After:		3 days
Differential Revision:	https://reviews.freebsd.org/D41866

(cherry picked from commit da8324a9258f1791cd10423103c1746646e33104)
2023-09-28 15:05:15 -06:00
Warner Losh
5e9b7d0e0e nvme: Give up when we've failed
Normally, we poll the device every so often to see if commands have
timed out. However, we'll go into the recovery state as part of failing
the drive. To account for all possibilties, if we're failed when we get
into the polling function, just stop polling: Party is over.

Sponsored by:		Netflix

(cherry picked from commit d95431624f934fe4740211738fc787808005b14e)
2023-09-28 15:05:14 -06:00
Warner Losh
c7cb2dcdf2 nvme: Add exclusion for ISR
Add a basically uncontended spinlock that we take out while the ISR is
running. This has two effects: First, when we get a timeout, we can
safely call the nvme_qpair_process_completions w/o racing any ISRs.
Second, we can use it to ensure that we don't reset the card while
the ISRs are active (right now we just sleep and hope for the best,
which usually is fine, but not always).

Sponsored by:		Netflix
MFC After:		2 weeks
Reviewed by:		chuck, gallatin
Differential Revision:	https://reviews.freebsd.org/D41452

(cherry picked from commit 8052b01e7e4113fa8296ce43c354116b0a1774b7)
2023-09-28 15:05:14 -06:00