Add vDSO support for timekeeping devices that support the KVM/XEN
paravirtual clock API.
Also, expose, in the userspace-accessible '<machine/pvclock.h>',
definitions that will be needed by 'libc' to support
'VDSO_TH_ALGO_X86_PVCLK'.
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31418
Add support for the KVM paravirtual clock device.
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D29733
A socket in the FIN_WAIT_1 state is marked disconnected by
do_close_con_rpl() even though there might still receive data pending.
This is because the socket at that point has set SBS_CANTRCVMORE which
causes the protocol layer to discard any data received before the FIN.
However, icl_cxgbei_conn_close needs to wait until all the data has
been discarded. Replace the wait for SS_ISDISCONNECTED with instead
waiting for final_cpl_received() to be called.
Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications
bus_activate_resource maps memory regions as uncacheable on x86, which
is more strict than required for regions allocated using xenmem_alloc,
so don't rely on bus_activate_resource and instead map the region
using pmap_mapdev_attr and VM_MEMATTR_XEN as the cache attribute.
Sponsored by: Citrix Systems R&D
Some simulators don't implement arbitrary sized memory access to the
virtio PCI registers. Follow Linux and use single byte accesses to read
and write to these registers.
Reviewed by: bryanv, emaste (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31424
Sanitizer instrumentation of course cannot automatically update shadow
state when devices write to host memory. KMSAN thus hooks into busdma,
both to update shadow state after a device write, and to verify that the
kernel does not publish uninitalized bytes to devices.
To implement this, when KMSAN is configured, each dmamap embeds a memory
descriptor describing the region currently loaded into the map.
bus_dmamap_sync() uses the operation flags to determine whether to
validate the loaded region or to mark it as initialized in the shadow
map.
Note that in cases where the amount of data written is less than the
buffer size, the entire buffer is marked initialized even when it is
not. For example, if a NIC writes a 128B packet into a 2KB buffer, the
entire buffer will be marked initialized, but subsequent accesses past
the first 128 bytes are likely caused by bugs.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31338
Reset the mmc owner before calling the bridge release host callback.
Some people are hitting the "mmc: host bridge didn't serialize us." panic as
the bridge is released before the mmc owner is reset.
Submitted by: luiz
Sponsored by: Rubicon Communications, LLC ("Netgate")
Simplify the setup of srrctl.BSIZEPKT on igb class NICs.
Improve the setup of rctl.BSIZE on lem and em class NICs.
Don't try to touch rfctl on lem class NICs.
Manipulate rctl.BSEX correctly on lem and em class NICs.
Approved by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31457
Both callout and taskqueue now have drain() routines not requiring
external locking. It allows to remove TASK flag and manual drain,
so the only thing remaining for lock to protect inside the callout
handler is ks_inq_length zero comparison, that can be lockless.
MFC after: 2 weeks
The toolchain can reorder symbols, meaning that changing
the order of DRIVER_MODULE macros is not enough
to ensure that miibus gets registered first.
Use DRIVER_MODULE_ORDERED instead to fix the problem properly.
Fixes: 5ad6d28cbe ("enetc: Support building the driver as a loadable module.")
Reported by: jhb
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).
Sponsored by: The FreeBSD Foundation
This has been present since the first revision of the file. The debugf
macros have always been unused so it doesn't actually do anything
useful, and besides, debugging should not be unconditionally turned on
for a production driver. Moreover, this breaks the riscv LINT kernel
build as sys/conf/NOTES includes options DEBUG, resulting in a macro
redefinition error. This does not show up in the arm64 LINT kernel build
since that has an explicit nooptions DEBUG, which is dubious and should
be revisited. Rather than copy such a hack to riscv's NOTES, fix this
specific instance of DEBUG breaking.
Fixes: 896e217a0e ("fu740_pci_dw: Add SiFive FU740 PCIe controller driver")
MFC after: 1 week
If we allocate a new window for a bridge rather than reusing an existing
one set up by firmware to cover all the devices then the new window only
includes the range needed for the first device to allocate the resource.
If a request comes in to adjust this resource in order to extend a
downstream window for another device then this will fail as the rman
doesn't have any space, so we must first grow the bridge's own window.
This is needed to support successfully attaching more than one PCI
device on SiFive's HiFive Unmatched, which has the following topology:
Root Port <---> Bridge <---> Bridge <-+-> Bridge <---> (Unused)
(pcib0) (pcib1) (pcib2) | (pcib3)
+-> Bridge <---> xHCI
| (pcib4)
+-> Bridge <---> M.2 E-key
| (pcib5)
+-> Bridge <---> M.2 M-key
| (pcib6)
+-> Bridge <---> x16 slot
(pcib7)
Without this, the xHCI endpoint successfully attaches but NVMe M.2 M-key
endpoint fails to attach as, when its adjacent bridge (pcib6) attempts
to allocate a window from its parent (pcib2) on the other side of the
switch, its parent attempts to grow its own window by calling
bus_adjust_resource on its own parent (pcib1) which fails to call the
root port device (pcib0) to request more memory to grow its own window.
Had the root port been directly connected to the switch without the
bridge in the middle then the existing code would have worked, but the
extra hop broke it.
Reviewed by: jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31035
Add the ability to map named components from IORT to their
SMMU or ITS node in order to setup interrupts.
It is now possible to find a node by its name (substring) and
resource ID similar to PCI nodes.
This is needed by work on a driver for NXP's Second Generation
Data Path Acceleration Architecture (DPAA2).
Reviewed by: andrew
MFC after: 2 weeks
Differential Revision:: https://reviews.freebsd.org/D31267
ISO can be disabled before establishing a connection by setting
dev.tNnex.N.toe.iso to 0.
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31223
Similar to TSO, iSCSI segmentation offload permits the upper layers to
submit a "large" virtual PDU which is split up into multiple segments
(PDUs) on the wire. Similar to how the TCP/IP headers are used as
templates for TSO, the BHS at the start of a large PDU is used as a
template to construct the specific BHS at the start of each PDU. In
particular, the DataSN is incremented for each subsequent PDU, and the
'F' flag is only set on the last PDU.
struct icl_conn has a new 'ic_hw_isomax' field which defaults to 0,
but can be set to the largest virtual PDU a backend supports. If this
value is non-zero, the iSCSI target and initiator use this size
instead of 'ic_max_send_data_segment_length' to determine the maximum
size for SCSI Data-In and SCSI Data-Out PDUs. Note that since PDUs
can be constructed from multiple buffers before being dispatched, the
target and initiator must wait for the PDU to be fully constructed
before determining the number of DataSN values were consumed (and thus
updating the per-transfer DataSN value used for the start of the next
PDU).
The target generates large PDUs for SCSI Data-In PDUs in
cfiscsi_datamove_in(). The initiator generates large PDUs for SCSI
Data-Out PDUs generated in response to an R2T.
Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31222
Create a struct icl_soft_conn which extends struct icl_conn and
move fields only used by icl_soft from struct icl_conn to
struct icl_soft_conn.
Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31414
This variable is set based on the exact CPU model detected. If this
value is set too small, it could lead to a NULL-dereference from an
improperly initialized pmc_rowindex_to_classdep array.
Though it has been fixed, this was previously the case for Broadwell.
Add two asserts to catch this in DEBUG kernels, as it represents a
configuration error that may be hard to uncover otherwise.
PR: 253687
Reported by: Zhenlei Huang <zlei.huang@gmail.com>
Sponsored by: The FreeBSD Foundation
It was written for Nehalem and Westmere, with minor but incomplete
updates for Sandy Bridge in 78d763a29b. The uncore architecture
changed significantly with this generation, bringing new layouts and
locations for some MSRs.
Misprogramming these MSRs in ucp_start_pmc() may panic the system, and
this is trivially reproducible via pmcstat(8) on at least Broadwell and
Haswell. Disable the class on these CPUs until it can be updated more
completely and leave a TODO comment detailing some of the work required.
Note that the nclasses value for Broadwell was already incorrect and
doesn't need changing.
The result is that any uncore events listed by pmcstat -L will no longer
be allocatable, but this is already the case for newer generations of
Intel CPUs.
PR: 253687
Reported by: Zhenlei Huang <zlei.huang@gmail.com>
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31389
After recent arm64 GENERIC config cleanup the ENETC MDIO
in NXP LS1028A SoC should support being loaded as a module.
Obtained from: Semihalf
Sponsored by: Alstom Group
Function level reset has to be done in attach in order to put the
hardware in a known state before configuring it.
The order of DRIVER_MODULEs was changed to ensure that the miibus driver
is loaded when mii_attach is called.
Obtained from: Semihalf
Sponsored by: Alstom Group
It is found on boards equipped with LS1028A SoC.
802.1q VLAN grouping is supported.
An external MDIO device is used for communicating with PHYs.
The driver is built as a module by default, it is not included
in GENERIC kernel config.
Submitted by: Lukasz Hajec <lha@semihalf.com>
Kornel Duleba <mindal@semihalf.com>
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30923
Felix switch found in LS1028A supports stripping VLAN tag on
ingress, instead of egress. The striptag flag excepts the latter
behaviour.
Add a new flag to support the feature.
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30922
The remote peer might send a FIN in the middle of a burst of data
PDUs. In the case of T6 with data PDU completion moderation, the
driver would not have seen these PDUs since the final PDU in the burst
was never received resulting in a stale rcv_nxt when the FIN is
received.
While here, invert the logic in the condition to be more readable and
always set tp->rcv_nxt from the sequence number in the CPL. This sets
the proper value of rcv_nxt for FINs on connections with data received
but not reported via a CPL (e.g. a partial iSCSI PDU burst interrupted
by a FIN).
Reported by: Jithesh Arakkan @ Chelsio
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D30871
o Arm CoreLink TM CMN-600 Coherent Mesh Network controller,
o Arm CoreLink DMC-620 Dynamic Memory Controller.
Sponsored by: Ampere Computing LLC
Submitted by: Klara Inc.
Recent firmware versions round down the value passed here by the MSS
and subsequently mishandle transmitted PDUs larger than the rounded
down value.
Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications
Add request submission status checks before checking req->ir_compcode,
otherwise it may be zero just because of initialization.
Add checks for req->ir_compcode errors in ipmi_reset_watchdog() and
ipmi_set_watchdog(). In first case explicitly check for 0x80, which
means timer was not previously set, that I found happening after BMC
cold reset. This change makes watchdog timer to recover instead of
permanently ignoring reset errors after BMC reset or upgraded.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Use of smp_rendezvous_cpus() instead of sched_bind() allows to not
block indefinitely if target CPU is running some thread with higher
priority, while all we need is single rdmsr/wrmsr instruction call.
I guess it should also be much cheaper than full thread migration.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
This ensures the TOE has finished processing any in-flight received
data before returning to the caller. The caller assumes it is safe to
free any open tasks or transfers (and associated buffers) after this
function returns.
Previously, data placed directly via DDP could be written to buffers
after the caller had freed the buffers.
Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications
After b48a2770d4, static POWER8 definitions became unnecessary,
as all of them (and much more) are already present in libpmc's
PMU events.
Submitted by: Leonardo Bianconi <leonardo.bianconi@eldorado.org.br> (initial version)
Reviewed by: kbowling, mhorne
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D31334
mac_veriexec sets its version to 1, but the mac_veriexec_shaX modules which depend on it expect MAC_VERIEXEC_VERSION = 2.
Be consistent and use MAC_VERIEXEC_VERSION everywhere.
This unbreaks loading of mac_veriexec modules at boot time.
Authored by: Kornel Duleba <mindal@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D31268
ALTQ only works on network drivers which use if_start (rather than
if_transmit). vtnet uses if_start if built with VTNET_LEGACY_TX. Default
to that the kernel is built with ALTQ enabled, to reduce user surprise.
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
As the Processor Version Register (PVR) is a 32-bit PowerPC
register, change mfpvr() return type to match it and avoid
type casts on its callers.
Suggested by: jhibbits
Reviewed by: jhibbits, imp
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D31332
xen_vector_callback_enabled is x86 specific and availability of
per-cpu event channel delivery differs on other architectures.
Introduce a new helper to check if there's support for per-cpu event
channel injection.
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29402
Presently suspend/resume and migration aren't supported on Xen/ARM. As
such this shouldn't ever occur.
This likely applies to future Xen architectures (RISC-V) and
xctrl_suspend() needs dependency on intr_machdep.h fixed.
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29599
This is no more or less than returning the smaller of two values. Since
this is what min() does, use that to shrink max_nr_grant_frames() down
to the single line.
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29840
While x86 only register PV shutdown handler for PV guests. ARM guests
are always using HVM and requires the PV shutdown handler.
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29406
ARM guest is considered as HVM in Freebsd but they only support PV disk
(no emulation available).
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29403
ARM guest is considered as HVM but it only supports PV nics (no
emulation available).
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29405
For embedded devices reserved addresses will be known in advance. More
recently added devices will also likely be correctly updated. As a
result using any available address is reasonable on non-x86.
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29304
Commit 1522652230 was implemented strictly for x86. Unfortunately
one of the pieces was mixed into a common area breaking other
architectures. For now disable these bits on !x86, this should be
cleaned up later.
Fixes: 1522652230 ('xen: fix dropping bitmap IPIs during resume')
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29306
The requirements for pages shared with Xen/other VMs may vary from
architecture to architecture. As such create a macro which various
architectures can use.
Remove a use of PAT_WRITE_BACK in xenstore.c. This is a x86-ism which
shouldn't have been present in a common area.
Original idea: Julien Grall <julien@xen.org>, 2014-01-14 06:44:08
Approach suggested by: royger
Reviewed by: royger, mhorne
Differential Revision: https://reviews.freebsd.org/D29351
Minor changes are necessary to make this processor-independent, but
moving the file out of x86 and into common is the first step (so
others don't add /more/ x86-isms).
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29042
While at it only output driver version to dmesg(8) when hardware is present.
Differential Revision: https://reviews.freebsd.org/D29100
MFC after: 1 week
Reviewed by: kib and markj
Sponsored by: NVIDIA Networking
Overview:
This is the first stage of a RDMA stack upgrade introducing kernel
changes only based on Linux 5.7-rc1.
This patch is based on about four main areas of work:
- Update of the IB uobjects system:
- The memory holding so-called AH, CQ, PD, SRQ and UCONTEXT objects
is now managed by ibcore. This also require some changes in the
kernel verbs API. The updated verbs changes are typically about
initialize and deinitialize objects, and remove allocation and
free of memory.
- Update of the uverbs IOCTL framework:
- The parsing and handling of user-space commands has been
completely refactored to integrate with the updated IB uobjects
system.
- Various changes and updates to the generic uverbs interfaces in
device drivers including the new uAPI surface.
- The mlx5_ib_devx.c in mlx5ib and related mlx5 core changes.
Dependencies:
- The mlx4ib driver code has been updated with the minimum changes
needed.
- The mlx5ib driver code has been updated with the minimum changes
needed including DV support.
Compatibility:
- All user-space facing APIs are backwards compatible after this
change.
- All kernel-space facing RDMA APIs are backwards compatible after
this change, with exception of ib_create_ah() and ib_destroy_ah()
which takes a new flag.
- The "ib_device_ops" structure exist, but only contains the driver ID
and some structure sizes.
Differences from Linux:
- Infiniband drivers must use the INIT_IB_DEVICE_OPS() macro to set
the sizes needed for allocating various IB objects, when adding
IB device instances.
Security:
- PRIV_NET_RAW is needed to use raw ethernet transmit features.
- PRIV_DRIVER is needed to use other privileged operations.
Based on upstream Linux, Torvalds (5.7-rc1):
8632e9b5645bbc2331d21d892b0d6961c1a08429
MFC after: 1 week
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31149
Sponsored by: NVIDIA Networking
Buggy SD card drivers may attach and detach a mmc(4) driver instance in
quick succession. In this case mmc(4) must disestablish its intrhook
callback during detach. Thus, this change adds a call to
config_intrhook_drain(), which blocks or does nothing if the intrhook is
running or has already ran (the SD card was plugged in), and
disestablishes the hook if it hasn't ran yet (the SD card was not
plugged in).
PR: 254373
Reviewed by: imp, manu, markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31262
When using SDIO the block size if per function and most of the time
not equal to MMC_SECTOR_SIZE, fix sdio on dwmmc by setting the correct
block size in the mmc registers.
MFC after: 1 month
Sponsored by: Diablotin Systems
This make it easier for script to get the hardware on which they are running.
MFC after: 1 month
Sponsored by: Diablotin Systems
Differential Revision: https://reviews.freebsd.org/D31205
Reviewed by: imp
Should be ok on powerpc: jhibbits (over irc)
Due to a mis-merge, the changes committed to libpmc never called
pmu_parse_event(), or set pm->pm_ev. However, this field shouldn't be
used to carry the actual pmc event code anyway, as it is expected to
contain the index into the pmu event array (otherwise, it breaks event
name lookup in pmclog_get_event()). Add a new MD field,
pm_md.pm_md_config, to pass the raw event code to arm64_allocate_pmc().
Additionally, the change made to pmc_md_op_pmcallocate was incorrect, as
this is a union, not a struct. Restore the proper padding size.
Reviewed by: luporl, ray, andrew
Fixes: 28dd6730a5 ("libpmc: enable pmu_utils on arm64")
Fixes: 8cc3815f02 ("hwpmc_arm64: accept raw event codes...")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31221
These two ioctls are not part of the current version of OSS and were
considered obsolete. However, their behaviour is not the same as their
old one, so this implementation is specific to FreeBSD.
Older OSS versions had the MUTE ioctls take and return an integer with
a value of 0 or 1, which meant that the _whole_ mixer is unmuted or
muted respectively. In my implementation, the ioctl takes and returns
a bitmask that tells us which devices are muted.
This allows us to mute and unmute only the devices we want, instead of the
whole mixer. The bitmask works the same way as in DEVMASK, RECMASK and
RECSRC.
Integrated the hardware volume feature with the new mute system.
Submitted by: Christos Margiolis <christos@freebsd.org>
Differential Revision: https://reviews.freebsd.org/D31130
MFC after: 1 week
Sponsored by: NVIDIA Networking
Currently we use the num-viewports property to decide how many outbound
regions there are we can use, defaulting to 2. However, Linux has
stopped using that and so it no longer appears in new device trees, such
as for the SiFive FU740. Instead, it's possible to just probe the
hardware directly.
Reviewed by: mmel
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31030
This supersedes the old legacy mode where a viewport register was used
to mux multiple regions behind a single set of registers, and is used on
the SiFive FU740.
Reviewed by: mmel
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31029
Currently we assume there is only one memory and one prefetch memory
window, and ignore the latter. However, the SiFive FU740 has two normal
memory windows.
As part of this, the viewports are rearranged. Previously the viewports
were memory, config then optionally I/O. Both to simplify the config
index calculation and to ensure it can always be mapped even if we have
too many memory windows for the number of viewports, config is moved to
being the first viewport.
This generalisation now also naturally supports mapping prefetch memory
windows.
Reviewed by: mmel
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31028
Note that currently Linux's device tree uses the FU540's compatible
string, as does upstream U-Boot, but the U-Boot shipped with the board
based on an older patch series has the correct FU740 name. Thankfully
they are the same, at least as far as software is concerned.
Whilst here, fix a style(9) nit.
Reviewed by: philip, kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31034
The intention here is to reduce differences between em, igb, igc, ixgbe.
The main functional change is logical simplification in igb_rx_checksum
and getting interface caps from scctx instead of the ifp.
Reviewed by: gallatin, markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D30073
If a data PDU encounters an error such as a digest error, the firmware
will report that data PDU when completion moderation is active even if
it is not the final data PDU in a burst.
Sponsored by: Chelsio Communications
A non-placed PDU can be delivered by CPL_RX_ISCSI_CMP in the middle of
a burst of placed PDUs (received via DDP) in which case the rcv_nxt
will not match the start of the non-placed PDU.
Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications
The intention here is to reduce differences with D30072.
The only functional change is logical simplification in
ixgbe_rx_checksum.
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D30074
It can be useful for system operators to see this kind of information
when correlating issues or requesting support from the OEM or Intel for
hardware and firmware issues.
Reviewed by: gallatin
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D30178
This fixes a kernel panic when probing for vmd_bus on Intel TigerLake on
14-CURRENT. Apparently, vmd_bus is a type of PCI bus, but was registered
as a separate device class.
PR: 256915
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D31071
To guard against the ill effects of a spurious interrupt during
construction (or one that was bogusly pending), enable interrupts after
the qpair is completely constructed. Otherwise, we can die with null
pointer dereferences in nvme_qpair_process_completions. This has been
observed in at least one pre-release NVMe drive where the MSIX interrupt
fired while the queue was being created, before we'd started the NVMe
controller card.
The alternative of only turning on the interrupts after the rest was
tried, but was insufficient to work around this bug and made the code
more complicated w/o benefit.
Reviewed by: mav, chuck
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D31182