Commit graph

120 commits

Author SHA1 Message Date
Stephen Hurd
3ab4a96085 Remove tx task spinning added in r333686
This caused issues with PASTE.  Just remove the reschedule since the DELAY()
should be enough for use cases such as pkt-gen which were failing before the
change.

Reported by:	Michio Honda
Sponsored by:	Limelight Networks
2018-06-08 21:49:19 +00:00
Eric Joyner
a06424ddd3 iflib: Record TCP checksum info in iflib when TCP checksum is requested
ixl(4) (when it switches over to using iflib) devices need the TCP header
length in order to do TCP checksum offload.

Reviewed by:	gallatin@, shurd@
MFC after:	1 week
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D15558
2018-06-07 13:03:07 +00:00
Stephen Hurd
3e0e6330b5 iflib: mark irq allocation name parameter as constant
The *name parameter passed to iflib_irq_alloc_generic and
iflib_softirq_alloc_generic is never modified. Many places in code pass
string literals and thus should not be modified.

Mark the *name parameter as a const char * instead, so that we enforce
that the name is not modified before passing to bus_describe_intr()

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	kmacy
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D15343
2018-05-29 21:56:39 +00:00
Matt Macy
6c3c319414 iflib: hold context lock across detach for drivers that need it 2018-05-29 18:03:43 +00:00
Eric Joyner
1d7ef1867a iflib: Add new shared flag: IFLIB_ADMIN_ALWAYS_RUN
ixl(4)'s nvmupdate utility expects the nvmupdate process to run
while the interface is down; these nvm update commands use the
admin queue, so the admin queue needs to be able to generate
interrupts and be processed while the interface is down.

So add a flag that ixl(4) sets that lets the entire admin task
run even when the interface is marked down/IFF_DRV_RUNNING isn't set.

With this change, nvmupdate should function like it did pre-iflib.

Reviewed by:	gallatin@, sbruno@
MFC after:	1 week
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D15575
2018-05-26 00:46:08 +00:00
Matt Macy
f6cb0dea4c net: fix uninitialized variable warning 2018-05-19 19:00:04 +00:00
Matt Macy
46d0f824be net: fix set but not used 2018-05-19 05:27:49 +00:00
Matt Macy
2aa6f526de Fix !netmap build post r333686
Approved by:	sbruno
2018-05-16 22:25:47 +00:00
Stephen Hurd
5ee36c68e1 Work around lack of TX IRQs in iflib for netmap
When poll() is called via netmap, txsync is initially called,
and if there are no available buffers to reclaim, it waits for the driver
to notify of new buffers. Since the TX IRQ is generally not used in iflib
drivers, this ends up causing a timeout.

Work around this by having the reclaim DELAY(1) if it's initially unable
to reclaim anything, then schedule the tx task, which will spin by
continuously rescheduling itself until some buffers are reclaimed. In
general, the delay is enough to allow some buffers to be reclaimed, so
spinning is minimized.

Reported by:	Johannes Lundberg <johalun0@gmail.com>
Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15455
2018-05-16 21:03:22 +00:00
Matt Macy
09f6ff4f1a iflib(9): Add support for cloning pseudo interfaces
Part 3 of many ...
The VPC framework relies heavily on cloning pseudo interfaces
(vmnics, vpc switch, vcpswitch port, hostif, vxlan if, etc).

This pulls in that piece. Some ancillary changes get pulled
in as a side effect.

Reviewed by:	shurd@
Approved by:	sbruno@
Sponsored by:	Joyent, Inc.
Differential Revision:	https://reviews.freebsd.org/D15347
2018-05-11 20:08:28 +00:00
Stephen Hurd
ac88e6da11 iflib: print message when iflib_tx_structures_setup fails
Print a message when iflib_tx_structures_setup fails, like we do for
iflib_rx_structures_setup.

Now that we always print a message from within
iflib_qset_structures_setup when it fails, stop printing one in
iflib_device_register() at the call site.

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	gallatin
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D15300
2018-05-08 17:15:10 +00:00
Stephen Hurd
6108c01395 iflib: cleanup queues when iflib_device_register fail
Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	gallatin
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D15299
2018-05-08 16:56:02 +00:00
Andrew Gallatin
1f7ce05d1d Fix an off-by-one error when deciding to request a tx interrupt
The canonical check for whether or not a ring is drainable is
TXQ_AVAIL() > MAX_TX_DESC() + 2.  Use this same construct here,
in order to avoid a potential off-by-one error where we might otherwise
fail to request an interrupt.

Reviewed by:	mmacy
Sponsored by:	Netflix
2018-05-07 18:11:22 +00:00
Mark Johnston
9461882562 Add netdump support to iflib.
em(4) and igb(4) were tested by me, and ixgbe(4) and bnxt(4) were
tested by sbruno.

Reviewed by:	mmacy, shurd
MFC after:	1 month
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D15262
2018-05-06 00:57:52 +00:00
Matt Macy
1ae4848cd0 fix gcc8 warnings
Approved by:	sbruno
2018-05-04 18:57:05 +00:00
Stephen Hurd
b89827a052 iflib: fix invalid free during queue allocation failure
In r301567, code was added to cleanup to prevent memory leaks for the
Tx and Rx ring structs. This code carefully tracked txq and rxq, and
made sure to free them properly during cleanup.

Because we assigned the txq and rxq pointers into the ctx->ifc_txqs and
ctx->ifc_rxqs, we carefully reset these pointers to NULL, so that
cleanup code would not accidentally free the memory twice.

This was changed by r304021 ("Update iflib to support more NIC designs"),
which removed this resetting of the pointers to NULL, because it re-used
the txq and rxq pointers as an index into the queue set array.

Unfortunately, the cleanup code was left alone. Thus, if we fail to
allocate DMA or fail to configure the queues using the drivers ifdi
methods, we will attempt to free txq and rxq. These variables would now
incorrectly point to the wrong location, resulting in a page fault.

There are a number of methods to correct this, but ultimately the root
cause was that we reuse the txq and rxq pointers for two different
purposes.

Instead, when allocating, store the returned pointer directly into
ctx->ifc_txqs and ctx->ifc_rxqs. Then, assign this to txq and rxq as
index pointers before starting the loop to allocate each queue.
Drop the cleanup code for txq and rxq, and only use ctx->ifc_txqs and
ctx->ifc_rxqs.

Thus, we no longer need to free txq or rxq under any error flow, and
intsead rely solely on the pointers stored in ctx->ifc_txqs and
ctx->ifc_rxqs. This prevents the invalid free(), and ensures that we
still properly cleanup after ourselves as before when failing to
allocate.

Submitted by:	Jacob Keller
Reviewed by:	gallatin, sbruno
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D15285
2018-05-04 15:20:34 +00:00
Stephen Hurd
4d613f5d04 iflib: remove unused brscp pointer from iflib_queues_alloc
This pointer was no longer written to as of r315217. Since nothing writes
to the variable, remove it.

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	gallatin, kmacy, sbruno
Differential Revision:	https://reviews.freebsd.org/D15284
2018-05-04 15:11:16 +00:00
Stephen Hurd
aa8a24d37f Allow iflib NIC drivers to sleep rather than busy wait
Since the move to SMP NIC driver locking has had to go through serious
contortions using mtx around long running hardware operations. This moves
iflib past that.

Individual drivers may now sleep when appropriate.

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14983
2018-05-03 17:02:31 +00:00
Andrew Gallatin
f759470750 Fix iflib_encap() EFBIG handling bugs
1) Don't give up if m_collapse() fails.  Rather than giving up, try
m_defrag() immediately.

2) Fix a leak where, if the NIC driver rejected the defrag'ed chain
as having too many segments, we would fail to free the chain.

Reviewed by:  Matthew Macy <mmacy@mattmacy.io> (this version of patch)
Submitted by: Matthew Macy <mmacy@mattmacy.io> (early version of leak fix)
2018-04-30 23:53:27 +00:00
Stephen Hurd
0b75ac7796 iflib: Fix queue distribution when there are no threads
Previously, if there are no threads, all queues which targeted
cores that share an L2 cache were bound to a single core. The intent is
to distribute them across these cores.

Reported by:	olivier
Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15120
2018-04-18 15:34:18 +00:00
Sean Bruno
7b610b60df Restore r332389 after resolution of locking fixes.
Add one extra lock initialization to iflib_register() that was missed
in the git<->phab conversion.

Split out flag manipulation from general context manipulation in iflib

To avoid blocking on the context lock in the swi thread and risk potential
deadlocks, this change protects lighter weight updates that only need to
be consistent with each other with their own lock.

Submitted by:   Matthew Macy <mmacy@mattmacy.io>
Reviewed by:    shurd
Sponsored by:   Limelight Networks
Differential Revision:  https://reviews.freebsd.org/D14967
2018-04-12 14:35:37 +00:00
Vincenzo Maffione
2ff91c175e netmap: align codebase to the current upstream (commit id 3fb001303718146)
Changelist:
    - Turn tx_rings and rx_rings arrays into arrays of pointers to kring
      structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib,
      vtnet and ptnet drivers to cope with the change.
    - Generalize the nm_config() callback to accept a struct containing many
      parameters.
    - Introduce NKR_FAKERING to support buffers sharing (used for netmap
      pipes)
    - Improved API for external VALE modules.
    - Various bug fixes and improvements to the netmap memory allocator,
      including support for externally (userspace) allocated memory.
    - Refactoring of netmap pipes: now linked rings share the same netmap
      buffers, with a separate set of kring pointers (rhead, rcur, rtail).
      Buffer swapping does not need to happen anymore.
    - Large refactoring of the control API towards an extensible solution;
      the goal is to allow the addition of more commands and extension of
      existing ones (with new options) without the need of hacks or the
      risk of running out of configuration space.
      A new NIOCCTRL ioctl has been added to handle all the requests of the
      new control API, which cover all the functionalities so far supported.
      The netmap API bumps from 11 to 12 with this patch. Full backward
      compatibility is provided for the old control command (NIOCREGIF), by
      means of a new netmap_legacy module. Many parts of the old netmap.h
      header has now been moved to netmap_legacy.h (included by netmap.h).

Approved by:	hrs (mentor)
2018-04-12 07:20:50 +00:00
Mateusz Guzik
66def52613 iflib: fix up a mismerge in r332419
Lead to crashes on boot while in ifconfig.

Submitted by: Matthew Macy <mmacy@mattmacy.io>
2018-04-12 04:11:37 +00:00
Stephen Hurd
90d72813d2 Properly initialize ifc_nhwtxqs.
Also, since ifc_nhwrxqs is only used in one place, remove it from the struct.
This was preventing iflib_dma_free() from being called via
iflib_device_detach().

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Sponsored by:	Limelight Networks
2018-04-11 21:41:59 +00:00
Sean Bruno
7feb881918 Revert r332389 as it is causing panics for various users and we need
to add some more test cases.
2018-04-11 17:26:53 +00:00
Stephen Hurd
5c1d8c4b73 Split out flag manipulation from general context manipulation in iflib
To avoid blocking on the context lock in the swi thread and risk potential
deadlocks, this change protects lighter weight updates that only need to
be consistent with each other with their own lock.

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14967
2018-04-10 19:48:24 +00:00
Brooks Davis
541d96aaaf Use an accessor function to access ifr_data.
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size).  This is believed to be sufficent to
fully support ifconfig on 32-bit systems.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14900
2018-03-30 18:50:13 +00:00
Mark Johnston
18628b74bd Clamp IFLIB_RX_COPY_THRESH to MHLEN in iflib_rxd_pkt_get().
If one has added fields to struct mbuf such that MHLEN is smaller than
this threshold (128), iflib_rxd_pkt_get() may otherwise overrun the
internal mbuf buffer while copying.

Reviewed by:	mmacy
MFC after:	3 days
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14843
2018-03-25 23:23:19 +00:00
Stephen Hurd
226fb85d19 iflib: stop timer callout when stopping
iflib_timer has been seen running after the interface had been removed.
This change prevents that.

Submitted by:	matt.macy@joyent.com
2018-03-02 18:48:07 +00:00
Navdeep Parhar
7cb7c6e37a Catch up with the removal of nktr_slot_flags from upstream netmap. No
functional impact intended.

Submitted by:	Vincenzo Maffione <v.maffione@gmail.com>
2018-02-20 21:42:45 +00:00
Stephen Hurd
a4e5960730 IFLIB: do not remove dmamap on buffer unload
Dmamap is created only on IFC attach. If we remove it on
buffer release, we won't be able to do ifconfig down&up. Only destroy
when in detach.

Reported by:	wma
Reviewed by:	wma
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14060
2018-02-20 18:33:45 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Pedro F. Giffuni
443133416b net*: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:21:51 +00:00
Stephen Hurd
9c58cafaa3 Don't pass rids to taskqgroup_attach()
As everywhere else, we want to pass rman_get_start(irq->ii_res).  This
caused set affinity errors when not using MSI-X vectors (legacy and MSI
interrupts).

Reported by:	sbruno
Sponsored by:	Limelight Networks
2017-12-27 20:42:30 +00:00
Stephen Hurd
ca03863c85 Remove assertion that's not true for !EARLY_AP_STARTUP
gtask->gt_taskqueue is NULL when EARLY_AP_STARTUP is not enabled.
Remove assertion to allow this config to work.

Reported by:	oleg
Sponsored by:	Limelight Networks
2017-12-27 19:14:15 +00:00
Stephen Hurd
de13095409 Fix indentation.
Sponsored by:	Limelight Networks
2017-12-27 19:12:32 +00:00
Konstantin Belousov
97755e83f5 Fix build for kernels with SCHED_4BSD.
Sponsored by:	The FreeBSD Foundation
2017-12-21 23:05:13 +00:00
Stephen Hurd
25ac1dd5c7 Don't call tcp_lro_rx() unless hardware verified TCP/UDP csum
It seems that tcp_lro_rx() doesn't verify TCP checksums, so
if there are bad checksums in the packets caused by invalid data, the
invalid data will pass through without errors.

This was noticed with the igb driver and a specific internet host:
fetch http://www.mpfr.org/mpfr-current/mpfr-3.1.6.tar.xz -o test.bin && sha256 test.bin
Would result in a different value sometimes.

This ends up making LRO require RXCSUM to be enabled, and RXCSUM to
support TCP and UDP checksums.

PR:		224346
Reported by:	gjb
Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13561
2017-12-21 01:22:36 +00:00
Li-Wen Hsu
40cf51c438 Add missing ;
Approved by:	kevlo
2017-12-20 06:08:16 +00:00
Stephen Hurd
b103855e18 Support attaching tx queues to cpus
This will attempt to use a different thread/core on the same L2
cache when possible, or use the same cpu as the rx thread when not.
If SMP isn't enabled, don't go looking for cores to use. This is mostly
useful when using shared TX/RX queues.

Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12446
2017-12-20 01:03:34 +00:00
Stephen Hurd
96fc97c81f Update Matthew Macy contact info
Email address has changed, uses consistent name (Matthew, not Matt)

Reported by:	Matthew Macy <mmacy@mattmacy.io>
Differential Revision:	https://reviews.freebsd.org/D13537
2017-12-19 17:59:00 +00:00
Stephen Hurd
06c47d48dc Increment encap_pad_mbuf_fail when m_dup() fails in padding
Previously, the counter was only incremented when m_append() failed.  Since
the function can also fail on m_dup() now, increment the counter there as
well.

Sponsored by:	Limelight Networks
2017-12-11 20:01:28 +00:00
Stephen Hurd
04993890dc Free mbuf chain when m_dup fails
Fix memory leak where mbuf chain wasn't free()d if iflib_ether_pad()
has a failure in m_dup().

Reported by:	"Ryan Stone" <rysto32@gmail.com>
Sponsored by:	Limelight Networks
2017-12-08 19:50:06 +00:00
Stephen Hurd
a15fbbb8fe Handle read-only mbufs in iflib ether pad function
If ethernet padding is enabled, and a read-only mbuf is passed,
it would modify the mbuf using m_append(). Instead, call m_dup() and
append to the new packet.

Reported by:	Pyun YongHyeon
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13414
2017-12-08 18:43:31 +00:00
Stephen Hurd
d14c853ba3 iflib: Support to padding Ethernet frames to a min size
Some bnxt devices do not correctly send frames smaller than
52 bytes (without CRC), so add a quirk that will pad frames to an
arbitrary size before passing off to the encap routine.

Reported by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13269
2017-12-05 21:00:31 +00:00
Stephen Hurd
fe1bcada9b Avoid calling CURVNET_[SET|RESTORE] for each packet
The LRO possible test was calling CURVNET_SET once for IPv4 or IPv6 for
each packet in a chain. Only call it once per chain instead.

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	cem, ae
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13368
2017-12-05 20:43:24 +00:00
Stephen Hurd
a027c8e958 Add support for SIOCGIFXMEDIA to iflib
SIOCGIFXMEDIA is required for extended ethernet media types,
but iflib did not support it.

Reported by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13312
2017-12-01 17:58:20 +00:00
Stephen Hurd
772593dbd9 Fix comment introduced in r326369
The code uses the set of all CPUs, it doesn't zero out the set.

Sponsored by:	Limelight Networks
2017-11-29 18:21:17 +00:00
Stephen Hurd
e516b5353b Ensure that ctx->ifc_cpus is always initialized
If a device didn't support MSI-X, ctx->ifc_cpus would not be initialized,
but the IRQ allocation routines still uses the value.  Move the
initialization to common code.

Sponsored by:	Limelight Networks
2017-11-29 18:14:57 +00:00
Stephen Hurd
7274b2f6be Fix off-by-one error in bit_nclear() usage
bit_nclear() takes the bit numbers for the start and end bits, not the start
and a count.  This was resulting in memory corruption past the end of the
bitstr_t.

Sponsored by:	Limelight Networks
2017-11-20 21:57:04 +00:00