Commit graph

291 commits

Author SHA1 Message Date
Vincenzo Maffione
ee0005f11f netmap: simplify parameter passing
Changes imported from the netmap github.
2021-01-24 21:59:02 +00:00
Vincenzo Maffione
3005e10ddb netmap: vtnet: fix RX initialization after netmap_reset()
At device reset, we must not publish those netmap receive buffers
that are owned by userspace (nm_kr_rxspace).

MFC after:	1 week
2021-01-11 21:38:32 +00:00
Vincenzo Maffione
55f0ad5fde netmap: restore hwofs and support it in iflib
Restore the hwofs functionality temporarily disabled by
7ba6ecf216 to prevent issues with iflib.
This patch brings the necessary changes to iflib to
enable howfs to allow interface restarts without
disrupting netmap applications actively using its
rings.
After this change, it becomes possible for multiple
non-cooperating netmap applications to use non-overlapping
subsets of the available netmap rings without clashing
with each other.

PR:		252453
MFC after:	1 week
2021-01-10 22:51:15 +00:00
Vincenzo Maffione
bb714db6d3 netmap: vtnet: enable/disable krings on any interface reinit
See 3d65fd97e8 for a detailed explanation.

PR:             252453
MFC after:      1 week
2021-01-10 14:10:09 +00:00
Vincenzo Maffione
9ac59d42c0 netmap: vtnet: stop krings during interface reset
Similarly to what done for iflib in 1d238b07d5,
this patch prevents access to the krings during the interface
reset triggered by netmap_register().

MFC after:	1 week
2021-01-09 22:34:52 +00:00
Vincenzo Maffione
7ba6ecf216 netmap: refactor netmap_reset
The netmap_reset() function is meant to be called by the driver
when they initialize (or re-initialize) a hardware ring.
However, since the introduction of support for opening (in
netmap mode) a subset of the available rings, netmap_reset()
may be called multiple times on actively used rings, causing
both kring and netmap ring to transition to an inconsistent
state.
This changes improves the situation by resetting all the
indices fields of the kring to 0, as expected after the
reinitialization of a hardware ring.

PR:	    252518
MFC after:  1 week
2021-01-09 22:07:24 +00:00
Vincenzo Maffione
1d238b07d5 netmap: iflib: stop krings during interface reset
When different processes open separate subsets of the
available rings of a same netmap interface, a device
reset may be performed while one of the processes
is actively using some rings (e.g., caused by another
process executing a nmport_open()).
With this patch, such situation will cause the
active process to get a POLLERR, so that it can
have a chance to detect the situation.
We also guarantee that no process is running a txsync
or rxsync (ioctl or poll) while an iflib device reset
is in progress.

PR:	    252453
MFC after:  1 week
2021-01-09 21:01:46 +00:00
Vincenzo Maffione
174f809da5 netmap: fix mutex double unlock bug
https://github.com/luigirizzo/netmap/pull/733

Submitted by:	 brian90013
MFC after:	3 days
2020-10-22 20:21:11 +00:00
Vincenzo Maffione
b7d6913862 netmap: use FreeBSD guards for epoch calls
EPOCH calls are FreeBSD specific. Use guards to protect these, so
that the code can compile under Linux.

MFC after:	1 week
2020-08-24 20:28:21 +00:00
Vincenzo Maffione
ff48ef48ac netmap: fix parsing of legacy nmr->nr_ringid
Code was checking for NETMAP_{SW,HW}_RING in req->nr_ringid which
had already been masked by NETMAP_RING_MASK. Therefore, the comparisons
always failed and set NR_REG_ALL_NIC. Check against the original nmr
structure.

Submitted by:	bpoole@packetforensics.com
Reported by:	bpoole@packetforensics.com
Reviewed by:	giuseppe.lettieri@unipi.it
Approved by:	vmaffione
MFC after:	1 week
2020-08-18 08:03:28 +00:00
Vincenzo Maffione
16f224b5f8 netmap: vtnet: fix races in vtnet_netmap_reg()
The nm_register callback needs to call nm_set_native_flags()
or nm_clear_native_flags() once the device has been stopped.
However, in the current implementation this is not true,
as the device is stopped by vtnet_init_locked(). This causes
race conditions where the driver crashes as soon as it
dequeues netmap buffers assuming they are mbufs (or the other
way around).
To fix the issue, we extend vtnet_init_locked() with a second
argument that, if not zero, will set/clear the netmap flags.
This results in a huge simplification of the nm_register
callback itself.
Also, use netmap_reset() to check if a ring is going to be
re-initialized in netmap mode.

MFC after:	1 week
2020-06-14 20:47:31 +00:00
Vincenzo Maffione
6682323732 netmap: introduce netmap_kring_on()
This function returns NULL if the ring identified by
queue id and direction is in netmap mode. Otherwise
return the corresponding kring.
Use this function to replace vtnet_netmap_queue_on().

MFC after:	1 week
2020-06-11 20:35:28 +00:00
Vincenzo Maffione
e8c07b1246 netmap: vtnet: clean up rxsync disabled logs
MFC after:	1 week
2020-06-03 17:47:32 +00:00
Vincenzo Maffione
1b6d5a80a6 netmap: vtnet: fix race condition in rxsync
This change prevents a race that happens when rxsync dequeues
N-1 rx packets (with N being the size of the netmap rx ring).
In this situation, the loop exits without re-enabling the
rx interrupts, thus causing the VQ to stall.

MFC after:	1 week
2020-06-03 17:46:21 +00:00
Vincenzo Maffione
2d769e25b1 netmap: vtnet: add vtnrx_nm_refill index to receive queues
The new index tracks the next netmap slot that is going
to be enqueued into the virtqueue. The index is necessary
to prevent the receive VQ and the netmap rx ring from going
out of sync, considering that we never enqueue N slots, but
at most N-1. This change fixes a bug that causes the VQ
and the netmap ring to go out of sync after N-1 packets
have been received.

MFC after:	1 week
2020-06-03 17:42:17 +00:00
Vincenzo Maffione
06f6997eb5 netmap: vale: fix disabled logs
MFC after:	1 week
2020-06-03 05:49:19 +00:00
Vincenzo Maffione
81d2cade1c netmap: vtnet: remove leftover memory barriers
MFC after:	1 week
2020-06-03 05:48:42 +00:00
Vincenzo Maffione
9ec71596c0 netmap: if_vtnet: avoid netmap ring wraparound
netmap assumes the one "slot" is left unused to distinguish
the empty ring and full ring conditions. This assumption was
violated by vtnet_netmap_rxq_populate().

MFC after:	1 week
2020-06-01 16:14:29 +00:00
Vincenzo Maffione
36f2d67026 netmap: if_vtnet: replace vtnet_free_used()
The functionality contained in this function is duplicated,
as it is already available in vtnet_txq_free_mbufs()
and vtnet_rxq_free_mbufs().

MFC after:	1 week
2020-06-01 16:12:09 +00:00
Vincenzo Maffione
c9de157d36 netmap: vtnet: fix RX virtqueue initialization bug
The vtnet_netmap_rxq_populate() function erroneously assumed
that kring->nr_hwcur = 0, i.e. the kring was in the initial
state. However, this is not always the case: for example,
when a vtnet reinit is triggered by some changes in the
interface flags or capenable.
This patch changes the behaviour of vtnet_netmap_kring_refill()
so that it always starts publishing the netmap buffers starting
from the current value of kring->nr_hwcur.

MFC after:	1 week
2020-06-01 16:10:44 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Gleb Smirnoff
6c3e93cb5a Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process
incoming packets in taskqueue context.

Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D23518
2020-02-11 18:57:07 +00:00
Vincenzo Maffione
723180da59 netmap: improve netmap(4) and vale(4) man pages
Clean up obsolete sysctl descriptions and add missing ones.

PR:		243838
Reviewed by:	bcr
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D23546
2020-02-07 19:26:26 +00:00
Vincenzo Maffione
de27b30340 netmap_mem_unmap: fix NULL pointer dereference
MFC after:	3 days
2020-01-26 21:34:46 +00:00
Gleb Smirnoff
a44700782e In netmap() call ether_input() within the network epoch. 2020-01-23 01:35:02 +00:00
Vincenzo Maffione
2ec213aba4 netmap: disable passthrough with no hypervisor support
The netmap passthrough subsystem requires proper support in the
hypervisor. In particular, two PCI device ids (from the Red Hat
PCI vendor id 0x1b36) need to be assigned to the two netmap
virtual devices. We then disable these devices until the ids have
not been assigned, in order to avoid conflicts with other
virtual devices emulated by upstream QEMU.

PR:	241774
MFC after:	3 days
2020-01-13 21:47:23 +00:00
Jeff Roberson
3cf3b4e641 Make page busy state deterministic on free. Pages must be xbusy when
removed from objects including calls to free.  Pages must not be xbusy
when freed and not on an object.  Strengthen assertions to match these
expectations.  In practice very little code had to change busy handling
to meet these rules but we can now make stronger guarantees to busy
holders and avoid conditionally dropping busy in free.

Refine vm_page_remove() and vm_page_replace() semantics now that we have
stronger guarantees about busy state.  This removes redundant and
potentially problematic code that has proliferated.

Discussed with:	markj
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D22822
2019-12-22 06:56:44 +00:00
Vincenzo Maffione
c7c7805531 add valectl to the system commands
The valectl(4) program is used to manage vale(4) switches.
Add it to the system commands so that it can be used right away.
This program was previously called vale-ctl, and stored in
tools/tools/netmap

Reviewed by:	hrs, bcr, lwhsu, kevans
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22146
2019-10-31 21:01:34 +00:00
Vincenzo Maffione
484456b2d8 netmap: enter NET_EPOCH on generic txsync
After r353292, netmap generic adapter on if_vlan interfaces panics on
asserting the NET_EPOCH. In more detail, this happens when
nm_os_generic_xmit_frame() is called, that is in the generic txsync
routine.
Fix the issue by entering the NET_EPOCH during the generic txsync.
We amortize the cost of entering/exiting over a whole batch of
transmissions.

PR:		241489
Reported by:	Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
2019-10-28 19:00:27 +00:00
Vincenzo Maffione
760fa2ab5d netmap: minor misc improvements
- use ring->head rather than ring->cur in lb(8)
 - use strlcat() rather than strncat()
 - fix bandwidth computation in pkt-gen(8)

MFC after:	1 week
2019-10-20 14:15:45 +00:00
Vincenzo Maffione
f8bc74e2f4 tap: add support for virtio-net offloads
This patch is part of an effort to make bhyve networking (in particular TCP)
faster. The key strategy to enhance TCP throughput is to let the whole packet
datapath work with TSO/LRO packets (up to 64KB each), so that the per-packet
overhead is amortized over a large number of bytes.
This capability is supported in the guest by means of the vtnet(4) driver,
which is able to handle TSO/LRO packets leveraging the virtio-net header
(see struct virtio_net_hdr and struct virtio_net_hdr_mrg_rxbuf).
A bhyve VM exchanges packets with the host through a network backend,
which can be vale(4) or if_tap(4).
While vale(4) supports TSO/LRO packets, if_tap(4) does not.
This patch extends if_tap(4) with the ability to understand the virtio-net
header, so that a tapX interface can process TSO/LRO packets.
A couple of ioctl commands have been added to configure and probe the
virtio-net header. Once the virtio-net header is set, the tapX interface
acquires all the IFCAP capabilities necessary for TSO/LRO.

Reviewed by:	kevans
Differential Revision:	https://reviews.freebsd.org/D21263
2019-10-18 21:53:27 +00:00
Jeff Roberson
0012f373e4 (4/6) Protect page valid with the busy lock.
Atomics are used for page busy and valid state when the shared busy is
held.  The details of the locking protocol and valid and dirty
synchronization are in the updated vm_page.h comments.

Reviewed by:    kib, markj
Tested by:      pho
Sponsored by:   Netflix, Intel
Differential Revision:        https://reviews.freebsd.org/D21594
2019-10-15 03:45:41 +00:00
Mark Johnston
fee2a2fa39 Change synchonization rules for vm_page reference counting.
There are several mechanisms by which a vm_page reference is held,
preventing the page from being freed back to the page allocator.  In
particular, holding the page's object lock is sufficient to prevent the
page from being freed; holding the busy lock or a wiring is sufficent as
well.  These references are protected by the page lock, which must
therefore be acquired for many per-page operations.  This results in
false sharing since the page locks are external to the vm_page
structures themselves and each lock protects multiple structures.

Transition to using an atomically updated per-page reference counter.
The object's reference is counted using a flag bit in the counter.  A
second flag bit is used to atomically block new references via
pmap_extract_and_hold() while removing managed mappings of a page.
Thus, the reference count of a page is guaranteed not to increase if the
page is unbusied, unmapped, and the object's write lock is held.  As
a consequence of this, the page lock no longer protects a page's
identity; operations which move pages between objects are now
synchronized solely by the objects' locks.

The vm_page_wire() and vm_page_unwire() KPIs are changed.  The former
requires that either the object lock or the busy lock is held.  The
latter no longer has a return value and may free the page if it releases
the last reference to that page.  vm_page_unwire_noq() behaves the same
as before; the caller is responsible for checking its return value and
freeing or enqueuing the page as appropriate.  vm_page_wire_mapped() is
introduced for use in pmap_extract_and_hold().  It fails if the page is
concurrently being unmapped, typically triggering a fallback to the
fault handler.  vm_page_wire() no longer requires the page lock and
vm_page_unwire() now internally acquires the page lock when releasing
the last wiring of a page (since the page lock still protects a page's
queue state).  In particular, synchronization details are no longer
leaked into the caller.

The change excises the page lock from several frequently executed code
paths.  In particular, vm_object_terminate() no longer bounces between
page locks as it releases an object's pages, and direct I/O and
sendfile(SF_NOCACHE) completions no longer require the page lock.  In
these latter cases we now get linear scalability in the common scenario
where different threads are operating on different files.

__FreeBSD_version is bumped.  The DRM ports have been updated to
accomodate the KPI changes.

Reviewed by:	jeff (earlier version)
Tested by:	gallatin (earlier version), pho
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20486
2019-09-09 21:32:42 +00:00
Vincenzo Maffione
253b2ec199 netmap: import changes from upstream (SHA 137f537eae513)
- Rework option processing.
 - Use larger integers for memory size values in the
   memory management code.

MFC after:	2 weeks
2019-09-01 14:47:41 +00:00
Vincenzo Maffione
df4e516f0f netmap: remove obsolete file
The netmap_pt.c module has become obsolete after
the refactoring that added netmap_kloop.c.
Remove it and unlink it from the build system.

MFC after:	1 week
2019-08-25 20:16:03 +00:00
Vincenzo Maffione
d7143780ce netmap: fix bug introduced by r349752
r349752 introduced a NULL pointer reference bug
in the emulated netmap code.

Reported by:	lwhsu
MFC after:	3 days
2019-07-13 08:08:25 +00:00
Vincenzo Maffione
5d47236b18 netmap: Remove pointer leakage in netmap_mem2.c
PR:		238641
Submitted by:	Fuqian Huang <huangfq.daxian@gmail.com>
Reviewed by:	vmaffione
MFC after:	1 week
2019-07-04 21:31:49 +00:00
Vincenzo Maffione
5fe59a51dd netmap: fix kernel pointer printing in netmap_generic.c
Print the adapter name rather than the address of the adapter
to avoid kernel address leakage.

PR:		Bug 238642
Submitted by:	Fuqian Huang <huangfq.daxian@gmail.com>
Reviewed by:	vmaffione
MFC after:	1 week
2019-07-04 21:11:45 +00:00
Vincenzo Maffione
23ced94451 netmap: fix two panics with emulated adapter
This patch fixes 2 panics. The first one is due to the current VNET not
being set in the emulated adapter transmission path. The second one
is caused by the M_PKTHDR flag not being set when preallocated mbufs
are recycled in the transmit path.

Submitted by:	aleksandr.fedorov@itglobal.com
Reviewed by:	vmaffione
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20824
2019-07-01 20:37:35 +00:00
Conrad Meyer
04e0c883c5 Add two missing eventhandler.h headers
These are obviously missing from the .c files, but don't show up in any
tinderbox configuration (due to latent header pollution of some kind).  It
seems some configurations don't have this pollution, and the includes are
obviously missing, so go ahead and add them.

Reported by:	Peter Jeremy <peter AT rulingia.com>
X-MFC-With:	r347984
2019-05-21 00:04:19 +00:00
Vincenzo Maffione
d337c8c731 netmap: align if_ptnet to the changes introduced by r347233
This removes non-functional SCTP checksum offload support.
More information in the log message of r347233.

MFC after:	2 weeks
2019-05-17 20:29:31 +00:00
Vincenzo Maffione
d12354a56c netmap: add support for multiple host rings
Some applications forward from/to host rings most or all the
traffic received or sent on a physical interface. In this
cases it is desirable to have more than a pair of RX/TX host
rings, and use multiple threads to speed up forwarding.
This change adds support for multiple host rings. On registering
a netmap port, the user can specify the number of desired receive
and transmit host rings in the nr_host_tx_rings and nr_host_rx_rings
fields of the nmreq_register structure.

MFC after:	2 weeks
2019-03-18 12:22:23 +00:00
Vincenzo Maffione
352a2062c9 netmap: remove redundant call to nm_set_native_flags()
This redundant call was introduced by mistake in r343772.

MFC after:	3 days
Sponsored by:	Sunny Valley Networks
2019-02-25 09:57:06 +00:00
Vincenzo Maffione
45100257c6 netmap: don't schedule kqueue notify task when kqueue is not used
This change adds a counter (kqueue_users) to keep track of how many
kqueue users are referencing a given struct nm_selinfo.
In this way, nm_os_selwakeup() can schedule the kevent notification
task only when kqueue is actually being used.
This is important to avoid wasting CPU in the common case where
kqueue is not used.

Reviewed by:	Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D19177
2019-02-18 14:21:41 +00:00
Vincenzo Maffione
1ef2a88149 netmap: revert netmap_attach_ext() to pre-r343772
Reported by:	marius
MFC after:	1 week
2019-02-07 11:28:53 +00:00
Vincenzo Maffione
75f4f3ed51 netmap: refactor logging macros and pipes
Changelist:
    - Replace ND, D and RD macros with nm_prdis, nm_prinf, nm_prerr
      and nm_prlim, to avoid possible naming conflicts.
    - Add netmap_krings_mode_commit() helper function and use that
      to reduce code duplication.
    - Refactor pipes control code to export some functions that
      can be reused by the veth driver (on Linux) and epair(4).
    - Add check to reject API requests with version less than 11.
    - Small code refactoring for the null adapter.

MFC after:	1 week
2019-02-05 12:10:48 +00:00
Vincenzo Maffione
5faab77822 netmap: upgrade sync-kloop support
Add SYNC_KLOOP_MODE option, and add support for direct mode, where application
executes the TXSYNC and RXSYNC in the context of the ioeventfd wake up callback.

MFC after:	5 days
2019-02-02 22:39:29 +00:00
Vincenzo Maffione
19c4ec08ad netmap: fix lock order reversal related to kqueue usage
When using poll(), select() or kevent() on netmap file descriptors,
netmap executes the equivalent of NIOCTXSYNC and NIOCRXSYNC commands,
before collecting the events that are ready. In other words, the
poll/kevent callback has side effects. This is done to avoid the
overhead of two system call per iteration (e.g., poll() + ioctl(NIOC*XSYNC)).

When the kqueue subsystem invokes the kqueue(9) f_event callback
(netmap_knrw), it holds the lock of the struct knlist object associated
to the netmap port (the lock is provided at initialization, by calling
knlist_init_mtx).
However, netmap_knrw() may need to wake up another netmap port (or even
the same one), which means that it may need to call knote().
Since knote() needs the lock of the struct knlist object associated to
the to-be-wake-up netmap port, it is possible to have a lock order reversal
problem (AB/BA deadlock).

This change prevents the deadlock by executing the knote() call in a
per-selinfo taskqueue, where it is possible to hold a mutex.

Reviewed by:	aleksandr.fedorov_itglobal.com
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D18956
2019-01-30 15:51:55 +00:00
Vincenzo Maffione
a56136a1ba netmap: add notifications on kloop stop
On sync-kloop stop, send a wake-up signal to the kloop, so that
waiting for the timeout is not needed.
Also, improve logging in netmap_freebsd.c.

MFC after:	3 days
2019-01-29 10:28:50 +00:00
Vincenzo Maffione
aa4dd64dfe netmap: fix crash with monitors and VALE ports
Crash report described here:
    https://github.com/luigirizzo/netmap/issues/583
Fixed by providing dummy sync callback in case it is missing.
2019-01-24 22:09:26 +00:00
Vincenzo Maffione
f79ba6d75b netmap: improvements to the netmap kloop (CSB mode)
Changelist:
    - Add the proper memory barriers in the kloop ring processing
      functions.
    - Fix memory barriers usage in the user helpers (nm_sync_kloop_appl_write,
      nm_sync_kloop_appl_read).
    - Fix nm_kr_txempty() helper to look at rhead rather than rcur. This
      is important since the kloop can read a value of rcur which is ahead
      of the value of rhead (see explanation in nm_sync_kloop_appl_write)
    - Remove obsolete ptnetmap_guest_write_kring_csb() and
      ptnet_guest_read_kring_csb(), and update if_ptnet(4) to use those.
    - Prepare in advance the arguments for netmap_sync_kloop_[tr]x_ring(),
      to make the kloop faster.
    - Provide kernel and user implementation for nm_ldld_barrier() and
      nm_ldst_barrier()

MFC after:	2 weeks
2019-01-23 14:51:36 +00:00
Vincenzo Maffione
8c9874f5b1 netmap: fix knote() argument to match the mutex state
The nm_os_selwakeup function needs to call knote() to wake up kqueue(9)
users. However, this function can be called from different code paths,
with different lock requirements.
This patch fixes the knote() call argument to match the relavant lock state.
Also, comments have been updated to reflect current code.

PR:	https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219846
Reported by:	Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D18876
2019-01-23 14:21:23 +00:00
Vincenzo Maffione
58e185425a netmap: fix txsync check in netmap poll
To check if txsync can be skipped, it is necessary to look for
unseen TX space. However, this means comparing ring->cur
against ring->tail, rather than ring->head against ring->tail
(like nm_ring_empty() does).
This change also adds some more comments to explain the optimization
performed at the beginning of netmap_poll().

MFC after:	3 days
Sponsored by:	Sunny Valley Networks
2018-12-22 16:23:42 +00:00
Vincenzo Maffione
e1ed1fbdea netmap: fix bug in netmap_poll() optimization
The bug was introduced by r339639, although it is present in the upstream
netmap code since 2015. It is due to resetting the want_rx variable to
POLLIN, rather than resetting it to POLLIN|POLLRDNORM.
It only affects select(), which uses POLLRDNORM. poll() is not affected,
because it uses POLLIN.
Also, it only affects FreeBSD, because Linux skips the optimization
implemented by the piece of code where the bug occurs.

MFC after:	3 days
Sponsored by:	Sunny Valley Networks
2018-12-22 15:15:45 +00:00
Vincenzo Maffione
77a2baf551 netmap: move buf_size validation code to its own function
This code validates the netmap buf_size against the interface MTU
and maximum descriptor size, to make sure the values are consistent.
Moving this functionality to its own function is needed because this
function is also called by Linux-specific code.

MFC after:	3 days
2018-12-21 11:50:14 +00:00
Vincenzo Maffione
c52382bd40 netmap: pipes: make sure both ends use the same number of slots 2018-12-21 11:32:55 +00:00
Vincenzo Maffione
dde885de95 netmap: fix warning in netmap_kloop.c
Reported by:	markj
MFC after:	3 days
2018-12-12 16:32:15 +00:00
Vincenzo Maffione
2605ddfce9 netmap: remove dead code obsoleted by iflib
The iflib subsystem implements netmap support in a driver-independent
way (sys/net/iflib.c). We can therefore remove the headers that
used to implement netmap support for all the drivers now supported
by iflib (em, igb, ixl, ixgbe, lem).

MFC after:	1 week
2018-12-07 11:47:42 +00:00
Vincenzo Maffione
89a9a5b5c9 netmap: netmap_transmit should honor bpf packet tap hook
This allows tcpdump to capture outbound kernel packets while
in netmap mode

Submitted by:	Marc de la Gueronniere <mdelagueronniere@verisign.com>
Reviewed by:	vmaffione
MFC after:	1 week
Sponsored by:	Verisign, Inc.
Differential Revision:	https://reviews.freebsd.org/D17896
2018-12-06 09:45:25 +00:00
Vincenzo Maffione
b6e66be22b netmap: align codebase to the current upstream (760279cfb2730a585)
Changelist:
  - Replace netmap passthrough host support with a more general
    mechanism to call TXSYNC/RXSYNC from an in-kernel event-loop.
    No kernel threads are used to use this feature: the application
    is required to spawn a thread (or a process) and issue a
    SYNC_KLOOP_START (NIOCCTRL) command in the thread body. The
    kernel loop is executed by the ioctl implementation, which returns
    to userspace only when a different thread calls SYNC_KLOOP_STOP
    or the netmap file descriptor is closed.
  - Update the if_ptnet driver to cope with the new data structures,
    and prune all the obsolete ptnetmap code.
  - Add support for "null" netmap ports, useful to allocate netmap_if,
    netmap_ring and netmap buffers to be used by specialized applications
    (e.g. hypervisors). TXSYNC/RXSYNC on these ports have no effect.
  - Various fixes and code refactoring.

Sponsored by:	Sunny Valley Networks
Differential Revision:	https://reviews.freebsd.org/D18015
2018-12-05 11:57:16 +00:00
Vincenzo Maffione
d55913f555 netmap: set IFCAP_NETMAP in if_capabilities
Revision r307394 removed (by mistake) the code that sets IFCAP_NETMAP
in if_capabilities on netmap_attach. This patch reverts this change.

Reviewed by:	np
Approved by:	gnn (mentor)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D17987
2018-11-28 14:07:34 +00:00
Vincenzo Maffione
2e42b74a6f vtnet: fix netmap support
netmap(4) support for vtnet(4) was incomplete and had multiple bugs.
This commit fixes those bugs to bring netmap on vtnet in a functional state.

Changelist:
  - handle errors returned by virtqueue_enqueue() properly (they were
    previously ignored)
  - make sure netmap XOR rest of the kernel access each virtqueue.
  - compute the number of netmap slots for TX and RX separately, according to
    whether indirect descriptors are used or not for a given virtqueue.
  - make sure sglist are freed according to their type (mbufs or netmap
    buffers)
  - add support for mulitiqueue and netmap host (aka sw) rings.
  - intercept VQ interrupts directly instead of intercepting them in txq_eof
    and rxq_eof. This simplifies the code and makes it easier to make sure
    taskqueues are not running for a VQ while it is in netmap mode.
  - implement vntet_netmap_config() to cope with changes in the number of queues.

Reviewed by:	bryanv
Approved by:	gnn (mentor)
MFC after:	3 days
Sponsored by:	Sunny Valley Networks
Differential Revision:	https://reviews.freebsd.org/D17916
2018-11-14 15:39:48 +00:00
Bjoern A. Zeeb
be01db051f Remove redundant redeclaration of netmap_vp_reg().
This should unbreak sparc64 and powerpc LINT builds.
2018-10-24 14:14:49 +00:00
Vincenzo Maffione
2a7db7a63d netmap: align codebase to the current upstream (sha 8374e1a7e6941)
Changelist:
    - Move large parts of VALE code to a new file and header netmap_bdg.[ch].
      This is useful to reuse the code within upcoming projects.
    - Improvements and bug fixes to pipes and monitors.
    - Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to
      handle differences between FreeBSD and Linux.
    - Introduce some new helper functions to handle more host rings and fake
      rings (netmap_all_rings(), netmap_real_rings(), ...)
    - Added new sysctl to enable/disable hw checksum in emulated netmap mode.
    - nm_inject: add support for NS_MOREFRAG

Approved by:	gnn (mentor)
Differential Revision:	https://reviews.freebsd.org/D17364
2018-10-23 08:55:16 +00:00
David Bright
53e992cfb9 Fix several memory leaks.
The libkqueue tests have several places that leak memory by using an
idiom like:

puts(kevent_to_str(kevp));

Rework to save the pointer returned from kevent_to_str() and then
free() it after it has been used.

Reported by:	asomers (pointer to Coverity), Coverity
CID:		1296063, 1296064, 1296065, 1296066, 1296067, 1350287, 1394960
Sponsored by:	Dell EMC
2018-08-14 19:12:45 +00:00
Matt Macy
24a7d6d3a6 netmap and iflib drivers, silence unused var warnings 2018-05-19 05:57:26 +00:00
Matt Macy
3535fae847 netmap: compare e1 with e2, not with itself 2018-05-19 05:37:18 +00:00
Matt Macy
cfa866f6a1 netmap: pull fix for 32-bit support from upstream
Approved by:	sbruno
2018-05-18 03:38:17 +00:00
Brooks Davis
1315f9b59f Fix build on 32-bit systems. 2018-04-13 19:43:23 +00:00
Vincenzo Maffione
2ff91c175e netmap: align codebase to the current upstream (commit id 3fb001303718146)
Changelist:
    - Turn tx_rings and rx_rings arrays into arrays of pointers to kring
      structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib,
      vtnet and ptnet drivers to cope with the change.
    - Generalize the nm_config() callback to accept a struct containing many
      parameters.
    - Introduce NKR_FAKERING to support buffers sharing (used for netmap
      pipes)
    - Improved API for external VALE modules.
    - Various bug fixes and improvements to the netmap memory allocator,
      including support for externally (userspace) allocated memory.
    - Refactoring of netmap pipes: now linked rings share the same netmap
      buffers, with a separate set of kring pointers (rhead, rcur, rtail).
      Buffer swapping does not need to happen anymore.
    - Large refactoring of the control API towards an extensible solution;
      the goal is to allow the addition of more commands and extension of
      existing ones (with new options) without the need of hacks or the
      risk of running out of configuration space.
      A new NIOCCTRL ioctl has been added to handle all the requests of the
      new control API, which cover all the functionalities so far supported.
      The netmap API bumps from 11 to 12 with this patch. Full backward
      compatibility is provided for the old control command (NIOCREGIF), by
      means of a new netmap_legacy module. Many parts of the old netmap.h
      header has now been moved to netmap_legacy.h (included by netmap.h).

Approved by:	hrs (mentor)
2018-04-12 07:20:50 +00:00
Vincenzo Maffione
4f80b14ce2 netmap: align codebase to upstream version v11.4
Changelist:
  - remove unused nkr_slot_flags
  - new nm_intr adapter callback to enable/disable interrupts
  - remove unused sysctls and document the other sysctls
  - new infrastructure to support NS_MOREFRAG for NIC ports
  - support for external memory allocator (for now linux-only),
    including linux-specific changes in common headers
  - optimizations within netmap pipes datapath
  - improvements on VALE control API
  - new nm_parse() helper function in netmap_user.h
  - various bug fixes and code clean up

Approved by:	hrs (mentor)
2018-04-09 09:24:26 +00:00
Vincenzo Maffione
46023447b6 netmap: align if_ptnet guest driver to the upstream code (commit 0e15788)
The change upgrades the driver to use the split Communication Status
Block (CSB) format. In this way the variables written by the guest
and read by the host are allocated in a different cacheline than
the variables written by the host and read by the guest; this is
needed to avoid cache thrashing.

Approved by:	hrs (mentor)
2018-04-04 21:31:12 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Pedro F. Giffuni
26c1d774b5 dev: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these is likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.
2018-01-13 22:30:30 +00:00
Pedro F. Giffuni
718cf2ccb9 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
Gleb Smirnoff
e8fd18f306 Shorten list of arguments to mbuf external storage freeing function.
All of these arguments are stored in m_ext, so there is no reason
to pass them in the argument list.  Not all functions need the second
argument, some don't even need the first one.  The second argument
lives in next cache line, so not dereferencing it is a performance
gain.  This was discovered in sendfile(2), which will be covered by
next commits.

The second goal of this commit is to bring even more flexibility
to m_ext mbufs, allowing to create more fields in m_ext, opaque to
the generic mbuf code, and potentially set and dereferenced by
subsystems.

Reviewed by:	gallatin, kbowling
Differential Revision:	https://reviews.freebsd.org/D12615
2017-10-09 20:35:31 +00:00
Luiz Otavio O Souza
4dd4446129 Restore the changes done in r313982: Replace zero with NULL for pointers.
Spotted by:	Harry Schmalzbauer
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC (Netgate)
2017-07-21 03:59:56 +00:00
Luiz Otavio O Souza
a02dbe4ca1 Do not allow the use of the loopback interface in netmap.
The generic support in netmap send the packets using if_transmit() and the
loopback do not support packets coming from if_transmit()/if_start().

This avoids the use of the loopback interface and the subsequent crash that
happens when the application send packets to the loopback interface.

Details in:	https://github.com/luigirizzo/netmap/issues/322
Reported by:	Vincenzo Maffione <v.maffione@gmail.com>
Sponsored by:	Rubicon Communications, LLC (Netgate)
2017-07-21 03:28:35 +00:00
Luiz Otavio O Souza
c3e9b4db8c Update the current version of netmap to bring it in sync with the github
version.

This commit contains mostly refactoring, a few fixes and minor added
functionality.

Submitted by:	Vincenzo Maffione <v.maffione at gmail.com>
Requested by:	many
Sponsored by:	Rubicon Communications, LLC (Netgate)
2017-06-12 22:53:18 +00:00
Pedro F. Giffuni
4d24901ac9 sys/dev: Replace zero with NULL for pointers.
Makes things easier to read, plus architectures may set NULL to something
different than zero.

Found with:	devel/coccinelle
MFC after:	3 weeks
2017-02-20 03:43:12 +00:00
Mark Johnston
4c55b4e8da Unbreak the gcc build of netmap.
This fixes several LINT targets.

Reviewed by:	Vincenzo Maffione
2017-02-14 21:36:18 +00:00
Sean Bruno
25a3341048 Fix panic on mb_free_ext() due to NULL destructor.
This used to happen because of the SET_MBUF_DESTRUCTOR() called
on unregif.

Submitted by:	Vincenzo Maffione <v.maffione@gmail.com>
2017-01-12 16:24:10 +00:00
Adrian Chadd
67ca1051e0 [netmap] call RLOCK /and/ RUNLOCK.
Reported by: olivier
2017-01-02 06:36:12 +00:00
Adrian Chadd
869d88787d [netmap] fix locking regressions
* Firmware oriented NICs may need to sleep in their configuration paths.
  Use RLOCK instead of WLOCK to allow this to again occur.

  This fixes netmap on cxgbe.

* Change the worker lock to a normal mutex rather than a spin lock.
  Drivers shouldn't be doing netmap work from the fast interrupt
  handlers, so it's not required to be a spinlock.

Submitted by:	luigi, Vincenzo Maffione <v.maffione@gmail.com>
Reviewed by:	jhb
2016-12-30 14:47:46 +00:00
Ed Maste
54c7693f2c netmap: add cast to fix powerpc64 LINT kernel
Attempt to fix powerpc64 LINT kernel broken by r308000. Netmap's use of
a uint64_t wchan seems odd, but in the interest of minimizing this
change just cast through uintptr_t to silence the compiler warning.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D8669
2016-11-30 02:00:30 +00:00
Sean Bruno
c371f1143c The buffer address is always overwritten in the extended descriptor format,
we have to refresh it ... always.  This fixes problems reported in NetMap
with em(4) devices after conversion to extended descriptor format in
svn r293331.

Submitted by:	luigi@
Reported by:	franco@opnsense.org
MFC after:	2 days
2016-10-28 13:37:58 +00:00
Luigi Rizzo
844a6f0c53 Various fixes for ptnet/ptnetmap (passthrough of netmap ports). In detail:
- use PCI_VENDOR and PCI_DEVICE ids from a publicly allocated range
  (thanks to RedHat)
- export memory pool information through PCI registers
- improve mechanism for configuring passthrough on different hypervisors
Code is from Vincenzo Maffione as a follow up to his GSOC work.
2016-10-27 09:46:22 +00:00
Ed Maste
984ff0d910 netmap: fix kernel build on GCC-using architectures
GCC produced a multiple declaration warning from the
SYSCTL_DECL(_dev_netmap).
2016-10-21 13:51:47 +00:00
Sepherosa Ziehau
ffaa5deb38 netmap: Unbreak LINT-VIMAGE building
Sponsored by:	Microsoft
2016-10-21 06:32:45 +00:00
Sepherosa Ziehau
e3f94e5133 netmap: Unbreak i386 LINT building
Sponsored by:	Microsoft
2016-10-21 06:05:16 +00:00
Luigi Rizzo
a2a7409151 remove stale and unused code from various files
fix build on 32 bit platforms
simplify logic in netmap_virt.h

The commands (in net/netmap.h) to configure communication with the
hypervisor may be revised soon.
At the moment they are unused so this will not be a change of API.
2016-10-18 16:18:25 +00:00
Luigi Rizzo
6ad42d71b2 remove trailing whitespace. No code changes. 2016-10-18 15:41:57 +00:00
Sean Bruno
225d33ff5c Restore svn r306772 that was overwritten by netmap import at svn r307394
#include <sys/selinfo.h> should be here as all drivers that support
netmap need to use this file regardless.
2016-10-18 14:48:41 +00:00
Luigi Rizzo
a9e644cd24 add two missing files for the netmap import 2016-10-16 15:22:17 +00:00
Luigi Rizzo
37e3a6d349 Import the current version of netmap, aligned with the one on github.
This commit, long overdue, contains contributions in the last 2 years
from Stefano Garzarella, Giuseppe Lettieri, Vincenzo Maffione, including:
+ fixes on monitor ports
+ the 'ptnet' virtual device driver, and ptnetmap backend, for
  high speed virtual passthrough on VMs (bhyve fixes in an upcoming commit)
+ improved emulated netmap mode
+ more robust error handling
+ removal of stale code
+ various fixes to code and documentation (some mixup between RX and TX
  parameters, and private and public variables)

We also include an additional tool, nmreplay, which is functionally
equivalent to tcpreplay but operating on netmap ports.
2016-10-16 14:13:32 +00:00
Sean Bruno
87de0cd185 Move netmap selinfo.h in to sensible location.
netmap_kern.h currently requires all drivers including it to include
selinfo.h.

Submitted by:	mmacy@nextbsd.org
Reviewed by:	gnn
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D5334
2016-10-06 17:54:34 +00:00
Eric Joyner
ff9b61ca07 Fix linker warnings (errors on gcc) that resulted from r304510.
The variables that are extern in the netmap header file should be
defined in ixl_txrx.c (the file that is included in both ixl(4)/ixlv(4),
not in the main driver source files.

Reported by:	ed@, dim@, ngie@
2016-09-01 01:08:18 +00:00
Jean-Sébastien Pédron
bd937497ea Consistently use device_t
Several files use the internal name of `struct device` instead of
`device_t` which is part of the public API. This patch changes all
`struct device *` to `device_t`.

The remaining occurrences of `struct device` are those referring to the
Linux or OpenBSD version of the structure, or the code is not built on
FreeBSD and it's unclear what to do.

Submitted by:	Matthew Macy <mmacy@nextbsd.org> (previous version)
Approved by:	emaste, jhibbits, sbruno
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D7447
2016-08-09 19:32:06 +00:00
Eitan Adler
cef367e6a1 Don't repeat the the word 'the'
(one manual change to fix grammar)

Confirmed With: db
Approved by: secteam (not really, but this is a comment typo fix)
2016-05-17 12:52:31 +00:00
Pedro F. Giffuni
453130d9bf sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00