Commit graph

12 commits

Author SHA1 Message Date
Shailend Chand
3afdae4885 gve: Fix TX livelock
Before this change the transmit taskqueue would enqueue itself when it
cannot find space on the NIC ring with the hope that eventually space
would be made. This results in the following livelock that only occurs
after passing ~200Gbps of TCP traffic for many hours:

                        100% CPU
┌───────────┐wait on  ┌──────────┐         ┌───────────┐
│user thread│  cpu    │gve xmit  │wait on  │gve cleanup│
│with mbuf  ├────────►│taskqueue ├────────►│taskqueue  │
│uma lock   │         │          │ NIC ring│           │
└───────────┘         └──────────┘  space  └─────┬─────┘
     ▲                                           │
     │      wait on mbuf uma lock                │
     └───────────────────────────────────────────┘

Further details about the livelock are available on
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560.

After this change, the transmit taskqueue no longer spins till there is
room on the NIC ring. It instead stops itself and lets the
completion-processing taskqueue wake it up.

Since I'm touching the trasnmit taskqueue I've also corrected the name
of a counter and also fixed a bug where EINVAL mbufs were not being
freed and were instead living forever on the bufring.

Signed-off-by: Shailend Chand <shailend@google.com>
Reviewed-by: markj
MFC-after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D47138

(cherry picked from commit 40097cd67c0d52e2b288e8555b12faf02768d89c)
2024-11-20 21:41:08 +00:00
Shailend Chand
1bda36a393 gve: Add DQO QPL support
DQO is the descriptor format for our next generation virtual NIC.
It is necessary to make full use of the hardware bandwidth on many
newer GCP VM shapes.

This patch extends the previously introduced DQO descriptor format
with a "QPL" mode. QPL stands for Queue Page List and refers to
the fact that the hardware cannot access arbitrary regions of the
host memory and instead expects a fixed bounce buffer comprising
of a list of pages.

The QPL aspects are similar to the already existing GQI queue
queue format: in that the mbufs being input in the Rx path have
external storage in the form of vm pages attached to them; and
in the Tx path we always copy the mbuf payload into QPL pages.

Signed-off-by: Shailend Chand <shailend@google.com>
Reviewed-by: markj
MFC-after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D46691

(cherry picked from commit 2348ac893d10f06d2d84e1e4bd5ca9f1c5da92d8)
2024-11-20 21:41:08 +00:00
Shailend Chand
c7aea09126 gve: Add DQO RDA support
DQO is the descriptor format for our next generation virtual NIC.
It is necessary to make full use of the hardware bandwidth on many
newer GCP VM shapes.

One major change with DQO from its predecessor GQI is that it uses
dual descriptor rings for both TX and RX queues.

The TX path uses a descriptor ring to send descriptors to HW, and
receives packet completion events on a TX completion ring.

The RX path posts buffers to HW using an RX descriptor ring and
receives incoming packets on an RX completion ring.

In GQI-QPL, the hardware could not access arbitrary regions of
guest memory, which is why there was a pre-negotitated bounce buffer
(QPL: Queue Page List). DQO-RDA has no such limitation.

"RDA" is in contrast to QPL and stands for "Raw DMA Addressing" which
just means that HW does not need a fixed bounce buffer and can DMA
arbitrary regions of guest memory.

A subsequent patch will introduce the DQO-QPL datapath that uses the
same descriptor format as in this patch, but will have a fixed
bounce buffer.

Signed-off-by: Shailend Chand <shailend@google.com>
Reviewed-by: markj
MFC-after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D46690

(cherry picked from commit d438b4ef0cfc6986b93d0754f49ebf3ead50f269)
2024-11-20 21:41:08 +00:00
Zhenlei Huang
6b1f530935 net: Remove unneeded NULL check for the allocated ifnet
Change 4787572d05 made if_alloc_domain() never fail, then also do the
wrappers if_alloc(), if_alloc_dev(), and if_gethandle().

No functional change intended.

Reviewed by:	kp, imp, glebius, stevek
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D45740

(cherry picked from commit aa3860851b9f6a6002d135b1cac7736e0995eedc)
2024-07-12 20:03:37 +08:00
Shailend Chand
224e20ceb1 gve: Make gve_free_qpls idempotent
This fixes a panic caused by double free.

PR:	kern/279410
Differential Revision: https://reviews.freebsd.org/D45489

(cherry picked from commit b81cbb12410b000074483899e61e9e767ba3ec1d)
2024-06-20 22:44:34 -07:00
Shailend Chand
04ada3cc2b gve: Make LRO work for jumbo packets
Each Rx descriptor points to a packet buffer of size 2K, which means
that MTUs greater than 2K see multi-descriptor packets. The TCP-hood of
such packets was being incorrectly determined by looking for a flag on
the last descriptor instead of the first descriptor.

Also fixed and progressed the version number.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D41754

(cherry picked from commit 5f62584a9adb7887bae33af617cfa4f43017abf8)
2023-09-14 03:50:15 -04:00
Shailend Chand
543cf924bc gve: Simplify tx loop over buffer ring
Reviewed by:	markj
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D41281
2023-08-12 01:01:53 -07:00
Shailend Chand
74861578d9 gve: Fix Tx tcpdump panic
Ringing the doorbell before making the BPF call can result in the
mbuf being freed before the BPF call.

Reviewed-by:		markj
MFC-after:		3 days
Differential Revision: https://reviews.freebsd.org/D41189
2023-07-26 22:36:42 -07:00
Xin LI
1177a6c8dc gve: Unobfuscate code by using nitems directly for loop.
While there, also make MODULE_PNP_INFO to reflect that the device
description is provided.

Reported-by:	jrtc27
Reviewed-by:	jrtc27, imp
Differential Revision: https://reviews.freebsd.org/D40430
2023-06-06 21:14:30 -07:00
Xin LI
1bbdfb0b43 gve: Add PNP info to PCI attachment of gve(4) driver.
Reviewed-by:		imp
Differential Revision:	https://reviews.freebsd.org/D40429
2023-06-05 21:05:55 -07:00
Xin LI
4d779448ad gve: Fix build on i386 and enable LINT builds.
Reviewed-by:	imp
Differential Revision: https://reviews.freebsd.org/D40419
2023-06-04 16:35:00 -07:00
Shailend Chand
54dfc97b0b Add gve, the driver for Google Virtual NIC (gVNIC)
gVNIC is a virtual network interface designed specifically for
Google Compute Engine (GCE). It is required to support per-VM Tier_1
networking performance, and for using certain VM shapes on GCE.

The NIC supports TSO, Rx and Tx checksum offloads, and RSS.
It does not currently do hardware LRO, and thus the software-LRO
in the host is used instead. It also supports jumbo frames.

For each queue, the driver negotiates a set of pages with the NIC to
serve as a fixed bounce buffer, this precludes the use of iflib.

Reviewed-by: 		markj
MFC-after:		2 weeks
Differential Revision: https://reviews.freebsd.org/D39873
2023-06-02 14:31:54 -07:00