Commit graph

210 commits

Author SHA1 Message Date
Arthur Kiyanovski
a1685d2560 ena: Bump driver version to v2.8.1
Changes since 2.8.0:

Bug Fixes:
* Fix LLQ normal width misconfiguration
* Check for errors when detaching children first, not last

Minor Changes:
* Remove \n from sysctl description

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D50041

(cherry picked from commit 59b30c1a864ee8a22c2e9912301cb88674f714c9)
2025-05-01 17:51:52 +00:00
David Arinzon
3f4a674a8e ena: Fix misconfiguration when requesting regular LLQ
Patch 0a33c047a443 introduced new values to
hw.ena.force_large_llq_header. The default value of 2 means no
preference, while 0 and 1 act as the previous false and true
respectively, which allowed forcefully setting regular or large LLQ.

There are 2 ways to force the driver to select regular LLQ:

1. Setting hw.ena.force_large_llq_header = 0 via sysctl.
2. Turning on ena express, which makes the recommendation by the FW to
   be regular LLQ.

When the device supports large LLQ but the driver is forced to
regular LLQ, llq_config->llq_ring_entry_size_value is never initialized
and since it is a variable allocated on the stack, it stays garbage.

Since this variable is involved in calculating max_entries_in_tx_burst,
it could cause the maximum burst size to be zero. This causes the driver
to ignore the real maximum burst size of the device, leading to driver
resets in devices that have a maximum burst size (Nitro v4 and on. see
[1] for more information).

In case the garbage value is 0, the calculation of
max_entries_in_tx_burst divides by 0 and causes kernel panic.

The patch modifies the logic to take into account all use-cases and
ensure that the relevant fields are properly initialized.

[1]: https://docs.aws.amazon.com/ec2/latest/instancetypes/ec2-nitro-instances.html

Fixes: 0a33c047a443 ("ena: Support LLQ entry size recommendation from device")
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D50040

(cherry picked from commit 56c45700f2ae15755358f2da8266247613c564df)
2025-05-01 17:51:52 +00:00
John Baldwin
1ab14e185a Check for errors when detaching children first, not last
These detach routines in these drivers all ended with 'return
(bus_generic_detach())' meaning that if any child device failed to
detach, the parent driver was left in a mostly destroyed state, but
still marked attached.  Instead, bus drivers should detach child
drivers first and return errors before destroying driver state in the
parent.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D47387

(cherry picked from commit d412c07617eb35435668b024bc2cecda05f57f1f)
2025-02-27 10:17:49 -05:00
Zhenlei Huang
1f6936723e ena: Remove \n from sysctl description
sysctl(8) prints a newline after the description, no need for this extra
newline.

MFC after:	1 week

(cherry picked from commit f2233ac33ab64f9bb03370c097af97f26dd0fca1)
2024-12-06 21:51:41 +08:00
osamaabb
6bf02434bd ena: Update driver version to v2.8.0
Features:
* Add support for device request reset message over AENQ
* Support LLQ entry size recommendation from device
* Support max large LLQ depth from the device
* Expand PHC infrastructures
* Configuration notification support

Bug Fixes:
* Fix leaking ifmedia resources on detach
* Fix netmap socket chain unmapping issue
* Properly reinit netmap structs upon sysctl changes
* Correctly count missing TX completions

Minor Changes:
* Add reset reason for corrupted TX/RX completion descriptors
* Add reset reason for missing admin interrupts
* Improve reset reason statistics
* Update licenses

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit ce4cc746bb4171a131ab9099947a500c0de18ff4)
2024-10-31 14:54:12 +00:00
Osama Abboud
7a39823d8e ena: Fix leaking ifmedia resources on detach
ifmedia_add() allocates an ifmedia_entry during ena_attach.
Current code doesn't release this memory during ena_detach()

This commit calls ifmedia_removeall() to properly free the
allocated memory during ena_detach().

Also, in case ena_attach fails, we need to detach ifmedia
which was allocated within ena_setup_ifnet().

This bug was first described in:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278100

Reviewed by: zlei
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 449496eb28daec8d5b852fa4be1e337c2957345c)
2024-10-31 14:54:12 +00:00
Osama Abboud
86ec26e7a9 ena: Support max large LLQ depth from the device
Large LLQ depth size is currently calculated by dividing the maximum
possible size of LLQ by 2.
In newer paltforms, starting from r8g the size of BAR2,
which contains LLQ, will be increased, and the maximum depth of
wide LLQ will be set according to a value set by the device, instead of
hardcoded division by 2.

The new value will be stored by the device in max_wide_llq_depth field
for drivers that expose ENA_ADMIN_LLQ_FEATURE_VERSION_1 or higher to
the device.

There is an assumption that max_llq_depth >= max_wide_llq_depth, since
they both use the same bar, and if it is possible to have a wide LLQ
of size max_wide_llq_depth, it is possible to have a normal LLQ of the
same size, since it will occupy half of the space.

Also moved the large LLQ case calculation of max_tx_queue_size
before its rounddown.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit d0419551d96c8f995bdf6388a8e69684be33f9b5)
2024-10-31 14:54:11 +00:00
Osama Abboud
4d18878ada ena: Support LLQ entry size recommendation from device
This commit adds support for receiving LLQ entry size recommendation
from the device. The driver will use the recommended entry size, unless
the user specifically chooses to use regular or large LLQ entry.

Also added enum ena_llq_header_size_policy_t and llq_plociy field in
order to support the new feature.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit b1c38df05d79c81ee1e9fd0942774820a4ffcb63)
2024-10-31 14:54:11 +00:00
Osama Abboud
ebb857f4ce ena: Add support for device request reset message over AENQ
This commit adds a handler for the new aenq message
ENA_ADMIN_DEVICE_REQUEST_RESET,
which in turn causes the driver to trigger reset of a new type:
ENA_REGS_RESET_DEVICE_REQUEST. Also adds counting of such occurrences in
a new statistic for it.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 705879424bc76fcc925e78eb7643dbf4bd9a11eb)
2024-10-31 14:54:11 +00:00
Osama Abboud
9aa14351c1 ena: Reinit netmap adapter struct upon sysctl changes
When attaching ENA driver, ena_netmap_attach() is invoked which, in turn
calls netmap_attach which, initializes a struct netmap_adapter,
allocating the struct's netmap_ring and the struct selinfo.

When we change the interface number of queues we need to reinit the
netmap adapter struct as well, so we need to detach it in order to free
the memory allocated by netmap_attach and allocate new memory based on
the new parameters like number of rings, ring size etc...

Without detaching and attaching the netmap interface, if we're to change
the number of queues from 8 to 2 for example and try to enable netmap,
the kernel will panic since the original netmap struct within the
kernel's possession still thinks that the driver has 8 queues which will
eventually cause a non-allocated virtual address access fault.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit f9c9c01de87e0440380b939c684d9939d48ce175)
2024-10-31 14:54:11 +00:00
Osama Abboud
8b43095ad8 ena: Clear NS_MOREFRAG flag for last netmap slot
When processing packets within the rx-flow
ena_netmap_rx_load_desc doesn't know the number of descriptors, so it
sets NS_MOREFRAG to all the slots to indicate that there are more
fragments for this packet.
The code calls ena_netmap_rx_load_desc() for every descriptor in
this packet to map the relevant buffer into the netmap shared memory.
After ena_netmap_rx_load_desc() calls, we need to unset the NS_MOREFRAG
for the last fragment to indicate that this is the last fragment,
so we explicitly turn off NS_MOREFRAG flag.
Current code overrides all other flags and sets NS_BUF_CHANGED.
This patch unsets the relevant flag only.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 2f17afd19a3534dc1755c52edb0c2f70ea0eb1e4)
2024-10-31 14:54:11 +00:00
Osama Abboud
b0830d2b6b ena: Handle wrap around for prefetch in netmap
Netmap index wraps around based on the number of netmap kernel ring
slots.
Currently the driver prefetches the next slot using nm_i + 1 which may
be wrong since it does not handle wrap around.
This patch fixes that by using the kernel API for fetching the next
netmap index.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit ce20b51cb71bfb548fcaafc4bacb8290460f03d5)
2024-10-31 14:54:11 +00:00
Osama Abboud
b47eb2848a ena: Properly unmap last socket chain in netmap
In case ena_com_prepare_tx() fails within the netmap tx flow,
the driver will unmap the last socket chain.
Currently, the driver unmaps the wrong socket within
ena_netmap_unmap_last_socket_chain().

Illustration of the flow:

1- ena_netmap_tx_frames()
2- ena_netmap_tx_frame()
3- ena_netmap_tx_map_slots()
3.1- Map slot
3.2- Advance to the next socket
4- ena_com_prepare_tx()
4.1- ena_com_prepare_tx() fails
5- ena_netmap_unmap_last_socket_chain()

In step 5, where the driver unmaps the socket, the netmap
index already points at the next entry, meaning we're unmapping the
wrong socket in case ena_com_prepare_tx() fails.
In order to fix that, the driver should first update the netmap index to
point at the previous entry and only then update the socket parameters.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit f236e544a2ff685ae151f3232e3785a6a9aab035)
2024-10-31 14:54:11 +00:00
Osama Abboud
d38362c0ee ena: Make global counters style unified
This commit changes the code so all global counters will have the
same line break.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 90953d2f829a45669b0c88a6a03da481f2d54e46)
2024-10-31 14:54:11 +00:00
Osama Abboud
cc489c17d2 ena: Trigger reset when mbuf is NULL error happens
The mbuf is NULL issue happens when the device sends the driver
a completion with a wrong request id.
Trigger a reset whenever this happens.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit da73e3a7d08c215a5d8530fea9a9730f8ac3709f)
2024-10-31 14:54:11 +00:00
Osama Abboud
db0c751ed7 ena: Add differentiation for missing TX completions reset
This commit adds differentiation for a reset caused by missing tx
completions, by verifying if the driver didn't receive tx
completions caused by missing interrupts.
The cleanup_running field was added to ena_ring because
cleanup_task.ta_pending is zeroed before ena_cleanup() runs.

Also ena_increment_reset_counter() API was added in order to support
only incrementing the reset counter.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit a33ec635d1f6d574d54e6f6d74766d070183be4c)
2024-10-31 14:54:11 +00:00
osamaabb
a20c06c6f1 ena: Set ena_min_poll_delay_us default value
This commit sets the default value for ena_min_poll_delay_us to 100.

This commit does not change the behavior of the driver, the delay is
calculated as MAX(ENA_MIN_ADMIN_POLL_US, delay_us), where the first
field is already defined as 100.
The second parameter, delay_us is taken from ena_min_poll_delay_us
which is currently unset - 0.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 637ff00f2f9bd6c8509d0e2ac8959c7a23f09650)
2024-10-31 14:54:11 +00:00
Osama Abboud
a0594d1f65 ena: Add reset reason for missing admin interrupt
There can be cases when we trigger reset if an admin interrupt
is missing.
In order to identify this use-case specifically,
this commit adds a new reset reason.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 274319acb48424958242d55e1b0c7d4528da7f70)
2024-10-31 14:54:10 +00:00
Osama Abboud
e445e3afde ena: Add reset reason for corrupted RX cdescs
RX completion descriptors may sometimes contain errors due
to corruption. Upon identifying such a case, the driver will
trigger a reset with an explicit reset reason
ENA_REGS_RESET_RX_DESCRIPTOR_MALFORMED.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 4af71159db3cd4a37055b2b3d982ec53703c5c3d)
2024-10-31 14:54:10 +00:00
Osama Abboud
189bc23fd0 ena: Add reset reason for corrupted TX cdescs
TX completion descriptors may sometimes contain errors due
to corruption. Upon identifying such a case, the driver will
trigger a reset with an explicit reset reason
ENA_REGS_RESET_TX_DESCRIPTOR_MALFORMED.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 38727218460008a500fbc18f08c90082ed678895)
2024-10-31 14:54:10 +00:00
Osama Abboud
89940eed91 ena: Improve reset reason statistics
The driver uses different reset reasons.
Some of them are counted and presented in the driver statistics.
There are cases where statistics are counted on a ring level,
but these are zeroed after a reset procedure takes place.

This commit makes the following changes:
1. Add statistics for the unrepresented reset reasons.
2. Add reset reasons which are counted on a ring level,
to be also global for better tracking.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 89ce3f6314f6feba0e6626be51832d44df611218)
2024-10-31 14:54:10 +00:00
Osama Abboud
b1718de62f ena: Update license signatures to 2024
This commit updates all the license signatures to 2024.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 8d6806cd08c093fc001db1f94cf122368b8d1549)
2024-10-31 14:54:10 +00:00
Osama Abboud
740fb85ba2 ena: Add configuration notifications interface support
This commit is part of the effort of notifying the user of non-optimal
or performance impacting practices.
A new interface is serving as a communication channel
between the device and the driver. One of the goals of this channel is
to create a new mechanism of notifying the driver and user in case of
sub-optimal configuration using a bitmap.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 8cd86b51be4ab0fe70bad4830e608d56db5c850f)
2024-10-31 14:54:10 +00:00
Osama Abboud
e31fe28929 ena: Count all currently missing TX completions in check
Currently we count all of the newly added and already existing
missing tx completions in each iteration of
check_missing_comp_in_tx_queue() causing duplicate counts
to missing_tx_comp stat.

This commit adds a new counter new_missed_tx within the relevant
function which only counts the newly added missing tx completions
in each iteration of check_missing_comp_in_tx_queue().
This will allow us to update missing_tx_comp stat accurately without
counting duplicates.

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 1f67704e2cd85a507776312b52dc63d8690b9260)
2024-10-31 14:54:10 +00:00
Osama Abboud
418d3190d7 ena: Fix customer metrics deallocation statement place
Upstream commit [1] made if_alloc_domain() never fail, then also do the
wrappers if_alloc(), if_alloc_dev(), and if_gethandle().

Upstream commit [2] removed the NULL check conducted by the driver.
This commit also removes err_customer_metrics_alloc goto label.

Commit [2] leaves behind a floating free() statement that
deallocates customer_metrics_array. This commit places the
deallocation statement where it belongs.

[1] commit 4787572d05 ("ifnet: make if_alloc_domain() never fail")
[2] commit aa3860851b9f ("net: Remove unneeded NULL check for the allocated ifnet")

Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.

(cherry picked from commit 5517ca8486bfbf4d0cd369898f3e4d10cd614a9a)
2024-10-31 14:54:10 +00:00
Zhenlei Huang
8c2d777a22 ena(4): Stop checking for failures from malloc(M_WAITOK)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D45852

(cherry picked from commit 51971340bd3ff41591adbbfca68e9e753f6eb135)
2024-09-30 12:44:21 +08:00
Zhenlei Huang
6b1f530935 net: Remove unneeded NULL check for the allocated ifnet
Change 4787572d05 made if_alloc_domain() never fail, then also do the
wrappers if_alloc(), if_alloc_dev(), and if_gethandle().

No functional change intended.

Reviewed by:	kp, imp, glebius, stevek
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D45740

(cherry picked from commit aa3860851b9f6a6002d135b1cac7736e0995eedc)
2024-07-12 20:03:37 +08:00
Osama Abboud
9015208bff ena: Update driver version to v2.7.0
Features:
* Introduce customer and SRD metrics through sysctl
* Introduce spreading IRQs to CPUs capability using sysctl
* Upgrade ena-com to v2.7.0

Bug Fixes:
* Remove outdated APIs

Minor Changes:
* Introduce a shared stats sample interval for all stats

Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.

(cherry picked from commit 4e2688cc762d94b190029f0c5efab9c4bb5521be)
2024-01-14 21:18:11 +00:00
Osama Abboud
eb29118a2f ena: Update the license dating to 2023
Some of the files are using outdated linceses.
Update the license to be 2023.

Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.

(cherry picked from commit 246aa273244e91a30d70997a3be790a29f9eb29c)
2024-01-14 21:18:11 +00:00
Osama Abboud
9757b07f18 ena: Support srd metrics with sysctl
This commit introduces SRD metrics through sysctl.
The metrics can be queried using the following sysctl node:
sysctl dev.ena.<device index>.ena_srd_info

Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.

(cherry picked from commit 36d42c862f4a5643f6e2395e8d7b7e5c4580499a)
2024-01-14 21:18:11 +00:00
Osama Abboud
538f9ea8fa ena: Support customer metric with sysctl
This commit adds sysctl support for customer metrics.
Different customer metrics can be found in the following sysctl node:
sysctl dev.ena.<device index>.customer_metrics

Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.

(cherry picked from commit f97993ad7b9c9e4787bd198d11f348c271af513e)
2024-01-14 21:18:11 +00:00
Osama Abboud
79c446badb ena: Introduce shared sample interval for all stats
Rename sample_interval node to stats_sample_interval and move
it up in the sysctl tree to make it clear that it's relevant for
all the stats and not only ENI metrics (Currently, sample interval node
is found under eni_metrics node).

Path to node:
dev.ena.<device_index>.stats_sample_interval

Once this parameter is set it will set the sample interval for all the
stats node including SRD/customer metrics.

Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.

(cherry picked from commit 5b925280d9a54eaefd85827bf6e84049aea8fa98)
2024-01-14 21:18:11 +00:00
Osama Abboud
9d773b0d5f ena: Remove CQ tail pointer update API
This commit removes the usage of this API from the freebsd driver since
the relevant functionality is not supported by the device.

Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.

(cherry picked from commit 2835752e075f2fa3edcb596df8306c570ec4cae6)
2024-01-14 21:18:10 +00:00
Osama Abboud
00916b6d29 ena: Update ena_com_update_intr_reg API usage
This commit fixes the usage of this function to be compatible with the
new API introduced by ena-com update to v2.7.0

Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.

(cherry picked from commit 72e34ebdd08854dc896f267b0461e241c4040241)
2024-01-14 21:18:10 +00:00
Arthur Kiyanovski
4526ef3d05 ena: Change measurement unit of time since last tx cleanup to ms
This commit:
1. Sets the time since last cleanup to milliseconds.
2. Fixes incorrect indentations.

Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.

(cherry picked from commit 9272e45c04c0d4fcb5d767e962783f3ab192f64e)
2024-01-14 21:18:10 +00:00
Osama Abboud
71cd348d24 ena: Add sysctl support for spreading IRQs
This commit allows spreading IO IRQs over different CPUs through sysctl.
Two sysctl nodes are introduced:
1- base_cpu: servers as the first CPU to which the first IO IRQ
will be bound.
2- cpu_stride: sets the distance between every two CPUs to which every
two consecutive IO IRQs are bound.

For example for doing the following IO IRQs / CPU binding:

IRQ idx |  CPU
----------------
   1    |   0
   2    |   2
   3    |   4
   4    |   6

Run the following commands:
sysctl dev.ena.<device index>.irq_affinity.base_cpu=0
sysctl dev.ena.<device_index>.irq_affinity.cpu_stride=2

Also introduced rss_enabled field, which is intended to replace
'#ifdef RSS' in multiple places, in order to prevent code duplication.

We want to bind interrupts to CPUs in case of rss set OR in case
the newly defined sysctl paremeter is set. This requires to remove a
couple of '#ifdef RSS' as well in the structs, since we'll be using the
relevant parameters in the CPU binding code.

Approved by: cperciva (mentor)
MFC after: 2 weeks
Sponsored by: Amazon, Inc.

(cherry picked from commit f9e1d9471077109c19fd7d6dc9c1d35432efdede)
2024-01-14 21:18:10 +00:00
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh
95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
Arthur Kiyanovski
ac40021c93 ena: Update driver version to v2.6.3
Bug Fixes:
* Initialize statistics before the interface is available
* Fix driver unload crash

Minor Changes:
* Mechanically convert ena(4) to DrvAPI
* Remove usage of IFF_KNOWSEPOCH

MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2023-07-04 15:58:47 +02:00
Arthur Kiyanovski
c59a5fbd8a ena: Fix driver unload crash
When ena_detach is called, we first call ether_ifdetach(),
which destroys internal addresses of ifp. One such address
is ifp->if_addr->ifa_addr. Then during ena_destroy_device(),
if_link_state_change() is called, eventually trying to access
ifp->if_addr->ifa_addr->sa_family. This causes an access
to garbage memory and crashes the kernel.

Ticket [1] was opened to the FreeBSD community to add null
check in the code of if_link_state_change().
A fix was submitted in commit [2], however it was noted
that it is our driver's responsibilty to not call
if_link_state_change() after calling ether_ifdetach().

This commit makes sure if_link_state_change() is not called
after ether_ifdetach().

[1]: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270813
[2]: https://reviews.freebsd.org/D39614

Fixes: 32f63fa7f9 ("Split ENA reset routine into restore and destroy stages")
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2023-07-04 15:57:15 +02:00
Osama Abboud
b9e80b5280 ena: Initialize statistics before the interface is available
In [1], the FBSD community exposed a bug in the fbsd/ena driver.

Bug description:
----------------
Current function call order is as follows:

1. ena_attach()
1.1. ena_setup_ifnet()
1.1.1. Registration of ena_get_counter()
1.1.2. ether_ifattach(ifp, adapter->mac_addr);
1.2. Statistics allocation and initialization.

At point 1.1.2, when ether_ifattach() returns, the interface is available,
and stats can be read before they are allocated, leading to kernel panic.

Also fixed a potential memory leak by freeing the stats since they were
not freed in case the following calls failed.

Fix:
----
This commit moves the statistics allocation and initialization to happen
before ena_setup_ifnet()

[1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268934

Fixes: 9b8d05b8ac ("Add support for Amazon Elastic Network Adapter (ENA) NIC")
Fixes: 30217e2dff ("Rework counting of hardware statistics in ENA driver")
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2023-07-04 15:51:16 +02:00
Gleb Smirnoff
a6b55ee6be net: replace IFF_KNOWSEPOCH with IFF_NEEDSEPOCH
Expect that drivers call into the network stack with the net epoch
entered. This has already been the fact since early 2020. The net
interrupts, that are marked with INTR_TYPE_NET, were entering epoch
since 511d1afb6b. For the taskqueues there is NET_TASK_INIT() and
all drivers that were known back in 2020 we marked with it in
6c3e93cb5a. However in e87c494015 we took conservative approach
and preferred to opt-in rather than opt-out for the epoch.

This change not only reverts e87c494015 but adds a safety belt to
avoid panicing with INVARIANTS if there is a missed driver. With
INVARIANTS we will run in_epoch() check, print a warning and enter
the net epoch.  A driver that prints can be quickly fixed with the
IFF_NEEDSEPOCH flag, but better be augmented to properly enter the
epoch itself.

Note on TCP LRO: it is a backdoor to enter the TCP stack bypassing
some layers of net stack, ignoring either old IFF_KNOWSEPOCH or the
new IFF_NEEDSEPOCH.  But the tcp_lro_flush_all() asserts the presence
of network epoch.  Indeed, all NIC drivers that support LRO already
provide the epoch, either with help of INTR_TYPE_NET or just running
NET_EPOCH_ENTER() in their code.

Reviewed by:		zlei, gallatin, erj
Differential Revision:	https://reviews.freebsd.org/D39510
2023-04-17 09:08:35 -07:00
Justin Hibbits
7583c633e0 Mechanically convert ena(4) to DrvAPI
Reviewed by: mw
Differential Revision: https://reviews.freebsd.org/D37837
2023-01-13 17:09:17 +01:00
Arthur Kiyanovski
e5de1d8dad ena: Update driver version to v2.6.2
Bug Fixes:
* Remove timer service re-arm on ena_restore_device failure.
* Re-Enable per-packet missing tx completion print

Minor Changes:
* Switch driver owners from Semihalf to Amazon in man file.

MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Pull Request: https://github.com/freebsd/freebsd-src/pull/637
2023-01-13 17:07:04 +01:00
David Arinzon
c4a85b8d68 ena: Remove timer service re-arm on ena_restore_device failure
In case the reset sequence fails (ena_destroy_device() followed by
ena_restore_device() calls) during ena_restore_device(), the driver
resources are being freed. After the clean-up, the timer service is
re-armed in order to try and re-initialize the driver state.
But, such an attempt would fail given that the resources are freed.
Moreover, this would actually cause either the system to fail or a
panic.
When the driver fails in ena_restore_device() procedure, the only
recovery is either unloading and loading the driver or instance
reboot.

This change removes the timer service re-arm in case of failure
in ena_restore_device().

MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Fixes: 78554d0c70 ("ena: start timer service on attach")
2023-01-13 17:06:43 +01:00
Arthur Kiyanovski
f01b2cd98e ena: Re-Enable per-packet missing tx completion print
Commit [1] first added the ena_tx_buffer.print_once member,
so that a message about a missing tx completion is printed only
once per packet (and not every second when the watchdog runs).
In this commit print_once is initialized to true, and is set back
to false after detecting a missing tx completion and printing
a warning about it to dmesg.

Commit [2] incorrectly reverses the values assigned to print_once.
The variable is initialized to be true but is checked to be false
when a missing tx completion is detected. This is never true, and
therefore the warning print for each missing tx completion is never
printed since this commit.

Commit [3] added time passed since last TX cleanup to the missing
tx completions per-packet print. However, due to the issue in commit
[2], this time is never printed.

This commit reverses back the values assigned to ena_tx_buffer.print_once
erroneously by commit [2], bringing back to life the missing tx
completion per-packet print.

Also add a space after "." in the missing tx completion print.

[1] - 9b8d05b8ac ("Add support for Amazon Elastic Network Adapter (ENA) NIC")
[2] - 74dba3ad78 ("Split function checking for missing TX completion in ENA driver")
[3] - d8aba82b5c ("ena: Store ticks of last Tx cleanup")

Fixes: 74dba3ad78 ("Split function checking for missing TX completion in ENA driver")
Fixes: d8aba82b5c ("ena: Store ticks of last Tx cleanup")
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2023-01-13 17:06:42 +01:00
Michal Krawczyk
25b64933a4 ena: Update driver version to v2.6.1
Minor version update which improves styling of a printouts, fixes
the KASAN and KMSAN kernel builds and LLQ reconfiguration after the
device reset.

Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2022-07-06 17:06:21 +02:00
Michal Krawczyk
3324e304c1 ena: Fix LLQ descriptor reconfiguration
After the device reset, the LLQ configuration descriptor wasn't passed
to the hardware. On a 6-generation AWS instances (like C6gn), it is
required to pass the LLQ descriptor after the device reset, otherwise
the hardware will be missing the LLQ configuration resulting in
performance degradation.

This patch reconfigures the LLQ each time the ena_device_init() is
called. This means that the LLQ descriptor will be passed during the
initial configuration and after a reset.

The ena_map_llq_mem_bar() function call was moved before the
ena_device_init() call, to make sure that the mem bar is available.

Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2022-07-06 17:06:21 +02:00
Michal Krawczyk
38d036e91a ena: Align req_id and qid print order
In most places, the req_id is printed first, and the qid is printed as a
second. To align the driver, one printout was reworked and the print
order of those variables was changed.

Suggested by: rpokala
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
2022-07-06 17:06:21 +02:00
Mark Johnston
b72f1f4516 ena: Make first_interrupt a uint8_t
We do not have atomic(9) routines for bools, and it is not guaranteed
that sizeof(bool) is 1.

This fixes the KASAN and KMSAN kernel builds, which fail because the
compiler refuses to silently cast a _Bool * to a uint8_t * when calling
the atomic(9) sanitizer interceptors.

Reviewed by:	Dawid Górecki <dgr@semihalf.com>
MFC after:	2 weeks
Fixes:	0ac122c388 ("ena: Use atomic_load/store functions for first_interrupt variable")
Differential Revision:	https://reviews.freebsd.org/D35683
2022-07-01 11:00:58 -04:00