Header ice_rss.h uses the kernel RSS interface if option RSS is defined.
However when ice_rss.h is included by ice_lib.h there is no prior
inclusion of ice_opts.h to set RSS causing ifdef RSS to always fail. Add
ice_opts.h to the top of ice_lib.h (like ice_iflib.h) so RSS can be
defined when ice_rss.h is parsed.
With that in place, compilation fails due to a missing defintion of
ICE_DEFAULT_RSS_HASH_CONFIG. It is defined in ice_rss.h only when RSS is
not defined. Since this define is not part of the kernel RSS interface
but ice-specific, it should always be defined. Move its definition
outside of ifdef RSS.
PR: 255309
Reviewed by: mhorne, erj (earlier version)
MFC after: 3 days
Pull Request: https://github.com/freebsd/freebsd-src/pull/1460
(cherry picked from commit 6e5650896f)
In (unknown) situations it seems the i2c bus can have trouble,
while nothing about the current link state has changed, the driver
would react by going into a link down state, and start busylooping
on up to 4 cores. Even if there was a valid link, such spinning
on a cpu by a kernel thread would wreak havoc to existing and
new connections.
This patch does the following:
1. If such a bus failure occurs, we keep the last known link state.
2. Prevent busy looping by implementing the lockmgr() facility to
be able to sleep while the i2c code waits on the i2c ISR. We cap
this with a timeout.
3. Pin the admin queues to the last CPU in the system, to prevent
other scenarios where busy looping might occur from landing on CPU
0, which especially seems to cause a lot of issues.
Given the design constraints both in hardware and in software,
the lockmgr() seems to be the only viable option, even though
FreeBSD explicitly forbids sleeping in callout context, but
fails to explain why this is or offer alternatives.
axgbe: revert allocating admin queues to last CPU
The issue was resolved in 52454a1e5b.
Scheduled threads such as CARP are now no longer pinned to CPU 0, making sure
they always get their time slice even if CPUs are blocked.
Since the I/O expander chip does not do a reset when soft power
cycling, the driver will first turn off all LEDs when initializing,
although no specific routine seems to be called when powering down.
This means that the LEDs will stay on until the driver has booted up,
after which the driver will be in a consistent state.
Initially, RSF (Receive Queue Store and Forward) was disabled for
unknown reasons, but the cut-through mode that's enabled as a result
seems to send 0 length packets up to the DMA when the RX queue is
full.
Since the iflib interface needs axgbe_pci_init() and its phy starting capabilities, no data was passed in its absence.
With the NULL check of the axgbe_miibus we also resort back to an MDIO read as a module might be capable of both
clause 22 and clause 45 methods of communication.
with the move of phy_stop() to if_detach() in d50d4e8cd4, it's better to prevent reconfiguring the phy should the pci_init() callout trigger more than once.
Within the code path of autonegotiation for gigabit SFP modules was a bug, causing
a report of LINK_ERR for cases where an external SFP PHY was present. Fixing this issue
did not resolve to a link however, as it turned out that while autonegotiation interrupts
were happening, it's resulting status cannot be correctly determined in all cases. In these
specific cases we have no other option than to assume a module has negotiated to 1Gbit/s.
PHY-specific configuration has been delegated to the miibus driver, if an external PHY is present.
It's possible that the i2c bus does not recognize a PHY on the first pass, so in all cases we
retry up to a maximum of 5 times during each link poll pass to ensure we didn't miss the presence
of an external PHY.
This commit also addresses link issues on both 100 mbit and 1Gb fiber modules. Not all of these modules
have the correct data set according to SFF-8472, as such we first check for gigabit compliance and
the associated baudrate, otherwise we resort back to determining what type of fiber module is plugged
in by checking the baudrate, cable length and wavelength and setting the MAC speed accordingly.
It is possible for a machine to boot into a state in which the configuration register,
responsible for controlling wether an I/O signal is considered an input or output,
contains randomized values. It was assumed this was programmed by the BIOS.
If I/O is reversed, it's possible for the driver to think an SFPP module has been inserted
when there is none, leading to unrecoverable I2C errors.
The configuration register should contain a state which is determined and provided by the BIOS,
hence no hard-coded values are programmed here.
When executing `ifconfig -v` this will lead to stalls for a second per
interface due to the timeout being set to a static 10 without a module
placed, this patch makes sure this is only allowed once per insertion.
Some iichid(4) child devices, currently hkbd(4) only, opens parent
device in their attach handlers. That breaks internal iichid(4) state
leading to rejecting any incoming data on software and hardware levels.
Fix it with adding of extra state check in iichid(4) attach handler.
Approved by: re (cperciva)
Reported by: many
Submitted by: trasz (initial version)
PR: 280290
MFC after: 3 days
(cherry picked from commit 018cb11cb7)
(cherry picked from commit c53ec86f0e)
386BSD provided a MD function sysbeep. This took two arguments (pitch
and period). Pitch was jammed into the PIT's divisor directly (which
means the argument was expected to sound a tone at '1193182 / pitch'
Hz). FreeBSD inherited this interface.
In commit e465985885 (svn 177642, Mar 26 2008), phk changed this
function to take a tone to sound in hz. He converted all in-tree
instances of 1193182 / hz to just hz (and kept the few misguided folks
that passed hz directly unchanged -- this was part of what motivated the
change). He converted the places where we pre-computed the 8254 divisor
from being pitch to 1193182 / pitch (since that converts the divisor to
the frequency and the interfaces that were exposed to userland exposed
it in these units in places, continuing the tradition inherited from SCO
System V/386 Unix in spots).
In 2009, Ed Shouten was contracted by the FreeBSD Foundation to write /
finish newcons. This work was done in perforce and was imported into
subversion in user/ed/newcons in revision 199072
(https://svnweb.freebsd.org/base?view=revision&revision=199072) which
was later imported into FreeBSD by ray@ (Aleksandr Rybalko).
From that earliest import into svn import to this date, we ring the bell
with:
sysbeep(1193182 / VT_BELLPITCH, VT_BELLDURATION);
where VT_BELLPITCH was defined to be 800. This results in a bell
frequency of 1491Hz, more or less today. This is similar to the
frequency that syscons and pcvt used (1493Hz and 1500Hz respectively).
This in turn was inherited from 386BSD, it seems, which used the hard
coded value 0x31b which is 795 -> 1500Hz.
This '800' was intended to be the bell tone (eg 800Hz) and this
interface was one that wasn't converted. The most common terminal prior
to the rise of PCs was the VT100, which had an approximately 800Hz
bell. Ed Shouten has confirmed that the original intent was 800Hz and
changing this was overlooked after the change to -current was made.
This restors that original intent and makes the bell less obnoxious in
the process.
Reviewed by: des, adrian
Differential Revision: https://reviews.freebsd.org/D32594
Sponsored by: Netflix
(cherry picked from commit ba48d52ca6)
This change was accidentally reverted in 80f21bb039.
(cherry picked from commit 2416be588e)
(cherry picked from commit 1c9f1cb4f0)
Approved by: re (cperciva)
Changes to acpi_gpiobus.c handle discovering and parsing the _AEI
objects and storing necessary data in device ivars. A new gpioaei.c
file implements the device, which simply requests an interrupt when
the pin is triggered and invokes the appropriate _Exx or _Lxx ACPI
method.
This makes the GPIO "power button" work on arm64 Graviton systems,
allowing EC2 "Stop"/"Reboot" instance calls to be handled cleanly.
(Prior to this change, those requests would time out after 4 minutes
and the instance would be forcibly killed.)
Reviewed by: imp, andrew, Ahmad Khalifa
Approved by: re (kib)
MFC after: 3 days
Sponsored by: Amazon
Differential Revision: https://reviews.freebsd.org/D47253
Co-authored-by: Andrew Turner <andrew@FreeBSD.org>
(cherry picked from commit 9709bda03c)
(cherry picked from commit c2cd78d944)
GPIO interrupts work just fine and will be used shortly. We still
do not support GPIO_INTR_SHAREABLE however, so leave that within
the NOT_YET scope.
Reviwed by: andrew
Approved by: re (kib)
MFC after: 1 week
Sponsored by: Amazon
Differential Revision: https://reviews.freebsd.org/D47251
(cherry picked from commit 2d4219919a)
(cherry picked from commit 1f69417607)
This allows acpi_gpiobus to override the method and fall back to the
generic gpiobus_read_ivar function if needed.
Reviewed by: andrew
Approved by: re (kib)
MFC after: 1 week
Sponsored by: Amazon
Differential Revision: https://reviews.freebsd.org/D47250
(cherry picked from commit bc0d10d01c)
(cherry picked from commit fffdfe2f67)
AWS Graviton [1234] systems have a bug in their ACPI where they mark
the PL061's GPIO pins as needing to be configured in PullUp mode (in
fact the PL061 has no pullup/pulldown resistors); this flag needs to
be removed in order for _AEI objects to be handled on these systems.
Reviewed by: Ali Saidi
Approved by: re (kib)
MFC after: 1 week
Sponsored by: Amazon
Differential Revision: https://reviews.freebsd.org/D47239
(cherry picked from commit 2f3f867ac6)
(cherry picked from commit 5fa51c3653)
ACPI sleep states are only implemented on x86 systems, so having the
ACPI power button attempt to enter "S5" (or other state as configured
via the hw.acpi.power_button_state sysctl) is not useful.
On non-x86 systems, implement the power button with a call to
shutdown_nice(RB_POWEROFF)
to shut down the system.
Reviewed by: Andrew
Tested on: Graviton 2
Approved by: re (kib)
MFC after: 2 weeks
Sponsored by: Amazon
Differential Revision: https://reviews.freebsd.org/D47094
(cherry picked from commit f41ef9d80b)
(cherry picked from commit e177e64294)
Right now flags is set to 0 before this "=" -> "|=" change, but it will
matter when the NOT_YET section above becomes effective.
Approved by: re (kib)
MFC after: 2 weeks
Sponsored by: Amazon
(cherry picked from commit c808132731)
(cherry picked from commit 7c8f273bfb)
This currently only implements the address space handler and attempts to
configure pins with flags obtained from ACPI.
Reviewed by: wulf
Approved by: re (kib)
MFC after: 1 month
Pull Request: https://github.com/freebsd/freebsd-src/pull/1359
(cherry picked from commit 92adaa5862)
(cherry picked from commit 14887d2c86)
This allows iavf to load on E830 devices since those devices place their MSI-X
BAR at a different location than in previous 800 series products.
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D46952
(cherry picked from commit e53a21abdf)
ifmedia_add() allocates an ifmedia_entry during ena_attach.
Current code doesn't release this memory during ena_detach()
This commit calls ifmedia_removeall() to properly free the
allocated memory during ena_detach().
Also, in case ena_attach fails, we need to detach ifmedia
which was allocated within ena_setup_ifnet().
This bug was first described in:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278100
Reviewed by: zlei
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 449496eb28)
Large LLQ depth size is currently calculated by dividing the maximum
possible size of LLQ by 2.
In newer paltforms, starting from r8g the size of BAR2,
which contains LLQ, will be increased, and the maximum depth of
wide LLQ will be set according to a value set by the device, instead of
hardcoded division by 2.
The new value will be stored by the device in max_wide_llq_depth field
for drivers that expose ENA_ADMIN_LLQ_FEATURE_VERSION_1 or higher to
the device.
There is an assumption that max_llq_depth >= max_wide_llq_depth, since
they both use the same bar, and if it is possible to have a wide LLQ
of size max_wide_llq_depth, it is possible to have a normal LLQ of the
same size, since it will occupy half of the space.
Also moved the large LLQ case calculation of max_tx_queue_size
before its rounddown.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit d0419551d9)
This commit adds support for receiving LLQ entry size recommendation
from the device. The driver will use the recommended entry size, unless
the user specifically chooses to use regular or large LLQ entry.
Also added enum ena_llq_header_size_policy_t and llq_plociy field in
order to support the new feature.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit b1c38df05d)
This commit adds a handler for the new aenq message
ENA_ADMIN_DEVICE_REQUEST_RESET,
which in turn causes the driver to trigger reset of a new type:
ENA_REGS_RESET_DEVICE_REQUEST. Also adds counting of such occurrences in
a new statistic for it.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 705879424b)
When attaching ENA driver, ena_netmap_attach() is invoked which, in turn
calls netmap_attach which, initializes a struct netmap_adapter,
allocating the struct's netmap_ring and the struct selinfo.
When we change the interface number of queues we need to reinit the
netmap adapter struct as well, so we need to detach it in order to free
the memory allocated by netmap_attach and allocate new memory based on
the new parameters like number of rings, ring size etc...
Without detaching and attaching the netmap interface, if we're to change
the number of queues from 8 to 2 for example and try to enable netmap,
the kernel will panic since the original netmap struct within the
kernel's possession still thinks that the driver has 8 queues which will
eventually cause a non-allocated virtual address access fault.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit f9c9c01de8)
When processing packets within the rx-flow
ena_netmap_rx_load_desc doesn't know the number of descriptors, so it
sets NS_MOREFRAG to all the slots to indicate that there are more
fragments for this packet.
The code calls ena_netmap_rx_load_desc() for every descriptor in
this packet to map the relevant buffer into the netmap shared memory.
After ena_netmap_rx_load_desc() calls, we need to unset the NS_MOREFRAG
for the last fragment to indicate that this is the last fragment,
so we explicitly turn off NS_MOREFRAG flag.
Current code overrides all other flags and sets NS_BUF_CHANGED.
This patch unsets the relevant flag only.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 2f17afd19a)
Netmap index wraps around based on the number of netmap kernel ring
slots.
Currently the driver prefetches the next slot using nm_i + 1 which may
be wrong since it does not handle wrap around.
This patch fixes that by using the kernel API for fetching the next
netmap index.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit ce20b51cb7)
In case ena_com_prepare_tx() fails within the netmap tx flow,
the driver will unmap the last socket chain.
Currently, the driver unmaps the wrong socket within
ena_netmap_unmap_last_socket_chain().
Illustration of the flow:
1- ena_netmap_tx_frames()
2- ena_netmap_tx_frame()
3- ena_netmap_tx_map_slots()
3.1- Map slot
3.2- Advance to the next socket
4- ena_com_prepare_tx()
4.1- ena_com_prepare_tx() fails
5- ena_netmap_unmap_last_socket_chain()
In step 5, where the driver unmaps the socket, the netmap
index already points at the next entry, meaning we're unmapping the
wrong socket in case ena_com_prepare_tx() fails.
In order to fix that, the driver should first update the netmap index to
point at the previous entry and only then update the socket parameters.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit f236e544a2)
This commit changes the code so all global counters will have the
same line break.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 90953d2f82)
The mbuf is NULL issue happens when the device sends the driver
a completion with a wrong request id.
Trigger a reset whenever this happens.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit da73e3a7d0)
This commit adds differentiation for a reset caused by missing tx
completions, by verifying if the driver didn't receive tx
completions caused by missing interrupts.
The cleanup_running field was added to ena_ring because
cleanup_task.ta_pending is zeroed before ena_cleanup() runs.
Also ena_increment_reset_counter() API was added in order to support
only incrementing the reset counter.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit a33ec635d1)
This commit sets the default value for ena_min_poll_delay_us to 100.
This commit does not change the behavior of the driver, the delay is
calculated as MAX(ENA_MIN_ADMIN_POLL_US, delay_us), where the first
field is already defined as 100.
The second parameter, delay_us is taken from ena_min_poll_delay_us
which is currently unset - 0.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 637ff00f2f)
There can be cases when we trigger reset if an admin interrupt
is missing.
In order to identify this use-case specifically,
this commit adds a new reset reason.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 274319acb4)
RX completion descriptors may sometimes contain errors due
to corruption. Upon identifying such a case, the driver will
trigger a reset with an explicit reset reason
ENA_REGS_RESET_RX_DESCRIPTOR_MALFORMED.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 4af71159db)
TX completion descriptors may sometimes contain errors due
to corruption. Upon identifying such a case, the driver will
trigger a reset with an explicit reset reason
ENA_REGS_RESET_TX_DESCRIPTOR_MALFORMED.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 3872721846)
The driver uses different reset reasons.
Some of them are counted and presented in the driver statistics.
There are cases where statistics are counted on a ring level,
but these are zeroed after a reset procedure takes place.
This commit makes the following changes:
1. Add statistics for the unrepresented reset reasons.
2. Add reset reasons which are counted on a ring level,
to be also global for better tracking.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 89ce3f6314)
This commit updates all the license signatures to 2024.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 8d6806cd08)
This commit is part of the effort of notifying the user of non-optimal
or performance impacting practices.
A new interface is serving as a communication channel
between the device and the driver. One of the goals of this channel is
to create a new mechanism of notifying the driver and user in case of
sub-optimal configuration using a bitmap.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 8cd86b51be)
Currently we count all of the newly added and already existing
missing tx completions in each iteration of
check_missing_comp_in_tx_queue() causing duplicate counts
to missing_tx_comp stat.
This commit adds a new counter new_missed_tx within the relevant
function which only counts the newly added missing tx completions
in each iteration of check_missing_comp_in_tx_queue().
This will allow us to update missing_tx_comp stat accurately without
counting duplicates.
Approved by: cperciva (mentor)
Sponsored by: Amazon, Inc.
(cherry picked from commit 1f67704e2c)