Provide macros to derive the various needed values and make it a bit
more clear the differences between em and igb.
The igb default EITR was not landing at the right offset.
Respect the 'max_interrupt_rate' tunable.
Sponsored by: BBOX.io
(cherry picked from commit 9bf9164fc8aad1ca828c725413e06462aef1e4dd)
The absolute and packet timers only apply to lem and em with some only
applying to the later.
This cleans up the sysctl tree to only show these where applicable and
stops writing to unexpected registers for igb.
Sponsored by: BBOX.io
(cherry picked from commit 1c578f1c93112d3d53820f8b6b411488b83f0ab2)
Based on sysinit_sub_id, SI_SUB_CLOCKS is after SI_SUB_CONFIGURE.
SI_SUB_CONFIGURE = 0x3800000, /* Configure devices */
At this stage, the variable “cold” will be set to 0.
SI_SUB_CLOCKS = 0x4800000, /* real-time and stat clocks*/
At this stage, the clock configuration will be done, and the real-time
clock can be used.
In the e1000 driver, if the API safe_pause_* are called between
SI_SUB_CONFIGURE and SI_SUB_CLOCKS stages, it will choose the wrong
clock source. The API safe_pause_* uses “cold” the value of which is
updated in SI_SUB_CONFIGURE, to decide if the real-time clock source is
ready. However, the real-time clock is not ready til the SI_SUB_CLOCKS
routines are done.
Obtained from: Juniper Networks
Differential Revision: https://reviews.freebsd.org/D42920
(cherry picked from commit 930a1e6f3d2dd629774f1b48b1acf7ba482ab659)
The write is only used to toggle the debug print function and this is
otherwise stateless.
(cherry picked from commit 5f6964d9fbf663f85ee60dae7dfff153b82759d8)
Bump to the current out of tree driver version since we only have some
gratuitous changes.
(cherry picked from commit ddfec1fb6814088abc5805f45c4a18c5731d51b9)
DPDK commit message
net/e1000/base: fix link power down
Current code is a result of work to reduce duplication between various
device models. However, the logic that was replaced did not exactly
match the new logic, and as a result the link power down was not
working correctly for some NICs, and the link remained up even when
the interface is down.
Fix it to correctly power down the link under all circumstances that
were supported by old logic.
Fixes: 44dddd1 ("net/e1000/base: remove duplicated codes")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Obtained from: DPDK (a8218d0)
(cherry picked from commit 811912c46b5886f1aa3bb7a51a6ec1270bc947a8)
The global hw.em.rx_process_limit knob has been replaced by the device-
specific dev.IF.N.iflib.rx_budget along with the conversion to iflib(4).
While at it, remove the - besides initialization of tx_process_limit -
unused {r,t}x_process_limit members.
(cherry picked from commit 0d6d28ce5650d1cd23dbe4bbac87fb103b3e2d3d)
In rS360398, a new iflib device method was added to opt out of VLAN
events needing an interface reset.
I am switching the default to not requiring a restart for:
* VLAN events
* unknown events
After fixing various bugs, I do not think this would be a common need
of hardware and it is undesirable from the user's perspective causing
link flaps and much slower VLAN configuration. Currently, there are no
other restart events besides VLAN events, and setting the
ifdi_needs_restart default to false will alleviate the need to churn
every driver if an odd event is added in the future for specific
hardware.
markj points out this could cause churn in the other direction; I will
solve that problem with an event registration system as he mentions in
the review should we need it in the future.
These drivers will opt into restart and need further inspection or work:
* ixv (needs code audit, 61a8231 fixed principal issue; re-init probably
not necessary)
* axgbe (needs code audit; re-init probably not necessary)
* iavf - (needs code audit; interaction with Malicious Driver Detection
mentioned in rS360398)
* mgb - no VLAN functions are currently implemented. Left a comment.
MFC after: 2 weeks
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41558
Since d49e83eac3, iflib(9) is ready
for this change.
While at it, make isc_driver_version strings (static) const where
not apparently un-const on purpose, too.
This reduces the size of the amd64 GENERIC by about 10 KiB.
This has been off by one in the FreeBSD drivers as far back as I've
looked. Emperically HW and SW emulations I have available don't seem to
mind. Noticed while debugging other issues.
MFC after: 3 days
Disable TSO on lem(4) and em(4) until a ring stall can be debugged.
I am not able to reproduce the issue on lem(4) but disabling there in
abundance of caution in case the issue is not specific to em(4).
Reported by: grog
Most em(4) devices now enjoy TSO and TSO6, matching NetBSD and Linux
defaults.
A prior commit automasks TSO on 10/100 Ethernet due to errata and other
bugs for IPv6 were fixed recently allowing this.
Mike Karels identified a performance anomaly on Intel 82574L devices.
These are multiqueue enabled on FreeBSD since the conversion to
iflib. I am investigating whether this can be fixed, in the mean time
MSI-X with checksum offloads remain default.
i219 SPT devices have an errata that downclocks the DMA engine, which
results in TSO not being able to acheive line rate. Therefore, it is
disabled on:
* Intel(R) I219-LM and I219-V SPT
* Intel(R) I219-LM and I219-V SPT-H (2)
* Intel(R) I219-LM and I219-V LBG (3)
* Intel(R) I219-LM and I219-V SPT (4)
* Intel(R) I219-LM and I219-V SPT (5)
Many lem(4) devices enjoy TSO, exceptions being 82542, 82543, 82547.
TSO6 may be possible for some chipsets but I am still working through
my testing matrix and that is hidden behind hw.em.unsupported_tso.
If you encounter issues, you may disable TSO with for example:
ifconfig em0 -tso -tso6.
I ask to be informed of any deviations from normal operation requiring
this.
Thanks to cc@ for access to emulab.net.
On a sample I219 system it saves about 16% CPU on IPv4 and 19% on IPv6.
iperf3 -Vc reported numbers:
total% user% system%
IPv4 TSO
21.3 7 14.4
21.4 6 15.4
21.5 6 15.5
IPv4 no TSO
36.8 5.4 31.4
38.5 5.1 33.5
38.2 5.7 32.6
IPv4 no TSO no TXCSUM
45.1 5.8 39.3
46 6.3 39.7
46.2 5.9 40.4
IPv6 TSO6
21.7 5.4 16.3
21.6 5.1 16.5
21.9 5.6 16.3
IPv6 no TSO6
41.2 5.2 36
41 5.1 36
40.8 5.2 35.7
IPv6 no TSO6 no TXCSUM6
49 5.9 43.1
48.8 4.9 43.9
49 5.6 43.4
Tested by: cc (lem(4)), karels (82574L)
MFC after: 3 months
Relnotes: yes
Sponsored by: BBOX.io
Differential Revision: https://reviews.freebsd.org/D41170
This feature masks TSO capability when a link comes up at 10 or 100mbit
due to errata on the chips. This behavior matches previous versions of
FreeBSD as well as NetBSD and Linux.
A tunable, hw.em.unsupported_tso may be set if the admin desires to
disabling automasking and configure TSO settings manually.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41170
Also disable IPV6 checksum offload.
Spell hw->mac.type < e1000_82543 as e1000_82542. Confusingly, chips
like 82540 and 82541 come later and do not have these issues. There
is no functional change here, as the enum was defined in such a way
it worked correctly. But this reads literally.
MFC after: 1 week
Explicitly set ipcss/ipcse/ipcso for IPv6 per intel SDM as indicated in
inline comments.
Fix and consolidate 82543/82547 hwcsum exemption.
While here rearrange and expand some commentary.
* em(4) obey administrative ifcaps for using hwcsum offload
* em(4) obey administrative ifcaps for hw vlan receive tagging
* em(4) add additional TSO6 ifcap, but disabled by default as is TSO4
* lem(4) obey administrative ifcaps for using hwcsum offload
* lem(4) add support for hw vlan receive tagging
* lem(4) Add ifcaps for TSO offload experimentation, but disabled by
default due to errata and possibly missing txrx code.
* lem(4) disable HWCSUM ifcaps by default on 82547 due to errata around
full duplex links. It may still be administratively enabled.
Reviewed by: markj (previous version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D30072
* em(4) obey administrative ifcaps for using hwcsum offload
* em(4) obey administrative ifcaps for hw vlan receive tagging
* em(4) add additional TSO6 ifcap, but disabled by default as is TSO4
* lem(4) obey administrative ifcaps for using hwcsum offload
* lem(4) add support for hw vlan receive tagging
* lem(4) Add ifcaps for TSO offload experimentation, but disabled by
default due to errata and possibly missing txrx code.
* lem(4) disable HWCSUM ifcaps by default on 82547 due to errata around
full duplex links. It may still be administratively enabled.
Reviewed by: markj (previous version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D30072
Always set TXD_CMD_IP for 82544
Otherwise set TXD_CMD_IP for IPv4, not IPv6
Reviewed by: markj (previous version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D30072
VLAN 0 essentially means "Treat as untagged, but with priority bits",
and is used by some ISPs.
On igb/em interfaces we did not receive packets with VLAN tag 0 unless
vlanhwfilter was disabled.
This can be fixed by explicitly listing VLAN 0 in the hardware VLAN
filter (VFTA). Do this from em_setup_vlan_hw_support(), where we already
(re-)write the VFTA.
Reviewed by: kbowling
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D40046
Ungate DMA clock on TGP and later to avoid packet loss.
A similar fix appears in Linux 639e298f432fb058a9496ea16863f53b1ce935fe
This may be needed as far back as SPT but no confirmation from intel or
other OS yet.
Obtained from: OpenBSD (if_em_hw.c 1.116)
MFC after: 2 weeks
Sponsored by: BBOX.io
This call only makes sense for ich8lan, and the shared code does it in
e1000_setup_init_funcs() above this deletion.
Obtained from: DPDK
MFC after: 2 weeks
Sponsored by: BBOX.io
Pull Request: https://github.com/freebsd/freebsd-src/pull/539
Incrementing these to avoid confusion in users; we are on par with these
out of tree versions.
Reviewed by: erj
MFC after: 2 weeks
Sponsored by: BBOX.io
Pull Request: https://github.com/freebsd/freebsd-src/pull/540
Clear the rings before reset to avoid a HW hang.
Inspired by em-7.7.8 and DPDK (1fc9701238edcf0541289b9ae15565b6d9d7ab30)
Reviewed by: erj
MFC after: 2 weeks
Sponsored by: BBOX.io
Pull Request: https://github.com/freebsd/freebsd-src/pull/540
These include I219 (20) through I219 (23), which ends at Raptor Lake.
This also corrects a discrepancy where the (16) devices should be
mac type "e1000_pch_tgp" and not "e1000_pch_adp".
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
PR: 269224
Reviewed by: erj@
MFC after: 1 day
Relnotes: yes
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D38376
Extend the size of the local rx_buffer_size variable to account for
larger buffer sizes possible on 82580, i350 chips.
From i350 datasheet, 6.2.10 Initialization Control 4 (LAN Base Address
+ Offset 0x13):
When 4 ports are enabled maximum buffer size is 36 KB. When 2 ports are
enabled maximum buffer size is 72 KB. When only a single port is
enabled maximum buffer size is 144 KB.
and 8.3:
The overall available internal buffer size in the I350 for all ports is
144 KB for receive buffers and 80 KB for transmit Buffers. Disabled
ports memory can be shared between active ports and sharing can be
asymmetric. The default buffer size for each port is loaded from the
EEPROM on initialization.
From the reporter:
But for I350 when only 2 ports are used PBA size can be set as 72KB
(see datasheet RXPbsize or e1000_rxpbs_adjust_82580 function in
e1000_82575.c). In this case calculating the rx_buffer_size overflows
as 0x0048 << 10 = 73728 or 0x12000 pushed into u16. It is then set as
0x2000 or 8192.
PR: 263896
Reported by: hannula@gmail.com
Tested by: hannula@gmail.com
Approved by: markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D35167
Currently if an e1000 interface is set to a fixed media configuration,
for gigabit, it will participate in auto-negotiation as required by
IEEE 802.3-2018 Clause 37. However, if set to fixed media configuration
for 100 or 10, it does NOT participate in auto-negotiation.
By my reading of Clauses 28 and 37, while auto-negotiation is optional
for 100 and 10, it is not prohibited and is, in fact, "highly
recommended".
This patch enables auto-negotiation for fixed 100 and 10 media
configuration, in a similar manner to that already performed for 1000.
I.e., the patch enables advertising of just the manually configured
settings with the goal of allowing the remote end to match the manually
configured settings if it has them available.
To be clear, this patch does NOT allow an em(4) interface that has been
manually configured with specific media settings to respond to
auto-negotiation by then configuring different parameters to those that
were manually configured. The intent of this patch is to fully comply
with the requirements of Clause 37, but for 100 and 10.
The need for this has arisen on an em(4) link where the other end is
under a different administrative control and is set to full
auto-negotiation. Due to the cable length GigE is not working well. It
is desired to set the em(4) end to "media 100baseTX mediatype
full-duplex" which does work when both ends are configured that way.
Currently, because em(4) does not participate in autoneg for this
setting, the remote defaults to half-duplex - i.e., there's a duplex
mismatch and things don't work. With this patch, em(4) would inform the
remote that it has only 100baseTX full, the remote would match that and
it will work.
Approved by: erj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D34449