mirror of
https://github.com/opnsense/docs.git
synced 2026-06-09 08:32:07 -04:00
Troubleshooting / Performance: Add RSS documentation
This commit is contained in:
parent
e7f2380702
commit
64b2813762
2 changed files with 131 additions and 0 deletions
|
|
@ -76,3 +76,4 @@ be found in the list below.
|
|||
troubleshooting/gateways
|
||||
troubleshooting/network
|
||||
troubleshooting/openvpn
|
||||
troubleshooting/performance
|
||||
|
|
|
|||
130
source/troubleshooting/performance.rst
Normal file
130
source/troubleshooting/performance.rst
Normal file
|
|
@ -0,0 +1,130 @@
|
|||
=======================================
|
||||
Performance
|
||||
=======================================
|
||||
|
||||
Receive-side scaling
|
||||
--------------------------------------------------
|
||||
|
||||
|
||||
|
||||
**Overview**
|
||||
=====================================================================================================================
|
||||
|
||||
RSS is used to distribute packets over CPU cores using a hashing function – either with support in the hardware which offloads the hashing for you, or in software.
|
||||
The idea is to take as input the TCP 4-tuple (source address, source port, destination address, destination port) of a packet, hash this input using
|
||||
an in-kernel defined key, and selecting the resulting values’ LSB as an index into a user-configurable indirection table.
|
||||
The indirection table is loaded into the hardware during boot and is used by the NIC to decide which CPU to interrupt with a given packet.
|
||||
All of this allows packets of the same origin/destination (a.k.a. flows) to be queued consistently on the same CPU.
|
||||
|
||||
.. note::
|
||||
|
||||
**By default, RSS is disabled on OPNsense to prevent unexpected side effects.** Users have noted performance improvements,
|
||||
but performance degradation is within the scope of possibilities as well. Use this feature with care and see what works best
|
||||
for you.
|
||||
|
||||
**Driver support**
|
||||
=====================================================================================================================
|
||||
|
||||
Assuming you are using a modern NIC which supports multiple hardware queues and RSS, the configuration of a NIC will decide how and on which queue packets
|
||||
arrive on your system. This is also hardware dependent and will not be the same on every NIC. Should your driver support the option to enable/disable RSS,
|
||||
a sysctl tunable will be available.
|
||||
|
||||
It is possible for a NIC to perform RSS without being able to configure it. Should you wish to know if it can be enabled/disabled:
|
||||
|
||||
.. code-block::
|
||||
|
||||
sysctl -a | grep rss
|
||||
|
||||
should show any drivers exposing the option via a tunable.
|
||||
|
||||
It is also possible that a driver does not expose this ability to the user, in which case you’d want to look up whether the NIC/driver supports RSS at all using online
|
||||
datasheets or a simple google search. For example, igb enables RSS by default, dut does not reflect this in any configuration parameter. However, since it uses multiple queues:
|
||||
|
||||
.. code-block::
|
||||
|
||||
# dmesg | grep vectors
|
||||
|
||||
igb0: Using MSI-X interrupts with 5 vectors
|
||||
igb1: Using MSI-X interrupts with 5 vectors
|
||||
igb2: Using MSI-X interrupts with 5 vectors
|
||||
igb3: Using MSI-X interrupts with 5 vectors
|
||||
|
||||
It will most likely have some form of packet filtering to distribute packets over the hardware queues. In fact, igb does RSS by default.
|
||||
|
||||
For most NICs, RSS is the primary method of deciding which CPU to interrupt with a packet. NICs that do not implement any other type of filter and whose RSS feature
|
||||
is missing or turned off, will most likely interrupt only CPU 0 at all times – which will reduce potential throughput due to cache line migrations and lock contention.
|
||||
Please keep system-wide RSS disabled if this is the case.
|
||||
|
||||
The last but not least thing to consider is the fact that driver support with the in-kernel implementation of RSS is a must. Proper driver support will ensure the correct key
|
||||
and indirection table being set in hardware. Drivers which support RSS according to the source code (but mostly untested):
|
||||
|
||||
em
|
||||
igb -> tested & working
|
||||
axgbe -> tested & working
|
||||
netvsc
|
||||
ixgbe
|
||||
ixl
|
||||
cxgbe
|
||||
lio
|
||||
mlx5
|
||||
sfxge
|
||||
|
||||
**Kernel support**
|
||||
=====================================================================================================================
|
||||
|
||||
Internally, FreeBSD uses netisr as an abstraction layer for dispatching packets to the upper protocols. Within the implementation, the default setting is to restrict
|
||||
packet processing to one thread only. Since RSS now provides a way to keep flows local to a CPU, the following sysctls should be set in System->Settings->Tunables:
|
||||
|
||||
.. code-block::
|
||||
|
||||
net.isr.bindthreads = 1
|
||||
|
||||
This causes threads to be bound to a CPU.
|
||||
|
||||
.. code-block::
|
||||
|
||||
net.isr.maxthreads = -1
|
||||
|
||||
This assigns a workstream to each CPU core available.
|
||||
|
||||
Furthermore, the RSS implementation also provides a few necessary sysctls:
|
||||
|
||||
.. code-block::
|
||||
|
||||
net.inet.rss.enabled = 1
|
||||
|
||||
This makes sure RSS is enabled. Disabled by default to prevent regressions on NICs that do not properly implement the RSS interface.
|
||||
|
||||
.. code-block::
|
||||
|
||||
net.inet.rss.bits = X
|
||||
|
||||
This one is dependent on the amount of cores you have. By default the amount of bits here represent the amount of cores x 2 in binary.
|
||||
This is done on purpose to provide load-balancing, though there is no current implementation for this so I recommend setting this value to the amount of bits
|
||||
representing the number of CPU cores. This means we use the following values:
|
||||
|
||||
* for 4-core systems, use ‘2’
|
||||
* for 8-core systems, use ‘3’
|
||||
* for 16-core systems, use ‘4’
|
||||
* Etc.
|
||||
|
||||
.. note::
|
||||
|
||||
Assume that all tunables set here require a reboot to properly apply them.
|
||||
|
||||
If RSS is enabled with the 'enabled' sysctl, the packet dispatching policy will move from ‘direct’ to ‘hybrid’. This will directly dispatch a packet on the current context when allowed,
|
||||
otherwise it will queue the packet on the bound CPU on which it came in on. Please note that this will increase the interrupt load as seen in ‘top -P’.
|
||||
This simply means that packets are being processed with the highest priority in the CPU scheduler - it does not mean the CPU is under more load than normal.
|
||||
|
||||
The correct working of netisr can be verified by running:
|
||||
|
||||
.. code-block::
|
||||
|
||||
netstat -Q
|
||||
|
||||
**Note regarding IPS**
|
||||
=====================================================================================================================
|
||||
|
||||
When Suricata is running in IPS mode, Netmap is utilized to fetch packets off the line for inspection. By default, OPNsense has configured Suricata in such a way that the packet which
|
||||
has passed inspection will be re-injected into the host networking stack for routing/firewalling purposes. The current Suricata/Netmap implementation limits this re-injection to one thread only.
|
||||
Work is underway to address this issue since the new Netmap API (V14+) is now capable of increasing this thread count. Until then, no benefit is gained from RSS when using IPS.
|
||||
Loading…
Reference in a new issue