Commit graph

31 commits

Author SHA1 Message Date
Hans Petter Selasky
d2cbfbc57b mlx5/mlx4: Bump driver version to 3.7
While at it only output driver version to dmesg(8) when hardware is present.

Differential Revision:	https://reviews.freebsd.org/D29100
MFC after:	1 week
Reviewed by:	kib and markj
Sponsored by:	NVIDIA Networking
2021-07-28 13:47:05 +02:00
Hans Petter Selasky
b633e08c70 ibcore: Kernel space update based on Linux 5.7-rc1.
Overview:

This is the first stage of a RDMA stack upgrade introducing kernel
changes only based on Linux 5.7-rc1.

This patch is based on about four main areas of work:
- Update of the IB uobjects system:
  - The memory holding so-called AH, CQ, PD, SRQ and UCONTEXT objects
    is now managed by ibcore. This also require some changes in the
    kernel verbs API. The updated verbs changes are typically about
    initialize and deinitialize objects, and remove allocation and
    free of memory.

- Update of the uverbs IOCTL framework:
  - The parsing and handling of user-space commands has been
    completely refactored to integrate with the updated IB uobjects
    system.

- Various changes and updates to the generic uverbs interfaces in
  device drivers including the new uAPI surface.

- The mlx5_ib_devx.c in mlx5ib and related mlx5 core changes.

Dependencies:

- The mlx4ib driver code has been updated with the minimum changes
needed.

- The mlx5ib driver code has been updated with the minimum changes
needed including DV support.

Compatibility:

- All user-space facing APIs are backwards compatible after this
  change.

- All kernel-space facing RDMA APIs are backwards compatible after
  this change, with exception of ib_create_ah() and ib_destroy_ah()
  which takes a new flag.

- The "ib_device_ops" structure exist, but only contains the driver ID
  and some structure sizes.

Differences from Linux:

- Infiniband drivers must use the INIT_IB_DEVICE_OPS() macro to set
  the sizes needed for allocating various IB objects, when adding
  IB device instances.

Security:

- PRIV_NET_RAW is needed to use raw ethernet transmit features.
- PRIV_DRIVER is needed to use other privileged operations.

Based on upstream Linux, Torvalds (5.7-rc1):
8632e9b5645bbc2331d21d892b0d6961c1a08429

MFC after:	1 week
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31149
Sponsored by:	NVIDIA Networking
2021-07-28 13:28:29 +02:00
Hans Petter Selasky
c8301cbb0f mlx4: Map core_clock page to user space only when allowed
Currently when we map the hca_core_clock page to the user space,
there are vulnerable registers, one of which is semaphore, on
this page as well. If user read the wrong offset, it can modify the
above semaphore and hang the device.

Hence, mapping the hca_core_clock page to the user space only when
user required it specifically.

After this patch, mlx4 core_clock won't be mapped to user space by
default. Oppose to current state, where mlx4 core_clock is always mapped
to user space.

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:35 +02:00
Hans Petter Selasky
30416d4e82 mlx4ib and mlx5ib: Set slid to zero in Ethernet completion struct
IB spec says that a lid should be ignored when link layer is Ethernet,
for example when building or parsing a CM request message (CA17-34).
However, since ib_lid_be16() and ib_lid_cpu16()  validates the slid,
not only when link layer is IB, we set the slid to zero to prevent
false warnings in the kernel log.

Linux commit:
65389322b28f81cc137b60a41044c2d958a7b950

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:34 +02:00
Hans Petter Selasky
c3987b8ea7 ibcore: Declare ib_post_send() and ib_post_recv() arguments const
Since neither ib_post_send() nor ib_post_recv() modify the data structure
their second argument points at, declare that argument const. This change
makes it necessary to declare the 'bad_wr' argument const too and also to
modify all ULPs that call ib_post_send(), ib_post_recv() or
ib_post_srq_recv(). This patch does not change any functionality but makes
it possible for the compiler to verify whether the
ib_post_(send|recv|srq_recv) really do not modify the posted work request.

Linux commit:
f696bf6d64b195b83ca1bdb7cd33c999c9dcf514
7bb1fafc2f163ad03a2007295bb2f57cfdbfb630
d34ac5cd3a73aacd11009c4fc3ba15d7ea62c411

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:33 +02:00
Hans Petter Selasky
d92a9e5604 ibcore: Simplify ib_modify_qp_is_ok().
All callers to ib_modify_qp_is_ok() provides enum ib_qp_state makes the
checks of out-of-scope redundant. Let's remove them together with updating
function signature to return boolean result.

While at it remove unused "ll" parameter from ib_modify_qp_is_ok().

Linux commit:
19b1f54099b6ee334acbfbcfbdffd1d1f057216d
d31131bba5a1630304c55ea775c48cc84912ab59

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:32 +02:00
Hans Petter Selasky
4238b4a7a2 ibcore: Introduce ib_port_phys_state enum.
In order to improve readability, add ib_port_phys_state enum to replace
the use of magic numbers.

Linux commit:
72a7720fca37fec0daf295923f17ac5d88a613e1

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:31 +02:00
Bjoern A. Zeeb
1411f52fac mlx4/OFED: replace the struct net_device with struct ifnet
Given all the code does operate on struct ifnet, the last step in this
longer series of changes now is to rename struct net_device to
struct ifnet (that is what it was defined to in the LinuxKPi code).
While mlx4 and OFED are "shared" code the decision was made years ago
to not write it based on the netdevice KPI but the native ifnet KPI
for most of it.  This commit simply spells this out and with that
frees "struct netdevice" to be re-done on LinuxKPI to become a more
native/mixed implementation over time as needed by, e.g., wireless
drivers.

Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30515
2021-06-18 21:20:08 +00:00
Bjoern A. Zeeb
60afad6fc3 mlx4: replace LinuxKPI macros with ifnet functions
The LinuxKPI net_device actually is an ifnet;  in order to further
clean that up so we can extend "net_device" replace the few macros
inline in mlx4.

Sponsored by:	The FreeBSD Foundation
MFC after:	12 days
Reviewed by:	kib
Differential Revision: https://reviews.freebsd.org/D30476
2021-05-27 12:26:01 +00:00
Bjoern A. Zeeb
5a461a86cf mlx4: remove no longer needed header
Remove linux/inetdevice.h as neither of the two inline functions there
are used here.

Sposored-by:	The FreeBSD Foundation
MFC-after:	2 weeks
Reviewed-by:	hselasky
X-D-R:		D29366 (extracted as further cleanup)
Differential Revision:	https://reviews.freebsd.org/D29428
2021-03-26 15:56:02 +00:00
Hans Petter Selasky
9a47ae044b Bump driver versions for mlx5en(4) and mlx4en(4).
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
2021-01-08 12:35:55 +01:00
Hans Petter Selasky
6c43a5e9c7 Include GID type when deleting GIDs from HW table under RoCE in mlx4ib.
Refer to the Linux commit mentioned below for a more detailed description.

Linux commit:
a18177925c252da7801149abe217c05b80884798

Requested by:	Isilon
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-11-10 12:58:25 +00:00
Hans Petter Selasky
1866c98e64 Infiniband clients must be attached and detached in a specific order in ibcore.
Currently the linking order of the infiniband, IB, modules decide in which
order the clients are attached and detached. For example one IB client may
use resources from another IB client. This can lead to a potential deadlock
at shutdown. For example if the ipoib is unregistered after the ib_multicast
client is detached, then if ipoib is using multicast addresses a deadlock may
happen, because ib_multicast will wait for all its resources to be freed before
returning from the remove method.

Fix this by using module_xxx_order() instead of module_xxx().

Differential Revision:	https://reviews.freebsd.org/D23973
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2020-07-06 08:50:11 +00:00
Hans Petter Selasky
cf59f7e108 Bump the Mellanox driver version numbers and the FreeBSD version number.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:15:07 +00:00
Hans Petter Selasky
6428c27faf Make sure to error out when arming the CQ fails in mlx4ib and mlx5ib.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:33:09 +00:00
Slava Shwartsman
0f3b263d83 mlx4/mlx5: Updated driver version to 3.5.0
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:25:34 +00:00
Hans Petter Selasky
f4546fa376 Add support for prio-tagged traffic for RDMA in ibcore.
When receiving a PCP change all GID entries are reloaded.
This ensures the relevant GID entries use prio tagging,
by setting VLAN present and VLAN ID to zero.

The priority for prio tagged traffic is set using the regular
rdma_set_service_type() function.

Fake the real network device to have a VLAN ID of zero
when prio tagging is enabled. This is logic is hidden inside
the rdma_vlan_dev_vlan_id() function which must always be used
to retrieve the VLAN ID throughout all of ibcore and the
infiniband network drivers.

The VLAN presence information then propagates through all
of ibcore and so incoming connections will have the VLAN
bit set. The incoming VLAN ID is then checked against the
return value of rdma_vlan_dev_vlan_id().

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 09:11:53 +00:00
Hans Petter Selasky
87e303056f Bump version information in mlx4ib(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 13:59:46 +00:00
Hans Petter Selasky
c9a80d0289 The mlx4ib(4) should not be loaded before the ibcore is initialized.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 13:58:58 +00:00
Hans Petter Selasky
54b55cbdd9 Disable unsupported disassociate ucontext functionality in mlx4ib(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 13:57:32 +00:00
Hans Petter Selasky
1456d97c01 Optimize ibcore RoCE address handle creation from user-space.
Creating a UD address handle from user-space or from the kernel-space,
when the link layer is ethernet, requires resolving the remote L3
address into a L2 address. Doing this from the kernel is easy because
the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily
available. In userspace such an interface does not exist and kernel
help is required.

It should be noted that in an IP-based GID environment, the GID itself
does not contain all the information needed to resolve the destination
IP address. For example information like VLAN ID and SCOPE ID, is not
part of the GID and must be fetched from the GID attributes. Therefore
a source GID should always be referred to as a GID index. Instead of
going through various racy steps to obtain information about the
GID attributes from user-space, this is now all done by the kernel.

This patch optimises the L3 to L2 address resolving using the existing
create address handle uverbs interface, retrieving back the L2 address
as an additional user-space information structure.

This commit combines the following Linux upstream commits:

IB/core: Let create_ah return extended response to user
IB/core: Change ib_resolve_eth_dmac to use it in create AH
IB/mlx5: Make create/destroy_ah available to userspace
IB/mlx5: Use kernel driver to help userspace create ah
IB/mlx5: Report that device has udata response in create_ah

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-05 14:34:52 +00:00
Eitan Adler
02ca39cff2 sys/dev/mlx[45]: fix uses of 1 << 31
Reviewed by:		kib (D13858)
2018-01-12 06:36:44 +00:00
Hans Petter Selasky
8cc487045e Update mlx4ib(4) to Linux 4.9.
Sponsored by:	Mellanox Technologies
2017-11-13 10:49:18 +00:00
Hans Petter Selasky
d05554bb99 The remote DMA TCP portspace selector, RDMA_PS_TCP, is used for both
iWarp and RoCE in ibcore. The selection of RDMA_PS_TCP can not be used
to indicate iWarp protocol use. Backport the proper IB device
capabilities from Linux upstream to distinguish between iWarp and
RoCE. Only allocate the additional socket required for iWarp for RDMA
IDs when at least one iWarp device present. This resolves
interopability issues between iWarp and RoCE in ibcore

Reviewed by:		np @
Differential Revision:	https://reviews.freebsd.org/D12563
Sponsored by:		Mellanox Technologies
MFC after:		3 days
2017-10-20 08:20:15 +00:00
Hans Petter Selasky
f40410e868 Make sure on-stack buffer is properly aligned.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-07-31 12:03:45 +00:00
Hans Petter Selasky
b0259ad374 Fix broken usage of the mlx4_read_clock() function:
- return value has too small width
 - cycle_t is unsigned and cannot be less than zero

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-07-31 09:15:15 +00:00
Hans Petter Selasky
08c6504d07 Improve code readability and fix compilation error when using clang 4.x.
Found by:		emaste @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-15 18:31:09 +00:00
Ed Maste
f0473bfa82 mlx(4): remove date from log message
Further to r310425, go one step further and just remove the date.

Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D8888
2016-12-23 20:14:05 +00:00
Ed Maste
07b2e5c92a mlx: avoid use of __DATE__ to make build reproducible
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D8886
2016-12-22 18:26:21 +00:00
Dimitry Andric
a3228a75a0 After r310171, the kernel version of sscanf() has format string checking
enabled.  This results in a -Werror warning in mlx4ib:

    sys/dev/mlx4/mlx4_ib/mlx4_ib_sysfs.c:90:22: error: format specifies type 'unsigned long long *' but the argument has type 'u64 *' (aka 'unsigned long *') [-Werror,-Wformat]
            sscanf(buf, "%llx", &sysadmin_ag_val);
                         ~~~~   ^~~~~~~~~~~~~~~~

Change sysadmin_ag_val to unsigned long long to avoid the warning.

Reviewed by:	hselasky
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D8831
2016-12-18 15:21:38 +00:00
Hans Petter Selasky
97549c34ec Move the ConnectX-3 and ConnectX-2 driver from sys/ofed into sys/dev/mlx4
like other PCI network drivers. The sys/ofed directory is now mainly
reserved for generic infiniband code, with exception of the mthca driver.

- Add new manual page, mlx4en(4), describing how to configure and load
mlx4en.

- All relevant driver C-files are now prefixed mlx4, mlx4_en and
mlx4_ib respectivly to avoid object filename collisions when compiling
the kernel. This also fixes an issue with proper dependency file
generation for the C-files in question.

- Device mlxen is now device mlx4en and depends on device mlx4, see
mlx4en(4). Only the network device name remains unchanged.

- The mlx4 and mlx4en modules are now built by default on i386 and
amd64 targets. Only building the mlx4ib module depends on
WITH_OFED=YES .

Sponsored by:	Mellanox Technologies
2016-09-30 08:23:06 +00:00