Commit graph

138453 commits

Author SHA1 Message Date
Andrew Turner
dcfd605871 Add more arm64 external abort sources
These will be used when we support the Arm Reliability, Availability,
and Serviceability extension.

Sponsored by:	The FreeBSD Foundation
2021-08-04 18:50:42 +00:00
Alexander V. Chernikov
f3a3b06121 [lltable] Unify datapath feedback mechamism.
Use newly-create llentry_request_feedback(),
 llentry_mark_used() and llentry_get_hittime() to
 request datapatch usage check and fetch the results
 in the same fashion both in IPv4 and IPv6.

While here, simplify llentry_provide_feedback() wrapper
 by eliminating 1 condition check.

MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D31390
2021-08-04 22:52:43 +00:00
John Baldwin
c51e4962a3 Document kern.log_wakeups_per_second.
PR:		148680
MFC after:	2 weeks
2021-08-04 11:50:34 -07:00
Mitchell Horne
1b8db4b4e3 arm: enable stack-smashing protection
With current generation clang/llvm it can pass all of our tests in
libc/ssp.

While here, remove the extra MACHINE_CPUARCH check for mips. SSP is
included in BROKEN_OPTIONS for this architecture in src.opts.mk, which
is enough to ensure normal builds won't set SSP_CFLAGS.

Reviewed by:	kevans, imp, emaste
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31400
2021-08-04 15:23:22 -03:00
Mitchell Horne
8399d923a5 hwpmc_intel: assert for correct nclasses value
This variable is set based on the exact CPU model detected. If this
value is set too small, it could lead to a NULL-dereference from an
improperly initialized pmc_rowindex_to_classdep array.

Though it has been fixed, this was previously the case for Broadwell.
Add two asserts to catch this in DEBUG kernels, as it represents a
configuration error that may be hard to uncover otherwise.

PR:		253687
Reported by:	Zhenlei Huang <zlei.huang@gmail.com>
Sponsored by:	The FreeBSD Foundation
2021-08-04 15:23:22 -03:00
Mitchell Horne
4f35e8cba2 hwpmc: disable uncore class on Sandy Bridge and newer
It was written for Nehalem and Westmere, with minor but incomplete
updates for Sandy Bridge in 78d763a29b. The uncore architecture
changed significantly with this generation, bringing new layouts and
locations for some MSRs.

Misprogramming these MSRs in ucp_start_pmc() may panic the system, and
this is trivially reproducible via pmcstat(8) on at least Broadwell and
Haswell. Disable the class on these CPUs until it can be updated more
completely and leave a TODO comment detailing some of the work required.
Note that the nclasses value for Broadwell was already incorrect and
doesn't need changing.

The result is that any uncore events listed by pmcstat -L will no longer
be allocatable, but this is already the case for newer generations of
Intel CPUs.

PR:		253687
Reported by:	Zhenlei Huang <zlei.huang@gmail.com>
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31389
2021-08-04 15:23:22 -03:00
Konstantin Belousov
0ef5eee9d9 Add vn_lktype_write()
and remove repetetive code that calculates vnode locking type for write.

Reviewed by:	khng, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31405
2021-08-04 19:40:13 +03:00
Warner Losh
e94f1a0a37 Revert "arm: remove fslsdma from GENERIC"
The firmware was already in the tree when I did this commit, and I
missed the message. The bug was obsolete.

This reverts commit 9e3761d126.

PR:		237466
Sponsored by:	Netflix
2021-08-03 20:10:32 -06:00
Andrew Turner
8b3bd5a2b5 Only store the arm64 ID registers in the cpu_desc
There is no need to store a pointer to the CPU implementer and part
strings. Switch to load them directly into the sbuf used to print them
on boot.

While here print the machine ID register when we fail to determine the
implementer or part we are booting on.

Reviewed by:	markj, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31346
2021-08-03 01:42:40 +00:00
Andrew Turner
1a78f44cd2 Move setting arm64 HWCAP values to the ID tables
The HWCAPS values are based on the ID registers. Move setting these
to the existing ID register parsing code.

Previously we would need to handle all possible ID field values where
a HWCAP is set, however as most ID fields follow a scheme where when
the field increments it will only add new features meaning we only
need to check if the field is greater than when the HWCAP feature
was added.

While here stop setting HWCAP value that need kernel support, but this
support is missing.

Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31201
2021-08-03 01:42:39 +00:00
Gleb Smirnoff
29b4fa7876 hdaa: add missing break in hdac_pin_patch().
Fixes driver attach on my Thinkpad X1 Carbon, and likely on
many other ALC family devices.

Fixes:	ef790cc740
2021-08-03 03:09:32 -07:00
Kornel Duleba
dfcaa2c18b enetc_mdio: Support building the driver as a loadable module.
After recent arm64 GENERIC config cleanup the ENETC MDIO
in NXP LS1028A SoC should support being loaded as a module.

Obtained from: Semihalf
Sponsored by: Alstom Group
2021-08-03 12:07:49 +02:00
Kornel Duleba
5ad6d28cbe enetc: Support building the driver as a loadable module.
Function level reset has to be done in attach in order to put the
hardware in a known state before configuring it.
The order of DRIVER_MODULEs was changed to ensure that the miibus driver
is loaded when mii_attach is called.

Obtained from: Semihalf
Sponsored by: Alstom Group
2021-08-03 12:07:49 +02:00
Kornel Duleba
bfce69ae9a dtb: freescale: Add fsl-ls1028a-rdb to the build
With the recent inclusion of ENETC networking driver we now support this
board.

Obtained from: Semihalf
Sponsored by: Alstom Group
2021-08-03 12:07:49 +02:00
Marcin Wojtas
451bcf1b36 Introduce driver for Freescale Felix switch
It is found on boards equipped with LS1028A SoC.
802.1q VLAN grouping is supported.
An external MDIO device is used for communicating with PHYs.
The driver is built as a module by default, it is not included
in GENERIC kernel config.

Submitted by: Lukasz Hajec <lha@semihalf.com>
              Kornel Duleba <mindal@semihalf.com>
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30923
2021-08-03 12:07:49 +02:00
Kornel Duleba
f5b29d0f35 etherswitch: Add a new striptagingress port flag
Felix switch found in LS1028A supports stripping VLAN tag on
ingress, instead of egress. The striptag flag excepts the latter
behaviour.
Add a new flag to support the feature.

Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D30922
2021-08-03 12:07:48 +02:00
Kyle Evans
04cc0c393c malloc(9): provide missing malloc_aligned implementation
Pointy hat:	kevans
Fixes:	6162cf885c ("malloc(9): Document/complete aligned variants")
2021-08-02 21:12:39 -05:00
Warner Losh
d00131e154 clock_id: These symbols weren't in 4.4BSD, adjust copyright
Peter Wemm added the first CLOCK_* symbols in 0f5ed9f420 in 1997
after obtaining them from NetBSD. In NetBSD, jtc@netbsd.org committed
them in sys/sys/time.h rev 1.19 dated 1996/11/15, along with all the
system calls associated with 1003.1b. FreeBSD's values are, however,
different than NetBSD's today. The USL/UCB lawsuit was settled in 1994,
so these couldn't have been derived from material provided to University
of California covered in that settlement. This file does not need the
settlement disclaimer.

Furthermore, I rewrote most of the code (except the symbols and their
values) when merging it from time.h and sys/time.h. Most of the creative
content of the file is new, so update copyright to reflect that.

Reviewed by:	kaktus
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D31369
2021-08-02 15:50:32 -06:00
Adam Fenn
6162cf885c malloc(9): Document/complete aligned variants
Comments on a pending kvmclock driver suggested adding a
malloc_aligned() to complement malloc_domainset_aligned(); add it now,
and document both.

Reviewed by:	imp, kib, allanjude (manpages)
Differential Revision:	https://reviews.freebsd.org/D31004
2021-08-02 15:36:14 -05:00
Eric van Gyzen
428624130a Fix lockstat:::thread-spin dtrace probe with LOCK_PROFILING
The spinning start time is missing from the calculation due to a
misplaced #endif.  Return the #endif where it's supposed to be.

Submitted by:	Alexander Alexeev <aalexeev@isilon.com>
Reviewed by:	bdrewery, mjg
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D31384
2021-08-02 14:44:23 -05:00
John Baldwin
d59f1c49e2 cxgbe tom: Permit rcv_nxt mismatches on FIN for iSCSI connections on T6.
The remote peer might send a FIN in the middle of a burst of data
PDUs.  In the case of T6 with data PDU completion moderation, the
driver would not have seen these PDUs since the final PDU in the burst
was never received resulting in a stale rcv_nxt when the FIN is
received.

While here, invert the logic in the condition to be more readable and
always set tp->rcv_nxt from the sequence number in the CPL.  This sets
the proper value of rcv_nxt for FINs on connections with data received
but not reported via a CPL (e.g. a partial iSCSI PDU burst interrupted
by a FIN).

Reported by:	Jithesh Arakkan @ Chelsio
Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D30871
2021-08-02 09:41:27 -07:00
Kristof Provost
600745f1e2 pf: bound DIOCGETSTATES memory use
Similar to what we did earlier for DIOCGETSTATESV2 we only allocate
enough memory for a handful of states and copy those out, bit by bit,
rather than allocating memory for all states in one go.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-08-02 16:29:23 +02:00
Adam Fenn
8ca384eb1d devclass_alloc_unit: move "at" hint test to after device-in-use test
Only perform this expensive operation when the unit number is a
potential candidate (i.e. not already in use), thereby reducing device
scan time on systems with many devices, unit numbers, and drivers.

Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
X-NetApp-PR:	#61
Differential Revision:	https://reviews.freebsd.org/D31381
2021-08-02 11:27:17 -05:00
Alexander Motin
ca34553b6f sched_ule(4): Pre-seed sched_random().
I don't think it changes anything, but why not.

While there, make cpu_search_highest() use all 8 lower load bits for
noise, since it does not use cs_prefer and the code is not shared
with cpu_search_lowest() any more.

MFC after:	1 month
2021-08-02 10:55:28 -04:00
Aleksandr Rybalko
aed2afeb51 Ignore ResourceProducer flag for:
o Arm CoreLink TM CMN-600 Coherent Mesh Network controller,
o Arm CoreLink DMC-620 Dynamic Memory Controller.

Sponsored by:	Ampere Computing LLC
Submitted by:	Klara Inc.
2021-08-02 14:11:20 +03:00
Ka Ho Ng
df95cc76af vmm: Bump vmname buffer in struct vm to VM_MAX_NAMELEN + 1
In hw.vmm.create sysctl handler the maximum length of vm name is
VM_MAX_NAMELEN. However in vm_create() the maximum length allowed is
only VM_MAX_NAMELEN - 1 chars. Bump the length of the internal buffer to
allow the length of VM_MAX_NAMELEN for vm name.

MFC after:	3 days
Reviewed by:	grehan
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31372
2021-08-02 17:55:08 +08:00
Roger Pau Monné
82bf6a2566 xen/timer: fix amd64 LINT kernel build
On amd64 XENHVM depends on the xentimer device for PVH early startup,
so both should be added or removed together (like the current
dependency with xenpci). Fix this by adding xentimer to NOTES and
updating the comments on the config files. Note that on i386 there's
no such dependency between xentimer and XENHVM, since there's no PVH
support.

While there also fix the MINIMAL i386 build to include the xentimer,
so it keeps the same functionality as before xentimer was split from
XENHVM.

Reported by: lwhsu
PR: 257549
Fixes: ae59812748 ('xen/timer: make xen timer optional')
2021-08-02 10:33:35 +02:00
Hans Petter Selasky
9340ebd404 Add missing file to sys/conf/files after 469884cf04 .
Found by:	vishwin@
Differential Revision:	https://reviews.freebsd.org/D29921
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-08-02 08:25:04 +02:00
Alexander Motin
8bb173fb5b sched_ule(4): Use trylock when stealing load.
On some load patterns it is possible for several CPUs to try steal
thread from the same CPU despite randomization introduced.  It may
cause significant lock contention when holding one queue lock idle
thread tries to acquire another one.  Use of trylock on the remote
queue allows both reduce the contention and handle lock ordering
easier.  If we can't get lock inside tdq_trysteal() we just return,
allowing tdq_idled() handle it.  If it happens in tdq_idled(), then
we repeat search for load skipping this CPU.

On 2-socket 80-thread Xeon system I am observing dramatic reduction
of the lock spinning time when doing random uncached 4KB reads from
12 ZVOLs, while IOPS increase from 327K to 403K.

MFC after:	1 month
2021-08-01 22:42:01 -04:00
Alexander Motin
2668bb2add sched_ule(4): Reduce duplicate search for load.
When sched_highest() called for some CPU group returns nothing, idle
thread calls it for the parent CPU group.  But the parent CPU group
also includes the CPU group we've just searched, and unless there is
a race going on, it is unlikely we find anything new this time.

Avoid the double search in case of parent group having only two sub-
groups (the most prominent case). Instead of escalating to the parent
group run the next search over the sibling subgroup and escalate two
levels up after if that fail too.  In case of more than two siblings
the difference is less significant, while searching the parent group
can result in better decision if we find several candidate CPUs.

On 2-socket 40-core Xeon system I am measuring ~25% reduction of CPU
time spent inside cpu_search_highest() in both SMT (2x20x2) and non-
SMT (2x20) cases.

MFC after:	1 month
2021-08-01 22:07:51 -04:00
Konstantin Belousov
665895db26 amd64 pmap_vm_page_alloc_check(): loose the assert
Current expression checks that vm_page_alloc(9) never returns a page
belonging to the preload area.  This is not true if something was freed
from there, for instance a preloaded module was unloaded, or ucode update
freed.

Only check that we never allow to allocate a page belonging to the kernel
proper, check against _end.

Reported and tested by:	dhw
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-08-02 03:28:33 +03:00
Konstantin Kukushkin
a61c24ddb7 udp: Fix soroverflow SOCKBUF unlocking
We hold the SOCKBUF_LOCK so use soroverflow_locked here.
This bug may manifest as a non-killable process stuck in [*so_rcv].

Approved by:	scottl
Reviewed by:	Roy Marples <roy@marples.name>
Fixes:	7045b1603b
MFC after:  10 days
Differential Revision:	https://reviews.freebsd.org/D31374
2021-08-01 08:07:33 -07:00
Konstantin Belousov
1a55a3a729 amd64 pmap_vm_page_alloc_check(): print more data for failed assert
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-08-01 16:42:02 +03:00
Alexander V. Chernikov
5b42b494d5 Fix typo in rib_unsibscribe<_locked>().
Submitted by:	Zhenlei Huang<zlei.huang at gmail.com>
Differential Revision: https://reviews.freebsd.org/D31356
2021-08-01 13:29:52 +00:00
Alexander V. Chernikov
054948bd81 [multipath][nhops] Fix random crashes with high route churn rate.
When certain multipath route begins flapping really fast, it may
 result in creating multiple identical nexthop groups. The code
 responsible for unlinking unused nexthop groups had an implicit
 assumption that there could be only one nexthop group for the
 same combination of nexthops with weights. This assumption resulted
 in always unlinking the first "identical" group, instead of the
 desired one. Such action, in turn, produced a used-but-unlinked
 nhg along with freed-and-linked nhg, ending up in random crashes.

Similarly, it is possible that multiple identical nexthops gets
 created in the case of high route churn, resulting in the same
 problem when deleting one of such nexthops.

Fix by matching the nexthop/nexhop group pointer when deleting the item.

Reported by:	avg
MFC after:	1 week
2021-08-01 10:07:37 +00:00
Edward Tomasz Napierala
60fb9e10c7 cam: enable kern.cam.da.enable_uma_ccbs by default
This makes the da(4) driver use UMA for its CCBs by default,
like ada(4) already does.  Please let me know via email
if you notice any suspicious kernel messages,

Reviewed By:	imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31257
2021-08-01 09:40:42 +00:00
Bjoern A. Zeeb
22e20d852f LinuxKPI: fix bug in le32p_replace_bits()
Fix a bug that slipped in in 90707c4e44
using the correct field in le32p_replace_bits().

MFC after:	3 days
Reviewed by:	hselasky
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31352
2021-07-31 22:15:35 +00:00
Kevin Bowling
ff01d6343f igb: clean up igb_txrx comments
Reviewed by:	grehan
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D31227
2021-07-31 08:04:25 -07:00
Kevin Bowling
d02e436353 igc: sync igc_txrx with igb(4)
Reviewed by:	grehan
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31227
2021-07-31 08:00:16 -07:00
Konstantin Belousov
041b7317f7 Add pmap_vm_page_alloc_check()
which is the place to put MD asserts about allocated pages.

On amd64, verify that allocated page does not belong to the kernel
(text, data) or early allocated pages.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-31 16:53:42 +03:00
Konstantin Belousov
e18380e341 amd64: do not assume that kernel is loaded at 2M physical
Allow any 2M aligned contiguous location below 4G for the staging
area location.  It should still be mapped by loader at KERNBASE.

The assumption kernel makes about loader->kernel handoff with regard to
the MMU programming are explicitly listed at the beginning of hammer_time(),
where kernphys is calculated.  Now kernphys is the variable instead of
symbol designating the physical address.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-31 16:53:42 +03:00
Hans Petter Selasky
792b602a33 Bump the FreeBSD version after making FPU sections thread-safe in the LinuxKPI.
Differential Revision:	https://reviews.freebsd.org/D29921
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-07-31 15:40:57 +02:00
Hans Petter Selasky
469884cf04 LinuxKPI: Make FPU sections thread-safe and use the NOCTX flag.
Reviewed by:	kib
Submitted by:	greg@unrelenting.technology
Differential Revision:	https://reviews.freebsd.org/D29921
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-07-31 15:36:48 +02:00
Jung-uk Kim
97c0b5ab18 acpica: Import ACPICA 20210730
(cherry picked from commit 34cfdff1f386b2d7bf0a8ea873acf604753991e6)
2021-07-31 00:16:27 -04:00
Warner Losh
155f15118a clock_gettime: Add Linux aliases for CLOCK_*
Linux standardized what we call CLOCK_{REALTIME,MONOTONIC}_FAST as
CLOCK_{REALTIME,MONOTONIC}_COARSE. In addition, Linux spells
CLOCK_UPTIME as CLOCK_BOOTTIME.

Add aliases to time.h and document these new aliases in
clock_gettime(2).

Reviewed by:		vangyzen, kib (prior), dchagin (prior)
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D30988
2021-07-30 17:20:22 -06:00
Warner Losh
7b797ba27a time.h: reduce CLOCK_ namespace pollution, move to _clock_id.h
Attempt to comply with the strict namespace pollution requirements of
_POSIX_C_SOURCE. Add guards to limit visitbility of CLOCK_ and TIMER_
defines as appropriate. Only define the CLOCK_ variables relevant to the
specific standards. Move all the sharing to sys/_clock_id.h and make
time.h and sys/time.h both include that rather than copy due to the
now large number of clocks and compat defines.

Please note: The old time.h previously used these newer dates:
	CLOCK_REALTIME			199506
	CLOCK_MONOTONIC			200112
	CLOCK_THREAD_CPUTIME_ID		200112
	CLOCK_PROCESS_CPUTIME_ID	200112

but glibc defines all of these for 199309. glibc uses this date for all
these values, however, only CLOCK_REALTIME was in IEEE 1003.1b. Add a
comment about this to document it. A large number of programs and
libraries assume that these will be defined for _POSIX_C_SOURCE =
199309.

In addition, leak CLOCK_UPTIME_FAST for the pocl package until it can be
updated to use a simple CLOCK_MONOTONIC.

Reviewed by:		kib
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D31056
2021-07-30 17:20:22 -06:00
John Baldwin
d0d631d5f4 cxgbei: Round up the maximum PDU data length by the MSS for TXDATAPLEN_MAX.
Recent firmware versions round down the value passed here by the MSS
and subsequently mishandle transmitted PDUs larger than the rounded
down value.

Reported by:	Jithesh Arakkan @ Chelsio
Sponsored by:	Chelsio Communications
2021-07-30 13:27:24 -07:00
Mark Johnston
b2ed7e988a bus: Convert to the new interceptor scheme
This was missed in commit a90d053b84.

Fixes:		a90d053b84
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2021-07-30 15:15:27 -04:00
Kristof Provost
b69019c14c pf: remove DIOCGETSTATESNV
While nvlists are very useful in maximising flexibility for future
extensions their performance is simply unacceptably bad for the
getstates feature, where we can easily want to export a million states
or more.

The DIOCGETSTATESNV call has been MFCd, but has not hit a release on any
branch, so we can still remove it everywhere.

Reviewed by:	mjg
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31099
2021-07-30 11:45:28 +02:00
Alexander Motin
9d3b47abbb ipmi(4): Add more watchdog error checks.
Add request submission status checks before checking req->ir_compcode,
otherwise it may be zero just because of initialization.

Add checks for req->ir_compcode errors in ipmi_reset_watchdog() and
ipmi_set_watchdog().  In first case explicitly check for 0x80, which
means timer was not previously set, that I found happening after BMC
cold reset.  This change makes watchdog timer to recover instead of
permanently ignoring reset errors after BMC reset or upgraded.

MFC after:	2 weeks
Sponsored by:   iXsystems, Inc.
2021-07-29 23:39:04 -04:00