Commit graph

130967 commits

Author SHA1 Message Date
Jeff Roberson
c8ea36e881 Fix a recursion on the thread lock by acquiring it after call rtp_to_pri().
Reported by:	swills
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D23495
2020-02-04 02:42:54 +00:00
Jeff Roberson
dc3915c8c6 Use STAILQ instead of TAILQ for bucket lists. We only need FIFO behavior
and this is more space efficient.

Stop queueing recently used buckets to the head of the list.  If the bucket
goes to a different processor the cache coherency will be more expensive.
We already try to encourage cache-hot behavior in the per-cpu layer.

Reviewed by:	rlibby
Differential Revision:	https://reviews.freebsd.org/D23493
2020-02-04 02:41:24 +00:00
Navdeep Parhar
87bbb3338e cxgbe(4): Add pfil(9) hooks to the driver's rx.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-04 01:09:02 +00:00
Navdeep Parhar
1486d2de9e cxgbe(4): Treat NIC rx as special and run its handler directly and not
via the t4_cpl_handler dispatch table.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-04 01:01:35 +00:00
Navdeep Parhar
46e1e307ed cxgbe(4): Retire the allow_mbufs_in_cluster optimization.
This simplifies the driver's rx fast path as well as the bookkeeping
code that tracks various rx buffer sizes and layouts.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-04 00:51:10 +00:00
Alex Richardson
febe2bd226 Set the LMA of the riscv kernel to the OpenSBI jump target by default
This allows us to boot FreeBSD RISCV on QEMU using the -kernel command line
options. When using that option, QEMU maps the kernel ELF file to the
addresses specified in the LMAs in the program headers.

Since version 4.2 QEMU ships with OpenSBI fw_jump by default so this allows
booting FreeBSD using the following command line:
qemu-system-riscv64 -bios default -kernel /.../boot/kernel/kernel -nographic -M virt

Without this change the -kernel option cannot be used since the LMAs start
at address zero and QEMU already maps a ROM to these low physical addresses.

For targets that require a different kernel LMA the make variable
KERNEL_LMA can be overwritten in the config file. For example, adding
`makeoptions	KERNEL_LMA=0xc0200000` will create an ELF file that will be
loaded at 0xc0200000.

Before:
There are 4 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x001000 0xffffffc000000000 0x0000000000000000 0x75e598 0x8be318 RWE 0x1000
  DYNAMIC        0x71fb20 0xffffffc00071eb20 0x000000000071eb20 0x000100 0x000100 RW  0x8
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x0
  NOTE           0x693400 0xffffffc000692400 0x0000000000692400 0x000024 0x000024 R   0x4

After:

There are 4 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x001000 0xffffffc000000000 0x0000000080200000 0x734198 0x893e18 RWE 0x1000
  DYNAMIC        0x6f7810 0xffffffc0006f6810 0x00000000808f6810 0x000100 0x000100 RW  0x8
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x0
  NOTE           0x66ca70 0xffffffc00066ba70 0x000000008086ba70 0x000024 0x000024 R   0x4

Reviewed By:	br, mhorne (earlier version)
Differential Revision: https://reviews.freebsd.org/D23436
2020-02-04 00:06:16 +00:00
Navdeep Parhar
d6f79b2710 cxgbe(4): Avoid ext_arg2 in rxb_free.
ext_arg2 is the only item in the third cacheline in an mbuf and could be
cold by the time rxb_free runs.  Put the information needed by rxb_free
in the same line as the refcount, which is very likely to be hot given
that rxb_free runs when the refcount is decremented and reaches 0.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:50:29 +00:00
Navdeep Parhar
44c6fea82b cxgbe(4): Do not use pack boundary > 512B unless it is explicitly
requested.

This is a tradeoff between PCIe efficiency during large packet rx and
packing efficiency during small packet rx.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:30:39 +00:00
Navdeep Parhar
a9c4062a9a cxgbe(4): Initialize the rx buffer's metadata on first-use and not on
allocation.

refill_fl doesn't touch any part of a freshly allocated cluster after
this change.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:25:12 +00:00
Navdeep Parhar
9087a3df60 cxgbe(4): Only checksummed TCP should be considered for LRO.
This avoids the per-packet nanouptime in tcp_lro_rx for traffic that's
not even TCP.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:06:42 +00:00
Mark Johnston
e489450589 Fix the !SMP case in sched_add() after r355779.
If the thread's lock is already that of the runqueue, don't recurse on
the queue lock.

Reviewed by:	jeff, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23492
2020-02-03 22:49:05 +00:00
Mateusz Guzik
8151b6e92a fd: partially unengrish the previous commit 2020-02-03 22:34:50 +00:00
Mateusz Guzik
e10f063b30 fd: streamline fget_unlocked
clang has the unfortunate property of paying little attention to prediction
hints when faced with a loop spanning the majority of the rotuine.

In particular fget_unlocked has an unlikely corner case where it starts almost
from scratch. Faced with this clang generates a maze of taken jumps, whereas
gcc produces jump-free code (in the expected case).

Work around the problem by providing a variant which only tries once and
resorts to calling the original code if anything goes wrong.

While here note that the 'seq' parameter is almost never passed, thus the
seldom users are redirected to call it directly.
2020-02-03 22:32:49 +00:00
Mateusz Guzik
52604ed792 fd: remove the seq argument from fget_unlocked
It is almost always NULL.
2020-02-03 22:27:55 +00:00
Mateusz Guzik
7f1566f884 fd: remove the seq argument from fget routines
It is almost always NULL.
2020-02-03 22:27:03 +00:00
Mateusz Guzik
4846218d8b seqc: provide seqc_read_any 2020-02-03 22:26:29 +00:00
Mateusz Guzik
0a1427c5ab ktrace: provide ktrstat_error
This eliminates a branch from its consumers trading it for an extra call
if ktrace is enabled for curthread. Given that this is almost never true,
the tradeoff is worth it.
2020-02-03 22:26:00 +00:00
Gleb Smirnoff
0017b2adac Couple protocol drain routines (frag6_drain and sctp_drain) may send
packets.  An unexpected behaviour for memory reclamation routine.
Anyway, we need enter the network epoch for doing that.
2020-02-03 20:48:57 +00:00
Warner Losh
a743528537 Fix a stray 'e'from my last commit. 2020-02-03 19:36:24 +00:00
Mark Johnston
36cb95c736 Disable the smallest UMA bucket size on 32-bit platforms.
With r357314, sizeof(struct uma_bucket) grew to 16 bytes on 32-bit
platforms, so BUCKET_SIZE(4) is 0.  This resulted in the creation of a
bucket zone for buckets with zero capacity.  A more general fix is
planned, but for now this bandaid allows 32-bit platforms to boot again.

PR:		243837
Discussed with:	jeff
Reported by:	pho, Jenkins via lwhsu
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2020-02-03 19:29:02 +00:00
Kyle Evans
3d62f685d5 namei: preserve errors from fget_cap_locked
Most notably, we want to make sure we don't clobber any capabilities-related
errors. This is a regression from r357412 (O_SEARCH) that was picked up by
the capsicum tests.

PR:		243839
Reviewed by:	kib (committed form recommended by)
Tested by:	lwhsu
Differential Revision:	https://reviews.freebsd.org/D23479
2020-02-03 18:59:07 +00:00
Mark Johnston
a83c682b36 Dynamically select LSE-based atomic(9)s on arm64.
Once all CPUs are online, determine if they all support LSE atomics and
set lse_supported to indicate this.  For now the atomic(9)
implementations are still always inlined, though it would be preferable
to create out-of-line functions to avoid text bloat.  This was not done
here since big.little systems exist in which some CPUs implement LSE
while others do not, and ifunc resolution must occur well before this
scenario can be detected.  It does seem unlikely that FreeBSD will
ever run on such platforms, however, so converting atomic(9) to use
ifuncs is probably a good next step.

Add a LSE_ATOMICS arm64 kernel configuration option to unconditionally
select LSE-based atomic(9) implementations when the target system is
known.

Reviewed by:	andrew, kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation, Amazon (hardware)
Differential Revision:	https://reviews.freebsd.org/D23325
2020-02-03 18:23:50 +00:00
Mark Johnston
920de6a15f Add LSE-based atomic(9) implementations.
These make use of the cas*, ld* and swp instructions added in ARMv8.1.
Testing shows them to be significantly more performant than LL/SC-based
implementations.

No functional change here since the wrappers still unconditionally
select the _llsc variants.

Reviewed by:	andrew, kib
MFC after:	1 month
Submitted by:	Ali Saidi <alisaidi@amazon.com> (original version)
Differential Revision:	https://reviews.freebsd.org/D23324
2020-02-03 18:23:35 +00:00
Mark Johnston
c1fced6800 Add wrappers for arm64 atomics.
Add a _llsc suffix for the existing LL/SC-based implementations and add
trivial wrappers.  This is in preparation for supporting LSE-based
atomic(9) implementations.

No functional change intended.

Reviewed by:	andrew, kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation, Amazon (hardware)
Differential Revision:	https://reviews.freebsd.org/D23323
2020-02-03 18:23:14 +00:00
Mark Johnston
3ad6c736cf Provide a single implementation for each of the arm64 atomic(9) ops.
Parameterize the macros by type width as well as acq/rel semantics.
This makes modifying the implementations much less tedious and
error-prone and makes it easier to support alternate LSE-based
implementations.  No functional change intended.

Reviewed by:	andrew, kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation, Amazon (hardware)
Differential Revision:	https://reviews.freebsd.org/D23322
2020-02-03 18:22:59 +00:00
Chuck Silvers
62612737d6 With INVARIANTS, track all softdep dependency structures centrally
so that we can find them in dumps.

Approved by:	mckusick (mentor)
Sponsored by:	Netflix
2020-02-03 17:47:14 +00:00
Warner Losh
58aa35d429 Remove sparc64 kernel support
Remove all sparc64 specific files
Remove all sparc64 ifdefs
Removee indireeect sparc64 ifdefs
2020-02-03 17:35:11 +00:00
Alexander Motin
c68c82324f Unblock kstat.zfs.misc.dbufstats sysctls.
It is not so much broken to hide it after we wasted time to collect it.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2020-02-03 17:10:40 +00:00
Mateusz Guzik
bcd1cf4f03 capsicum: faster cap_rights_contains
Instead of doing a 2 iteration loop (determined at runeimt), take advantage
of the fact that the size is already known.

While here provdie cap_check_inline so that fget_unlocked does not have to
do a function call.

Verified with the capsicum suite /usr/tests.
2020-02-03 17:08:11 +00:00
Mateusz Guzik
2abdae33b1 tmpfs: inline tmpfs_update
It was generated to be just a jumping off point to tmpfs_itimes.

While here provide a dedicated variant for getattr since we normally don't
expect to need to the update from that caller.
2020-02-03 17:06:21 +00:00
Andrew Turner
59606417c4 Remove the GICv3 ITS irq and replace it with an ID
In r357324 most of the use of gi_irq was moved to gi_lpi. Complete this
with the last few places we need the IRQ value and create gi_id for the
per-device value we need.

MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2020-02-03 14:38:19 +00:00
Mateusz Guzik
fee204544e fd: fix f_count acquire in fget_unlocked
The code was using a hand-rolled fcmpset loop, while in other places the same
count is manipulated with the refcount API.

This transferred from a stylistic issue into a bug after the API got extended
to support flags. As a result the hand-rolled loop could bump the count high
enough to set the bit flag. Another bump + refcount_release would then free
the file prematurely.

The bug is only present in -CURRENT.
2020-02-03 14:28:31 +00:00
Mateusz Guzik
f1fa1ba3d0 Fix up various vnode-related asserts which did not dump the used vnode 2020-02-03 14:25:32 +00:00
Andrew Turner
7877018c2c Use a unique name for the GICv3 ITS vmem
When there are multiple GICv3 ITS devices we don't know which vmem is for
which device. Use device_get_nameunit to get a per-device name.

MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2020-02-03 13:50:55 +00:00
Andrew Turner
a58fc7cb88 Disable the use of the quantum cache in the GICv3 ITS
This uses UMA to allocate space. It causes issues when there are multiple
ITS devices in the system where interrupts are not allocated from a low
address on some interrupt controllers. Disabling the quantum cache fixes
this on the Neoverse N1 SDP.

MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2020-02-03 13:47:41 +00:00
Warner Losh
d9e9979c02 On powerpc, we use ofw_syscons for device sc. That references the default
fonts. As a workaround, remove the static. vt is default on powerpc, but there's
a few old macs that still fail with vt. sc is used as a work arouond for those
machines, and the kernel fails to build w/o it.
2020-02-03 05:38:45 +00:00
Conrad Meyer
8e6b06be14 netinet/libalias: Fix typo in debug message
No functional change.

PR:		243831
Submitted by:	Neel Chauhan <neel AT neelc DOT org>
Differential Revision:	https://reviews.freebsd.org/D23365
2020-02-03 05:19:44 +00:00
Pedro F. Giffuni
2a1481fbbf typo: Registration.
Pointed by:	Dikshie Fauzie
2020-02-03 02:02:13 +00:00
Pedro F. Giffuni
ad2b6d4e9b ethernet: Minor cleanup.
Consistently use uppercase for ethertype hex numbers.
2020-02-03 01:08:15 +00:00
Ed Maste
2927ab0397 acpi_ibm: remove superfluous cast
Reported by:	kib
2020-02-02 20:56:18 +00:00
Pedro F. Giffuni
b33c19776b style(9): Fix spaces after #define.
No functional change.
2020-02-02 19:02:07 +00:00
Ed Maste
66671c1428 acpi_ibm: whitespace and wrapping cleanup 2020-02-02 19:01:16 +00:00
Pedro F. Giffuni
682397c263 ethernet: add some more Ethertypes.
Sort ETHERTYPE_FCOE, from r357414.
2020-02-02 18:33:20 +00:00
Pedro F. Giffuni
badbcf06e0 ethernet: add some more Ethertypes.
Add some types based on other BSDs and also add EtherCat and PROFINET, which
are IEC standards.

There is a public list (CSV format) at:
	https://standards.ieee.org/products-services/regauth/

MFC after:	2 weeks
2020-02-02 18:27:37 +00:00
Ed Maste
4382f0f7a9 acpi_ibm: whitespace fixup 2020-02-02 18:07:47 +00:00
Kyle Evans
6a5abb1ee5 Provide O_SEARCH
O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping
permissions checks on the directory itself after the initial open(). This is
close to the semantics we've historically applied for O_EXEC on a directory,
which is UB according to POSIX. Conveniently, O_SEARCH on a file is also
explicitly undefined behavior according to POSIX, so O_EXEC would be a fine
choice. The spec goes on to state that O_SEARCH and O_EXEC need not be
distinct values, but they're not defined to be the same value.

This was pointed out as an incompatibility with other systems that had made
its way into libarchive, which had assumed that O_EXEC was an alias for
O_SEARCH.

This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC
respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a
directory is checked in vn_open_vnode already, so for completeness we add a
NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not
re-check that when descending in namei.

[0] https://pubs.opengroup.org/onlinepubs/9699919799/

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D23247
2020-02-02 16:34:57 +00:00
Kyle Evans
c887ac8324 zfs: light refactor to indicate cachedlookup in zfs_lookup
If we come from VOP_CACHEDLOOKUP, we must skip the VEXEC check as it will
have been done in the caller (vfs_cache_lookup). This is a part of D23247,
which may skip the earlier VEXEC check as well if the root fd was opened
with O_SEARCH.

This one required slightly more work as zfs_lookup may also be called
indirectly as VOP_LOOKUP or a couple of other places where we must do the
check.
2020-02-02 16:10:33 +00:00
Kyle Evans
bd11e674ec pseudofs: don't do VEXEC check in VOP_CACHEDLOOKUP
VOP_CACHEDLOOKUP should assume that the appropriate VEXEC check has been
done in the caller (vfs_cache_lookup), so it does not belong here.
2020-02-02 15:36:12 +00:00
Ed Maste
43c2dac0e5 Move ce enable to SOURCELESS_HOST
ce contains obfuscated code that runs on the host's processor
2020-02-02 14:41:09 +00:00
Mateusz Guzik
2568d5bb79 fd: sprinkle some predits around fget
clang inlines fget -> _fget into kern_fstat and eliminates several checkes,
but prior to this change it would assume fget_unlocked was likely to fail
and consequently avoidable jumps got generated.
2020-02-02 09:38:40 +00:00