Commit graph

138197 commits

Author SHA1 Message Date
Kristof Provost
231e83d342 pf: syncookie ioctl interface
Kernel side implementation to allow switching between on and off modes,
and allow this configuration to be retrieved.

MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31139
2021-07-20 10:36:13 +02:00
Kristof Provost
8e1864ed07 pf: syncookie support
Import OpenBSD's syncookie support for pf. This feature help pf resist
TCP SYN floods by only creating states once the remote host completes
the TCP handshake rather than when the initial SYN packet is received.

This is accomplished by using the initial sequence numbers to encode a
cookie (hence the name) in the SYN+ACK response and verifying this on
receipt of the client ACK.

Reviewed by:	kbowling
Obtained from:	OpenBSD
MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31138
2021-07-20 10:36:13 +02:00
Kristof Provost
ee9c3d3803 pf: factor out pf_synproxy()
MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31137
2021-07-20 10:36:13 +02:00
Navdeep Parhar
76c8902296 cxgbe(4): Initialize abs_id for ctrl and ofld queues.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2021-07-20 00:54:13 -07:00
Kevin Bowling
9fd0cda92d e1000: Add missing branch prediction
I missed this edit from the ixgbe review (D30074)

Reviewed by:	gallatin
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30073
2021-07-20 00:21:21 -07:00
Kevin Bowling
41f0225714 e1000: Clean up igb_txrx
The intention here is to reduce differences between em, igb, igc, ixgbe.

The main functional change is logical simplification in igb_rx_checksum
and getting interface caps from scctx instead of the ifp.

Reviewed by:	gallatin, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30073
2021-07-20 00:11:30 -07:00
Dmitry Chagin
2b38186330 Drop rdivacky@ "All rights reserved" from linux_event.
I got explicit permission from Roman.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30913
MFC after:		2 weeks
2021-07-20 10:06:16 +03:00
Dmitry Chagin
1ca6b15bbd Drop "All rights reserved" from my copyright statements.
Add email and fixup years while here.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30912
MFC after:		2 weeks
2021-07-20 10:05:50 +03:00
Dmitry Chagin
ae8330b448 linux(4): Add arch name to the some printfs.
Reviewed by:		emaste
Differential revision:	https://reviews.freebsd.org/D30904
MFC after:		2 weeks
2021-07-20 10:05:08 +03:00
Dmitry Chagin
fe7409530c linprocfs: Fixup vDSO name in the procmaps after 9931033bbf.
As the sv_shared_page_base now pointed out to the native sharedpage and
the process VA layout has changed as follows:
VDSOPAGE	(2 * PAGE_SIZE)
SHAREDPAGE	(PAGE_SIZE)
USRSTACK
fixup the vDSO name by calculating the start of page relative to the
native sharedpage.

Differential revision:	https://reviews.freebsd.org/D30903
MFC after:		2 weeks
2021-07-20 10:04:20 +03:00
Dmitry Chagin
09cffde975 linux(4): Fixup the vDSO initialization order.
The vDSO initialisation order should be as follows:
- native abi init via exec_sysvec_init();
- vDSO symbols queued to the linux_vdso_syms list;
- linux_vdso_install();
- linux_exec_sysvec_init();

As the exec_sysvec_init() called with SI_ORDER_ANY (last) at SI_SUB_EXEC
order, move linux_vdso_install() and linux_exec_sysvec_init() to the
SI_SUB_EXEC+1 order.

Reviewed by:		trasz
Differential Revision:	https://reviews.freebsd.org/D30902
MFC after		2 weeks
2021-07-20 10:02:34 +03:00
Dmitry Chagin
a543556c81 linux(4): Constify vdso install/deinstall.
In order to reduce diff between arches constify vdso install/deinstall
functions like arm64.

Reviewed by:		emaste
Differential revision:	https://reviews.freebsd.org/D30901
MFC after:		2 weeks
2021-07-20 10:01:47 +03:00
Dmitry Chagin
9931033bbf linux(4); Almost complete the vDSO.
The vDSO (virtual dynamic shared object) is a small shared library that the
kernel maps R/O into the address space of all Linux processes on image
activation. The vDSO is a fully formed ELF image, shared by all processes
with the same ABI, has no process private data.

The primary purpose of the vDSO:
- non-executable stack, signal trampolines not copied to the stack;
- signal trampolines unwind, mandatory for the NPTL;
- to avoid contex-switch overhead frequently used system calls can be
  implemented in the vDSO: for now gettimeofday, clock_gettime.

The first two have been implemented, so add the implementation of system
calls.

System calls implemenation based on a native timekeeping code with some
limitations:
- ifunc can't be used, as vDSO r/o mapped to the process VA and rtld
  can't relocate symbols;
- reading HPET memory is not implemented for now (TODO).

In case on any error vDSO system calls fallback to the kernel system
calls. For unimplemented vDSO system calls added prototypes which call
corresponding kernel system call.

Tested by:		trasz (arm64)
Differential revision:  https://reviews.freebsd.org/D30900
MFC after:              2 weeks
2021-07-20 10:01:18 +03:00
Dmitry Chagin
5fd9cd53d2 linux(4): Modify sv_onexec hook to return an error.
Temporary add stubs to the Linux emulation layer which calls the existing hook.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D30911
MFC after:		2 weeks
2021-07-20 09:56:25 +03:00
Dmitry Chagin
62ba4cd340 Call sv_onexec hook after the process VA is created.
For future use in the Linux emulation layer call sv_onexec hook right after
the new process address space is created. It's safe, as sv_onexec used only
by Linux abi and linux_on_exec() does not depend on a state of process VA.

Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D30899
MFC after:		2 weeks
2021-07-20 09:55:14 +03:00
Dmitry Chagin
b39fa4770d Remove bogus cast from exec_sysvec_init().
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D30910
MFC after:		2 weeks
2021-07-20 09:54:09 +03:00
Dmitry Chagin
21629e2a45 Modify exec_sysvec_init() to allow non-native abi to setup their sysentvecs.
For future use in the Linux emulation layer modify the exec_sysvec_init()
to allow non-native abi to fill sv_timekeep_base and sv_shared_page_obj.

Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D30898
MFC after:		2 weeks
2021-07-20 09:53:21 +03:00
Dmitry Chagin
815165be20 linux(4): Remove function prototypes from the vDSO.
In preparation for vDSO code revision get rid of incomplete vDSO methods
from locore, but leave .note.Linux section commented out.
.note.Linux section is used by glibc rtld to get the kernel version, that
saves one system call call. I'll try to implement it later, if figure out
how to use it with jails.

MFC after:	2 weeks
2021-07-20 09:52:08 +03:00
Jessica Clarke
f221000127 elf: Remove R_RISCV_[GT]PREL_[IS] relocation defines
These were internal binutils relocations that have no way to be
generated in assembly nor will ever be seen in the output, and so should
never have been defined in the psABI in the first place. They have
therefore been removed from the spec as of [1], so do so here too.

[1] 44f98e0fd8
2021-07-20 06:13:43 +01:00
Rick Macklem
7685f8344d nfscl: Send stateid.seqid of 0 for NFSv4.1/4.2 mounts
For NFSv4.1/4.2, the client may set the "seqid" field of the
stateid to 0 in RPC requests.  This indicates to the server that
it should not check the "seqid" or return NFSERR_OLDSTATEID if the
"seqid" value is not up to date w.r.t. Open/Lock operations
on the stateid.  This "seqid" is incremented by the NFSv4 server
for each Open/OpenDowngrade/Lock/Locku operation done on the stateid.

Since a failure return of NFSERR_OLDSTATEID is of no use to
the client for I/O operations, it makes sense to set "seqid"
to 0 for the stateid argument for I/O operations.
This avoids server failure replies of NFSERR_OLDSTATEID,
although I am not aware of any case where this failure occurs.

This makes the FreeBSD NFSv4.1/4.2 client compatible with the
Linux NFSv4.1/4.2 client.

MFC after:	2 weeks
2021-07-19 17:35:39 -07:00
John Baldwin
b5e73dd952 cxgbei: Don't assert F for data completion PDUs.
If a data PDU encounters an error such as a digest error, the firmware
will report that data PDU when completion moderation is active even if
it is not the final data PDU in a burst.

Sponsored by:	Chelsio Communications
2021-07-19 15:36:31 -07:00
John Baldwin
4a7d15ebb6 cxgbei: Remove invalid assertion.
A non-placed PDU can be delivered by CPL_RX_ISCSI_CMP in the middle of
a burst of placed PDUs (received via DDP) in which case the rcv_nxt
will not match the start of the non-placed PDU.

Reported by:	Jithesh Arakkan @ Chelsio
Sponsored by:	Chelsio Communications
2021-07-19 15:36:31 -07:00
Michael Tuexen
a730d82378 tcp: fix RACK and BBR when using VIMAGE enabled kernel
Fix a bug in VNET handling, which occurs when using specific NICs.
PR:			257195
Reviewed by:		rrs
MFC after:		3 days
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D31212
2021-07-20 00:29:18 +02:00
Ed Maste
79e6eb5c01 zfs: Remove zfs-images submodule
This can cause issues like 'No url found for submodule path' in
downstream or derived projects making use of submodules.

Reviewed by:	imp
2021-07-19 16:40:09 -04:00
Jessica Clarke
439097486b acpi: Fix a repeated comment typo 2021-07-19 17:19:23 +01:00
Jessica Clarke
52fba9a943 acpi: Fix a repeated vm_offset_t that should be a vm_size_t
The underlying types for both are the same so arguably this doesn't
really matter, but using the wrong type is still confusing and
technically incorrect.
2021-07-19 17:19:23 +01:00
Emmanuel Vadot
857ad3e4ff arm64: std.allwinner: Add aw_syscon
This was missed during the conversion of kernel configs
PR:		257278
Reported by:	 Manuel Stühn <freebsd@justmail.de>
2021-07-19 17:31:57 +02:00
Mateusz Guzik
9009d36afd pf: shrink struct pf_kstate
Makes room for a pointer.

Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-07-19 14:54:49 +02:00
Mateusz Guzik
f9aa757d8d pf: add a comment to pf_kstate concerning compat with pf_state_cmp
Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-07-19 14:54:49 +02:00
Mateusz Guzik
144ec0713d pf: add a branch prediction to expire state check in pf_find_state
Reviewed by:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-07-19 14:54:49 +02:00
Mateusz Guzik
78e3a16861 arm: dedup counter(9) address calculation
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-07-19 10:46:25 +00:00
Mateusz Guzik
6799c73d1e arm: retire bcopy
It is obsolete since ba96f37758 ("Use __builtin for various mem*
and b* (e.g. bzero) routines.")

Discussed with:	cognet
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-07-19 09:53:02 +00:00
Mateusz Guzik
9ef5b65085 arm: bcmp -> memcmp
The bcmp symbol is not used, at the same time memcmp as pulled from
libkern does byte-by-byte comparison.

So happens bcmp as found in support.S is in fact renamed memcmp, rename
it back.

Discussed with:	cognet
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-07-19 09:52:23 +00:00
Kyle Evans
db0f264393 kenv: allow listing of static kernel environments
The early environment is typically cleared, so these new options
need the PRESERVE_EARLY_KENV kernel config(8) option. These environments
are reported as missing by kenv(1) if the option is not present in the
running kernel.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D30835
2021-07-18 23:06:19 -05:00
Kyle Evans
7a129c973b kern: add an option for preserving the early kenv
Some downstream configurations do not store secrets in the
early (loader/static) environments and desire a way to preserve these
for diagnostic reasons.  Provide an option to do so.

Reviewed by:	imp, jhb (earlier version)
Differential Revision:	https://reviews.freebsd.org/D30834
2021-07-18 23:05:48 -05:00
Emmanuel Vadot
d72d5ced80 arm64: std.allwinner: Fix mismerge
Re-add aw_r_intc and remove duplicate a10_codec
2021-07-18 18:35:47 +02:00
Emmanuel Vadot
0f2c633164 arm64: Add per SoC family kernel config
There is multiple reason for this :
- This makes it easier to see which driver is needed for each SoC
- This makes it easier to create a custom config for one SoC
- This really reduce boot time (which some people might want)

Some explaination about the files :
- std.arm64 contains all standard kernel option
- std.dev contains all the standard kernel devices
- std.<soc> contains all drivers needed to boot on this SoC family
- <SOC> includes std.arm64, std.dev and std.<soc>
- GENERIC includes std.arm64, std.dev and all std.<soc>

Sponsored by:	Diablotin Systems
MFC After:	2 months
Reviewed by:	mmel, cognet, imp
Differential Revision:	      https://reviews.freebsd.org/D30474
2021-07-18 16:11:08 +02:00
Oskar Holmlund
8cdb4491c9 arm: TI AM335x fix gpio_pin numbers in lookup table.
gpio_pin are calculated as [GPIO_BANK]*32 + GPIO_PIN.
gpio_pin are wrong for these pins.
As a consequence wrong pins are acquired and used.

Approved by: manu (mentor)
Reported by: Martin Zakardissnehf
(martin.zakardissnehf@se.abb.com)
Differential revision: https://reviews.freebsd.org/D31164
2021-07-18 13:06:26 +02:00
Kevin Bowling
51e46835e1 ixgbe: Clean up ix_txrx
The intention here is to reduce differences with D30072.
The only functional change is logical simplification in
ixgbe_rx_checksum.

Reviewed by:	gallatin
Differential Revision:	https://reviews.freebsd.org/D30074
2021-07-17 23:25:56 -07:00
Neel Chauhan
76fffd0a86 vmd_bus: Fix typo in comment
Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D31210
2021-07-17 18:03:39 -07:00
Konstantin Belousov
dbaad75f28 zfs: add missed dependency of zfs module on zlib
Reviewed by:	mm
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31207
2021-07-18 01:44:22 +03:00
Warner Losh
abea0c6b0d cam: Mark the qos data is valid in xpd_done_direct() too.
Sponsored by:		Netflix
2021-07-17 16:12:00 -06:00
Kristof Provost
2c0d115bbc pf: locally originating connections with 'route-to' fail
Similar to the REPLY_TO shortcut (6d786845cf) we also can't shortcut
ROUTE_TO. If we do we will fail to apply transformations or update the
state, which can lead to premature termination of the connections.

PR:		257106
MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31177
2021-07-17 14:28:07 +02:00
Kristof Provost
295f2d939d pf: Remove unused arguments from pf_send_tcp()
struct mbuf *replyto is not actually used (and only rarely provided).
The same applies to struct ifnet *ifp.

No functional change.

Reviewed by:	mjg
MFC after:	1 week
Sponsored by:   Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D31136
2021-07-17 15:18:15 +02:00
Kristof Provost
ef950daa35 pf: match keyword support
Support the 'match' keyword.
Note that support is limited to adding queuing information, so without
ALTQ support in the kernel setting match rules is pointless.

For the avoidance of doubt: this is NOT full support for the match
keyword as found in OpenBSD's pf. That could potentially be built on top
of this, but this commit is NOT that.

MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D31115
2021-07-17 12:01:08 +02:00
Rick Macklem
fad3f322ef param.h: Bump __FreeBSD_version to 1400026 for commit ee29e6f311
Commit ee29e6f311 changed the internal KAPI between the nfscommon
and nfsd modules.  Bump __FreeBSD_version to 1400026 since both
modules will need to be rebuilt from sources.
2021-07-16 15:13:27 -07:00
Rick Macklem
ee29e6f311 nfsd: Add sysctl to set maximum I/O size up to 1Mbyte
Since MAXPHYS now allows the FreeBSD NFS client
to do 1Mbyte I/O operations, add a sysctl called vfs.nfsd.srvmaxio
so that the maximum NFS server I/O size can be set up to 1Mbyte.
The Linux NFS client can also do 1Mbyte I/O operations.

The default of 128Kbytes for the maximum I/O size has
not been changed for two reasons:
- kern.ipc.maxsockbuf must be increased to support 1Mbyte I/O
- The limited benchmarking I can do actually shows a drop in I/O rate
  when the I/O size is above 256Kbytes.
However, daveb@spectralogic.com reports seeing an increase
in I/O rate for the 1Mbyte I/O size vs 128Kbytes using a Linux client.

Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D30826
2021-07-16 15:01:03 -07:00
Randall Stewart
db4d2d7222 tcp: When rack or bbr get a pullup failure in the common code, don't free the NULL mbuf.
There is a bug in the error path where rack_bbr_common does a m_pullup() and the pullup fails.
There is a stray mfree(m) after m is set to NULL. This is not a good idea :-)

Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D31194
2021-07-16 13:59:57 -04:00
David Chisnall
cf98bc28d3 Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned.  This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.

This reapplies 3a522ba1bc with a fix for
the static assertion failure on i386.

Approved by:	markj (mentor)

Reviewed by:	kib, bcr (manpages)

Differential Revision: https://reviews.freebsd.org/D29185
2021-07-16 18:06:44 +01:00
Andrew Turner
ae47eecf87 Hide arm64 features that don't have a HWCAP
We should only export MSR fields if there is also a HWCAP so it doesn't
matter which software uses.

Sponsored by:	The FreeBSD Foundation
2021-07-15 23:56:47 +00:00