Both modules provide many symbols used by various DTrace provider
modules, so just export everything.
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 8a693ccf86)
Previously, a page fault taken during copyin/out and related functions
would run the entire fault handler while permitting direct access to
user addresses. This could also leak across context switches (e.g. if
the page fault handler was preempted by an interrupt or slept for disk
I/O).
To fix, clear SUM in assembly after saving the original version of
SSTATUS in the supervisor mode trapframe.
Reviewed by: mhorne, jrtc27
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D29763
(cherry picked from commit 753bcca440)
These macros are not backend-specific but reference a
backend-independent field in struct icl_conn.
Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32858
(cherry picked from commit e900338c09)
Don't pass the same name to multiple mutexes while using unique types
for WITNESS. Just use the unique types as the mutex names.
Reviewed by: markj
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32740
(cherry picked from commit 4e057806cf)
The starting sequence number used to verify that TLS 1.0 CBC records
are encrypted in-order in the OCF layer was always set to 0 and not to
the initial sequence number from the struct tls_enable.
In practice, OpenSSL always starts TLS transmit offload with a
sequence number of zero, so this only matters for tests that use a
random starting sequence number.
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D32676
(cherry picked from commit 4827bf76bc)
TLS 1.0 records are encrypted as one continuous CBC chain where the
last block of the previous record is used as the IV for the next
record. As a result, TLS 1.0 records cannot be encrypted out of order
but must be encrypted as a FIFO.
If the later pages of a sendfile(2) request complete before the first
pages, then TLS records can be encrypted out of order. For TLS 1.1
and later this is fine, but this can break for TLS 1.0.
To cope, add a queue in each TLS session to hold TLS records that
contain valid unencrypted data but are waiting for an earlier TLS
record to be encrypted first.
- In ktls_enqueue(), check if a TLS record being queued is the next
record expected for a TLS 1.0 session. If not, it is placed in
sorted order in the pending_records queue in the TLS session.
If it is the next expected record, queue it for SW encryption like
normal. In addition, check if this new record (really a potential
batch of records) was holding up any previously queued records in
the pending_records queue. Any of those records that are now in
order are also placed on the queue for SW encryption.
- In ktls_destroy(), free any TLS records on the pending_records
queue. These mbufs are marked M_NOTREADY so were not freed when the
socket buffer was purged in sbdestroy(). Instead, they must be
freed explicitly.
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D32381
(cherry picked from commit 9f03d2c001)
According to 11.4.8 in RFC 7143, ExpDataSN MUST be 0 if the response
code is not Command Completed, but we were requiring it to always be
the count of DataIn PDUs regardless of the response code.
In addition, at least one target (OCI Oracle iSCSI block device)
returns an ExpDataSN of 0 when returning a valid completion with an
error status (Check Condition) in response to a SCSI Inquiry. As a
workaround for this target, only warn without resetting the connection
for a 0 ExpDataSN for responses with a non-zero error status.
PR: 259152
Reported by: dch
Reviewed by: dch, mav, emaste
Fixes: 4f0f5bf995 iscsi: Validate DataSN values in Data-In PDUs in the initiator.
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32650
(cherry picked from commit cdbc4a074b)
As is done in the target, require that DataSN values are consecutive
and in-order. If an out of order Data-In PDU is received, force a
session reconnect. In addition, when a SCSI Response PDU is received,
verify that the ExpDataSN field matches the count of Data-In PDUs
received for this command. If not, force a session reconnect.
Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31594
(cherry picked from commit 4f0f5bf995)
This is not a functional change as the Poly1305 hash is the same
length as the GMAC hash length.
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30137
(cherry picked from commit 4a92afae7f)
I missed updating this counter when rebasing the changes in
9c64fc4029 after the switch to
COUNTER_U64_DEFINE_EARLY in 1755b2b989.
Fixes: 9c64fc4029 Add Chacha20-Poly1305 as a KTLS cipher suite.
Sponsored by: Netflix
(cherry picked from commit 90972f0402)
This supports Chacha20-Poly1305 for both send and receive for TLS 1.2
and for send in TLS 1.3.
Reviewed by: gallatin
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27841
(cherry picked from commit 4dd6800e22)
Chacha20-Poly1305 for TLS is an AEAD cipher suite for both TLS 1.2 and
TLS 1.3 (RFCs 7905 and 8446). For both versions, Chacha20 uses the
server and client IVs as implicit nonces xored with the record
sequence number to generate the per-record nonce matching the
construction used with AES-GCM for TLS 1.3.
Reviewed by: gallatin
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27839
(cherry picked from commit 9c64fc4029)
Previously the body of ktls_tick was a nop when NIC TLS was disabled,
but the callout was still scheduled consuming power on otherwise-idle
systems with Chelsio T6 adapters. Now the callout only runs while NIC
TLS is enabled on at least one interface of an adapter.
Reported by: mav
Reviewed by: np, mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32491
(cherry picked from commit ef3f98ae47)
Create the initial pool of kprocs on demand when the first socket AIO
request is submitted instead. The pool of kprocs used for other AIO
requests is similarly created on first use.
Reviewed by: asomers
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32468
(cherry picked from commit d1b6fef075)
Rework if_epair(4) to no longer use netisr and dpcpu.
Instead use mbufq and swi_net.
This simplifies the code and seems to make it work better and
no longer hang.
Work largely by bz@, with minor tweaks by kp@.
Reviewed by: bz, kp
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D31077
(cherry picked from commit 3dd5760aa5)
This is how most SYSINITs are defined. Also annotate the dummy
parameter with __unused. No functional change intended.
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 2287ced2f5)
This reverts commit 9ef7df022a ("hyperv: Register hyperv_timecounter
later during boot") and adds a comment explaining why the timecounter
needs to be registered as early as it is.
PR: 259878
Fixes: 9ef7df022a ("hyperv: Register hyperv_timecounter later during boot")
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit ed6a9452be)
Hyper-V wants to register its MSR-based timecounter during
SI_SUB_HYPERVISOR, before SI_SUB_LOCK, since an emulated 8254 may not be
available for DELAY(). So we cannot use MTX_SYSINIT to initialize the
timecounter lock.
PR: 259878
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 3339950117)
Given nodes 1 and 2, where node 1 has an advskew of 0 and node 2 has an
advskew of 100, making them master and backup respectively.
If net.inet.carp.demotion is set to a negative value on node 1, node 2
might become master while node 1 still retains it master status. Wether
or not node 2 becomes master seems to depend on the nodes advskew and
what the demotion sysctl was set to on node 1.
The reason for node 2 becoming master seems to be that the calculated
advskew taking demotion into account is truncated to a single unsigned
byte when copied into the carp header for sending, and node 1 stays
master since it takes uses the whole non-truncated calculated advskew
when deciding wether to stay master.
PR: 259528
Reviewed by: donner, glebius
MFC after: 3 weeks
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D32759
(cherry picked from commit 1019354b54)
Add bcd2bin() in bcd.h.
Libkern does provide a bcd2bin() which cannot be used cirectly leaving
us with a conflict (see comment in file).
Rather than having code to re-define bcd2bin() for the LinuxKPI
make sure libkern.h is always included before the LinuxKPI version.
Then only re-define our local LinuxKPI implementation. [1]
From the argument truncating wrapper call the libkern version.
If we change our libkern implementation in the future we can save
us the remainder of the hassle. [2].
Suggested by: Johannes Berg (johannes sipsolutions.net) [1]
Suggested by: ian [2]
Sponsored by: The FreeBSD Foundation
(cherry picked from commit 548ada00e5)
(cherry picked from commit ae2268efd5)
If a pNFS server's DS runs out of disk space, it replies
NFSERR_NOSPC to the client doing writing. For the Linux
client, it then sends a LayoutError RPC to the server to
tell it about the error and keeps retrying, doing repeated
LayoutGet and Write RPCs to the DS. The Linux client is
"stuck" until disk space on the DS is free'd up.
For a mirrored server configuration, the first mirror that
ran out of space was taken offline. This does not make
much sense, since the other mirror(s) will run out of space
soon and the fix is a manual cleanup up disk space.
This patch changes the pNFS server to not disable a mirror
for the mirrored case when this occurs.
Further work is needed, since the Linux client expects the
MDS to reply NFSERR_NOSPC to LayoutGets once the DS is out
of space. Without this further change, the above mentioned
looping occurs.
Found during a recent IEFT NFSv4 working group testing event.
(cherry picked from commit a7e014eee5)
When a mount with the "pnfs" and "nconnect" options specified
does an I/O operation, it erroneously uses a TCP connection
to the MDS when it is meant to be a DS operation and, as such,
needs to use a TCP connection to the DS. This patch fixes this.
When the "pnfs" and "nconnect" options are specified for a
NFSv4.1/4.2 mount, there probably should be N connections
established to each DS for I/O RPCs. This is a fair amount
of work and may be done in a future commit.
This problem was found during a recent IETF NFSv4 working
group testing event.
(cherry picked from commit 80e5955b08)
Leaf 1Fh is a prefered extended version of 0Bh. It is supported by
new Lader Lake CPUs, though does not report anything new so far.
MFC after: 2 weeks
(cherry picked from commit 6badb512a9)
Although I was not able to cause a failure during testing, there
are places in nfscl_removedeleg() and nfscl_renamedeleg() where
I think a forced dismount could get hung. This patch fixes those.
This patch only affects forced dismount and only if the NFSv4
server is issuing delegations to the client.
Found by code inspection.
(cherry picked from commit f5d5164fb6)
We should clear firewall tags on loopback, icmp reflection, or if_epair
transmission. Left over tags can produce unexpected behaviour,
especially on if_epair where a and b interfaces can be in different
vnets, and have different firewall policies set.
MFC after: 3 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D32664
(cherry picked from commit 7fe0c3f8d3)
Remove all (non-persistent) tags when we transmit a packet. Real network
interfaces do not carry any tags either, and leaving tags attached can
produce unexpected results.
Reviewed by: bz, glebius
MFC after: 3 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D32663
(cherry picked from commit 62d2dcafb7)
I made a mistaking in merging the final commits for the devctl changes. This
adds the 'hushed' variable and has the correct dates for the manuals.
Pointy hat to: imp
(cherry picked from commit 80f21bb039)
Generate VT events when the bell beeps. When coupled with disabling the
bell,this allows custom bells to be rung when we'd otherwise beep.
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D32656
(cherry picked from commit 4ac3d08a96)
Add the glue needed to listen to TP_SETBELLPD which teken uses to
inform its client drivers about the results of parsing
\e[=<pitch>;<duration>B. It converts these to a Hz value for the
tone/pitch of the bell and a duration in ms. There's some loss of
precision because <pitch> in the escape seuquence is defined to be
(1193182 / pitch) Hz and <duration> is in 10ms units. Also note that
kbdcontrol also parses 'off' but then doesn't send the proper escape
sequence, leading me to wonder if that's another bug since teken
appears to parse that sequence properly and I've added code here to
treat that as the same as quiet or disabled.
In general, Hz from 100 to 2000 is good. Outside that range is possible,
but even at 100Hz the square wave is starting to sound bad and above
2000Hz the speaker may not respond.
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D32620
(cherry picked from commit 2533eca1c2)
Change the 'period' argument to 'duration' and change its type to
sbintime_t so we can more easily express different durations.
Reviewed by: tsoome, glebius
Differential Revision: https://reviews.freebsd.org/D32619
(cherry picked from commit 072d5b98c4)
Remote detritis copied, apparently, from sparc. utrap has never been
used on arm, so it's safe to just remove it.
Sponsored by: Netflix
(cherry picked from commit c47a4a2375)