This is all very long-standing bug stuff that is touchy and still poorly
documented. Ok, here goes.
The basic bug:
* deleting a VAP causes the RX path (and TX path too) to be restarted
without a full chip reset, which causes RX hangs on the AR9380 and later.
(ie, the ones with the newer DMA engine.)
The basic fix:
* do an RX flush when stopping RX in ath_vap_delete() to match what happens
when RX is stopped elsewhere. This ensures any pending frames are completed
and we restart at the right spot; it also ensures we don't push new RX buffers
into the hardware if we're stopping receive.
The other issues I found:
* Don't bother checking the RX packet ring in the deferred read taskqueue;
that's specifically supposed to be for completing frames rather than
just yanking them off the receive ring.
* Cancel/drain any pending deferred read taskqueue. This isn't done inside
any locks so we should be super careful here. This stops the hardware
being reprogrammed at the same time in another thread/CPU whilst we're
stopping RX.
* .. (yes, this should be better serialised, but that's for another day. maybe.)
* Add more debugging to trace what's going on here.
And the fun bit:
* Reinitialise the RX FIFO ONLY if we've been reset or stopped, rather than just
reset. I noticed that after all the above was done I was STILL seeing RXEOL.
RXEOL isn't enabled on the AR9380 so I'd only see it if I was sending TX frames
(ie a ping where it'd be transmitted but never received) so I was not being
spammed by RXEOL. So, as long as stuff is stopped, restart it.
This seems to be doing the right thing in both AP and STA modes.
What I should do next, if I ever get time:
* as I said above, serialise the receive stop/start to include taskqueues
* monitor RXEOL on the AR9380 and I keep seeing it spammed / lockups, just
go do a full chip reset to get things back on track. It sucks, but it
is better than nothing.
Tested:
* AR9380 AP/STA mode, adding/deleting a hostap VAP to trigger the TX/RX
queue stop/start; whilst also running an iperf through it. Lots of times.
Lots. Of.. Times.
I have to dig into why I'm seeing it on chips as late as the AR9380 era
stuff (as it's marked as an AR5416 bug, but who knows!) but i'm seeing
aggregate TX frames complete with no blockack bit set. So, everything
should be treated as a failure and do a hardware reset for good measure.
Tested:
* AR9380, STA mode
* AR9580 (5GHz), AP mode
I wasn't enforcing the maximum packet length when using static rates
so although the driver was enforcing it itself OK, the statistics were
sometimes going into the wrong bin.
Tested:
* AR9380, STA mode
- Consistently use 'void *' for key schedules / key contexts instead
of a mix of 'caddr_t', 'uint8_t *', and 'void *'.
- Add a ctxsize member to enc_xform similar to what auth transforms use
and require callers to malloc/zfree the context. The setkey callback
now supplies the caller-allocated context pointer and the zerokey
callback is removed. Callers now always use zfree() to ensure
key contexts are zeroed.
- Consistently use C99 initializers for all statically-initialized
instances of 'struct enc_xform'.
- Change the encrypt and decrypt functions to accept separate in and
out buffer pointers. Almost all of the backend crypto functions
already supported separate input and output buffers and this makes
it simpler to support separate buffers in OCF.
- Remove xform_userland.h shim to permit transforms to be compiled in
userland. Transforms no longer call malloc/free directly.
Reviewed by: cem (earlier version)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24855
This change adds Hyper-V socket feature in FreeBSD. New socket address
family AF_HYPERV and its kernel support are added.
Submitted by: Wei Hu <weh@microsoft.com>
Reviewed by: Dexuan Cui <decui@microsoft.com>
Relnotes: yes
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D24061
Fix returning from xenstore device with locks held, which triggers the
following panic:
# cat /dev/xen/xenstore
^C
userret: returning with the following locks held:
exclusive sx evtchn_ringc_sx (evtchn_ringc_sx) r = 0 (0xfffff8000650be40) locked @ /usr/src/sys/dev/xen/evtchn/evtchn_dev.c:262
Note this is not a security issue since access to the device is
limited to root by default.
Sponsored by: Citrix Systems R&D
MFC after: 1 week
Previously the driver handled the bit within itself, but did not expose
the state change to net80211 and interface layers.
This change uses net80211 KPI for rfkill signaling.
The code is modeled after similar code in iwn and wpi.
Reviewed by: adrian
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24923
My preivous logic was a bit wrong. This caused transmissions that failed due
to a mix of short and long retries to count intermediate rates as OK if the
LONG retry count indicated some retries had made it to this intermediate rate,
but the SHORT retry count was the one that caused the whole transmit to fail.
Now status is passed in again - and this is the status for the whole transmission -
and then update_stats() does some quick math to see if the current transmission
series hit its long retry count or not before updating things as a success
or failure.
into account and remove the requirement that the MCS rate is "higher" if we're
considering a new rate.
Ok, another fun one.
* In order for reliable non-software retried higher MCS rates, the TX schedules
(inconsistently!) use hard-coded lower rates at the end of the schedule.
Now, hard-coded is a problem because (a) it means that aggregate formation
is limited by the SLOWEST rate, so I never formed large AMDU frames for
3 stream rates, and (b) if the AP disables lower rates as base rates, it
complains about "unknown rix" every frame you transmit at that rate.
So, for now just disable the third and fourth schedule entry for AMPDUs.
Now I'm forming 32k and 64k aggregates for the higher density MCS rates
much more reliably.
It would be much nicer if the rate schedule stuff wasn't fixed but instead
I'd just populate ath_rc_series[] when I fetch the rates. This is all a
holdover of ye olde pre-11n stuff and I really just need to nuke it.
But for now, ye hack.
* The check for "is this MCS rate better" based on MCS itself is just garbage.
It meant things like going MCS0->7 would be fine, and say 0->8->16 is fine,
(as they're equivalent encoding but 1,2,3 spatial streams), BUT it meant
going something like MCS7->11 would fail even though it's likely that
MCS11 would just be better, both for EWMA/BER and throughput.
So for now just use the average tx time. The "right" way for this comparison
would be to compare PHY bitrates rather than MCS / rate indexes, but I'm not
yet there. The bit rates ARE available in the PHY index, but honestly
I have a lot of other cleaning up to here before I think about that.
* Don't include the RTS/CTS retry count (and thus time) into the average tx time
caluation. It just makes temporarily failures make the rate look bad by
QUITE A LOT, as RTS/CTS exchanges are (a) long, and (b) mostly irrelevant
to the actual rate being tried. If we keep hitting RTS/CTS failures then
there's something ELSE wrong on the channel, not our selected rate.
* Fix formatting, cause reasons;
* Put back the "and the chosen rate is within 90% of the current rate" logic;
* Ensure the best rate and the current rate aren't the same; this ...
* ... fixes the packets_since_switch[] tracking to actually conut how many
frames since the rate switched, so now I know how stable stuff is; and
* Ensure that MCS can go up to a higher MCS at this or any other spatial stream.
My previous quick hack attempt was doing > rather than >= so you had to go
to both a higher root MCS rate (0..7) and spatial stream. Eg, you couldn't
go from MCS0 (1ss) to MCS8 (2ss) this way.
The best rate and switching rate logic still have a bunch more work to do
because they're still quite touchy when it comes to average tx time but at least
now it's choosing higher rates correctly when it wants to try a higher rate.
Tested:
* AR9380, STA mode
Some laptops don't send ACPI "lid status changed" notifications upon
opening the lid if the system was currently suspended. In r358219
this was partially fixed, updating the "lid_status" variable upon
resume even if there is no "status changed" notification from ACPI.
Unfortunately the fix in r358219 did not include notifying userland
via devd; this causes problems on systems using upowerd (e.g. KDE),
since upowerd remembers the most recent devd notification about the
lid status rather than querying the sysctl to get the current status.
This showed up as two symptoms when KDE's "When laptop lid closed: Sleep"
option is set:
1. 50% of the time, closing the lid would not trigger S3 sleep.
2. 50% of the time, plugging/unplugging AC power would trigger S3 sleep.
PR: 246477
MFC after: 3 days
My initial rate control code was .. suboptimal. I wanted to at least get MCS
rates sent, but it didn't do anywhere near enough to handle low signal level links
or remotely keep accurate statistics.
So, 8 years later, here's what I should've done back then.
* Firstly, I wasn't at all tracking packet sizes other than the two buckets
(250 and 1600 bytes.) So, extend it to include 4096, 8192, 16384, 32768 and
65536. I may go add 2048 at some point if I find it's useful.
This is important for a few reasons. First, when forming A-MPDU or AMSDU
aggregates the frame sizes are larger, and thus the TX time calculation
is woefully, increasingly wrong. Secondly, the behaviour of 802.11 channels
isn't some fixed thing, both due to channel conditions and radios themselves.
Notably, there was some observations done a few years ago on 11n chipsets
which noticed longer aggregates showed an increase in failed A-MPDU sub-frame
reception as you got further along in the transmit time. It could be due to
a variety of things - transmitter linearity, channel conditions changing,
frequency/phase drift, etc - but the observation was to potentially form
shorter aggregates to improve BER.
* .. and then modify the ath TX path to report the length of the aggregate sent,
so as the statistics kept would line up with the correct bucket.
* Then on the rate control look-up side - i was also only using the first frame
length for an A-MPDU rate control lookup which isn't good enough here.
So, add a new method that walks the TID software queue for that node to
find out what the likely length of data available is. It isn't ALL of the
data in the queue because we'll only ever send enough data to fit inside the
block-ack window, so limit how many bytes we return to roughly what ath_tx_form_aggr()
would do.
* .. and cache that in the first ath_buf in the aggregate so it and the eventual
AMPDU length can be returned to the rate control code.
* THEN, modify the rate control code to look at them both when deciding which bucket
to attribute the sent frame on. I'm erring on the side of caution and using the
size bucket that the lookup is based on.
Ok, so now the rate lookups and statistics are "more correct". However, MCS rates
are not the same as 11abg rates in that they're not a monotonically incrementing
set of faster rates and you can't assume that just because a given MCS rate fails,
the next higher one wouldn't work better or be a lower average tx time.
So, I had to do a bunch of surgery to the best rate and sample rate math.
This is the bit that's a WIP.
* First, simplify the statistics updates (update_stats()) to do a single pass on
all rates.
* Next, make sure that each rate average tx time is updated based on /its/ failure/success.
Eg if you sent a frame with { MCS15, MCS12, MCS8 } and MCS8 succeeded, MCS15 and MCS
12 would have their average tx time updated for /their/ part of the transmission,
not the whole transmission.
* Next, EWMA wasn't being fully calculated based on the /failures/ in each of the
rate attempts. So, if MCS15, MCS12 failed above but MCS8 didn't, then ensure
that the statistics noted that /all/ subframes failed at those rates, rather than
the eventual set of transmitted/sent frames. This ensures the EWMA /and/ average
TX time are updated correctly.
* When picking a sample rate and initial rate, probe rates aroud the current MCS
but limit it to MCS0..7 /for all spatial streams/, rather than doing crazy things
like hitting MCS7 and then probing MCS8 - MCS8 is basically MCS0 but two spatial
streams. It's a /lot/ slower than MCS7. Also, the reverse is true - if we're at
MCS8 then don't probe MCS7 as part of it, it's not likely to succeed.
* Fix bugs in pick_best_rate() where I was /immediately/ choosing the highest MCS
rate if there weren't any frames yet transmitted. I was defaulting to 25% EWMA and
.. then each comparison would accept the higher rate. Just skip those; sampling
will fill in the details.
So, this seems to work a lot better. It's not perfect; I'm still seeing a lot of
instability around higher MCS rates because there are bursts of loss/retransmissions
that aren't /too/ bad. But i'll keep iterating over this and tidying up my hacks.
Ok, so why this still something I'm poking at? rather than porting minstrel_ht?
ath_rate_sample tries to minimise airtime, not maximise throughput. I have
extended it with an EWMA based on sub-frame success/failures - high MCS rates
that have partially successful receptions still show super short average frame
times, but a /lot/ of retransmits have to happen for that to work.
So for MCS rates I also track this EWMA and ensure that the rates I'm choosing
don't have super crappy packet failures. I don't mind not getting lower
peak throughput versus minstrel_ht; instead I want to see if I can make "minimise
airtime" work well.
Tested:
* AR9380, STA mode
* AR9344, STA mode
* AR9580, STA/AP mode
This function is responsible for setting pc_domain in each pcpu
structure. Call it from the main function that starts APs, rather than
a separate SYSINIT. This makes it easier to close the window where
UMA's per-CPU slab allocator may be called while pc_domain is
uninitialized. In particular, the allocator uses pc_domain to allocate
domain-local pages, so allocations before this point end up using domain
0 for everything.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24757
__builtin_unreachable doesn't raise any compile-time warnings/errors on its
own, so problems with its usage can't be easily detected. While it would be
nice for this situation to change and compilers to at least add a warning
for trivial cases where local state means the instruction can't be reached,
this isn't the case at the moment and likely will not happen.
This commit adds an __assert_unreachable, whose intent is incredibly clear:
it asserts that this instruction is unreachable. On INVARIANTS builds, it's
a panic(), and on non-INVARIANTS it expands to __unreachable().
Existing users of __unreachable() are converted to __assert_unreachable,
to improve debuggability if this assumption is violated.
Reviewed by: mjg
Differential Revision: https://reviews.freebsd.org/D23793
So, replicate the ATI vendor snoop configuration for the AMD vendor.
I think that this should fix a number of cases where users currently
have to resort to polling or disabling MSI.
MFC after: 1 week
Right now (well, since I did this in 2011/2012) the rate control code
makes some super bad choices for 11n aggregates/rates, and it tracks
statistics even more questionably.
It's been long enough and I'm now trying to use it again daily, so let's
start by:
* telling the rate control code if it's an aggregate or not;
* being clearer about the TID - yes it can be extracted from the
ath_buf but this way it can be overridden by the caller without
changing the TID itself.
(This is for doing experiments with voice/video QoS at some point..)
* Return an optional field to limit how long the aggregate is in
microseconds. Right now the rate control code supplies a rate table
and the ath aggr form code will look at the rate table and limit
the aggregate size to 4ms at the slowest rate. Yeah, this is pretty
terrible.
* Add some more TODO comments around handling txpower, rate and
handling filtered frames status so if I continue to have spoons for
this I can go poke at it.
Yes, people shouldn't use bitfields in C for structure parsing.
If someone ever wants a cleanup task then it'd be great to remove them
from this vendor code and other places in the ar9285/ar9287 HALs.
Alas, here we are.
AH_BYTE_ORDER wasn't defined and neither were the two values it could be.
So when compiling ath_ee_print_9300 it'd default to the big endian struct
layout and get a WHOLE lot of stuff wrong.
So:
* move AH_BYTE_ORDER into ath_hal/ah.h where it can be used by everyone.
* ensure that AH_BYTE_ORDER is actually defined before using it!
This should work on both big and little endian platforms.
There are no in-kernel consumers.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24775
It no longer has any in-kernel consumers via OCF. smbfs still uses
single DES directly, so sys/crypto/des remains for that use case.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24773
There are no longer any in-kernel consumers. The software
implementation was also a non-functional stub.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24771
Although a few drivers supported this algorithm, there were never any
in-kernel consumers. cryptosoft and cryptodev never supported it,
and there was not a software xform auth_hash for it.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24767
Pursuant to r360398, implement driver-specific versions of the
ifdi_needs_restart iflib device method.
Some (if not most?) Intel network cards don't need reinitializing when a
VLAN is added or removed from the device hardware, so these implement
ifdi_needs_restart in a way that tell iflib not to bring the interface
up or down when a VLAN is added or removed, regardless of whether the
VLAN_HWFILTER interface capability flag is set or not.
This could potentially solve several PRs relating to link flaps that
occur when VLANs are added/removed to devices.
Signed-off-by: Eric Joyner <erj@freebsd.org>
PR: 240818, 241785
Reviewed by: gallatin@, olivier@
MFC after: 3 days
MFC with: r360398
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D24659
r360870 added linux/slab.h into liunx/bitmap.h and this include linux/types.h
The qlnx driver is redefining some of those types so remove them and add an
explicit linux/types.h include.
Pointy hat: manu
Reported by: Austin Shafer <ashafer@badland.io>
Some ethernet switches have very large register windows; for example
the AR8316 switch MIB starts at 0x20000.
Submitted by: Mori Hiroki <yamori813@yahoo.co.jp>
The attach method uses GPIO_GET_BUS() to get a "newbus" device
that provides a pin. But on hints-based systems a GPIO controller
driver might not be fully initialized yet and it does not know gpiobus
hanging off it. Thus, GPIO_GET_BUS() cannot be called yet.
The reason is that controller drivers typically create a child gpiobus
using gpiobus_attach_bus() and that leads to the following call chain:
gpiobus_attach_bus() -> gpiobus_attach() ->
bus_generic_attach(gpiobus) -> gpioiic_attach().
So, gpioiic_attach() is called before gpiobus_attach_bus() returns.
I observed this bug with nctgpio driver on amd64.
I think that the problem was introduced in r355276.
The fix is to avoid calling GPIO_GET_BUS() from the attach method.
Instead, we know that on hints-based systems only the parent gpiobus can
provide the pins.
Nothing is changed for FDT-based systems.
MFC after: 1 week
Sometimes, especially when there is not much memory in the system left,
allocating mbuf jumbo clusters (like 9KB or 16KB) can take a lot of time
and it is not guaranteed that it'll succeed. In that situation, the
fallback will work, but if the refill needs to take a place for a lot of
descriptors at once, the time spent in m_getjcl looking for memory can
cause system unresponsiveness due to high priority of the Rx task. This
can also lead to driver reset, because Tx cleanup routine is being
blocked and timer service could detect that Tx packets aren't cleaned
up. The reset routine can further create another unresponsiveness - Rx
rings are being refilled there, so m_getjcl will again burn the CPU.
This was causing NVMe driver timeouts and resets, because network driver
is having higher priority.
Instead of 16KB jumbo clusters for the Rx buffers, 9KB clusters are
enough - ENA MTU is being set to 9K anyway, so it's very unlikely that
more space than 9KB will be needed.
However, 9KB jumbo clusters can still cause issues, so by default the
page size mbuf cluster will be used for the Rx descriptors. This can have a
small (~2%) impact on the throughput of the device, so to restore
original behavior, one must change sysctl "hw.ena.enable_9k_mbufs" to
"1" in "/boot/loader.conf" file.
As a part of this patch (important fix), the version of the driver
was updated to v2.1.2.
Submitted by: cperciva
Reviewed by: Michal Krawczyk <mk@semihalf.com>
Reviewed by: Ido Segev <idose@amazon.com>
Reviewed by: Guy Tzalik <gtzalik@amazon.com>
MFC after: 3 days
PR: 225791, 234838, 235856, 236989, 243531
Differential Revision: https://reviews.freebsd.org/D24546
The bus is independent of the device, so all devices can be attached to
either a PCI bus or an MMIO bus. For example, QEMU's virtio-rng-device
gives the MMIO variant of virtio-rng-pci, and is now detected.
Reviewed by: andrew, br, brooks (mentor)
Approved by: andrew, br, brooks (mentor)
Differential Revision: https://reviews.freebsd.org/D24730
The non-legacy virtio MMIO specification drops the use of PFNs and
replaces them with physical addresses. Whilst many implementations are
so-called transitional devices, also implementing the legacy
specification, TinyEMU[1] does not. Device-specific configuration
registers have also changed to being little-endian, and must be accessed
using a single aligned access for registers up to 32 bits, and two
32-bit aligned accesses for 64-bit registers.
[1] https://bellard.org/tinyemu/
Reviewed by: br, brooks (mentor)
Approved by: br, brooks (mentor)
Differential Revision: https://reviews.freebsd.org/D24681
With the removal of in-tree consumers of DES, Triple DES, and
MD5-HMAC, the only algorithm this driver still supports is SHA1-HMAC.
This is not very useful as a standalone algorithm (IPsec AH-only with
SHA1 would be the only user).
This driver has also not been kept up to date with the original driver
in OpenBSD which supports a few more cards and AES-CBC on newer cards.
The newest card currently supported by this driver was released in
2005.
Reviewed by: cem
MFC after: 1 week
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24691
Only _BCL and _BCM methods seem to be essential to the driver's
operation. If _BQC is missing then we can assume that the current
brightness is whatever we set by the last _BCM invocation. If _DCS or
_DGS is missing the we can make assumptions as well.
The change is based on a patch suggested by Anthony Jenkins
<Scoobi_doo@yahoo.com> in PR 207086.
PR: 207086
Submitted by: Anthony Jenkins <Scoobi_doo@yahoo.com (earlier version)
Reviewed by: manu
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D24653
"F lock" is a switch between two sets of scancodes for function keys F1-F12
found on some Logitech and Microsoft PS/2 keyboards [1]. When "F lock" is
pressed, then F1-F12 act as function keys and produce usual keyscans for
these keys. When "F lock" is depressed, F1-F12 produced the same keyscans
but prefixed with E0.
Some laptops use [2] E0-prefixed F1-F12 scancodes for non-standard keys.
[1] https://www.win.tue.nl/~aeb/linux/kbd/scancodes-6.html
[2] https://reviews.freebsd.org/D21565
MFC after: 2 weeks
o Shrink sglist(9) functions to work with multipage mbufs down from
four functions to two.
o Don't use 'struct mbuf_ext_pgs *' as argument, use struct mbuf.
o Rename to something matching _epg.
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D24598
The following series of patches addresses three things:
Now that array of pages is embedded into mbuf, we no longer need
separate structure to pass around, so struct mbuf_ext_pgs is an
artifact of the first implementation. And struct mbuf_ext_pgs_data
is a crutch to accomodate the main idea r359919 with minimal churn.
Also, M_EXT of type EXT_PGS are just a synonym of M_NOMAP.
The namespace for the newfeature is somewhat inconsistent and
sometimes has a lengthy prefixes. In these patches we will
gradually bring the namespace to "m_epg" prefix for all mbuf
fields and most functions.
Step 1 of 4:
o Anonymize mbuf_ext_pgs_data, embed in m_ext
o Embed mbuf_ext_pgs
o Start documenting all this entanglement
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D24598
All callers are currently filtering bad nsid to this function,
however, we'll have undefined behavior if that's not true. Add the
KASSERT to prevent that.
Add the nvmeX device to the XPT_PATH_INQ nvme specific
information. while one could figure this out by looking up the
domain🚌slot:function, it's a lot easier to have the SIM set it
directly since the sim knows this.
This largely reuses the TLS TOE support added in r330884. However,
this uses the KTLS framework in upstream OpenSSL rather than requiring
Chelsio-specific patches to OpenSSL. As with the existing TLS TOE
support, use of RX offload requires setting the tls_rx_ports sysctl.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24453
- Add a new TCP_RXTLS_ENABLE socket option to set the encryption and
authentication algorithms and keys as well as the initial sequence
number.
- When reading from a socket using KTLS receive, applications must use
recvmsg(). Each successful call to recvmsg() will return a single
TLS record. A new TCP control message, TLS_GET_RECORD, will contain
the TLS record header of the decrypted record. The regular message
buffer passed to recvmsg() will receive the decrypted payload. This
is similar to the interface used by Linux's KTLS RX except that
Linux does not return the full TLS header in the control message.
- Add plumbing to the TOE KTLS interface to request either transmit
or receive KTLS sessions.
- When a socket is using receive KTLS, redirect reads from
soreceive_stream() into soreceive_generic().
- Note that this interface is currently only defined for TLS 1.1 and
1.2, though I believe we will be able to reuse the same interface
and structures for 1.3.
Instead, copy the strings into a temporary buffer on the stack and
run strcmp on the copies.
Reviewed by: brooks, kib
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24567
Some models of laptops e.g. "X1 Carbon 3rd Gen Thinkpad" have LRM buttons
wired as so called "Synaptic touchpads extended buttons" rather thah real
trackpoint buttons. Handle this case with merging of events from both
sources.
PR: 245877
Reported by: Raichoo <raichoo@googlemail.com>
MFC after: 1 week
Kernel prints the device announcement before the attach method is
called, so if the correct description is not set by the probe method,
then the announcement would have an incorrect one.
MFC after: 1 week
Use DRIVER_MODULE_ORDERED(SI_ORDER_ANY) so that ig4's ACPI attachment
happens after iicbus and acpi_iicbus drivers are registered.
I have seen a problem where iicbus attached under ig4 instead of
acpi_iicbus when ig4.ko was loaded with kldload. I believe that that
happened because ig4 driver was a first driver to register, it attached
and created an iicbus child. Then iicbus driver was registered and,
since it was the only driver that could attach to the iicbus child
device, it did exactly that. After that acpi_iicbus driver was
registered. It would be able to attach to the iicbus device, but it was
already attached, so nothing happened.
MFC after: 2 weeks
In r360131, acpi_ec probe was changed to not clobber an error status prior to
several error cases that did not explicitly set the error variable before
goto'ing the exit path. However, I did not notice that the error variable was
not set to success in the success path. That caused all successful probes to
fail, which is obviously undesirable.
PR: 245778
Reported by: Neel Chauhan <neel AT neelc.org>, Evilham <contact AT evilham.com>
Tested by: Evilham
X-MFC-With: r360131
as the dma_device during RDMA registration.
cxgbe's struct device cannot be used as-is because it's a native FreeBSD
driver and ibcore is LinuxKPI based.
MFC after: 1 week
MFC after: r360196
The sole in-tree user of this flag has been retired, so remove this
complexity from all drivers. While here, add a helper routine drivers
can use to read the current request's IV into a local buffer. Use
this routine to replace duplicated code in nearly all drivers.
Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24450
In r360126, I meant to have a different mask only on powerpc, not powerpc64.
Update the check to check that we're not compiling for powerpc64.
Reported by: jhibbits
Approved by: wulf (implicit)
MFC after: 2 weeks
X-MFC-Note: 12 only
X-MFC-With: r360126
Differential Revision: D24370 (followup)
All of the 'goto out;' cases in this probe routine without explicit
initialization of 'ret' indicate error cases and were clearly intended
to use the initial definition of 'ret' with ENXIO. However, 'ret' was
accidentally squashed by reuse for a subroutine call near the beginning
of probe.
Use a different variable for the subroutine status to preserve ENXIO ret
for the 'goto out's as a minimal solution to the panic reported at attach
for now.
PR: 245757
Change kern.evdev.rcpt_mask from 3 to 12 by default. This makes us much
more evdev-friendly, and will prevent everyone using xorg and wayland with
evdev devices (the default) from needing to change this locally.
powerpc32 still uses the old value for the keyboard part, becaues the adb
keyboard driver used there is not evdev compatible.
Reviewed by: wulf
Approved by: wulf
MFC after: 2 weeks
X-MFC-Note: 12 only
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D24370
elements in the middle.
This fixes a panic when detaching USB mouse.
PR: 245732
Reviewed by: wulf
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D24500
don't implement link power management, LPM.
This fixes error code XHCI_TRB_ERROR_BANDWIDTH for isochronous USB 3.0
transactions.
Submitted by: Horse Ma <Shichun.Ma@dell.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
snd_hda was rewritten in r230130; one function retained a comment
referencing the previous name.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
A later change, currently being iterated on in D24459, will in-fact change
the lock type to an sx so that TTY drivers can sleep on it if they need to.
Committing this ahead of time to make the review in question a little more
palatable.
tty_lock_assert() is unfortunately still needed for now in two places to
make sure that the tty lock has not been recursed upon, for those scenarios
where it's supplied by the TTY driver and possibly a mutex that is allowed
to recurse.
Suggested by: markj
On my Dell Latitude 7390 laptop, the brightness hotkeys
(Fn+<up/down arrow>) send ACPI notifications which acpi_video
handles by adjusting its brightness setting; but ACPI does not
actually do anything with the backlight.
Announcing brightness changes via devd makes it possible to close
the loop by triggering the intel_backlight utility to perform the
required backlight adjustment.
Reviewed by: imp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D24424
Name each ehci driver uniquely.
This remove the warning printed at each arm boot :
module_register: cannot register simplebus/ehci from kernel; already loaded from kernel
A similar fix was done in r333074 but imx_ehci wasn't renamed and generic_ehci wasn't
present at that time.
KTLS uses the flowid to distribute software encryption tasks among its
pool of worker threads. Without this change, all software KTLS
requests for TOE sockets ended up on the first worker thread.
Note that the flowid for TOE sockets created via connect() is not a
hash of the 4-tuple, but is instead the id of the TOE pcb (tid). The
flowid of TOE sockets created from TOE listen sockets do use the
4-tuple RSS hash as the flowid since the firmware provides the hash in
the message containing the original SYN.
Reviewed by: np (earlier version)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24348
This fixes a panic when unloading and reloading t4_tom.ko since the
old pointer is still stored when t4_tom_load tries to set it.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24358
This pattern is used in callbacks with void * data arguments and seems
both relatively uncommon and relatively harmless. Silence the warning
by casting through uintptr_t.
This warning is on by default in Clang 11.
Reviewed by: arichardson
Obtained from: CheriBSD (partial)
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24425
If the fdt node doesn't have a cd-gpios properties or if the node is set
as non-removable we do not init the card detection timeout task as it is
useless so don't schedule it too.
MFC after: 1 month
X-MFC-With: r359924
This fixes a panic that would occur when the timer tried to close a
stale socket.
Submitted by: Krishnamraju Eraparaju @ Chelsio
MFC after: 1 week
Sponsored by: Chelsio Communications
Copy the CP, PTRIN, etc macros from freebsd32.h into a sys/abi_compat.h
and replace existing definitation with includes where required. This
eliminates duplicate code and allows Linux and FreeBSD compatability
headers to be included in the same files.
Input from: cem, jhb
Obtained from: CheriBSD
MFC after: 2 weeks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24275
the dts to find the supported speeds and the regulators.
Not all DTS have every settings properly defined so host controller
will still have to add some caps themselves.
It also add a mmc_fdt_gpio_setup function which will read the cd-gpios
property and register it as the CD pin.
If the pin support interrupts one will be registered and the cd_helper
function will be called.
If the pin doesn't support interrupts the internal taskqueue will poll
for change and call the same cd_helper function.
mmc_fdt_gpio_setup will also parse the wp-gpio property and MMC drivers
can know the write-protect pin value by calling the
mmc_fdt_gpio_get_readonly function.
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D23267
While the original implementation of unmapped mbufs was a large
step forward in terms of reducing cache misses by enabling mbufs
to carry more than a single page for sendfile, they are rather
cache unfriendly when accessing the ext_pgs metadata and
data. This is because the ext_pgs part of the mbuf is allocated
separately, and almost guaranteed to be cold in cache.
This change takes advantage of the fact that unmapped mbufs
are never used at the same time as pkthdr mbufs. Given this
fact, we can overlap the ext_pgs metadata with the mbuf
pkthdr, and carry the ext_pgs meta directly in the mbuf itself.
Similarly, we can carry the ext_pgs data (TLS hdr/trailer/array
of pages) directly after the existing m_ext.
In order to be able to carry 5 pages (which is the minimum
required for a 16K TLS record which is not perfectly aligned) on
LP64, I've had to steal ext_arg2. The only user of this in the
xmit path is sendfile, and I've adjusted it to use arg1 when
using unmapped mbufs.
This change is almost entirely mechanical, except that we
change mb_alloc_ext_pgs() to no longer allow allocating
pkthdrs, the change to avoid ext_arg2 as mentioned above,
and the removal of the ext_pgs zone,
This change saves roughly 2% "raw" CPU (~59% -> 57%), or over
3% "scaled" CPU on a Netflix 100% software kTLS workload at
90+ Gb/s on Broadwell Xeons.
In a follow-on commit, I plan to remove some hacks to avoid
access ext_pgs fields of mbufs, since they will now be in
cache.
Many thanks to glebius for helping to make this better in
the Netflix tree.
Reviewed by: hselasky, jhb, rrs, glebius (early version)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24213
uart(4) backends currently detect RX FIFO overrun errors and report
them to the uart(4) core layer. They are then reported to the generic
TTY layer which promptly ignores them. As a result, there is
currently no good way to determine if a uart is experiencing RX FIFO
overruns. One could add a generic per-tty counter, but there did not
appear to be a good way to export those. Instead, add a sysctl under
the uart(4) sysctl tree to export the count of overruns.
Reviewed by: brooks
MFC after: 2 weeks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24368
Shifting right by 1 is not the same as dividing by 2 for signed
values. In particular, dividing a signed value by 2 gives the integer
ceiling of the (e.g. -5 / 2 == -2) whereas shifting right by 1 always
gives the floor (-5 >> 1 == -3).
An embedded board with a 25 Mhz base clock results in an error of
-30.5% when used with a baud rate of 115200. Using division, this
truncates to -30% and is permitted. Using the shift, this fails and
is rejected causing TIOCSETA requests to fail with EINVAL and breaking
getty(8).
Using division gives the same error range for both over and under baud
rates and also makes the code match the behavior documented in the
existing comment about supporting boards with 25 Mhz clocks.
Reported by: imp
MFC after: 2 weeks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24367
A T6 adapter contains two crypto engines on separate channels. This
commit distributes sessions between the two engines. Previously, only
the first engine was used.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24347
There are several reports of "hdac0: Command timeout on address 2"
messages emitted during playback on a variety of contemporary machines.
Show the command that timed out in case it might provide a clue in
finding the cause.
PR: 229190
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
The comment previously stated the delay must be at least 250us but that
was insufficient and so should be doubled, but the delay was actually
1000. The HDA spec actually says the delay must be 521 us (25 frames)
so update the comment to match.
Recognize new micro-architecture in hwpmc_intel driver. Based on Intel
document 325462-071US. Tested with tools/test/hwpmc/pmctest.py
on Atom E3930 SoC.
Submitted by: Dawid Gorecki <dgr@semihalf.com>
Reviewed by: kib
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D24310
When using SACK it can happen there are multiple option headers.
Don't drop these packets, but instead limit the amount of inlining
to the maximum supported.
MFC after: 1 week
Sponsored by: Mellanox Technologies
This includes 14 bytes of ethernet header and 2 bytes of VLAN header.
This allows for making assumptions about the inline size limit
in the fast transmit path later on.
Use a signed integer variable to catch underflow.
MFC after: 1 week
Sponsored by: Mellanox Technologies
The reporter is developing a frame buffer driver for hardware using
3 bytes per pixel, but a stride that's a multiple of 256. Previously
this resulted in writing beyond the end of each stride. On the last
row this attempted to write past the end of the frame buffer, triggering
the assertion in vt_fb_mem_wr1().
PR: 243533
MFC after: 2 weeks
Submitted by: Thomas Skibo
This is mostly indentation whitespace, and reflowing a few multiline
comments. This gets a bunch of minor stuff out of the way so that the diffs
for style don't clutter up the diffs for some upcoming functional changes.
Submitted by: Thomas Skibo
Differential Revision: https://reviews.freebsd.org/D24226
This driver hasn't been relevant in almost 15 years. It was for a product on the
shelves for about 6 months in 2003/2004. I've not updated the driver since then,
and have had nobody talk to me about it since maybe 2006 or 2007. It doesn't
implement a standard interface, and can be better done with libusb. All the
action has moved to webcamd for newer, more fully featured hardware. It makes no
appearances in the nycbug dmesg archive.
Relnotes: yes
MFC After: 3 days
This allows clean handoff from BIOS implementing some asynchronous I/O to
the OS AHCI driver. During attach driver declares OS ownership request
and waits from 25ms to 2s for BIOS to complete operation and release the
hardware.
MFC after: 2 weeks
JMB582 has 2 6Gbps SATA ports and PCIe 3.0 x1.
JMB585 has 5 6Gbps SATA ports and PCIe 3.0 x2.
Both chips support AHCI v1.31, Port Multiplier with FBS and 8 MSI vectors.
MFC after: 2 weeks
device. This requires some structural refactoring inside the driver, mostly
about converting existing audio channel structures into arrays.
The main audio mixer is provided by the first PCM instance.
The non-first audio instances may only have a software mixer for PCM playback.
Tested by: Horse Ma <Shichun.Ma@dell.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
key is pressed at the same time as a regular key, that means key with
modifier is output. Some automated USB keyboards like Yubikeys need this.
This fixes a regression issue after r357861.
Reported by: Adam McDougall <mcdouga9@egr.msu.edu>
PR: 224592
PR: 233884
MFC after: 3 days
Sponsored by: Mellanox Technologies
Incompatibility between i386 and amd64 evdev ABIs was caused by presence of
'struct timeval' in evdev protocol. Replace it with 'struct timeval32' for
32 bit binaries.
Big-endian platforms may require additional work due to bitstr_t (array of
unsigned longs) usage in ioctl interface.
MFC after: 2 weeks
If hdaa is used in polling mode, it logs each change to the poll
interval under bootverbose, which makes it unusable (slow). These
messages are arguably useless or are a debugging leftovers at best.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
- The linked list of cryptoini structures used in session
initialization is replaced with a new flat structure: struct
crypto_session_params. This session includes a new mode to define
how the other fields should be interpreted. Available modes
include:
- COMPRESS (for compression/decompression)
- CIPHER (for simply encryption/decryption)
- DIGEST (computing and verifying digests)
- AEAD (combined auth and encryption such as AES-GCM and AES-CCM)
- ETA (combined auth and encryption using encrypt-then-authenticate)
Additional modes could be added in the future (e.g. if we wanted to
support TLS MtE for AES-CBC in the kernel we could add a new mode
for that. TLS modes might also affect how AAD is interpreted, etc.)
The flat structure also includes the key lengths and algorithms as
before. However, code doesn't have to walk the linked list and
switch on the algorithm to determine which key is the auth key vs
encryption key. The 'csp_auth_*' fields are always used for auth
keys and settings and 'csp_cipher_*' for cipher. (Compression
algorithms are stored in csp_cipher_alg.)
- Drivers no longer register a list of supported algorithms. This
doesn't quite work when you factor in modes (e.g. a driver might
support both AES-CBC and SHA2-256-HMAC separately but not combined
for ETA). Instead, a new 'crypto_probesession' method has been
added to the kobj interface for symmteric crypto drivers. This
method returns a negative value on success (similar to how
device_probe works) and the crypto framework uses this value to pick
the "best" driver. There are three constants for hardware
(e.g. ccr), accelerated software (e.g. aesni), and plain software
(cryptosoft) that give preference in that order. One effect of this
is that if you request only hardware when creating a new session,
you will no longer get a session using accelerated software.
Another effect is that the default setting to disallow software
crypto via /dev/crypto now disables accelerated software.
Once a driver is chosen, 'crypto_newsession' is invoked as before.
- Crypto operations are now solely described by the flat 'cryptop'
structure. The linked list of descriptors has been removed.
A separate enum has been added to describe the type of data buffer
in use instead of using CRYPTO_F_* flags to make it easier to add
more types in the future if needed (e.g. wired userspace buffers for
zero-copy). It will also make it easier to re-introduce separate
input and output buffers (in-kernel TLS would benefit from this).
Try to make the flags related to IV handling less insane:
- CRYPTO_F_IV_SEPARATE means that the IV is stored in the 'crp_iv'
member of the operation structure. If this flag is not set, the
IV is stored in the data buffer at the 'crp_iv_start' offset.
- CRYPTO_F_IV_GENERATE means that a random IV should be generated
and stored into the data buffer. This cannot be used with
CRYPTO_F_IV_SEPARATE.
If a consumer wants to deal with explicit vs implicit IVs, etc. it
can always generate the IV however it needs and store partial IVs in
the buffer and the full IV/nonce in crp_iv and set
CRYPTO_F_IV_SEPARATE.
The layout of the buffer is now described via fields in cryptop.
crp_aad_start and crp_aad_length define the boundaries of any AAD.
Previously with GCM and CCM you defined an auth crd with this range,
but for ETA your auth crd had to span both the AAD and plaintext
(and they had to be adjacent).
crp_payload_start and crp_payload_length define the boundaries of
the plaintext/ciphertext. Modes that only do a single operation
(COMPRESS, CIPHER, DIGEST) should only use this region and leave the
AAD region empty.
If a digest is present (or should be generated), it's starting
location is marked by crp_digest_start.
Instead of using the CRD_F_ENCRYPT flag to determine the direction
of the operation, cryptop now includes an 'op' field defining the
operation to perform. For digests I've added a new VERIFY digest
mode which assumes a digest is present in the input and fails the
request with EBADMSG if it doesn't match the internally-computed
digest. GCM and CCM already assumed this, and the new AEAD mode
requires this for decryption. The new ETA mode now also requires
this for decryption, so IPsec and GELI no longer do their own
authentication verification. Simple DIGEST operations can also do
this, though there are no in-tree consumers.
To eventually support some refcounting to close races, the session
cookie is now passed to crypto_getop() and clients should no longer
set crp_sesssion directly.
- Assymteric crypto operation structures should be allocated via
crypto_getkreq() and freed via crypto_freekreq(). This permits the
crypto layer to track open asym requests and close races with a
driver trying to unregister while asym requests are in flight.
- crypto_copyback, crypto_copydata, crypto_apply, and
crypto_contiguous_subsegment now accept the 'crp' object as the
first parameter instead of individual members. This makes it easier
to deal with different buffer types in the future as well as
separate input and output buffers. It's also simpler for driver
writers to use.
- bus_dmamap_load_crp() loads a DMA mapping for a crypto buffer.
This understands the various types of buffers so that drivers that
use DMA do not have to be aware of different buffer types.
- Helper routines now exist to build an auth context for HMAC IPAD
and OPAD. This reduces some duplicated work among drivers.
- Key buffers are now treated as const throughout the framework and in
device drivers. However, session key buffers provided when a session
is created are expected to remain alive for the duration of the
session.
- GCM and CCM sessions now only specify a cipher algorithm and a cipher
key. The redundant auth information is not needed or used.
- For cryptosoft, split up the code a bit such that the 'process'
callback now invokes a function pointer in the session. This
function pointer is set based on the mode (in effect) though it
simplifies a few edge cases that would otherwise be in the switch in
'process'.
It does split up GCM vs CCM which I think is more readable even if there
is some duplication.
- I changed /dev/crypto to support GMAC requests using CRYPTO_AES_NIST_GMAC
as an auth algorithm and updated cryptocheck to work with it.
- Combined cipher and auth sessions via /dev/crypto now always use ETA
mode. The COP_F_CIPHER_FIRST flag is now a no-op that is ignored.
This was actually documented as being true in crypto(4) before, but
the code had not implemented this before I added the CIPHER_FIRST
flag.
- I have not yet updated /dev/crypto to be aware of explicit modes for
sessions. I will probably do that at some point in the future as well
as teach it about IV/nonce and tag lengths for AEAD so we can support
all of the NIST KAT tests for GCM and CCM.
- I've split up the exising crypto.9 manpage into several pages
of which many are written from scratch.
- I have converted all drivers and consumers in the tree and verified
that they compile, but I have not tested all of them. I have tested
the following drivers:
- cryptosoft
- aesni (AES only)
- blake2
- ccr
and the following consumers:
- cryptodev
- IPsec
- ktls_ocf
- GELI (lightly)
I have not tested the following:
- ccp
- aesni with sha
- hifn
- kgssapi_krb5
- ubsec
- padlock
- safe
- armv8_crypto (aarch64)
- glxsb (i386)
- sec (ppc)
- cesa (armv7)
- cryptocteon (mips64)
- nlmsec (mips64)
Discussed with: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D23677
- make sure volume controls are correctly mapped to "pcm" and "rec" depending
on how they deliver audio to the USB host.
- make sure there are no duplicate record selections.
- remove internal only mixer class type.
- don't add software volume controls for recording only.
- some minor mixer code cleanup.
Tested by: Horse Ma <Shichun.Ma@dell.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
This change fixes a couple of issues with OPAL IPMI driver and
implements a mechanism to detect timeouts and discard old messages left
in receive queue, to avoid old messages from being confused with the
reply of new ones.
Reviewed by: jhibbits
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D24185
Reverts r293369. The macro was orginally correct, since our SMBus
framework, unlike i2c, already requires addresses to be 8-bit, LSB-cleared.
MFC after: 3 days
Sponsored by: Juniper Networks, Inc
Previously they included sys/dev/cx/machdep.h, but the cx driver was
retired in r359178. These drivers haven't had real development for
a decade or more so there's no real benefit in sharing this file; just
copy it to the ce and cp subdirs.
The devices supported by these drivers are obsolete ISA cards, and the
sync serial protocols they supported are essentially obsolete too.
Sponsored by: The FreeBSD Foundation
This reduces the lines bouncing around between the driver rx ithread and
the netmap rxsync thread. There is no net change in the size of the
struct (it continues to waste a lot of space).
This kind of split was originally proposed in D17869 by Marc De La
Gueronniere @ Verisign, Inc.
MFC after: 1 week
Sponsored by: Chelsio Communications
The warning used to be displayed for valid VPDs about 512B or above in
size. Fix the size check and add a break while here so that the routine
stops if if detects any problem.
Tested with "pciconf -lV"
Reviewed by: kib@, jhb@
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D23679
When vmx(4) was converted to an iflib driver in r343291, the
power-of-2 queue count constraint was removed as it appeared that
current implementations of the VMXNET3 virtual device no longer
required that constraint. It turns out that some of the
implementations still do, and on such systems, the device will fail to
initialize when configured with a non-power-of-2 RX or TX queue count.
PR: 237321
Reported by: ncrogers@gmail.com
MFC after: 1 week
Skip device mode switch step on Fountain-based devices as they don't
support RAW_SENSOR_MODE command, so failing to attach.
This was reproduced on PowerBook G4 (model PowerBook5,6) equipped with
product ID 0x020e
Reviewed by: hselasky
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D24005
These adjustments improve performance with jumbo frames and/or LRO
enabled (i.e., when there may be multiple descriptors per packet) by
increasing the default size of the receive queues and by always using
page-sized buffers for the body type receive ring.
This patch also adjust the initialization of the max frame size to
remove cases where certain configuration sequences would result in 2K
receive buffers being used instead of 4K ones when jumbo frames were
enabled.
Reviewed by: gallatin
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23950
This fixes a bug where the checksum offload status of received packets
was being taken from the first descriptor instead of the last, which
affected LRO packets.
The driver has been hardened against the device skipping receive
descriptors, although it is not believed that this can occur given the
way this implementation configures the receive rings.
Additionally, for packets received with the error indicator set, the
driver now forces the length of all fragments in that packet to zero
prior to passing it to iflib. Such packets should wind up being
discarded at some point in the stack anyway, but this removes any
questions by killing them in the driver.
Counters have been added (and exposed via sysctls) for skipped receive
descriptors, zero-length packets received, and packets received with
the error indicator set so that these conditions can be easily
observed in the field.
PR: 243126, 243392, 240628
Reported by: avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer
Reviewed by: gallatin
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23949
192 bytes are not enough to print long commands, such as ATA COMMAND PASS
THROUGH(16), that makes debug output difficult to read.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
it's not in use by TOE or KTLS.
Reviewed by: jhb@
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24046
maximal input report size instead of wMaxPacketSize.
If the USB frame length is set to 1024 bytes, WMT_BSIZE, the EETI controller
will pack multiple touch events in the packet and the current code will only
process the first touch event.
As a result some important events are lost like releasing the finger from the
touchscreen.
Use the maximal input report size as buffer size instead.
PR: 244718
Tested by: Oskar Holmlund <oskar.holmlund@ohdata.se>, wulf
MFC after: 3 days
Discussed with: hselasky
Limiting frame size to maximum packet size breaks devices which have input
report size larger than wMaxPacketSize. Maximal input report size should be
used instead.
Revert the commit as it have not been MFC-ed yet.
Discussed with: hselasky
will pack multiple touch events in the packet and the current code will only
process the first touch event.
As a result some important events are lost like releasing the finger from the
touchscreen.
Use the maximum maximum packet size as buffer size instead.
Submitted by: Oskar Holmlund <oskar.holmlund@ohdata.se>
PR: 244718
MFC after: 3 days
Sponsored by: Mellanox Technologies
already allocating from the safe zone and the allocation fails.
This bug was introduced in r357481.
MFC after: 3 days
Sponsored by: Chelsio Communications
Touch Digitizer V04 report descriptor declares 'Contact Count Maximum' usage
as constant. That was not supported by descriptor parser.
PR: 232040
Reported by: Sergei Akhmatdinov <sakhmatd@darkn.space>
MFC after: 1 week
When iicbus is attached as child of Designware I2C controller it scans all
ACPI nodes for "I2C Serial Bus Connection Resource Descriptor" described
in section 19.6.57 of ACPI specs.
If such a descriptor is found, I2C child is added to iicbus, it's I2C
address, IRQ resource and ACPI handle are added to ivars. Existing
ACPI bus-hosted child is deleted afterwards.
The driver also installs so called "I2C address space handler" which is
disabled by default as nontested.
Set hw.iicbus.enable_acpi_space_handler loader tunable to 1 to enable it.
Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D22901
Newbus device reference attached to ACPI handle is not cleared when newbus
device is deleted with devctl(8) delete command. Fix that with calling of
AcpiDetachData() from "child_deleted" bus method like acpi_pci driver does.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D22902
Without this change, if an AIF interrupt comes at the same time a SYNC
command is finished, the SYNC interrupt will be lost. This happens because
all interrupt bits (bellbits) are cleared, but only one of them is handled.
Debugging shows that, (at least) when !sc->msi_enabled and (sc->flags &
AAC_FLAGS_SYNC_MODE) is true (sync mode), both bits may be set at the same
time.
PR: 237463
Reviewed by: scottl
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D23859
Before skipping the current cpu when trying to find the ones that
have the same opp, record that this one have this opp.
Reported by: mmel
MFC after: 2 weeks
X-MFC-With: r358555
because it clobbers the super speed link status when a device is in super
speed mode. Currently the power bit is not needed for anything in the USB
hub driver.
This fixes USB warm reset for super speed devices.
Tested by: Shichun.Ma@dell.com
MFC after: 3 days
Sponsored by: Mellanox Technologies
Replace hardcoded sizes by nitems and sizeof
Replace CTLFLAG_NEEDGIANT with CTLFLAG_MPSAFE, I run this driver since a few
years with CTLFLAG_MPSAFE w/o issues.
Add a HACK to handle a special case for a sensor location.
This fixes some errors on PPC64, during attach and when trying to assign an IP
to an interface. With this change, basic operation of X710 NICs is now
possible.
This also fixes builds with IXL_DEBUG enabled
Reviewed by: erj
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D23975
As written, the condition of (cdb[0] != 0x28 || cdb[0] != 0x2A) will always
be true, since if it's one, it's obviously not the other. Reading the code,
the intent appears to be that it should only perform the operation if it's
neither, otherwise the conditional can be elided.
Found by clang 10.
Port aacraid driver to big-endian (BE) hosts.
The immediate goal of this change is to make it possible to use the
aacraid driver on PowerPC64 machines that have Adaptec Series 8 SAS
controllers.
Adapters supported by this driver expect FIB contents in little-endian
(LE) byte order. All FIBs have a fixed header part as well as a data
part that depends on the command being issued to the controller.
In this way, on BE hosts, the FIB header and all FIB data structures
used in aacraid.c and aacraid_cam.c need to be converted to LE before
being sent to the adapter and converted to BE when coming from it.
The functions to convert each struct are on aacraid_endian.c.
For little-endian (LE) targets, they are macros that expand
to nothing.
In some cases, when only a few fields of a large structure are used,
the fields are converted inline, by the code using them.
PR: 237463
Reviewed by: jhibbits
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D23887
Each segment can be up to 4096 bytes in chain structure according to the
RK3399 TRM Part 2.
Set the buffers in full ring where the last one point to the first one.
Correctly reports the MMC_IVAR_MAX_DATA.
Use CACHE_LINE_SIZE for bus_dma alignment.
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D23894
Check copyin's error code (differ adding copyout checks at this time).
Don't directly access user memory in the switch statement.
Since bnxt_ioctl_data isn't all that big, use a stack allocation.
Reviewed by: jhb
MFC after: 3 days
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D23933
appropriate actions when we are trying to detach an audio device,
but cannot because someone is using it.
This avoids applications having to wait for the DSP read data
timeout before they receive any error indication.
Tested with virtual_oss(8).
Remove some unused definitions while at it.
PR: 194727
MFC after: 1 week
Sponsored by: Mellanox Technologies
This issue was observed on a PowerPC64 machine with an Adaptec RAID Controller
with PCI device ID 0x028d. After several read/write operations, the kernel was
panic'ing in bus_dmamap_sync(). This was due to a missing aac_unmap_command()
in the SYNC path.
PR: 237463
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D23668
This fixes a regression issue after r357861.
Reported by: James Wright <james.wright@jigsawdezign.com>
PR: 224592
PR: 233884
MFC after: 3 days
Sponsored by: Mellanox Technologies
When looking for cpu with the same OPP starts from the root /cpus node
so each instance of cpufreq_dt will now each cpu with the same operating
point.
Also test that the node we are testing have the property "device_type" set
to be equal to "cpu".
While here add more debug printfs (off by defaults).
MFC after: 2 weeks
These support outdated or obsolete ISA WAN (T1/E1) sync serial cards,
and these drivers haven't really been touched (other than in tree-wide
sweeps to keep them building) for 15+ years.
Related PCI devices ce and cp are still in the tree, with deprecation
proposed in D23928.
MFC after: 1 week
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
This issue was observed on a PowerPC64 machine with an Adaptec RAID
Controller with PCI device ID 0x028d, where sense data was causing a
buffer overflow because of wrong max sense length logic.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D23667
This allows libinput to disable touchpads when the lid is closed and
various desktop environments can show power-off dialogs when the power
button is pressed. While the latter is doable with devd a
cross-platform solution is nicer.
Submitted by: Greg V <greg@unrelenting.technology>
Differential Revision: https://reviews.freebsd.org/D23863
MFC after: 1 week
Sponsored by: Mellanox Technologies
Also, inline/remove now empty or trivial macros. CAM has evolved enough this
code couldn't work there anyway, and the API sweeep commits made since then were
made unconditional.
Eliminate code for old versions, inline pci_find_cap instead of relying on
compat ifdef.
This commit should have been combined with r358488 before pushing it in.
these look to be cut and pasted from other drivers since this driver was
committed to FreeBSD 7-current and MFC'd to FreeBSD 6. The ones for FreeBSD 4
and 5 likely never were working...
The IVAR_MAX_DATA is supposed to have the number of descriptor X the mmc
block size and desc_count contain all this information + 1.
Reported by: phk
MFC after: 1 week
This more clearly differentiates TLS records encrypted and decrypted
in TOE connections from those encrypted via NIC TLS.
MFC after: 1 week
Sponsored by: Chelsio Communications
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Reviewed by: cem
Approved by: csprng, kib (mentor, blanket)
Differential Revision: https://reviews.freebsd.org/D23841
Remove a number of workarounds for older versions of FreeBSD. FreeBSD stable/10
was branched over 6 years ago. All of these changes date from about that time or
earlier. These workarounds are extensive and get in the way of understanding
the current flow in the driver.
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
Remove an old workaround that is no longer necessary since rS343824.
There used to be a problem with FMan interrupts firing on multiple CPUS
at the same time.
This ended up being due to multicast interrupts being unsupported in the
Freescale PIC (so instead of using a selection algorithm, it would do some
unspecified action, such as interrupting multiple cpus at random.)
Reviewed by: jhibbits
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D23829
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE.
Reviewed by: royger
Approved by: kib (mentor, blanket)
Differential Revision: https://reviews.freebsd.org/D23638
When turning IBRS mitigation using sysctl, as opposed to loader tunable,
send IPI to tweak MSR on all cores. Right now code only performed MSR write
onr the CPU where sysctl was run.
Properly report hw.ibrs_active for IBRS_ALL. Split hw_ibrs_ibpb_active out
from ibrs_active, to keep the current semantic of guiding kernel entry and
exit handlers.
Reported and tested by: mav
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
key-codes from the USB keyboard. Negative key-codes are currently skipped.
While at it use the bit size value provided by the HID location structure
instead of assuming a value of 8.
This fixes a regression issue after r357861.
Reported by: Minoru TANABE <kotanabe3@gmail.com>
PR: 224592
PR: 233884
MFC after: 3 days
Sponsored by: Mellanox Technologies
commands have completed.
It's not OK to force complete any pending commands before we send the
REMOVE_DEVICE. Instead, make sure that all pending commands are complete before
sending that. By trying to second guess the firmware here, we run the risk of
completing commands twice, which leads to corruption.
This removes the forced completion of commands introduced in r218811. So it's a
partial backout of that commit, but replaces it with a more rebust
mechanism. Either these commands will complete due to the TARGET RESET, or they
will timeout and be aborted, but they will all complete.
Add assert that all commands are complete to REMOVE_DEVICE completion
routine. We attempt to assure this programatically, so we shouldn't have any
commands in the queue because we've waited for them all. Any commands that make
it into our action routine after we mark the target in removal will complete
immediately with an error.
When we're removing a target that's not a volume, advertise up the stack that
it's actually gone, as opposed to having a transient selection error we should
retry. Do this both in the action routine, and when we get a notification of an
aborted command. We don't do this for volumes because the driver tries hard not
to advertise to the OS a volume has disappeared.
Apply these changes to both mpr and mps since they are based on quite similar
designs.
Discussed with: scottl@
Differential Revision: https://reviews.freebsd.org/D23768
After the network epoch was added, we lost the ability to migrate the
ithread in the middle of dispatch, as being in the network epoch will pin
the current thread (for safety reasons.)
Luckily, we don't actually have to do this workaround in the first place,
as we can just bind it to the correct cpu when we preallocate it.
Pass dev through to XX_PreallocAndBindIntr() and actually bind it to the
cpu like it was supposed to in the first place, instad of leaving it
floating and moving it to the correct cpu the first time it fires.
This fixes panics while bringing up dtsec on my X5000.
Reviewed by: jhibbits
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D23826
switch over to opt-in instead of opt-out for epoch.
Instead of IFF_NEEDSEPOCH, provide IFF_KNOWSEPOCH. If driver marks
itself with IFF_KNOWSEPOCH, then ether_input() would not enter epoch
when processing its packets.
Now this will create recursive entrance in epoch in >90% network
drivers, but will guarantee safeness of the transition.
Mark several tested drivers as IFF_KNOWSEPOCH.
Reviewed by: hselasky, jeff, bz, gallatin
Differential Revision: https://reviews.freebsd.org/D23674
if_capabilities indicates capabilities supported by the hardware;
if_capenable which are enabled. Note that rx checksum is still disabled
in the driver at compile time.
Submitted by: Johannes <iz-rpi04@hs-karlsruhe.de>
MFC after: 2 weeks
Driver working in LLQ mode in some cases can send only few last segments
of the mbuf using DMA engine, and the rest of them are sent to the
device using direct PCI transaction. To map the only necessary data, two DMA
maps were used. That solution was very rough and was causing a bug - if
both maps were used (head_map and seg_map), there was a race in between
two flows on two queues and the device was receiving corrupted
data which could be further received on the other host if the Tx cksum
offload was enabled.
As it's ok to map whole mbuf and then send to the device only needed
segments, the design was simplified to use only single DMA map.
The driver version was updated to v2.1.1 as it's important bug fix.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Approved by: kib (mentor, blanket)
Differential Revision: https://reviews.freebsd.org/D23636
If a failure happens reading the lid state, assume the lid is opened.
Suggested by: cem @
Differential Revision: https://reviews.freebsd.org/D23724
PR: 240881
MFC after: 1 week
Sponsored by: Mellanox Technologies
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all low hanging fruits as MPSAFE.
Reviewed by: markj
Approved by: kib (mentor, blanket)
Differential Revision: https://reviews.freebsd.org/D23626
The index should be computed as distance from arg[0] and not
the beginning of struct mlx5_ib_congestion .
While at it fix a use of zero length array to avoid depending
on undefined compiler behaviour.
MFC after: 1 week
Sponsored by: Mellanox Technologies
While at it update the sysctl(9) description for the lid state.
Differential Revision: https://reviews.freebsd.org/D23724
PR: 240881
Submitted by: Yuri Pankov <yuripv@yuripv.me>
MFC after: 1 week
Sponsored by: Mellanox Technologies
When we register an interrupt handler we need to pass the intr_type along in
bus_setup_intr().
The interrupt type matters because it is used to decide if we need to enter
NET_EPOCH. That meant that vtmmio-based if_vtnet did not, which led to panics
with INVARIANTS set.
Sponsored by: Axiado
The epoch stuff with taskqueues works fine if the driver never calls
the receive path in other contexts, but this driver does. If there was
a chip reset during active receive then part of the reset will call
the receive path to flush out any active packets before reinitialising
the receive queue and that needs to be done with the epoch held.
So:
* make the receive task a normal task again
* explicitly call epoch enter/exit around the legacy and newer DMA
receive paths
* add a couple of epoch asserts to ensure that the receive packet
path itself is called with epoch held.
This fixes it on my Atom eeepc laptop (circa 2010!) that I did
all of my initial 802.11n work in this driver and net80211.
Tested:
* AR9285, STA mode
TODO:
* Test on EDMA chipset (AR9380)
* Test in AP/adhoc modes, just to be sure (eg for beacon
receive processing in particular.)
ACPI Control Method Batteries have a _BIF and/or _BIX object which
provide static properties of the battery. FreeBSD acpi_cmbat module
supported _BIF object only, which was deprecated as of ACPI 4.0.
_BIX is an extended version of _BIF defined in ACPI 4.0 or later.
As of writing, _BIX has two revisions. One is in ACPI 4.0 (rev.0) and
another is in ACPI 6.0 (rev.1). It seems that hardware vendors still
stick to _BIF only or _BIX rev.0 + _BIF for the maximum compatibility.
Microsoft requires _BIX rev.0 for Windows machines, so there are some
laptop machines with _BIX rev.0 only. In this case, FreeBSD does not
recognize the battery information.
After this change, the acpi_cmbat module gets battery information from
_BIX or _BIF object and internally uses _BIX rev.1 data structure as
the primary information store in the kernel. ACPIIO_BATT_GET_BI[FX]
returns an acpi_bi[fx] structure built by using information obtained
from a _BIF or a _BIX object found on the system. The revision number
field can be used to check which field is available. The acpiconf(8)
utility will show additional information if _BIX is available.
Although ABIs of ACPIIO_BATT_* were changed, the existing APIs for
userland utilities are not changed and the backward-compatible ABIs
are provided. This means that older versions of acpiconf(8) can also
work with the new kernel. The (union acpi_battery_ioctl_arg) was
padded to 256 byte long to avoid another ABI change in the future.
A _BIX object with its revision number >1 will be treated as
compatible with the rev.1 _BIX format.
Reviewed by: takawata
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23728
buffer group.
This fixes a bug where congestion drops on port 1 of a T6 card would
incorrectly be counted as drops on port 0.
MFC after: 1 week
Sponsored by: Chelsio Communications
We need this to use EARLY_DRIVER_MODULE in child drivers on arm64. This
should be a no-op on x86 as it has DRIVER_MODULE in the nexus driver making
all later drivers attach in the last pass.
Reviewed by: imp
MFC after: 1 month
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D23717
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked). Use it in
preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Reviewed by: imp, kib
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D23633
Fix the following -Werror warning from clang 10.0.0 in hptmv(4):
sys/dev/hptmv/ioctl.c:240:4: error: misleading indentation; statement is not part of the previous 'if' [-Werror,-Wmisleading-indentation]
_vbus_p=pArray->pVBus;
^
sys/dev/hptmv/ioctl.c:237:10: note: previous statement is here
if(!mIsArray(pArray))
^
This is because the return statement after the if statement was not
indented. (Note that this file has been idented assuming 4-space tabs.)
MFC after: 3 days
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked). Use it in
preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Reviewed by: hselasky, kib
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D23632
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked). Use it in
preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Reviewed by: kib, trasz
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D23640
sys/dev/hptmv/ioctl.c:240:4: error: misleading indentation; statement is not part of the previous 'if' [-Werror,-Wmisleading-indentation]
_vbus_p=pArray->pVBus;
^
sys/dev/hptmv/ioctl.c:237:10: note: previous statement is here
if(!mIsArray(pArray))
^
This is because the return statement after the if statement was not
indented. (Note that this file has been idented assuming 4-space tabs.)
MFC after: 3 days
The upstream OpenSSL changes only set the cipher for GCM since the
authentication is redundant, and changes to OCF will soon remove the
GCM authentication algorithm constants entirely for the same reason.
In addition, ktls_create_session() already validates these fields and
wouldn't pass down an invalid auth_algorithm value to any drivers or
ktls backends.
Reviewed by: hselasky
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D23671
It duplicated the kern_tls_records stat and was not conditional on NIC
TLS being enabled.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D23670
Initialize the FCH SMBus controller for Hygon Dhyana CPU.
Set the vendor of the FCH description via the exact CPU vendor.
Submitted by: Pu Wen <puwen@hygon.cn>
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23558
Add support for decoding pressed keys as a bitmap. The keys in the
bitmap are described in the interface specific HID descriptor. Some
keyboards even have multiple input interfaces, only using the bitmap
method when the event array is full. That typically means when more
than seven keys are pressed simultaneously.
The internals of the USB keyboard driver have been slightly reworked
to keep track of all keys in a single bitmap having 256 bits. This
bitmap is then divided into blocks of 64-bits as an optimisation.
Simplify automatic key repeat logic, because only the last key pressed
can be repeated.
PR: 224592
PR: 233884
Tested by: Alex V. Petrov <alexvpetrov@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
It was saved from the initial purge of drivers in fcp-101 due to being
the onboard Ethernet device on a number of sparc64 machines. Now that
sparc64 is gone, it serves little purpose (PCI cards exist, but are rare
and are unlikely to have been deployed outside Sun systems).
MFC after: 3 days
Platform (N1SDP).
Neoverse N1 is a high-performance ARM microarchitecture designed
by the ARM Holdings for the server market.
The PCI part on N1SDP was shipped untested and suffers from some
integration issues.
For instance accessing to not existing BDFs causes System Error
(SError) exception. To mitigate this, the firmware scans the bus,
catches SErrors and creates a table with valid BDFs. That allows
us to filter-out accesses to invalid BDFs in this driver.
Also the root complex config space (BDF == 0) has an unusual
location in memory map, so remapping accesses to it is required.
Finally, the config space is restricted to 32-bit accesses only.
This was tested on the ARM boxes kindly provided by the ARM Ltd
to the DARPA CHERI Project.
In collaboration with: andrew
Reviewed by: andrew
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D23349
This patch introduces processing of the frames
up to 9kB by the mvneta driver. Some versions of
this NIC limit TX checksum offloading, depending
on the frame size, so add appropriate handling
of this feature.
Submitted by: Kornel Duleba
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D23225
To make the PMC tool pmcstat working properly on Hygon platform, add
support for Hygon Dhyana family 18h by using the PMC initialization
code path of AMD family 17h.
Submitted by: Pu Wen <puwen@hygon.cn>
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23562
riscv cores.
GFE cores come with standard DTS file that lacks standard 'dmas ='
property, which means xae(4) could not find a DMA controller to use.
The 'dmas' property could not be added to the DTS file because the
ethernet controller and DMA engine parts in Linux are implemented
in a single driver.
Instead of 'dmas' property the standard Xilinx 'axistream-connected'
property is provided, so fallback to use it instead.
Suggested by: James Clarke <jrtc27@jrtc27.com>
Reviewed by: James Clarke <jrtc27@jrtc27.com>
Sponsored by: DARPA, AFRL
in the sysctl block for the driver. mpsutil/mprutil needs this so it can
know how big of a buffer to allocate when requesting the IOCFacts from the
controller. This eliminates the kernel console messages about wrong
allocation sizes.
Reported by: imp
BIO_READ and BIO_WRITE, we've handled this expanded syntax poorly in
drivers when the driver doesn't support a particular command. Do a
sweep and fix that.
Reported by: imp
Currently the Xen console is always attached with priority CN_REMOTE
(highest), which means that when booting with a single console the Xen
console will take preference over the VGA for example, and that's not
intended unless the user has also selected to use a serial console.
Fix this by lowering the priority of the Xen console to NORMAL unless
the user has selected to use a serial console. This keeps the usual
FreeBSD behavior of outputting to the internal consoles (ie: VGA) when
booted as a Xen dom0.
MFC after: 3 days
Sponsored by: Citrix Systems R&D
This means that extra virtual interfaces (VIs) created with
hw.cxgbe.num_vis are no longer required to use netmap. Use this
tunable to enable native netmap support on the main interface:
hw.cxgbe.native_netmap="3"
There is no change in default behavior.
Suggested by: jch@
MFC after: 2 weeks
Sponsored by: Chelsio Communications
In legacy VirtIO drivers, the header must be PCI endianness (little) and the
device-specific region is encoded in the native endian of the guest.
This patch makes the access (read/write) to VirtIO header using the little
endian order. Other read and write access are native endianness. This also
sets the device's IO region as big endian if on big endian machine.
PR: 205178
Submitted by: Andre Silva <afscoelho@gmail.com>
Reported by: Kenneth Salerno <kennethsalerno@yahoo.com>
Reviewed by: bryanv, bdragon, luporl, alfredo
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D23401
The cong_drop setting will apply to queues created after the setting is
changed and not to existing queues.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
This most importantly reduces duplication, but it also removes any potential
race with usage of dev->si_drv1 since it's now set prior to the device being
constructed enough to be accessible.
This simplifies the driver's rx fast path as well as the bookkeeping
code that tracks various rx buffer sizes and layouts.
MFC after: 1 week
Sponsored by: Chelsio Communications
ext_arg2 is the only item in the third cacheline in an mbuf and could be
cold by the time rxb_free runs. Put the information needed by rxb_free
in the same line as the refcount, which is very likely to be hot given
that rxb_free runs when the refcount is decremented and reaches 0.
MFC after: 1 week
Sponsored by: Chelsio Communications
requested.
This is a tradeoff between PCIe efficiency during large packet rx and
packing efficiency during small packet rx.
MFC after: 1 week
Sponsored by: Chelsio Communications
The Parallel Port SCSI adapter was interesting for 100MB ZIP drives, but is no
longer used or maintained. Remove it from the tree.
The Parallel Port microsequencer (microseq.9) is now mostly unused in the tree,
but remains. PPI still refrences it, but doesn't use its full functionality.
Relnotes: Yes
Reviewed by: rgrimes@, Ihor Antonov
Discussed on: arch@
Differential Revision: https://reviews.freebsd.org/D23389
This driver has seen no real changes for almost 20 years. It's for
hardware that's 25 years old. It has no reports of active use, nor
has it been seen in the NYCBug dmesg database at all. Schedule
its removal for 13.0.
Reviewed by: rgrimes@ (earlier version)
Relnote: Yes
MFC After: 3 days
Differential Revision: https://reviews.freebsd.org/D23403
As Ian Lepore noted, writing ~1 to a register might have a completely
different effect than doing a regular read-modify-write operation.
Follow the TCG_PC_Client_Platform_TPM_Profile_PTP_2.0_r1.03_v22
datasheet instead, and use the actual values mentioned there:
(uint32_t)1 to cancel the command, (uint32_t)0 to clear the field.
MFC after: 3 days
tpmtis_go_ready() read the value of the TPM_STS register, ORed
TPM_STS_CMD_READY with it, and wrote it back. However, the TPM Profile
(PTP) specification states that only one bit in the write request value may
be set to 1, or else the entire write request is ignored.
Fix by just writing TPM_STS_CMD_READY.
Similarly, remove the call which clears the TPM_STS_CMD_READY flag in the
same function. It was being ignored for the same reason.
Submitted by: Darrick Lew <darrick.freebsd AT gmail.com>
Reviewed by: vangyzen, myself
MFC after: if you care about stable, you might want to do so
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D23081
operations to a boolean in tpm(4):
sys/dev/tpm/tpm_crb.c:301:32: error: converting the result of '<<' to a boolean; did you mean '(1 << (0)) != 0'? [-Werror,-Wint-in-bool-context]
WR4(sc, TPM_CRB_CTRL_CANCEL, !TPM_CRB_CTRL_CANCEL_CMD);
^
sys/dev/tpm/tpm_crb.c:73:34: note: expanded from macro 'TPM_CRB_CTRL_CANCEL_CMD'
#define TPM_CRB_CTRL_CANCEL_CMD BIT(0)
^
sys/dev/tpm/tpm20.h:60:19: note: expanded from macro 'BIT'
#define BIT(x) (1 << (x))
^
In this case, the intent was to clear the zeroth bit, and leave the rest
unaffected. Therefore, the ~ operator should be used instead.
Noticed by: cem
MFC after: 3 days
Make sure all receive completion callbacks are covered by the network
EPOCH(9), because this is required when calling if_input() and
ether_input() after r357012.
Convert some spaces to tabs while at it.
Sponsored by: Mellanox Technologies
ahd_inb() returns type uint8_t. The shift left by untyped 24 implicitly
promotes the result to type (signed) int. Then the binary OR with uint64_t
values sign-extends the integer. If bit 31 of the read value happened to be
set, the 64-bit result would have all upper 32 bits set to 1 due to OR. This
is clearly not intended.
Reported by: Coverity
CID: 980473 (old one!)
Make completion event path mostly lockless using EPOCH(9).
Implement a mechanism using EPOCH(9) which allows us to make
the callback path for completion events mostly lockless.
Simplify draining callback events using epoch_wait().
While at it make sure all receive completion callbacks are
covered by the network EPOCH(9), because this is required
when calling if_input() and ether_input() after r357012.
Sponsored by: Mellanox Technologies
ThinkPad PrivacyGuard is a built-in toggleable privacy filter that
restricts viewing angles when on. It is an available on some new
ThinkPad models such as the X1 Carbon 7th gen (as an optional HW
upgrade).
The privacy filter can be enabled/disabled via an ACPI call. This commit
adds a sysctl under dev.acpi_ibm that allows for getting and setting the
PrivacyGuard state.
Submitted by: Kamila Součková <kamila@ksp.sk>
Reviewed By: cem, philip
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D23370
Make sure all occurrences of ieee80211_input_xxx() in sys/dev are
covered by a network epoch section. Do not depend on the interrupt
handler nor any taskqueues being in a network epoch section.
This patch should unbreak the PCI WLAN drivers after r357004.
Pointy hat: glebius@
Sponsored by: Mellanox Technologies
While here, clean up magic numbers.
The memset(,0,) (and M_ZERO!) can just be removed; the bit_alloc() API already
zeros the allocation.
No functional change.
Reported by: Coverity
CID: 1378286
All of these uses of sizeof() were on the wrong type in relation to the pointer
passed to SYSCTL_ADD_PROC as arg1. Fortunately, none of the handlers actually
use arg2. So just don't pass a (non-zero) arg2.
Reported by: Coverity
CID: 1007701
Probe Family 17h CPUs for up to 4 (Zen, Zen+) or 8 (Zen2) CCD temperature
sensors. These were discovered by Ondrej Čerman
(https://github.com/ocerman) and collaborators experimentally, and are not
currently documented in any datasheet I have access to.
One more instance of if_input being called outside of
interrupt, by means of msk_handle_events.
Differential Revision: https://reviews.freebsd.org/D23379
These should not be any functional change. While the change in
emul10kx-pcm.c looks like a real bug fix (as opposed to inconsistent
whitespace), the extra statements were not harmful.
Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D23363
o Make the pci_host_generic_acpi_attach() globally visible.
o Declare a new driver class.
These will be used by a new PCI root complex driver.
Sponsored by: DARPA, AFRL
We observe at least one problem: if a UDP socket is connect(2)-ed, then a
received packet that matches the connection cannot be matched to the
corresponding PCB because of an incorrect flow ID. That was oberved for DNS
requests from the libc resolver. We got this problem because FreeBSD
r343291 enabled code that can set rsstype of received packets to values
other than M_HASHTYPE_OPAQUE_HASH. Earlier that code was under 'ifdef
notyet'.
The essence of this change is to use the system-wide RSS key instead of
some historic hardcoded key when the software RSS is enabled and it is
configured to use Toeplitz algorithm (the default).
In all other cases, the driver reports the opaque hash type for received
packets while still using Toeplitz algorithm with the internal key.
PR: 242890
Reviewed by: pkelsey
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D23147
This bus does not really have a concept of the initiator ID, so use
a guaranteed dummy one that won't conflict with any real target.
This change fixes a problem with virtio_scsi on GCE where disks get
sequential target IDs starting from one. If there are seven or more
disks, then a disk with the target ID of seven would not be discovered
by FreeBSD as that ID was reserved as the initiator ID -- see
scsi_scan_bus().
Discussed with: bryanv
MFC after: 2 weeks
Sponsored by: Panzura
supposedly may call into ether_input() without network epoch.
They all need to be reviewed before 13.0-RELEASE. Some may need
be fixed. The flag is not planned to be used in the kernel for
a long time.
The vnode pager does not want the object lock held. Moving this out allows
further object lock scope reduction in callers. While here add some missing
paging in progress calls and an assert. The object handle is now protected
explicitly with pip.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D23033
On a MINIMAL kernel, mps.ko wouldn't load because it uses the xpt_hold_boot
symbol from CAM, but didn't have a dependency on cam(4).
(CEM: Some context: when linking loaded modules, the kernel dynamic linker
only looks for definitions in explictly marked dependency modules. Also,
the identical mpr(4) driver uses the same CAM function, but already had the
correct MODULE_DEPEND(), so no similar change is needed there.)
Submitted by: Greg V <greg AT unrelenting.technology>
Reviewed by: imp, myself
Differential Revision: https://reviews.freebsd.org/D23272
This is a qspi driver for the Xilinx Zynq-7000 chip.
It could be useful for anyone wanting to boot a system from flash memory
instead of SD cards.
Submitted by: Thomas Skibo (thomasskibo@yahoo.com)
Differential Revision: https://reviews.freebsd.org/D14698
Most of the gpio controller cannot configure or get the configuration
of the pin muxing as it's usually handled in the pinctrl driver.
But they can know what is the pinmuxing driver either because they are
child of it or via the gpio-range property.
Add some new methods to fdt_pinctrl that a pin controller can implement.
Some methods are :
fdt_pinctrl_is_gpio: Use to know if the pin in the gpio mode
fdt_pinctrl_set_flags: Set the flags of the pin (pullup/pulldown etc ...)
fdt_pinctrl_get_flags: Get the flags of the pin (pullup/pulldown etc ...)
The defaults method returns EOPNOTSUPP.
Reviewed by: ian, bcr (manpages)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D23093
Some consumer cannot know the voltage of the regulator without it.
While here, refuse to attach is min_voltage != max_voltage, it
shouldn't happens anyway.
Reviewed by: mmel
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D23003
This change is based on Linux commit 40630f462824ee. csio.resid should
account for transfer_len only for success and SRB_STATUS_DATA_OVERRUN
condition.
I am not sure how exactly this change works, but I have a report from a
user that they see lots of checksum errors when running a pool scrub
concurrently with iozone -l 1 -s 100G. After applying this patch the
problem cannot be reproduced.
Reviewed by: nobody
Sponsored by: CyberSecure
Differential Revision: https://reviews.freebsd.org/D22312
than 128MB, which is the maximum supported by the hardware in RDMA mode.
Obtained from: Chelsio Communications
MFC after: 3 days
Sponsored by: Chelsio Communications
The netmap passthrough subsystem requires proper support in the
hypervisor. In particular, two PCI device ids (from the Red Hat
PCI vendor id 0x1b36) need to be assigned to the two netmap
virtual devices. We then disable these devices until the ids have
not been assigned, in order to avoid conflicts with other
virtual devices emulated by upstream QEMU.
PR: 241774
MFC after: 3 days
Fix a mistake introduced by r343291, which ported the vmx(4)
driver to iflib.
In case of TSO, the hlen field of the (first) tx descriptor must
be initialized to the cumulative length of Ethernet, IP and TCP
headers. The length of the TCP header was missing.
PR: 236999
Reported by: pkelsey
Reviewed by: avg
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22967
Highlights:
- Exit early if we're not disabling unused regulators; there's no need to
take the regulator topology lock and re-evaluate this every iteration, as
it's not going to change.
- Don't emit a notice that we're shutting down a regulator if it's not
enabled, to reduce noise.
- Mention the outcome of the shutdown, to aide debugging and easily let
developer/user collect list of regulators we actually shutdown to
determine problematic one.
Reviewed by: manu
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D22213
The patch could be simplier, using only the second chunk to
vtnet_rxq_eof(), that passes full mbufs to pfil(9). Packet
filter would m_free() them in case of returning PFIL_DROPPED.
However, we pretend to be a hardware driver, so we first try
to pass a memory buffer via PFIL_MEMPTR feature. This is mostly
done for debugging purposes, so that one can experiment in bhyve
with packet filters utilizing same features as a true driver.
We use to handle each message separately in i2c_transfer but that cannot
work with message with NOSTOP as it confuses the controller that we disable
the interrupts and start a new message.
Handle every message in the interrupt handler and fire a new start condition
if the previous message have NOSTOP, the controller understand this as a
repeated start.
This fixes booting on Allwinner A10/A20 platform where before the i2c controller
used to write 0 to the PMIC register that control the regulators as it though that
this was the continuation of the write message.
Tested on: A20 BananaPi, Cubieboard 1 (kevans)
Reported by: kevans
MFC after: 1 month
This avoids getting the XHCI_TRB_ERROR_CONTEXT_STATE error code from the XHCI
controller when the endpoint is disabled or already stopped.
Suggested by: Shichun.Ma@dell.com
MFC after: 1 week
Sponsored by: Mellanox Technologies
The newer versions of RPi FDT flipped the order of the interrupts
specification and added an 'interrupt-names' property for driver aide in
finding the correct interrupt, rather than assuming the positions. Use it if
it's available, or fallback to the old method if there is no interrupt-names
property with a usb value.
This has been tested with both old RPi3B FDT and new RPi3B FDT, USB again
works on the latter.
Reported by: Tom Yan <tom.ty89 gmail com>
MFC after: 3 days
Do not configure any endpoint twice, but instead keep track of which
endpoints are configured on a per device basis, and use an evaluate
endpoint context command instead. When changing the configuration make
sure all endpoints get deconfigured and the configured endpoint mask
is reset.
This fixes an issue where an endpoint might stop working if there is
an error and the endpoint needs to be reconfigured as a part of the
error recovery mechanism in the FreeBSD USB stack.
Tested by: Shichun.Ma@dell.com
MFC after: 1 week
Sponsored by: Mellanox Technologies
Virtualise tcp_always_keepalive, TCP and UDP log_in_vain. All three are
set in the netoptions startup script, which we would love to run for VNETs
as well [1].
While virtualising the log_in_vain sysctls seems pointles at first for as
long as the kernel message buffer is not virtualised, it at least allows
an administrator to debug the base system or an individual jail if needed
without turning the logging on for all jails running on a system.
PR: 243193 [1]
MFC after: 2 weeks
Move the decision to take an early exit from that function after adding
children based on FDT data into the #ifdef FDT block, so that it doesn't
offend coverity's notion of how the code should be written. (What's the
point of compilers optimizing away dead code if static analyzers won't
let you use the feature in conjuction with an #ifdef block?)
Reported by: coverity via vangyzen@
has a non-NULL child bus stored in it, so the "none" value can't be zero
since that's a valid array index. Also, when adding all possible buses
because there is no specific per-bus config, there's no need to reset
sc->maxbus on each loop iteration, it can be set once after the loop.
This method is supposed to write the voltage into uvolt
and return an errno compatible value.
Reviewed by: mmel
Differential Revision: https://reviews.freebsd.org/D23006
Don't wait until the vtnet_debugnet_init() call happens, because at that
point we might already have allocated something from
vtnet_tx_header_zone.
Some systems showed this panic:
vtnet0: link state changed to UP
panic: keg vtnet_tx_hdr initialization after use.
cpuid = 5
time = 1578427700
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe004db427f0
vpanic() at vpanic+0x17e/frame 0xfffffe004db42850
panic() at panic+0x43/frame 0xfffffe004db428b0
uma_zone_reserve() at uma_zone_reserve+0xf6/frame 0xfffffe004db428f0
vtnet_debugnet_init() at vtnet_debugnet_init+0x77/frame 0xfffffe004db42930
debugnet_any_ifnet_update() at debugnet_any_ifnet_update+0x42/frame 0xfffffe004db42980
do_link_state_change() at do_link_state_change+0x1b3/frame 0xfffffe004db429d0
taskqueue_run_locked() at taskqueue_run_locked+0x178/frame 0xfffffe004db42a30
taskqueue_run() at taskqueue_run+0x4d/frame 0xfffffe004db42a50
ithread_loop() at ithread_loop+0x1d6/frame 0xfffffe004db42ab0
fork_exit() at fork_exit+0x80/frame 0xfffffe004db42af0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004db42af0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 12 tid 100011 ]
Stopped at kdb_enter+0x37: movq $0,0x1084eb6(%rip)
db>
Reviewed by: cem, markj
Differential Revision: https://reviews.freebsd.org/D23073
SSD capacity in laptops is growing faster then RAM size, so my original
guess seems too low on second thought. Hopefully nobody will build large
array of those crappy SSDs.
MFC after: 2 weeks
X-MFC-with: 356474
This allows cheapest DRAM-less NVMe SSDs to use some of host RAM (about
1MB per 1GB on the devices I have) for its metadata cache, significantly
improving random I/O performance. Device reports minimal and preferable
size of the buffer. The code limits it to 1% of physical RAM by default.
If the buffer can not be allocated or below minimal size, the device will
just have to work without it.
MFC after: 2 weeks
Relnotes: yes
Sponsored by: iXsystems, Inc.
In PR 243056 a user reports some spam from smartpqi(4). In particular,
the driver warns about an unrecognized PQI_CONF_TABLE_SECTION_SOFT_RESET
section (not yet defined in the driver, but handled in Linux), but this
doesn't cause any problems. The Linux driver also does not warn about
unrecognized sections.
Also do not log a warning when a device is added, since this is routine.
Lower severity to DISC, to match pqisrc_remove_device().
PR: 243056
Reviewed by: sbruno
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23023
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.
Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D21427
r356147 removed a vm_page_activate() call, but this is required to
ensure that pages end up in the page queues in the first place.
Restore the pre-r356157 logic. Now, without the page lock, the
vm_page_active() check is racy, but this race is harmless.
Reviewed by: alc, kib
Reported and tested by: pho
Differential Revision: https://reviews.freebsd.org/D23024
Add a privilege check to the ixl_handle_nvmupd_cmd function, ensuring
that only privileged users are allowed to access the NVM update
interface.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reported by: markj@
Reviewed by: markj@, erj@, jeffrey.e.pieper@intel.com
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D22870
This will soon be a dependency for machine/atomic.h on mips with the
introduction of 64-bit atomics; the scope here is pretty narrow, so throw it
here in the header just before systm.h, which includes machine/atomic.h
An i2c bus can be divided into segments which can be selectively connected
and disconnected from the main bus. This is usually done to enable using
multiple slave devices having the same address, by isolating the devices
onto separate bus segments, only one of which is connected to the main bus
at once.
There are several types of i2c bus muxes, which break down into two general
categories...
- Muxes which are themselves i2c slaves. These devices respond to i2c
commands on their upstream bus, and based on those commands, connect
various downstream buses to the upstream. In newbus terms, they are both
a child of an iicbus and the parent of one or more iicbus instances.
- Muxes which are not i2c devices themselves. Such devices are part of the
i2c bus electrically, but in newbus terms their parent is some other
bus. The association with the upstream bus must be established by
separate metadata (such as FDT data).
In both cases, the mux driver has one or more iicbus child instances
representing the downstream buses. The mux driver implements the iicbus_if
interface, as if it were an iichb host bridge/i2c controller driver. It
services the IO requests sent to it by forwarding them to the iicbus
instance representing the upstream bus, after electrically connecting the
upstream bus to the downstream bus that hosts the i2c slave device which
made the IO request.
The net effect is automatic mux switching which is transparent to slaves on
the downstream buses. They just do i2c IO they way they normally do, and the
bus is electrically connected for the duration of the IO and then idled when
it is complete.
The existing iicbus_if callback() method is enhanced so that the parameter
passed to it can be a struct which contains a device_t for the requesting
bus and slave devices. This change is done by adding a flag that indicates
the extra values are present, and making the flags field the first field of
a new args struct. If the flag is set, the iichb or mux driver can recast
the pointer-to-flags into a pointer-to-struct and access the extra
fields. Thus abi compatibility with older drivers is retained (but a mux
cannot exist on the bus with the older iicbus driver in use.)
A new set of core support routines exists in iicbus.c. This code will help
implement mux drivers for any type of mux hardware by supplying all the
boilerplate code that forwards IO requests upstream. It also has code for
parsing metadata and instantiating the child iicbus instances based on it.
Two new hardware mux drivers are added. The ltc430x driver supports the
LTC4305/4306 mux chips which are controlled via i2c commands. The
iic_gpiomux driver supports any mux hardware which is controlled by
manipulating the state of one or more gpio pins. Test Plan
Tested locally using a variety of mux'd bus configurations involving both
ltc4305 and a homebrew gpio-controlled mux. Tested configurations included
cascaded muxes (unlikely in the real world, but useful to prove that 'it all
just works' in terms of the automatic switching and upstream forwarding of
IO requests).
The number is public and has no "entropy," but should be integrated quickly
on VM rewind events to avoid duplicate sequences.
Approved by: csprng(markm)
Differential Revision: https://reviews.freebsd.org/D22946
If somebody else holds that lock, it will likely do the work for us.
If it won't, then we return here later and retry.
Under heavy load it allows to avoid lock congestion between interrupt and
polling threads.
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Combined with earlier nstart/nend removal it allows to remove several locks
from request path of GEOM and few other places. It would be cool if we had
more SMP-friendly statistics, but this helps too.
Sponsored by: iXsystems, Inc.
Allow loadable modules that provide random entropy source(s) to safely
unload. Prior to this change, no driver could ensure that their
random_source structure was not being used by random_harvestq.c for any
period of time after invoking random_source_deregister().
This change converts the source_list LIST to a ConcurrencyKit CK_LIST and
uses an epoch(9) to protect typical read accesses of the list. The existing
HARVEST_LOCK spin mutex is used to safely add and remove list entries.
random_source_deregister() uses epoch_wait() to ensure no concurrent
source_list readers are accessing a random_source before freeing the list
item and returning to the caller.
Callers can safely unload immediately after random_source_deregister()
returns.
Reviewed by: markj
Approved by: csprng(markm)
Discussed with: jhb
Differential Revision: https://reviews.freebsd.org/D22489
With the previous reviews, the page lock is no longer required in order
to perform queue operations on a page. It is also no longer needed in
the page queue scans. This change effectively eliminates remaining uses
of the page lock and also the false sharing caused by multiple pages
sharing a page lock.
Reviewed by: jeff
Tested by: pho
Sponsored by: Netflix, Intel
Differential Revision: https://reviews.freebsd.org/D22885
Chase the removal of dev from gpioths_dht_readbytes() in r355540.
Reviewed by: ian
Approved by: will (mentor)
Differential Revision: https://reviews.freebsd.org/D22926
srandom(9) is meaningless on SMP systems or any system with, say,
interrupts. One could never rely on random(9) to produce a reproducible
sequence of outputs on the basis of a specific srandom() seed because the
global state was shared by all kernel contexts. As such, removing it is
literally indistinguishable to random(9) consumers (as compared with
retaining it).
Mark random(9) as deprecated and slated for quick removal. This is not to
say we intend to remove all fast, non-cryptographic PRNG(s) in the kernel.
It/they just won't be random(9), as it exists today, in either name or
implementation.
Before random(9) is removed, a replacement will be provided and in-tree
consumers will be converted.
Note that despite the name, the random(9) interface does not bear any
resemblance to random(3). Instead, it is the same crummy 1988 Park-Miller
LCG used in libc rand(3).
Simplify RANDOM_LOADABLE by removing the ability to unload a LOADABLE
random(4) implementation. This allows one-time random module selection
at boot, by loader(8). Swapping modules on the fly doesn't seem
especially useful.
This removes the need to hold a lock over the sleepable module calls
read_random and read_random_uio.
init/deinit have been pulled out of random_algorithm entirely. Algorithms
can run their own sysinits to initialize; deinit is removed entirely, as
algorithms can not be unloaded. Algorithms should initialize at
SI_SUB_RANDOM:SI_ORDER_SECOND. In LOADABLE systems, algorithms install
a pointer to their local random_algorithm context in p_random_alg_context at
that time.
Go ahead and const'ify random_algorithm objects; there is no need to mutate
them at runtime.
LOADABLE kernel NULL checks are removed from random_harvestq by ordering
random_harvestq initialization at SI_SUB_RANDOM:SI_ORDER_THIRD, after
algorithm init. Prior to random_harvestq init, hc_harvest_mask is zero and
no events are forwarded to algorithms; after random_harvestq init, the
relevant pointers will already have been installed.
Remove the bulk of random_infra shim wrappers and instead expose the bare
function pointers in sys/random.h. In LOADABLE systems, read_random(9) et
al are just thin shim macros around invoking the associated function
pointer. We do not provide a registration system but instead expect
LOADABLE modules to register themselves at SI_SUB_RANDOM:SI_ORDER_SECOND.
An example is provided in randomdev.c, as used in the random_fortuna.ko
module.
Approved by: csprng(markm)
Discussed with: gordon
Differential Revision: https://reviews.freebsd.org/D22512
In the event of a MOD_LOAD failure, MOD_UNLOAD will be invoked to unwind
module load. Most of the reversion in MOD_LOAD can just be deferred to
normal MOD_UNLOAD cleanup, rather than duplicating the effort.
A NULL return of kbd_get_switch in the MOD_UNLOAD handler has been
downgraded from a panic to a successful return, as that certainly just means
that kbd_add_driver failed (not possible at the moment) and we have no work
to do.
r356087 made it rather innocuous to double-register built-in keyboard
drivers; we now set a flag to indicate that it's been registered and only
act once on a registration anyways. There is no misleading here, as the
follow-up kbd_delete_driver will actually remove the driver as needed now
that the linker set isn't also consulted after kbdinit.
This leads to the revert of r355806; this reduces duplication in keyboard
registration and driver switch lookup and leaves us with one authoritative
source for currently registered drivers. The reduced duplication later is
nice as we have more procedure involved in keyboard setup.
keyboard_driver->flags is used to more quickly detect bogus adds/removes.
From KPI consumers' perspective, nothing changes- kbd_add_driver of an
already-registered driver will succeed, and a single kbd_delete_driver will
later remove it as expected. In contrast to historical behavior,
kbd_delete_driver on a driver registered via linker set will now actually
de-register the driver so that it may not be used -- e.g. if kbdmux's
MOD_LOAD handler fails somewhere.
Detection for already-registered drivers in kbd_add_driver has improved, as
the previous SLIST_NEXT(driver) != NULL check would not have caught a driver
that's at the tail end.
kbdinit is now called from cninit() rather than via SYSINIT so that keyboard
drivers are available as early as console drivers. This is particularly
important as cnprobe will, in both syscons and vt, attempt to do any early
configuration of keyboard drivers built-in (see: kbd_configure).
Reviewed by: imp (earlier version, pre-cninit change)
Differential Revision: https://reviews.freebsd.org/D22835
Proper locking for atkbdc will likely replace the kbdc_lock mechanism
entirely with a mutex in atkbdc_softc, so that other consumers can also
properly ensure locking protocol is followed (e.g. psm.c:doinitialize).
The first step to doing this neatly is making KBDC less opaque so that
others don't have to jump through weird casting hoops to address the mutex.
No functional change intended; this diff effectively just removes a bunch of
casting. A future change may remove the KBDC typedef entirely and just opt
for using `atkbdc_softc_c *` directly, but this was decidedly a good
intermediate step to make these changes simple to audit.
in clang HEAD.
There was an invisible space in the middle of the tabs, and that apprently
was enough to throw off clang's column counting.
Even if clang is "incorrect" here, it's still a style(9) violation.
A missing check meant that unprivileged users could send passthrough
commands to the device firmware.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
The rest were removed in r355936, which speculated that the cause of this
phenomenon was due to an inability to have an empty linker set. The comment
included with this one shows that this was, in fact, not the reason.
Regardless, syscons no longer seems to have an issue with not having any
keyboard drivers and in-fact ignores the keyboard probe anyways.
X-MFC-With: r355936
Analysis seems to reveal that sc->keyboard >= 0 implies sc->kbd != NULL and
there's no such scenario where sc->kbd is set (and theoretically used to
rebuild sc->keyboard) with the keyboard unavailable.
Drop the index softc. The index is only explicitly needed in few places, in
which case we can just as easily grab it from sc->kbd. There's no need for
keeping sc->kbd and sc->keyboard in sync when it can be readily accomplished
with just the former.
removed from objects including calls to free. Pages must not be xbusy
when freed and not on an object. Strengthen assertions to match these
expectations. In practice very little code had to change busy handling
to meet these rules but we can now make stronger guarantees to busy
holders and avoid conditionally dropping busy in free.
Refine vm_page_remove() and vm_page_replace() semantics now that we have
stronger guarantees about busy state. This removes redundant and
potentially problematic code that has proliferated.
Discussed with: markj
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D22822
The VM generation counter is a 128-bit value exposed by the BIOS via ACPI.
The value changes to another unique identifier whenever a VM is duplicated.
Additionally, ACPI provides notification events when such events occur.
The driver decodes the pointer to the UUID, exports the value to userspace
via OPAQUE sysctl blob, and forwards the ACPI notifications in the form of
an EVENTHANDLER invocation as well as userspace devctl events.
See design paper: https://go.microsoft.com/fwlink/p/?LinkID=260709
The implementation was landed in r344913 and has had some bake time (at
least on my personal systems). There is some discussion of the motivation
for defaulting to this cipher as a PRF in the commit log for r344913.
As documented in that commit, administrators can retain the prior (AES-ICM)
mode of operation by setting the 'kern.random.use_chacha20_cipher' tunable
to 0 in loader.conf(5).
Approved by: csprng(delphij, markm)
Differential Revision: https://reviews.freebsd.org/D22878
This effectively reverts r355935, but is functionally equivalent. We gain no
benefit from storing the index and repeatedly fetching the keyboard with
`kbd_get_keyboard` when we need it. We'll be notified when it's going away
so we can clean up the pointer.
All existing references were trivially converted. Only once instance
actually needed the index.
With absolutely no keyboards attached and no kbdmux in kernel, we descend
down this error path. 0 is a valid keyboard index, so leaving
vd->vd_keyboard at 0 when there's no keyboard found is objectively wrong as
later attachment of a keyboard will fail -- it gets index 0, and vt thinks
it's already using that keyboard.
This is decidedly the corniest of corner cases, but it's easy enough to get
correct that we should do so.
Tested in a kernel without atkbdc, atkbd, psm, kbdmux, ukbd, hyperv then
loading ukbd post-boot and attaching a usb keyboard.
Flip the knob added in r349154 to "enabled." The commit message from that
revision and associated code comment describe the rationale, implementation,
and motivation for the new default in detail. I have dog-fooded this
configuration on my own systems for six months, for what that's worth.
For end-users: the result is just as secure. The benefit is a faster, more
responsive system when processes produce significant demand on random(4).
As mentioned in the earlier commit, the prior behavior may be restored by
setting the kern.random.fortuna.concurrent_read="0" knob in loader.conf(5).
This scales the random generation side of random(4) somewhat, although there
is still a global mutex being shared by all cores and rand_harvestq; the
situation is generally much better than it was before on small CPU systems,
but do not expect miracles on 256-core systems running 256-thread full-rate
random(4) read. Work is ongoing to address both the generation-side (in
more depth) and the harvest-side scaling problems.
Approved by: csprng(delphij, markm)
Tested by: markm
Differential Revision: https://reviews.freebsd.org/D22879
The OpenCores I2C IP core can be found on any bus. Split out the PCI
bus specifics into their own file, only compiled on systems with PCI.
Reviewed by: kp
Sponsored by: Axiado
First reported against ESXi 5.0, PCI passthrough was not working due to
MSI-X issues. However, this issue was fixed via patch releases against
ESXi 5.5 and 6.0 in 2016. Given ESXi 5.5 and earlier have been EOL, this
patch removes the VMware MSI-X blacklist entries in the quirk table.
PR: 203874
Reviewed by: imp, jhb
MFC after: 1 month
Sponsored by: VMware
Differential Revision: https://reviews.freebsd.org/D22819
The phy name may apparently be followed by a number in some systems.
Allow that.
PR: 242654
Reported and tested by: Marcel <marcel@brickporch.com>
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Missing validation meant that it was possible to read 8 bytes beyond
the end of sfp_vpd_dump_buffer.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by: delphij, ram
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22859
While a given ACPI device may have 0-N compatibility IDs, in practice most
seem to have 0 or 1. If one is present, emit it as part of the PNP info
string associated with a device. This could enable MODULE_PNP_INFO-based
automatic kldload for ACPI drivers associated with a given _CID (but without
a good _HID or _UID identifier).
Reviewed by: imp, jhb
Differential Revision: https://reviews.freebsd.org/D22846
SIOCGAIRONET allows userspace to query an(4) for various device
properties and configuration, which appears to potentially include
sensitive information such as WEP keys (an(4) seems to predate WPA).
Also avoid races by copying in the request structure to a temporary
buffer before locking and modifying the device softc.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
It used to be required that a device be a child of gpiobus(4) to manipulate
gpio pins. That requirement didn't work well for FDT-based systems with many
cross-hierarchy users of gpio, so a more modern framework was created that
removed the old hierarchy requirement.
These changes adapt the owc_gpiobus driver to use the newer gpio_pin_*
functions to acquire, release, and manipulate gpio pins. This allows a
single driver to work for both hinted-attachment and fdt-based systems, and
removes the requirement that any one-wire fdt nodes must appear at the root
of the devicetree.
Differential Revision: https://reviews.freebsd.org/D22710
Nothing modifies these things, but const'ify out of an abundance of caution.
If we could const'ify the definition in each keyboard driver, I likely
would- improper mutations here can lead to misbehavior or slightly more
annoying to debug state.
This looks like a bit of debugging code that sliped into the initial
import of the new ATA framework. This changes the behavior to omit a
line of output that appears to have been intended for omission.
Reviewed by: mav
MFC after: 3 days
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22845
- PCI_MASK_CONFIG(sc->dev, CBBR_BRIDGECTRL,
- & ~CBBM_BRIDGECTRL_INTR_IREQ_ISA_EN, 2);
was accidentally dropped from r355822 in the refactor. Restore it since 16-bit
cards may fail without it (some bridges autodetect this properly, so my laptop
worked when I tested it).
Noticed by: rpokala@
array with a singleton.
Also, pccbb isa attachment is never going to happen, do disconnect it from the
build (will delete this in future commit). It would need to be updated as well,
but since this code is effectively dead code, remove it from the build instead.
Keyboard drivers are generally registered via linker set. In these cases,
they're also available as kmods which use KPI for registering/unregistering
keyboard drivers outside of the linker set.
For built-in modules, we still fire off MOD_LOAD and maybe even MOD_UNLOAD
if an error occurs, leading to registration via linker set and at MOD_LOAD
time.
This is a minor optimization at best, but it keeps the internal kbd driver
tidy as a future change will merge the linker set driver list into its
internal keyboard_drivers list via SYSINIT and simplify driver lookup by
removing the need to consult the linker set.
This is needed after r355796. Some double-registration of kbd drivers needs
to be sorted out, then this sysinit will simply add these drivers into the
normal list and kill off any other bits in the driver that are aware of the
linker set, for simplicity.
The previous implementation relied on a kbdsw array that mirrored the global
keyboards array. This is fine, but also requires extra locking consideration
when accessing to ensure that it's not being resized as new keyboards are
added.
The extra pointer costs little in a struct that there are relatively few of
on any given system, and simplifies locking requirements ever-so-slightly as
we only need to consider the locking requirements of whichever method is
being invoked.
__FreeBSD_version is bumped as any kbd modules will need rebuilt following
this change.
Most keyboard drivers are using the genkbd implementations as it is;
formally use them for any that aren't set and make
genkbd_get_fkeystr/genkbd_diag private.
A future change will provide default implementations for some of these where
it makes sense and most of them are already using the genkbd
implementation (e.g. get_fkeystr, diag).
These invocations were directly calling enkbd_diag(), rather than
indirection back through kbdd_diag/kbdsw. While they're functionally
equivent, invoking kbdd_diag where feasible (i.e. not in a diag
implementation) makes it easier to visually identify locking needs in these
other drivers.
Don't hold the scheduler lock while doing context switches. Instead we
unlock after selecting the new thread and switch within a spinlock
section leaving interrupts and preemption disabled to prevent local
concurrency. This means that mi_switch() is entered with the thread
locked but returns without. This dramatically simplifies scheduler
locking because we will not hold the schedlock while spinning on
blocked lock in switch.
This change has not been made to 4BSD but in principle it would be
more straightforward.
Discussed with: markj
Reviewed by: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22778
Eliminate recursion from most thread_lock consumers. Return from
sched_add() without the thread_lock held. This eliminates unnecessary
atomics and lock word loads as well as reducing the hold time for
scheduler locks. This will eventually allow for lockless remote adds.
Discussed with: kib
Reviewed by: jhb
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22626
Within command completion processing the callback function may access
DMAed data buffer. Synchronize it before use, not after.
This allows to use NVMe disk on non-DMA coherent arm64 system.
MFC after: 3 weeks
This #ifdef is misleading as there are actually no user-serviceable parts
inside and, as far as I can tell, there is no pollution leading from
userland to this header. Furthermore, it becomes a slight nuisance when
attempting to move things around in this header.
an exclusive object lock.
Previously swap space was freed on a best effort basis when a page that
had valid swap was dirtied, thus invalidating the swap copy. This may be
done inconsistently and requires the object lock which is not always
convenient.
Instead, track when swap space is present. The first dirty is responsible
for deleting space or setting PGA_SWAP_FREE which will trigger background
scans to free the swap space.
Simplify the locking in vm_fault_dirty() now that we can reliably identify
the first dirty.
Discussed with: alc, kib, markj
Differential Revision: https://reviews.freebsd.org/D22654
Parse out the VSEC. If the user invokes a second -c command line option,
do a hex dump of the vendor data.
Reviewed by: imp
MFC after: 3 days
Sponsored by: Intel
Differential Revision: http://reviews.freebsd.org/D22808
CPL_TX_PKT_XT disables the internal parser on the chip and instead
relies on the driver to provide the exact length of the L2 and L3
headers. This allows hw checksumming and TSO to be used with L2 and
L3 encapsulations that the chip doesn't understand directly.
Note that netmap tx still uses the old CPL as it never uses the hw
to generate the checksum on tx.
Reviewed by: jhb@
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D22788
Delay the attachment of children, when requested, until after interrutps are
running. This is often needed to allow children to run transactions on i2c or
spi busses. It's a common enough idiom that it will be useful to have its own
wrapper.
Reviewed by: ian
Differential Revision: https://reviews.freebsd.org/D21465
While there are subtle semantic differences between bool and boolean_t, none of
them matter in these cases. Prefer true/false when dealing with bool
type. Preserve a couple of TRUEs since they are passed into int args into CAM.
Preserve a couple of FALSEs when used for status.done, an int.
Differential Revision: https://reviews.freebsd.org/D20999
detach work and return the error. Especially don't call iicbus_reset()
since the most likely cause of failing to detach children is that one
of them has IO in progress.
This trims the boot time a bit more for AWS and other platforms that have nvme
drives. There's no reason too do this inline. This has been in my tree a while,
but IIRC I talked to Jim Harris about this at one of our face to face meetings.
MFC After: 2 weeks
Instead of first detaching the children(s) and then delete them,
use the device_delete_children function that does all of that.
MFC after: 1 month
Suggested by: ian
The driver used to always add the mmc device as it's child even
it no card was detected. Add a function that will detect if the
card is present or not and that will attach/detach the mmc device.
The function is either call on attach (as we won't have the interrupt
fired) or from two taskqueues. The first taskqueue will directly call
the function when the sdcard was present and is now removed and the other
one will delay a bit the attach when we didn't had a card and now have one.
This is mostly based on comments from the sdhci driver where it describe
a situation when the CD pin is detected before the others pins are connected.
MFC after: 1 month
This method will disable the regulators, clocks and assert the reset of
the module. It will also detach it's children (the mmc device) and release
it's resources.
While here enable the regulators on attach as we need them to power up
the sdcard or emmc.
MFC after: 1 month
The children of the bus need to do IO on the bus to probe for hardware
presence. Doing IO means timing the bus states using sbinuptime(), and
that requires working timecounters, which are not initialized until after
device attachment has completed.
PR: 242526
bus_get/set_resource methods are implemented in child device (iicbus).
As their implementation with bus_generic_rl_get/set calls do not
recurse up the tree, the versions in ig4 are never called.
Suggested by: jhb
This is a 32-bit structure embedded in each vm_page, consisting mostly
of page queue state. The use of a structure makes it easy to store a
snapshot of a page's queue state in a stack variable and use cmpset
loops to update that state without requiring the page lock.
This change merely adds the structure and updates references to atomic
state fields. No functional change intended.
Reviewed by: alc, jeff, kib
Sponsored by: Netflix, Intel
Differential Revision: https://reviews.freebsd.org/D22650
TX_PKTS2 is more efficient within the firmware and this improves netmap
Tx by a few Mpps in some common scenarios.
MFC after: 1 week
Sponsored by: Chelsio Communications
These were obtained from the Chelsio Unified Wire v3.12.0.1 beta
release.
Note that the firmwares are not uuencoded any more.
MFH: 1 month
Sponsored by: Chelsio Communications
The datasheets for these chips claim the maximum is 921,600, but testing
shows these two higher rates also work (but no rates above 921,600 other
than these two work; these represent dividing the base buad clock by 3 and 2
respectively).
As we do for many other laptops, put the headphone jack and speakers in
the same association by default so that the generic sound device
automatically switches between them.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
of the sensor hardware. Part of the polling process involves signalling
the chip then waiting 20 milliseconds. This was being done with DELAY(),
which is a pretty rude thing to do in a callout. Now a taskqueue_thread
task is scheduled to do the polling, and because sleeping is allowed in
the task context, pause_sbt() replaces DELAY() for the 20ms wait.
This change enables the use of OpenFirmware Console (ofwcons), even when VGA is
available, allowing early kernel messages to be seen, that is important in case
of crashes before VGA console initialization.
This is specially useful in virtualized environments, where the user/developer
doesn't have full control of the virtualization engine (e.g. OpenStack).
The old behavior is preserved by default and, in order to use ofwcons, a few
tunables that have been introduced need to be set:
- hw.ofwfb.disable=1 - disable OFW FrameBuffer device
- machdep.ofw.mtx_spin=1 - change PPC OFW mutex to SPIN type, to match kernel
console's mutex type
- debug.quiesce_ofw=0 - don't call OFW quiesce, needed to keep ofwcons I/O
working
More details can be found at differential revision D20640.
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D20640