The Flexible File layout case wasn't handled by LayoutRecall callbacks
because it just checked for File layout and returned NFSERR_NOMATCHLAYOUT
otherwise. This patch adds the Flexible File layout handling.
Found during testing of the pNFS server.
MFC after: 2 weeks
This file has only generated a warning for the last 18 months. Its
existence at this point only serves to confuse software looking for
POSIX.1e capabilities and produce actionless warnings.
breakpoint instruction, however this would lose information that may be
useful for debugging.
These are now handled in a similar way to other exceptions, however it
won't exit out of the exception handler until it is known if we can
handle these exceptions in a useful way.
Sponsored by: DARPA, AFRL
and also on apic in common and i386 files (except for xen it is optional
only on xenhvm), but it was not ifdefed except on apic in common and i386
files.
This is all that is left from an attempt to build a (sub-)minimal kernel
without any devices. The isa "option" is still used without ifdefs in many
standard files even on amd64. ISAPNP is not optional on at least i386.
ATPIC is not optional on i386 (it is used mainly for Xspuriousint). But
pci is now supposed to be optional on x86.
A call to npxsave() in the exception trampolines was not relocated.
This call to a garbage address usually paniced when made, but it is only
made when the thread has used an FPU recently, and this is not the usual
case.
PR: 228755
Reviewed by: kib
These ioctls are not documented and only stubbed in a few drivers: mse(4),
psm(4) and syscon's sysmouse(4). The only exception is MOUSE_GETVARS
implemented in psm(4)
Given the fact that they were introduced 20 years ago and implementation
has never been completed, remove any related code.
PR: 228718 (exp-run)
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D15726
These macros were added because they were used by the pNFS server last
year. However, they are no longer used by the pNFS server code and
might as well be deleted.
This is a partial reversion of r326735.
NFSDEV_MIRRORSTR was defined for the pNFS server, but has not been used,
so this patch deletes it. It also cleans up the comment and hopefully
makes it more readable.
when parsing the phy type, however this is included in the length returned
by OF_getprop. To fix this stop ignoring the terminator.
PR: 228828
Reported by: sbruno
Sponsored by: DARPA, AFRL
If a locally generated packet is routed (with route-to/reply-to/dup-to) out of
a different interface it's passed through the firewall again. This meant we
lost the inp pointer and if we required the pointer (e.g. for user ID matching)
we'd deadlock trying to acquire an inp lock we've already got.
Pass the inp pointer along with pf_route()/pf_route6().
PR: 228782
MFC after: 1 week
Since we are setting IFF_UP flag on SIOCSIFADDR, it is possible, that
after this link state information still not initialized properly.
This leads to problems with routing, since now interface has
IFCAP_LINKSTATE capability and a route is considered as working only
when interface's link state is in LINK_STATE_UP (see RT_LINK_IS_UP()
macro).
Reported by: Marek Zarychta
MFC after: 3 days
This caused issues with PASTE. Just remove the reschedule since the DELAY()
should be enough for use cases such as pkt-gen which were failing before the
change.
Reported by: Michio Honda
Sponsored by: Limelight Networks
Per-cpu zone allocations are very rarely done compared to regular zones.
The intent is to avoid pessimizing the latter case with per-cpu specific
code.
In particular contrary to the claim in r334824, M_ZERO is sometimes being
used for such zones. But the zeroing method is completely different and
braching on it in the fast path for regular zones is a waste of time.
callbacks to perform additional cleanup actions at the time a socket is
closed.
Michio Honda presented a use for this at BSDCan 2018.
(See https://www.bsdcan.org/2018/schedule/events/965.en.html .)
Submitted by: Michio Honda <micchie at sfc.wide.ad.jp> (previous version)
Reviewed by: lstewart (previous version)
Differential Revision: https://reviews.freebsd.org/D15706
With the introduction of pmap_switch(), the DSB instruction on the
address map switch is not necessary executed, which is fixed by
changing the unlock store to release. Also remove comment which
documented pre-pmap_switch() code.
Reviewed by: andrew
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
If we fail noise floor calibration then we may end up with a deaf NIC
which we can't recover without a full chip reset.
Earlier chips seem to get less stuck in this condition versus AR9280/later
and AR9300/later, but whilst here just fix up the AR5212 era chips to also
return NF calibration failures.
This HAL routine would only return failure if the channel was not configured.
This is a no-op until the driver side code for doing resets and the HAL
code for being told about the reset type (and then handling it!) is
implemented.
Tested:
* AR9280, STA mode
* AR2425, STA mode
* AR9380, STA mode
d4a72f2386
During scans (scrubs or resilvers), it sorts the blocks in each transaction
group by block offset; the result can be a significant improvement. (On my
test system just now, which I put some effort to introduce fragmentation into
the pool since I set it up yesterday, a scrub went from 1h2m to 33.5m with the
changes.) I've seen similar rations on production systems.
Approved by: Alexander Motin
Obtained from: ZFS On Linux
Relnotes: Yes (improved scrub performance, with tunables)
Differential Revision: https://reviews.freebsd.org/D15562
Turns out there is code which ends up passing M_ZERO to counters.
Since counters zero unconditionally on their own, just ignore drop the
flag in that place.
pmc_process_interrupt takes 5 arguments when only 3 are needed.
cpu is always available in curcpu and inuserspace can always be
derived from the passed trapframe.
While facially a reasonable cleanup this change was motivated
by the need to workaround a compiler bug.
core2_intr(cpu, tf) ->
pmc_process_interrupt(cpu, ring, pmc, tf, inuserspace) ->
pmc_add_sample(cpu, ring, pm, tf, inuserspace)
In the process of optimizing the tail call the tf pointer was getting
clobbered:
(kgdb) up
at /storage/mmacy/devel/freebsd/sys/dev/hwpmc/hwpmc_mod.c:4709
4709 pmc_save_kernel_callchain(ps->ps_pc,
(kgdb) up
1205 error = pmc_process_interrupt(cpu, PMC_HR, pm, tf,
resulting in a crash in pmc_save_kernel_callchain.
Nothing in the tree uses it and pcpu zones have a fundamentally different use
case than the regular zones - they are not supposed to be allocated and freed
all the time.
This reduces pollution in the allocation fast path.
memset fills the target buffer from a byte-sized value passed in as the
second argument.
The fully-sized (8 bytes) register containing it is named %rsi. Lower 4 bytes
can be referred to as %esi and finally the lowest byte is %sil.
Vast majority of all the callers just zero the target buffer and set it up by
doing xor %esi,%esi which has a side-effect of zeroing the upper parts of
the register as well. Some others do a word-sized move to %esi which has the
same result.
However, there are callers which only fill %sil. This does *not* clear up
the rest of the register.
The value of %rsi is multiplied by $0x0101010101010101 to create a 8-byte sized
pattern for 8-byte stores.
Prior to the patch, the func just blindly took %rsi assuming the unwanted bytes
are zeroed out. Since this is not the case for the callers which only play with
%sil (the rest of the register can have absolutely anything), the resulting
pattern can be garbage.
This has potential for funny bugs. One side effect (which was not amusing)
after enabling it instead of bzero was that the kernel was hanging on boot
as a xen domU.
Reported by: Trond Endrestøl <Trond.Endrestol fagskolen.gjovik.no>
Pointy hat: me
trashing freed memory and checking that allocated memory is properly
trashed, and also of keeping a bitset of freed items. Trashing/checking
creates a lot of CPU cache poisoning, while keeping debugging bitsets
consistent creates a lot of contention on UMA zone lock(s). The performance
difference between INVARIANTS kernel and normal one is mostly attributed
to UMA debugging, rather than to all KASSERT checks in the kernel.
Add loader tunable vm.debug.divisor that allows either to turn off UMA
debugging completely, or turn it on only for a fraction of allocations,
while still running all KASSERTs in kernel. That allows to run INVARIANTS
kernels in production environments without reducing load by orders of
magnitude, but still doing useful extra checks.
Default value is 1, meaning debug every allocation. Value of 0 would
disable UMA debugging completely. Values above 1 enable debugging only
for every N-th item. It isn't possible to strictly follow the number,
but still amount of debugging is reduced roughly by (N-1)/N percent.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D15199
Changed excise_initrd_region to support both 32- and 64-bit
values for linux,initrd-start and linux,initrd-end.
This fixes the boot problem on some machines after rS334485.
Submitted by: Luis Pires <lffpires@ruabrasil.org>
Reviewed by: jhibbits, leitao
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15667
myriad ways that the various compliers treat this. The
only safe prefetch appears to be for AMD. The other
compilers either are not volatile or are not const :(
Reported by: Michael Tuexen
When we're at our vnode limit, getnewvnode will call into the vnode LRU
cache to free up vnodes. If the vnode we try to recycle is a ZFS vnode we
end up, eventually, in zfs_rmnode. If the ZFS vnode we're recycling
represents something with extended attributes, zfs_rmnode will call
zfs_zget which will attempt to allocate another vnode. If the next vnode we
try to recycle is also a ZFS vnode representing something with extended
attributes we can recurse further. This ends up being unbounded and can end
up overflowing the stack.
In order to avoid this, restructure zfs_rmnode to simply add the extended
attribute directory's object ID to the unlinked set, thus not requiring the
allocation of a vnode. We then schedule a task that calls zfs_unlinked_drain
which will do the work of properly marking the vnodes for unlinking.
zfs_unlinked_drain is also called on mount so these will be cleaned up
there.
Reviewed by: avg, mav
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D15342
Rack includes the following features:
- A different SACK processing scheme (the old sack structures are not used).
- RACK (Recent acknowledgment) where counting dup-acks is no longer done
instead time is used to knwo when to retransmit. (see the I-D)
- TLP (Tail Loss Probe) where we will probe for tail-losses to attempt
to try not to take a retransmit time-out. (see the I-D)
- Burst mitigation using TCPHTPS
- PRR (partial rate reduction) see the RFC.
Once built into your kernel, you can select this stack by either
socket option with the name of the stack is "rack" or by setting
the global sysctl so the default is rack.
Note that any connection that does not support SACK will be kicked
back to the "default" base FreeBSD stack (currently known as "default").
To build this into your kernel you will need to enable in your
kernel:
makeoptions WITH_EXTRA_TCP_STACKS=1
options TCPHPTS
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D15525
pagetables.
physmap[] can be inconsistent with the physical memory limit due to
buggy bios, or to the hw.physmem tunable. Since bootstrap pagetables
are initialized by accesses through the DMAP, we must ensure that DMAP
really cover the selected pages. This is only relevant when machine
has less than 4G RAM and buggy BIOS, which is the combination on Acer
Chromebook 720.
The call to mp_bootaddress() is moved later to have Maxmem initialized.
An alternative could be to always cover 4G for DMAP, but this change
seems to be simpler.
Reported and tested by: grembo
Reviewed by: royger
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D15675
Fix the behavior of ofw_fdt_getprop() and ofw_fdt_getprop() functions to match
the documentation as the non-fdt code.
Submitted by: Luis Pires <lffpires@ruabrasil.org>
Reviewed by: manu, jhibbits
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D15680