It should be sufficient to hold the lock just for removing the session
from the session list. Everything else should be covered by the session
specific lock.
On top of that, at present we can get a deadlock caused by waiting on
the CAM SIM reference count while holding the global lock. A specific
scenario involving ZFS is this:
- concurrent termination of two sessions, S1 and S2
- session S1 completed all I/Os and sleeps in CAM waiting for device
close by ZFS;
- session S2 is also dead now, but can not forcefully complete
outstanding requests by calling iscsi_session_cleanup() from
iscsi_maintenance_thread_terminate(), since it can't get the same
global sc_lock;
- as soon as there are unfinished requests, ZFS can not do
spa_config_enter() as writer, and so can not close the device for
session S1;
- deadlock.
Reported by: Ben RUBSON <ben.rubson@gmail.com>
Tested by: Ben RUBSON <ben.rubson@gmail.com>
Reviewed by: mav, trasz
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D12652
This is a NFC patch to move around the Search:memory implementation so
that it doesn't exceed the standard column width and doesn't take so
much vertical space in gdb_trap.
Submitted by: Daniel O'Connor <darius@dons.net.au>
Reviewed by: cem, jhb
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12684
When the NFSv4.1 pNFS client is using a Flexible File Layout specifying
mirrored Data Servers, it must do the writes and commits to all mirrors.
This patch modifies the client to use a taskqueue to perform these writes
and commits concurrently.
The number of threads can't be changed for taskqueue(9), so it is set
to 4 * mp_ncpus by default, but this can be overridden by setting the
sysctl vfs.nfs.pnfsiothreads.
Differential Revision: https://reviews.freebsd.org/D12632
- Fix it by replacing m_cat() with m_prev->m_next = m_new
(m_cat() will try to append data - as a result, there will be no
fragmentation).
- Move some constants out of the loop.
Was previously tested with D4077.
Differential Revision: https://reviews.freebsd.org/D4090
Allocate smallest unit number from pool via ifc_alloc_unit_next()
and exact unit number (if available) via ifc_alloc_unit_specific().
While here, address possible deadlock (mentioned in PR).
PR: 217401
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D12551
The stop drops process lock, which allows the signal mask to be
changed and our selected signal might become blocked, i.e. should be
returned to the process queue instead of delivery.
Also, for the existing check of the process no longer having an
attached debugger, we should not loose the signal, but requeue it.
Reported and tested by: bdrewery
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Split two conditions into separate asserts. Print additional details,
like the signal number and action value.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
In r324542 I neglected to reset the first and last fields of struct
unrhdr. This causes a tmpfs to fail the unr(9) consistency checks with
DIAGNOSTIC on. Fix this by resetting the fields by calling init_unrhdr.
While here, change a loop to use TAILQ_FOREACH_SAFE since it is more
readable and equally fast.
Reported by: David Wolfskill <david@catwhisker.org>
Approved by: rstone (mentor)
Sponsored by: Dell EMC Isilon
For processing, reclaim_pv_chunk() removes the pv_chunk from the lru
list, which makes pc_lru linkage invalid. Then the pmap lock is
released, which allows for other thread to free the last pv entry
allocated from the chunk and call free_pv_chunk(), which tries to
modify the invalid linkage.
Similarly, the chunk is inserted into the private tailq new_tail
temporary. Again, free_pv_chunk() might be run and corrupt the
linkage for the new_tail after the pmap lock is dropped.
This is a consequence of r299788 elimination of pvh_global_lock, which
allowed for reclaim to run in parallel with other pmap calls which
free pv chunks.
As a fix, do not remove the chunk from pc_lru queue, use a marker to
remember the position in the queue iteration. We can safely operate
on the chunks after the chunk's pmap is locked, we fetched the chunk
after the marker, and we checked that chunk pmap is same as we have
locked, because chunk removal from pc_lru requires both pv_chunk_mutex
and the pmap mutex owned.
Note that the fix lost an optimization which was present in the
previous algorithm. Namely, new_tail requeueing rotated the pv chunks
list so that reclaim didn't scan the same pv chunks that couldn't be
freed (because they contained a wired and/or superpage mapping) on
every invocation. An additional change is planned which would improve
this.
Reported and tested by: pho
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Makefile.inc has a specific meaning in the tree, and
common/Makefile.inc doesn't quite fit into that. Rename it to
loader.mk and it will be a place to collect common things to all
/boot/loader programs there.
Sponsored by: Netflix
Refactor boot1 to use the same I/O code as /boot/loader uses. Refactor
to use the common efi_main.c.
Submitted by: Eric McCorkle
Differential Revision: https://reviews.freebsd.org/D10447
allocations (for req and ccb, which ultimately contain the
nvme_cmd). As such, we can micro-optimize these routines. Add a
comment to this effect, and bzero the ccb used to make the requests
for the nda dump rotuine so it more closely matches a ccb allocated
with xpt_get_ccb().
Sponsored by: Netflix
The client IP address was not being reported for some NFSv4 mounts by
nfsdumpstate. Upon investigation, two problems were found for mounts
using IPv4. One was that the code (originally written and tested on i386)
assumed that a "u_long" was a "uint32_t" and would exactly store an
IPv4 host address. Not correct for 64bit arches.
Also, for NFSv4.1 mounts, the field was not being filled in. This was
basically correct, because NFSv4.1 does not use a callback address.
However, it meant that nfsdumpstate could not report the client IP addr.
This patch should fix both of these issues.
For IPv6, the address will still not be reported. The original NFSv4 RFC
only specified IPv4 callback addresses. I think this has changed and, if so,
a future commit to fix reporting of IPv6 addresses will be needed.
Reported by: manu
PR: 223036
MFC after: 2 weeks
Ensure that the current behaviour is consistent: stop processing
of the chunk, but finish the processing of the previous chunks.
This behaviour might be changed in a later commit to ABORT the
assoication due to a protocol violation, but changing this
is a separate issue.
MFC after: 3 days
Create a config file for PCI devices that exposes their configuration
space. Only fields needed by libdrm are filled in (vendor, device,
revision, subvendor and subdevice).
Link /sys/class/drm/card%d/device to the PCI device directory.
/compat/linux/path before /path. Stop following symbolic links when
looking up /compat/linux/path so dead symbolic links aren't ignored.
This allows syscalls like readlink(2) and lstat(2) to work on such links.
And open(2) will return an error now instead of trying /path.
more general and doesn't try to access registers that may be undefined
when the card is in MSIX mode.
This change, along with r324630, r324631, r324632, makes nda crash
dumps work again. Previously, they only worked on CPU 0 when the stack
garbage was just so.
Sponsored by: Netflix
Suggested by: scottl@ (who provided earlier version of the patch)
acidentally sending bogus values in these fields, which some drives
may reject with an error or worse (undefined behavior).
This is especially needed for the ndadump routine which allocates the
cmd from stack garbage....
Sponsored by: Netflix
There should be no functional changes.
Reviewed by: hselasky
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12670
Follow up from r324201 to fix compilation with gcc, which complains
about non-ICE case expressions.
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D12675
The variable is modified with the highly contended page free queue lock.
It unnecessarily shares a cacheline with purely read-only fields and is
re-read after the lock is dropped in the page allocation code making the
hold time longer.
Pad the variable just like the others and store the value as found with
the lock held instead of re-reading.
Provides a modest 1%-ish speed up in concurrent page faults.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D12665
The value is spread all over the kernel and zeroing a register is
cheaper/shorter than setting it up to an arbitrary value.
Reduces amd64 GENERIC-NODEBUG .text size by 0.4%.
MFC after: 1 week
After some in-progress work is committed, this would otherwise be the only
instance of #if(n)def NO_SWAPPING in the tree. Moreover, the requisite
opt_vm.h include was missing, so the PHOLD/PRELE calls were always being
compiled in anyway.
MFC after: 1 week
"optimization". First, sendfile(..., SF_NOCACHE) frees pages without
checking whether those pages are mapped. This can leave the system
with mappings to free or repurposed pages. Second, a page can be
busied between the time of the current busy test and acquiring the
object lock. Essentially, the test performed before the object lock
is acquired can only be regarded as an optimization to short-circuit
further work on the page. It cannot, however, be relied upon to prove
that it is safe to free the page. Third, when sendfile(..., SF_NOCACHE)
was originally implemented, vm_page_deactivate_noreuse() did not yet
exist. Use vm_page_deactivate_noreuse() instead of vm_page_deactivate(),
because it comes closer to freeing the page.
In collaboration with: glebius
Discussed with: gallatin, kib, markj
X-MFC after: r324448
then td->td_sel is NULL and this will result in a segfault inside selrecord().
This happens when only using kqueue() to poll for read and write events.
If select() and kqueue() is mixed there won't be a segfault.
Reported by: Johannes Lundberg
MFC after: 1 week
Sponsored by: Mellanox Technologies