Commit graph

120149 commits

Author SHA1 Message Date
Navdeep Parhar
218c5b2115 cxgbe(4): Add a knob to enable/disable PCIe relaxed ordering. Disable it by
default when running on Intel CPUs.

This is a crude fix for the performance issues alluded to in these Linux commits:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=87e09cdec4dae08acdb4aa49beb793c19d73e73e
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a99b646afa8a02571ea298bedca6592d818229cd

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-01-03 19:24:57 +00:00
Ed Maste
22760c6d0a ath: revert accidental change committed with r327526
It will be recommitted with the correct commit message.
2018-01-03 19:24:21 +00:00
Ed Maste
47b5c71929 embed_mfs: correctly test grep return value
Reported by:	br
MFC with:	r326992
Sponsored by:	The FreeBSD Foundation
2018-01-03 19:22:10 +00:00
Mark Johnston
8a7a42bb17 Add missing newlines to a couple of error messages.
Keep error messages on a single line so that they're easier to grep for.

Reported by:	pho
MFC after:	1 week
2018-01-03 18:19:47 +00:00
Konstantin Belousov
af317aa4e5 Use the new SDM-approved way to serialize x2APIC MSR writes.
SDM editions 64 and below stated that it is enough to use MFENCe or
LFENCE to serialize x2APIC register writes.  New edition 65 requires
either full serialization instruction or MFENCE;LFENCE sequence.  Use
the later, FreeBSD needs serialization to ensure that writes done
before IPI request are visible to the target IPI CPU.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-01-03 11:23:47 +00:00
Poul-Henning Kamp
1517c9d893 Eliminate a paranthesis which is both unneeded and causing trouble. 2018-01-03 09:33:59 +00:00
Mike Karels
d626b50b9d make SW_WATCHDOG dynamic
Enable the hardclock-based watchdog previously conditional on the
SW_WATCHDOG option whenever hardware watchdogs are not found, and
watchdogd attempts to enable the watchdog. The SW_WATCHDOG option
still causes the sofware watchdog to be enabled even if there is a
hardware watchdog. This does not change the other software-based
watchdog enabled by the --softtimeout option to watchdogd.

Note that the code to reprime the watchdog during kernel core dumps is
no longer conditional on SW_WATCHDOG. I think this was previously a bug.

Reviewed by:	imp alfred bjk
MFC after:	1 week
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D13713
2018-01-03 00:56:30 +00:00
Oleksandr Tymoshenko
9d1208bbd2 nctgpio: add new device id for the GPIO chip in PCEngines APU3
PR:		224512
Submitted by:	mike@sentex.net
MFC after:	2 weeks
2018-01-02 20:58:05 +00:00
Ed Maste
c24972459f ath: fix possible memory disclosures in ioctl handlers
Apply the fix from r327499 to additional ioctl handlers.

Reported by:	Ilja van Sprundel <ivansprundel@ioactive.com>
MFC after:	1 week
MFC with:	r327499
Sponsored by:	The FreeBSD Foundation
2018-01-02 19:34:23 +00:00
Ed Maste
85f385b9aa ath: fix memory disclosure from ath_btcoex_ioctl
The ath_btcoex_ioctl handler allocated a buffer without M_ZERO and
returned it to userland without writing to it.

The device has permissions only for root so this is not urgent, and the
fix can be MFCd and considered for a future EN.

Reported by:	Ilja van Sprundel <ivansprundel@ioactive.com>
Submitted by:	Domagoj Stolfa <domagoj.stolfa@gmail.com>
Reviewed by:	adrian
MFC after:	1 week
2018-01-02 19:29:30 +00:00
Ed Maste
5d8501f487 hpt{nr,rr}: plug info leak in hpt_ioctl
The hpt{nr,rr} ioctl handler allocates a buffer without M_ZERO and calls
hpt_do_ioctl(), which might not overwrite the entire buffer.

Also zero bytesReturned in case it is not written by hpt_do_ioctl().

The hpt27{nr,rr} device has permissions only for root so this is not urgent,
and the fix can be MFCd and considered for a future EN.

The same issue was reported in the hpt27xx driver by Ilja Van Sprundel.

Reviewed by:	jhb, kib
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2018-01-02 18:31:32 +00:00
Ed Maste
51cbc81510 hpt27xx: plug info leak in hpt_ioctl
The hpt27xx ioctl handler allocates a buffer without M_ZERO and calls
hpt_do_ioctl(), which might not overwrite the entire buffer.

Also zero bytesReturned in case it is not written by hpt_do_ioctl().

The hpt27xx device has permissions only for root so this is not urgent,
and the fix can be MFCd and considered for a future EN.

Reported by:	Ilja van Sprundel <ivansprundel@ioactive.com>
Submitted by:	Domagoj Stolfa <domagoj.stolfa@gmail.com> (M_ZERO)
Reviewed by:	jhb, kib
MFC after:	3 days
Security:	info leak in root-only ioctl
Sponsored by:	The FreeBSD Foundation
2018-01-02 18:29:44 +00:00
Mark Johnston
1787c3feb4 Fix some I/O ordering issues in gmirror.
- BIO_FLUSH requests were dispatched to the disks directly from
  g_mirror_start() rather than going through the mirror's I/O request
  queue, so they could have been reordered with preceding writes.
  Address this by processing such requests from the queue, avoiding
  direct dispatch.
- Handling for collisions with synchronization requests was too
  fine-grained and could cause reordering of writes. In particular,
  BIO_ORDERED was not being honoured. Address this by effectively
  freezing the request queue any time a collision with a synchronization
  request occurs. The queue is unfrozen once the collision with the
  first frozen request is over.
- The above-mentioned collision handling allowed reads to jump ahead
  of writes to the same offset. Address this by freezing all request
  types when a collision occurs, not just BIO_WRITEs and BIO_DELETEs.

Also add some more fail points for use in testing error handling.

Reviewed by:	imp
MFC after:	3 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13559
2018-01-02 18:11:54 +00:00
Jeff Roberson
ad5b0f5b51 Fix arc after r326347 broke various memory limit queries. Use UMA features
rather than kmem arena size to determine available memory.

Initialize the UMA limit to LONG_MAX to avoid spurious wakeups on boot before
the real limit is set.

PR:		224330 (partial), 224080
Reviewed by:	markj, avg
Sponsored by:	Netflix / Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13494
2018-01-02 04:35:56 +00:00
Nathan Whitehorn
67530f82dd Fix reversed endianness that crept in at some point. Blue is now blue
instead of pink.

MFC after:	3 days
2018-01-02 03:59:46 +00:00
Adrian Chadd
9fbe631a1a [net80211] convert all of the WME use over to a temporary copy of WME info.
This removes the direct WME info access in the ieee80211com struct and instead
provides a method of fetching the data.  Right now it's a no-op but eventually
it'll turn into a per-VAP method for drivers that support it (eg iwn, iwm,
upcoming ath10k work) as things like p2p support require this kind of behaviour.

Tested:

* ath(4), STA and AP mode

TODO:

* yes, this is slightly stack size-y, but it is an important first step
  to get drivers migrated over to a sensible WME API.  A lot of per-phy things
  need to be converted to per-VAP before P2P, 11ac firmware, etc stuff shows up.
2018-01-02 00:07:28 +00:00
Antoine Brodin
1b25176cbc sysctl_kern_proc_args: do not take the fast path if p_args is NULL
In this case it falls back to reading ps_strings
2018-01-01 21:25:01 +00:00
Konstantin Belousov
84874cc151 Avoid re-check of usermode condition.
It does not change anything in the behavior of trap_pfault(), while
eliminating obfuscation of jumping to the code which checks for the
condition reversed of the goto cause.  Also avoid force initialize the
rv variable, since it is now only accessed after storing vm_fault()
return value.

Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13725
2018-01-01 20:47:03 +00:00
Konstantin Belousov
da457ed9d6 Add CR4.SMAP control bit.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2018-01-01 19:34:19 +00:00
Konstantin Belousov
9997c0481c Do not let vm_daemon run unbounded.
On a load where single anonymous object consumes almost all memory on
the large system, swapout code executes the iteration over the
corresponding object page queue for long time, owning the map and
object locks.  This blocks pagedaemon which tries to lock the object,
and blocks other threads in the process in vm_fault() waiting for the
map lock.

Handle the issue by terminating the deactivation loop if we executed
too long and by yielding at the top level in vm_daemon.

Reported by:	peterj, pho
Reviewed by:	alc
Tested by:	pho (as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13671
2018-01-01 19:27:33 +00:00
Warner Losh
7167e16b49 Remove sys/mips/rmi. It's been unmaintained since 2011. This hardware
is now unobtanium. It's only had API changes in the last 7 years, and
is responsible for a very large number of them. In addition, there's a
lot of code that reimplements base FreeBSD functionality, diminishing
the chances it still works. Without hardware to teset it on, or
prospects of obtaining such hardware and without vendor support, it's
time to move on.

Suggested by: kan@ in mips@ retirement discussion
2018-01-01 05:13:03 +00:00
Warner Losh
c3dbef68d5 Remove support for IDT. Only the RouterBoard RB533 used this chip, and
it's at least 5 years out of production. I couldn't find a used one on
ebay and other secondary markets just now, nor when I tried 4 years
ago. It dates from the initial project/mips2 merge 8 years ago, and
hasn't been updated since.

Discussed on: mips@ (with some dissent)
2018-01-01 04:10:36 +00:00
Warner Losh
ada611f6ad Retire old ADM 5120 port. It never grew much beyond the original port.
It came into the tree with the project/mips merge 8 years ago. At the
time, it was hard to find a board with enough RAM to run. Now FreeBSD
requires at least 2x the RAM it did then. No changes have happened to
this port apart from API churn and license tagging since then. It ran
OK at the time it was committed, but no sightings in the wild have
happened since shortly after it was committed.

https://www.linux-mips.org/wiki/Adm5120_devices lists a bunch of
boards that were available 5 years ago (but are no longer
available). The beefiest one had only 64MB of RAM which is too
small. The Mirktik RB1xx never had more than 32MB.

Also remove confusing QEMU config file that never ever worked in QEMU
for mips. MALTA is used for that. Another of my past mistakes, false
starts that never amounted to anything.

Discussed on: mips@ (with some dissent)
2018-01-01 04:10:31 +00:00
Warner Losh
5daa7f27c5 Remove sys/mips/alchemy. It was still-born when I committed it and it
never got better. It never worked on real hardware and is still mostly
stubs after 8 years when I added it. It has had no real update in that
time apart from API churn. It was added just so it didn't get lost in
the project/mips merge, but maybe it should have been lost as nothing
has come of it. It is time to give up the ghost on this one.

Approved by: me, shooting my own dog
Discussed on: mips@
2018-01-01 04:10:25 +00:00
Warner Losh
039f26fab0 Remove sys/mips/rt305x. It's been replaced by sys/mips/mediatek.
OK'd by: Stanislav Galabov (who did both)
Discussed on: mips@
2018-01-01 04:06:24 +00:00
Oleksandr Tymoshenko
de87887b78 Fix GCC build broken by r32744
Indicate in function declaration that vt_palette_init does not take any arguments
2017-12-31 23:40:06 +00:00
Ian Lepore
b6f4732cb3 Add a validbcd() routine that uses the bcd2bin_data[] array and returns a
bool indicating whether the input value represents a valid BCD byte.

The existing bcd2bin() routine will KASSERT if asked to convert a bad value,
but sometimes the kernel has to handle BCD data from untrusted sources, so
this will provide a mechanism to validate data before attempting conversion.

This would be have easier/cleaner if the bcd2bin_data[] array contained an
out-of-range value (such as 0xff) in the infill locations that aren't valid,
but it's a global symbol that might be referenced by out-of-tree code
relying on the current scheme, so I'm leaving that alone.
2017-12-31 22:43:24 +00:00
Kyle Evans
07bd38576d aw_sid: Add support for a64
Newer Allwinner SoCs have nearly identical SID controllers with efuse space
starting at 0x200 into their register space and thermal data available at
0x234, making all of these fairly trivial additions.

The h3 will be added at a later time after some testing, due to a silicon
bug that causes the rootkey (at least) to be read incorrectly unless first
read via the control register.
2017-12-31 22:35:32 +00:00
Alan Cox
7000c58871 The variable "minslptime" is pointless and always has been, ever since its
introduction in r83366.  (At that time, this code appeared in vm/vm_glue.c,
because vm/vm_swapout.c did not exist.)  When the FOREACH_THREAD loop
completes, we know that the sleep time for every thread is above whichever
threshold is being applied.

Reviewed by:	kib
X-MFC with:	r327354
2017-12-31 21:36:42 +00:00
Oleksandr Tymoshenko
c127b9f9d1 Unbreak build broken by r327444
During review iterations function signature has changed in definition
but not in actual call. Fix call to match the definition.

Reported by:	Herbert J. Skuhra
Pointyhat to: gonzo
MFC after:	2 weeks
2017-12-31 21:29:20 +00:00
Colin Percival
0337438d5d Wrap includes in sys/tslog.h with #ifdef TSLOG.
This is necessary because some non-kernel code #defines _KERNEL and then
includes kernel headers; as a result, it was getting conflicting versions
of curthread and curproc.  Non-kernel code should probably refrain from
defining _KERNEL, but for now hiding these indirect inclusions fixes the
build.

Reported by:	Michael Butler, Herbert J. Skuhra
2017-12-31 21:00:21 +00:00
Ian Lepore
a9bcc66178 Chase r327432... sparc64 clock.c now needs to include sys/tslog.h
Discussed with:	 cperciva
2017-12-31 20:30:51 +00:00
Nathan Whitehorn
3972f4c1d4 Remove PIR from PCPU data. It has an implementation-defined meaning that
is of limited utility outside of platform-specific code and can vary
at runtime when running as a hypervisor guest, so does not even have the
virtue of being a static identifier.

Reviewed by:	jhibbits
2017-12-31 20:23:39 +00:00
Oleksandr Tymoshenko
29f61a2f42 vt(4): add support for configurable console palette
Introduce new set of loader tunables kern.vt.color.N.rgb, where N is a
number from 0 to 15. The value is either comma-separated list decimal
numbers ranging from 0 to 255 that represent values of red, green, and
blue components respectively (i.e. "128,128,128") or 6-digit hex triplet
commonly used to represent colors in HTML or xterm settings (i.e. #808080)

Each tunable overrides one of the 16 hardcoded palette codes and can be set
in loader.conf(5)

Reviewed by:	bcr(docs), jilles, manu, ray
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D13645
2017-12-31 20:21:05 +00:00
Nathan Whitehorn
4e05ac247c Fix 32-bit build. 2017-12-31 20:20:55 +00:00
Nathan Whitehorn
f81dfc7f6b Make newer binutils happy by using a bl-type branch instead of b, which
displeases it for some reason. LR is not relevant in this code, so just
do what it wants.
2017-12-31 20:10:08 +00:00
Nathan Whitehorn
ec75f647cc Provide relative, as well as absolute, addresses in trap panic panics. This
makes it easier to cross-correlate them with instruction listings without
worrying about where the kernel was relocated to.

MFC after:	1 week
2017-12-31 20:08:16 +00:00
Konstantin Belousov
1865d6b851 Remove MP SAFE marks and stray register name in comments.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-12-31 17:07:59 +00:00
Bjoern A. Zeeb
c300109360 Happy New Year 2018 my friends! 2017-12-31 16:48:04 +00:00
Kristof Provost
9d671fee3a pf: Allow the module to be unloaded
pf can now be safely unloaded. Most of this code is exercised on vnet
jail shutdown.

Don't block unloading.
2017-12-31 16:18:13 +00:00
Kristof Provost
5d0020d6d7 pf: Clean all fragments on shutdown
When pf is unloaded, or a vnet jail using pf is stopped we need to
ensure we clean up all fragments, not just the expired ones.
2017-12-31 10:01:31 +00:00
Colin Percival
d5d7606c0c Use the TSLOG framework to record entry/exit timestamps for DELAY and
_vprintf; these functions are called in many places and can contribute
meaningfully to the total time spent booting.
2017-12-31 09:24:41 +00:00
Colin Percival
49a4e3b4b4 Instrument thread creations for the the benefit of the TSLOG framework.
This assists in tracking time spent while the boot is being "held" waiting
for something to happen.
2017-12-31 09:24:11 +00:00
Colin Percival
8b8a7c43a9 Instrument "boot holds" for the benefit of the TSLOG framework. These
are places where the "main thread" of the booting kernel (either the
thread which later becomes swapper or the thread which later becomes
init) has to stop and wait for action to take place in another thread
before continuing.

There are currently three such holds:
1. The intr_config_hooks SYSINIT waits for hooks registered via the
config_intrhook_establish function; this allows (typically) devices
which need interrupts enabled to complete their initialization to do
so before root is mounted.

2. The g_waitidle function waits for the GEOM event queue to be empty;
this ensures that all of the disks which have been attached have been
tasted before we attempt to mount root.

3. The vfs_mountroot_wait function (in addition to calling g_waitidle)
waits for holds registered via root_mount_hold; among other things, this
is used by the USB subsystem to ensure that we don't fail to mount root
if it's located on a USB disk which takes a while to probe.
2017-12-31 09:23:52 +00:00
Colin Percival
82614df42c Use the TSLOG framework to record entry/exit timestamps for VFS_MOUNT calls. 2017-12-31 09:23:35 +00:00
Colin Percival
a21a2da599 Teach makeobjops.awk to accept PROLOG and EPILOG blocks before
METHOD and STATICMETHOD declarations; that code will be inserted
into the dispatch function before and after the method call.

Use this functionality and the TSLOG framework to record DEVICE_ATTACH
and DEVICE_PROBE entry/exit timestamps.
2017-12-31 09:23:19 +00:00
Colin Percival
95e678f96e Use the TSLOG framework to record SYSINIT entry/exit timestamps. 2017-12-31 09:23:02 +00:00
Colin Percival
6032e08810 Use the TSLOG framework to record entry/exit timestamps for machine
independent functions with important roles in the early boot process:
mi_startup (with the "exit" recorded when it becomes swapper),
start_init (with the "exit" recorded when the thread is about to
"return" into the newly created init process), vfs_mountroot, and
vfs_mountroot_wait.
2017-12-31 09:22:31 +00:00
Colin Percival
31a55efdc5 Use the TSLOG framework to record entry/exit timestamps for hammer_time.
The entry must be logged "manually" using TSRAW rather than TSENTER
since PCPU data structures have not yet been initialized and thus
curthread cannot be accessed; &thread0 is what will become curthread
later in hammer_time.

Other MD initialization code should be similarly instrumented in order
to gain visibility into the time spent before entering mi_startup; this
will require some care and testing from people with access to such
hardware.
2017-12-31 09:22:07 +00:00
Colin Percival
ae3d6bfa20 Connect kern_tslog.c to the build and add TSLOG / TSLOGSIZE kernel options.
These are intended for debugging purposes and should not be added to
"generic" kernel configurations since they result in a nontrivial amount
of memory being set aside for this purpose, can break if kernel modules are
unloaded, and can potentially leak a dangerous amount of information about
timestamps used as a source of kernel entropy.
2017-12-31 09:21:34 +00:00
Colin Percival
e31e71991a Code for recording timestamps of events, especially function entries/exits.
This is a very primitive system, intended for use in measuring performance
during the early system boot, before more sophisticated tools like DTrace
or infrastructure like kernel memory allocation and mutexes are available.

Because this code records pointers to strings rather than copying strings
(in order to keep the memory usage more manageable), if a kernel module is
unloaded after logging an event, Bad Things can happen.  Users are advised
to not do that.

Since cycle counts from the early kernel boot are used as an initial entropy
source, publishing this information to userland could result in inadequate
entropy being kept private to the kernel RNG.  Users are advised to not
enable this on systems with untrusted users.

Discussed on:	freebsd-current
2017-12-31 09:21:01 +00:00
Kyle Evans
1ccce047f5 aw_sid: rewrite compat-string configuration to be more flexible
This will allow easiser support in the future for boards that have thermal
data and different offsets for root key/efuse data.
2017-12-31 06:44:15 +00:00
Nathan Whitehorn
5261ac0eda Use data from the boot loader to pick the appropriate output graphics mode
instead of hard-coding a default. This information is passed implicitly by
the PS3 firmware and can be relied upon. Also adjust the default mode, if
somehow firmware doesn't pass one, to 1920x1080 from 720x480 since it is
2017.

MFC after:	2 weeks
2017-12-31 06:10:07 +00:00
Nathan Whitehorn
a891d21aac Make sure the first instruction of the low-memory spinloop is in the
cacheline being invalidated.

MFC after:	1 month
2017-12-31 05:38:19 +00:00
Alan Cox
4abca9bb05 Previously, swap_pager_copy() freed swap blocks one at at time, via
swp_pager_meta_ctl(), with no opportunity to recognize freeing of
consecutive blocks and free fewer block ranges.  To open that opportunity,
this change removes the SWM_FREE option from swp_pager_meta_ctl(), and
compels the caller to do the freeing when a valid block address is returned.
In swap_pager_copy(), these frees are aggregated, so that a sequence of them
can be done at one time.

The only other caller to swp_pager_meta_ctl() that passed SWM_FREE,
swp_pager_unswapped(), is also modified to handle its single free
explicitly.

Submitted by:	Doug Moore <dougm@rice.edu>
Reviewed by:	kib (an earlier version)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D13290
2017-12-31 04:01:47 +00:00
Pedro F. Giffuni
0879ca728a sysv_{ipc|shm}: update the NetBSD VCS tags to match nearer our files.
Both files originated in NetBSD:

sysv_ipc.c CVS 1.9:
Most of their changes don't apply to us as we already have similar
changes. This is a better reference for future merges.

sysv_shm.c CVS 1.39:
Most of their changes don't apply to our code but interestingly this
revision merged our changes and is a better point for reference.

Move the VCS tags to the position recommended in our committers guide
(section 8),

No functional change.
2017-12-31 03:34:00 +00:00
Mateusz Guzik
efa9f177f5 locks: adjust loop limit check when waiting for readers
The check was for the exact value, but since the counter started being
incremented by the number of readers it could have jumped over.
2017-12-31 02:31:01 +00:00
Mateusz Guzik
cde25ed4cd sx: fix up non-smp compilation after r327397 2017-12-31 01:59:56 +00:00
Mateusz Guzik
28f1a9e3ff locks: re-check the reason to go to sleep after locking sleepq/turnstile
In both rw and sx locks we always go to sleep if the lock owner is not
running.

We do spin for some time if the lock is read-locked.

However, if we decide to go to sleep due to the lock owner being off cpu
and after sleepq/turnstile gets acquired the lock is read-locked, we should
fallback to the aforementioned wait.
2017-12-31 00:47:04 +00:00
Mateusz Guzik
fb10612355 sx: read the SX_NOADAPTIVE flag and Giant ownership only once
These used to be read multiple times when waiting for the lock the become
free, which had the potential to issue completely avoidable traffic.
2017-12-31 00:37:50 +00:00
Mateusz Guzik
15140a8ade mtx: deduplicate indefinite wait check in spinlocks and thread lock 2017-12-31 00:34:29 +00:00
Mateusz Guzik
1f4d28c7ea mtx: pre-read the lock value in thread_lock_flags_
Since this function is effectively slow path, if we get here the lock is most
likely already taken in which case it is cheaper to not blindly attempt the
atomic op.

While here move hwpmc probe out of the loop to match other primitives.
2017-12-31 00:33:28 +00:00
Mateusz Guzik
80c39f6c37 rwlock: tidy up __rw_runlock_hard similarly to r325921 2017-12-31 00:31:14 +00:00
Emmanuel Vadot
74e0613ead dwmmc: Fully subclass driver
Fully subclass the dwmmc driver and split every driver into multiple files.

There is still a few quirks in the dwmmc driver that will need some work.

Tested On: Pine64 Rock64

Differential Revision:	https://reviews.freebsd.org/D13615
2017-12-30 22:01:17 +00:00
Nathan Whitehorn
01d1c9d889 Avoid use of the fdt_get_property_*() API, which is intrinsically
incompatible with FDT versions < 16. This also simplifies the code a bit.

MFC after:	1 month
2017-12-30 20:28:29 +00:00
Nathan Whitehorn
3fca788024 Remove logic for early console with loader.ps3 now that loader.ps3 is dead. 2017-12-30 20:25:33 +00:00
Nathan Whitehorn
ba06dbb874 Change the way SMP startup works to match the new multi-AP features in
locore64.S introduced in r327358.

MFC after:	3 weeks
2017-12-30 20:24:33 +00:00
Nathan Whitehorn
223d42a4de Check more aggressively for whether the desired properties actually exist.
If they don't, the code would look up some random part of the device tree
and seize the console inappropriately.

MFC after:	2 weeks
2017-12-30 20:23:14 +00:00
Bryan Venteicher
ac2b436d20 Add macro for vxlan list mutex lock and unlock
This will simplify some later VNET support.

Submitted by:	hrs
MFC after:	2 weeks
2017-12-30 19:49:40 +00:00
Bryan Venteicher
6d7bc5838b Advertise IFCAP_LINKSTAT after r326480 added link status support
MFC after:	2 weeks
2017-12-30 19:35:12 +00:00
Cy Schubert
57229a86bf Fix r327383:
.../sys/dev/ep/elink.c:31:1: error: '/*' within block comment
[-Werror,-Wcomment]
/* $NetBSD: elink.c,v 1.6 1995/01/07 21:37:54 mycroft Exp $
^
2017-12-30 19:27:22 +00:00
Pedro F. Giffuni
8a2141f0e3 elink.[ch]: Move historic VCS tags after the license.
This matches better our common practices and style.
2017-12-30 15:13:33 +00:00
Konstantin Belousov
abbfe9e5d1 Move i386/isa/elink.[hc] to dev/ep.
The ep(4) driver is the only consumer of the two functions from
elink.c.  I removed the standalone module as well, and most likely,
the module metadata is not needed anywhere, but this is for later
cleanup.

Discussed with:	imp, jhb
Sponsored by:	The FreeBSD Foundation
2017-12-30 11:42:49 +00:00
Konstantin Belousov
0cc997d237 Move i386/isa/npx.c to i386i386/npx.c.
The i386 FPU (AKA npx) code does not depend on ISA devices at all,
after the support for IRQ13 FPU exceptions was removed.  Put the file
into the expected place in the kernel source tree.

Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
2017-12-30 11:33:04 +00:00
Warner Losh
9e1472ca00 On further testing on actual machines with this hardware, we should
only warn for devices that are attached. Add missing \n.
2017-12-30 08:16:31 +00:00
Warner Losh
d14e21a831 Remove two stray references to wl driver, removed some time ago. 2017-12-30 07:59:32 +00:00
Bryan Venteicher
33e0d8f057 Add support for IPv6 scoped addresses to vxlan
MFC after:	2 weeks
2017-12-30 04:03:53 +00:00
John Baldwin
8bbeea2b1d Remove a redunant check. 2017-12-30 03:08:49 +00:00
Pedro F. Giffuni
2afb21f309 geom_ccd.c: Fix the licenses properly
The license merging in r109471 didn't take into account that licensing
could change. Just removing the 3rd clause obviates the copyright
assignment to the NetBSD Foundation.

We do have plenty of files that have two or more licensing as in this
case, so fix this properly by splitting back the licenses as they are
upstream.

Obtained from:	NetBSD
2017-12-30 02:07:18 +00:00
Pedro F. Giffuni
68689f580b geom_ccd.c: Update the license with changes from upstream.
Part of this file originated in NetBSD, with the original file
carrying two versions of 4-clause BSD licenses. r109471 attempted to
simplify the situation by putting both licenses together.

Meanwhile, NetBSD dropped Clauses 3 and 4 from their own license, and
eventually NetBSD got permission from the University of Utah to drop the
3rd clause.

Keep the license "simple" by dropping the third clause since both TNF,
Utah/Berkeley and phk agree in principle that it can be dropped.

Obtained from:	NetBSD (ccd.c CVS 1.128, 1.138)
2017-12-30 01:37:08 +00:00
Andriy Voskoboinyk
bcabc90835 net80211: sanitize input for ieee80211_output()
- Add some basic checks for i_fc* bits (ToDS, FromDS, MoreFrag, Protected);
those are used / checked across various places in Tx path.
- Mark injected 802.11 frame as encapsulated (just as it should be).
- Classify 802.11 frame in a proper way (extract ether_type from LLC header
for Data frames, use AC_BE queue for others (NoData / Management / Control).
- Subtract header length from tx_bytes statistics (so it will correspond
to the comment).

Was checked with RTL8188EU (AP) + Intel 6205 (STA).

Reviewed by:	adrian
Differential Revision:	https://reviews.freebsd.org/D13161
2017-12-30 00:40:34 +00:00
Andriy Voskoboinyk
f6b986459f net80211: handle VHT nodes in ieee80211_node_setuptxparms()
Select proper mode when node can do VHT.

Currently there are no drivers with VHT support in the tree,
so this should be noop.

Reviewed by:	adrian
Differential Revision:	https://reviews.freebsd.org/D9806
2017-12-30 00:24:53 +00:00
Ian Lepore
2d09b07279 Make kernel option KERNVIRTADDR optional, remove it from std.<platform>
files that can use the default value.

It used to be required that the low-order bits of KERNVIRTADDR matched
the low-order bits of the physical load address for all arm platforms.
That hasn't been a requirement for armv6 platforms since FreeBSD 10.
There is no longer any relationship between load addr and KERNVIRTADDR
except that both must be aligned to a 2 MiB boundary.

This change makes the default KERNVIRTADDR value 0xc0000000, and removes the
options from all the platforms that can use the default value.  The default
is now defined in vmparam.h, and that file is now included in a few new
places that reference KERNVIRTADDR, since it may not come in via the
forced-include of opt_global.h on the compile command line.
2017-12-30 00:20:49 +00:00
Conrad Meyer
66ad253880 Add AHCI/XHCI device IDs found on AMD 1950X+X399 system
A follow-up to r327094.

Sponsored by:	Dell EMC Isilon
2017-12-29 22:24:41 +00:00
Nathan Whitehorn
f9d6e0a5d0 Enhance the CHRP/pSeries platform layer:
- Densely number CPUs to avoid systems with CPUs with very high ID numbers
- Always have the BSP be CPU 0 to avoid remnant brokenness with non-0 BSPs
  in other parts of the kernel.
- Improve parsing of the device tree CPU listings on SMT systems.
- Allow reboot via RTAS as well as OF for pSeries systems booted by FDT
  without functioning Open Firmware.

Obtained from:	projects/powernv
MFC after:	3 weeks
2017-12-29 21:09:17 +00:00
Konstantin Belousov
b6eabc36ba Do not lock vm map in swapout_procs().
Neither swapout_procs() nor swapout() access the map.  Since the
process' vmspace is referenced only to obtain the pointer to the
vm_map, the reference is not needed as well.

Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13681
2017-12-29 20:33:56 +00:00
Nathan Whitehorn
70f654991a Add support for 64-bit PowerPC kernels to be directly loaded by kexec, which
is used as the bootloader on a number of PPC64 platforms. This involves the
following pieces:
- Making the first instruction a valid kernel entry point, since kexec
  ignores the ELF entry value. This requires a separate section and linker
  magic to prevent the linker from filling the beginning of the section
  with stubs.
- Adding an entry point at 0x60 past the first instruction for systems
  lacking firmware CPU shutdown support (notably PS3).
- Linker script changes to support the above.

MFC after:	1 month
2017-12-29 20:30:10 +00:00
Nathan Whitehorn
8469e0fe35 Maintain alignment of in-code 64-bit quantities by design rather than luck.
If these are not aligned, the linker has to emit a different type of
relocation that the early boot self-relocation code cannot handle, even
in principle, resulting in them being set to zero and the kernel crashing.

MFC after:	1 week
2017-12-29 20:25:15 +00:00
Marius Strobl
6c06949f41 - Don't allow userland to switch partitions; it's next to impossible
to recover from that, especially when something goes wrong.
- When userland changes EXT_CSD, update the kernel copy before using
  relevant EXT_CSD bits in mmcsd_switch_part().
2017-12-29 19:07:50 +00:00
Konstantin Belousov
e258b4a0dc Style.
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13678
2017-12-29 19:05:07 +00:00
Alan Cox
5c515efc88 After r327168, the variable "vm_pageout_wanted" can be static.
MFC after:	2 weeks
2017-12-29 17:02:22 +00:00
Dimitry Andric
4fc74049d2 Merge ^/head r327169 through r327340. 2017-12-29 12:51:26 +00:00
Dimitry Andric
2055323b2c Work around a clang 6.0.0 issue with rep prefixes followed by .byte
directives (as reported in https://bugs.llvm.org/show_bug.cgi?id=35749),
by defining the rep prefix with yet another .byte directive.

This is a temporary fix, to be reverted before merging back to head,
until upstream has a proper fix for this.
2017-12-29 12:49:24 +00:00
Marius Strobl
cc22204bbc - There is no need to keep the tuning error and re-tuning interrupts
enabled (though, no interrupt generation enabled for them) all the
  time as soon as (re-)tuning is supported; only enable them and let
  them generate interrupts when actually using (re-)tuning.
- Also disable all interrupts except SDHCI_INT_DATA_AVAIL ones while
  executing tuning and not just their signaling.
2017-12-29 12:48:19 +00:00
Dimitry Andric
42e6db5330 Work around clang 6.0.0 warnings about arithmetic on NULL pointers in
acpica, by redefining the ACPI_ADD_PTR and ACPI_SUB_PTR macros so they
use integer arithmetic instead.

This is a temporary fix, to be reverted before merging back to head,
unless jkim reads my mail and agrees with this. :)
2017-12-29 12:47:23 +00:00
Cy Schubert
f813e520c2 Correct the comment describing badrs which is bad router solicitiation,
not bad router advertisement.

MFC after:	3 days
2017-12-29 07:23:18 +00:00
Navdeep Parhar
348694dada cxgbe(4): Reduce duplication by consolidating minor variations of the
same code into a single routine.

Sponsored by:	Chelsio Communications
2017-12-29 02:30:21 +00:00
Pedro F. Giffuni
8e1f85caa3 if_txpreg.h: drop clauses 3 and 4.
Obtained from:	OpenBSD (CVS 1.35)
2017-12-29 00:59:56 +00:00
Pedro F. Giffuni
11c08a3f62 dev/txp: Update if_txpreg.h to match OpenBSD's version.
Resolve a minor mismatch: TXP_CMD_READ_VERSION instead of
TXP_CMD_VERSIONS_READ.
Curiously the later is defined but not used in OpenBSD.

Obtained from:	OpenBSD (CVS 1.31-1.34)
MFC after:	1 week
2017-12-29 00:27:12 +00:00
Konstantin Belousov
89c0e67db5 Clean up the comment.
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13671
2017-12-28 23:50:21 +00:00
Nathan Whitehorn
2ad331874e Remove ELF note for Open Firmware. It is marked optional in a single 1996
draft of a never-finalized standard (CHRP) and is irrelevant in practice
on FreeBSD since we load the kernel with loader(8) on Open Firmware
platforms anyway. Moreover, loader(8), which is directly loaded by Open
Firmware, has never had an equivalent note.

MFC after:	2 weeks
2017-12-28 23:49:53 +00:00
Konstantin Belousov
0080a8fa95 In vm_swapout_map_deactivate_pages(), it is enough to lock the map for read.
Reviewed by:	alc, markj (as part of the larger patch)
Tested by:	pho (again, as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13671
2017-12-28 22:56:30 +00:00
Marius Strobl
54ef33c2a8 Probe Intel Denverton eMMC 5.0 controllers. 2017-12-28 22:03:08 +00:00
Marius Strobl
15f0034553 With the advent of interrupt remapping, Intel has repurposed bit 11
(now: Interrupt_Index[15]) and assigned the previously reserved bits
55:48 (Interrupt_Index[14:0] goes into 63:49 while Destination Field
used 63:56 and bit 48 now is Interrupt_Format) in the IO redirection
tables (see the VT-d specification, "5.1.5.1 I/OxAPIC Programming").
Thus, when not using interrupt remapping, ensure that all previously
reserved bits in the high part of the RTEs are zero instead of doing
a read-modify-write for their Destination Field bits only.
Otherwise, on machines based on Apollo Lake and its derivatives such
as Denverton, typically some of the previously preserved bits remain
set after boot when not employing interrupt remapping. The result is
that INTx interrupts are not getting delivered.
Note: With an AMD IOMMU, interrupt remapping apparently bypasses the
IO APIC altogether.

Submitted by:	loos (modulo comment)
Reviewed by:	jhb (modulo comment)
2017-12-28 21:46:09 +00:00
John Baldwin
9a1bf495b1 Remove a few more dangling references to ie(4). 2017-12-28 21:35:53 +00:00
Sean Bruno
6fe4c0a063 e1000: Add support for Ice Lake and Cannon Lake
Ths add initial support for Ice Lake and Cannon Lake ethernet devices.

This also addressed errata 1.5.4.4 for Sky Lake and Kabby Lake devices:
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/i218-i219-ethernet-connection-spec-update.pdf?asset=9561

Submitted by:	Kevin Bowling <kevin.bowling@kev009.com>
Relnotes:	Yes
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13660
2017-12-28 21:26:40 +00:00
Pedro F. Giffuni
3760a9ac78 Fix some typos.
Obtained from:	OpenBSD (CVS v1.5)
2017-12-28 20:40:56 +00:00
Pedro F. Giffuni
a8e6714356 netinet6/ip6_id.c: niels kindly dropped clause 3/4 from the license.
This bring back r327293 from OpenBSD, with the important difference that
we are now getting it from their ip6_id.c file.

Obtained from:	OpenBSD (CVS v1.3)
2017-12-28 20:35:21 +00:00
Pedro F. Giffuni
b3c64c30fa Start syncing changes from OpenBSD's ip6_id.c instead of ip_id.c.
correct non-repetitive ID code, based on comments from niels provos.
- seed2 is necessary, but use it as "seed2 + x" not "seed2 ^ x".
- skipping number is not needed, so disable it for 16bit generator (makes
  the repetition period to 30000)

Obtained from:	OpenBSD (CVS rev. 1.2)
MFC after:	1 week
2017-12-28 20:26:51 +00:00
Pedro F. Giffuni
d82751000f Revert r327293
netinet6/ip6_id.c: niels kindly dropped clause 3/4 from the license.

I was looking at the wrong file. There is an important merge that must be
done before I can bring this change.
2017-12-28 20:10:10 +00:00
Pedro F. Giffuni
e9738d25c1 netinet6/ip6_id.c: niels kindly dropped clause 3/4 from the license.
This file is supposed to be based on the OpenBSD CVS v1.6 but checking
the OpenBSD repository the license had already dropped the 2&3 clasues by
then. Catch up with the licensing.

Obtained from:	OpenBSD (CVS 1.2)
2017-12-28 19:42:53 +00:00
John Baldwin
a4f66e5f21 Minor formatting and style tweaks to some comments. 2017-12-28 18:51:35 +00:00
John Baldwin
c2ab6ce5e2 Remove a stale reference to ie(4).
This was missed in r304513.

Submitted by:	kib
2017-12-28 17:55:58 +00:00
Pedro F. Giffuni
d7629e6644 sys/sys/chio.h: add NetBSD RCS ID.
Make it easier to identify the point where we started diverging from
NetBSD. Newer versions of this file has been updated to a new license so
this information may become useful some day.

Obtained from:	NetBSD
2017-12-28 14:26:33 +00:00
Pedro F. Giffuni
aa0e6abb41 sys/i386/isa/elink*: sync with (older) NetBSD.
Make it easier to identify the point where we started diverging from
NetBSD. Newer versions of these files have been updated to a new license
so this information may become useful some day.

Obtained from:	NetBSD
2017-12-28 14:14:35 +00:00
Konstantin Belousov
cfb03f67d9 Reuse kern_proc_vmmap_resident() for procfs_map resident count.
The existing algorithm in procfs_map() to calculate count of resident
pages in an entry is too primitive, resulting in too long run time for
large sparse mapping entries.  Re-use the kern_proc_vmmap_resident()
from kern_proc.c which only looks at the existing pages in the
iterations.

Also, this makes procfs to honor kern.proc_vmmap_skip_resident_count,
if user does not need this information.

Reported by:	Glenn Weinberg <glenn.weinberg@intel.com>
PR:	224532
No objections from:	des (procfs maintainer)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13595
2017-12-28 13:23:13 +00:00
Konstantin Belousov
baaa79699a Make kern_proc_vmmap_resident() externally accesible, and move the
vmmap_skip_res_cnt control check inside it.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13595
2017-12-28 13:16:32 +00:00
Emmanuel Vadot
ee070097f2 Revert r327250 as it broke the build for some armv6 kernel and all armv4/5
Reported by:	ian
2017-12-28 07:31:14 +00:00
Pedro F. Giffuni
8a97717048 SPDX: fix wrong license ID tag in dev/cesa. 2017-12-28 03:10:57 +00:00
Pedro F. Giffuni
bd9ba70127 SPDX: fix wrong license ID tag in dev/spibus. 2017-12-28 03:04:36 +00:00
Pedro F. Giffuni
2cc7a55540 SPDX: fix wrong license ID tag in libkern. 2017-12-28 01:20:30 +00:00
Pedro F. Giffuni
ebd63680e4 arm/ixp425: Drop 3rd and 4th clauses from Ichiro FUKUHARA's license.
This syncs us with NetBSD as much of our changes have been upstreamed.

Obtained from:	NetBSD
2017-12-28 01:12:28 +00:00
Pedro F. Giffuni
8f1927357d SPDX: fix license ID tags for arm/xscale.
Use parenthesis for grouping as suggested by the spec.
2017-12-27 22:47:56 +00:00
Navdeep Parhar
adbb3637ec cxgbe/iw_cxgbe: Fix iWARP over VLANs (catch up with r326169).
Submitted by:	KrishnamRaju ErapaRaju @ Chelsio
Sponsored by:	Chelsio Communications
2017-12-27 22:44:50 +00:00
Pedro F. Giffuni
e3e70e1e67 sparc64: Update idprom.h.
We only take a small part of the NetBSD file in sys/dev/sun/idprom.h.
Bring some comments and update the license.

Obtained from:	NetBSD (CVS rev 1.3)
2017-12-27 22:01:30 +00:00
Emmanuel Vadot
9c4bfa00d4 arm: hdmi_if.m is already in files.arm
Do not require it in files.vendor
2017-12-27 21:58:19 +00:00
Emmanuel Vadot
d06955f9bd arm: Add kern/kern_clocksource.c to files.arm
Instead of adding it to every files.vendor, add it to the common arch file.
2017-12-27 21:39:57 +00:00
Stephen Hurd
9c58cafaa3 Don't pass rids to taskqgroup_attach()
As everywhere else, we want to pass rman_get_start(irq->ii_res).  This
caused set affinity errors when not using MSI-X vectors (legacy and MSI
interrupts).

Reported by:	sbruno
Sponsored by:	Limelight Networks
2017-12-27 20:42:30 +00:00
Stephen Hurd
ca03863c85 Remove assertion that's not true for !EARLY_AP_STARTUP
gtask->gt_taskqueue is NULL when EARLY_AP_STARTUP is not enabled.
Remove assertion to allow this config to work.

Reported by:	oleg
Sponsored by:	Limelight Networks
2017-12-27 19:14:15 +00:00
Pedro F. Giffuni
51268c3852 SPDX: Complete License ID tags for UFS. 2017-12-27 19:13:50 +00:00
Stephen Hurd
de13095409 Fix indentation.
Sponsored by:	Limelight Networks
2017-12-27 19:12:32 +00:00
Kyle Evans
858f246615 if_awg: Respect rgmii-*id PHY configurations
phy-mode can be one of: rgmii, rgmii-id, rgmii-txid, rgmii-rxid; as this was
written, any of these alternate -id configurations would break as we fail to
configure syscon for rgmii. Instead, simply check that phy-mode is
configured for rgmii and we'll let the PHY driver handle any internal delay
configuration.

The pine64 should eventually specify phy-mode = "rgmii-txid" to address
gigabit issues when rx delay is configured, motivating this change.
2017-12-27 18:22:02 +00:00
Emmanuel Vadot
8e3f1dcf54 ctl: Correct comment in ctl_worker_thread
The incoming queue is handled before the RtR one.
No functional change.

MFC after:	3 days
2017-12-27 15:39:31 +00:00
Eitan Adler
caa7e52f3f kernel: Fix several typos and minor errors
- duplicate words
- typos
- references to old versions of FreeBSD

Reviewed by:	imp, benno
2017-12-27 03:23:21 +00:00
Michael Zhilin
ebff39fb9b [mips] fix compilation of TP-WN1043ND kernel configuration
This compilation issue has been found thanks to freebsd-wifi-build:
 - gpioiic requires "gpio_if.h", so "device gpio" is mandatory
 - rtl8366rb works over MDIO interface, so "device mdio" is mandatory

Compilation is checked on FreeBSD 12-CURRENT machine.
2017-12-26 19:50:23 +00:00
Ian Lepore
513581749b Add a new ARM kernel option, LOCORE_MAP_MB, to control the size of the
kernel VA mapping in the temporary page tables set up by locore-v6.S.

The number used to be hard-coded to 64MB, which is still the default if
the kernel option is not specified.  However, 64MB is insufficient for
using a large mdroot filesystem.  The hard-coded number can't be safely
increased because too large a number may run into memory-mapped IO space
on some SoCs that must not be mapped as ordinary memory.
2017-12-26 19:02:56 +00:00
Alan Cox
fec296887f Refactor vm_map_find(), creating a separate function, vm_map_alignspace(),
for finding aligned free space in the given map.  With this change, we
always return KERN_NO_SPACE when we fail to find free space.  Whereas,
previously, we might return KERN_INVALID_ADDRESS.  Also, with this change,
we explicitly check for address wrap, rather than relying upon the map's
min and max addresses to establish sentinel-like regions.

This refactoring was inspired by the problem that we addressed in r326098.

Reviewed by:	kib
Tested by:	pho
Discussed with:	markj
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D13346
2017-12-26 17:59:37 +00:00
Kyle Evans
198ca831a1 extres/syscon: Commit missing bits from r327106
r327106 introduced kobj to syscon so it can be subclassed and fit in with
the rest of the syscon framework. The diff for syscon.c was misapplied in a
clean tree prior to commit, so bring it back to what was included in the
review and tested. The entire file has basically been rewritten from what
was present prior to the kobj work.

Pointy hat to:	me
2017-12-26 16:38:04 +00:00
Michael Tuexen
fa5867cbd6 White cleanups. 2017-12-26 16:33:55 +00:00
Mark Johnston
65ef323137 Ensure that pass > 0 when starting a scan with vm_pages_needed == 1.
Otherwise the page daemon will not reclaim pages and thus will not
wake threads sleeping in VM_WAIT.

Reported and tested by:	pho
Reviewed by:	alc, kib
X-MFC with:	r327168
Differential Revision:	https://reviews.freebsd.org/D13640
2017-12-26 16:29:39 +00:00
Michael Tuexen
f34a628e7e Clearify CID 1008197.
MFC after:	3 days
2017-12-26 16:12:04 +00:00
Michael Tuexen
0460135495 Clearify issue reported in CID 1008198.
MFC after:	3 days
2017-12-26 16:06:11 +00:00
Emmanuel Vadot
99de54f18a arm64: a10_gpio.c and a10_mmc.c were renamed aw_gpio.c and aw_mmc.c 2017-12-26 15:35:19 +00:00
Michael Tuexen
f6ea123171 Fix CID 1008428.
MFC after:	1 week
2017-12-26 15:29:11 +00:00
Michael Tuexen
4830aee72f Fix CID 1008936. 2017-12-26 15:24:42 +00:00
Michael Tuexen
c9256941d0 Allow the first (and second) argument of sn_calloc() be a sum.
This fixes a bug reported in
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224103
PR:		224103
2017-12-26 14:37:47 +00:00
Emmanuel Vadot
4f96b2503d arm: a10_gpio.c was renamed aw_gpio.c
While here order files in files.allwinner
2017-12-26 14:34:38 +00:00
Michael Tuexen
cd90150413 When adding support for sending SCTP packets containing an ABORT chunk
to ipfw in https://svnweb.freebsd.org/changeset/base/326233,
a dependency on the SCTP stack was added to ipfw by accident.

This was noted by Kevel Bowling in https://reviews.freebsd.org/D13594
where also a solution was suggested. This patch is based on Kevin's
suggestion, but implements the required SCTP checksum computation
without any dependency on other SCTP sources.

While there, do some cleanups and improve comments.

Thanks to Kevin Kevin Browling for reporting the issue and suggesting
a fix.
2017-12-26 12:35:02 +00:00
Emmanuel Vadot
81b5c8ff8d Allwinner: gpio: Rename driver to aw_gpio and add man page for it
Reviewed by:	bcr (manpages)
Differential Revision:	https://reviews.freebsd.org/D13617
2017-12-26 12:11:04 +00:00
Emmanuel Vadot
b5be541f1d Allwinner: mmc: Rename driver to aw_mmc and add a man page for it
Reviewed by:	bcr (manpages)
Differential Revision:	https://reviews.freebsd.org/D13616
2017-12-26 12:06:56 +00:00
Emmanuel Vadot
b6d40d9394 Change the remaining files using my personnal email address to my freebsd one 2017-12-25 22:09:25 +00:00
Poul-Henning Kamp
8ba749fbe3 Introduce an architecture-agnostic <sys/_stdarg.h> to reduce
platform divergence.

Only architectures which pass arguments in registers (mips)
and platforms which use really weird compilers (any?) would
need to augment the contents of <sys/_stdarg.h>

Convert x86, arm and arm64 architectures to use <sys/_stdarg.h>
2017-12-25 20:54:00 +00:00
Alan Cox
115423761e Make the vm object bypass and collapse counters per CPU.
Requested by:	mjg
Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D13611
2017-12-25 19:36:04 +00:00
Emmanuel Vadot
b7bb2f26b3 allwinner: aw_usbphy is also needed for ohci 2017-12-25 16:40:09 +00:00
Emmanuel Vadot
d93f448238 Allwinner: Remove unused aw_console driver. 2017-12-25 16:27:36 +00:00
Alexander Kabaev
151ba7933a Do pass removing some write-only variables from the kernel.
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.

Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385
2017-12-25 04:48:39 +00:00
Mark Johnston
280d15cd0a Fix two problems with the page daemon control loop.
Both issues caused the page daemon to erroneously go to sleep when
applications are consuming free pages at a high rate, leaving the
application threads blocked in VM_WAIT.

1) After completing an inactive queue scan, concurrent allocations may
   have prevented the page daemon from meeting the v_free_min threshold.
   In this case, the page daemon was going to sleep even when the
   inactive queue contained plenty of clean pages.
2) pagedaemon_wakeup() may be called without the free queues lock held.
   This can lead to a lost wakeup if a call occurs after the page daemon
   clears vm_pageout_wanted but before going to sleep.

Fix 1) by ensuring that we start a new inactive queue scan immediately
if v_free_count < v_free_min after a prior scan.

Fix 2) by adding a new subroutine, pagedaemon_wait(), called from
vm_wait() and vm_waitpfault(). It wakes up the page daemon if either
vm_pages_needed or vm_pageout_wanted is false, and atomically sleeps
on v_free_count.

Reported by:	jeff
Reviewed by:	alc
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D13424
2017-12-24 19:45:16 +00:00
Dimitry Andric
e42df78a1a Remove obsolete register keyword from opensolaris's sysmacros.h. When
compiling zfsd with recent clang, it leads to a warning about the
register storage class being incompatible with C++17.

MFC after:	3 days
2017-12-24 19:17:15 +00:00
Warner Losh
ed98ce5cad Further investigation shows this shouldn't have been added at all.
Remove it.
2017-12-24 17:59:48 +00:00
Warner Losh
d76103580a Comment this out until I have time to get to the bottom of why it's
failing for some people.
2017-12-24 16:36:50 +00:00
Warner Losh
7dcb3b1295 Warn when nonPNP ISA devices are attached in GENERIC that they are
being removed from GENERIC in 12. Always print PNP info for ISA when
it exists: it doesn't depend on ISAPNP. Add PNP ID to orm and vga to
prevent us from warning about them since those devices aren't being
removed from GENERIC. PNP devices will be removed from GENERIC too,
but they will be automatically loaded, so need no warning. We don't
warn for non-GENERIC kernels because people running them are presumed
to know what they are doing.

MFC After: 2 weeks
2017-12-23 22:57:14 +00:00
Konstantin Belousov
6332b14887 Add missed AVX512VL (128 and 256 bit vector length) extension
identification bit.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-12-23 21:32:50 +00:00
Alexander Kabaev
6d41588b6b Reverse the check to allocate the buffer if cached pointer is NULL.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D13596
2017-12-23 17:55:19 +00:00
Alexander Kabaev
4daa09f343 Remove dead store to local variable. 2017-12-23 16:49:57 +00:00
Alexander Kabaev
bf51c9665d Silence clang analyzer false positive.
clang does not know that two lookup calls will return the same
pointer, so it assumes correctly that using the old pointer
after dropping the reference to it is a bit risky.
2017-12-23 16:45:26 +00:00
Alexander Kabaev
92f19df431 Do not pass NULL pointer to copyout in if_clone_list.
Sometimes caller is only interested in how many clones
are there and NULL pointer is passed for the destination
buffer. Do not pass it to copyout then.
2017-12-23 16:45:24 +00:00
Alexander Kabaev
5f943cca65 Remove dead initialization of the inode pointer.
The pointer gets initialized again later in the code. This also
improves code style(9).
2017-12-23 16:24:02 +00:00
Alexander Kabaev
ce4ab99d82 Remove some trailing whitespace.
Reviewed by: glebius, ae
Differential Revision: https://reviews.freebsd.org/D10386
2017-12-23 16:24:00 +00:00
Alexander Kabaev
3395dd6eb8 Do not double free the memory in if_clone.
if_clone_attach function will drop the reference on failure  which will
free the if_clone structure. No need to do it second time.

Reviewed by: glebius, ae
Differential Revision: https://reviews.freebsd.org/D10386
2017-12-23 16:23:58 +00:00
Kyle Evans
cd04523f0e Move syscon into extres framework
This should help reduce confusion between syscon/syscons a little bit.
syscon is a resource generally modeled by FDT platforms, and not to be
confused with syscons.
2017-12-23 14:30:44 +00:00
Kyle Evans
4d68f3daa0 syscon: Introduce kobj and split out fdt bits
Allow more flexibility by kobj'ifying syscon and splitting out fdt specific
bits in preparation of a move to the extres framework.

The generic fdt driver has been moved to syscon_generic.c and the fdt
requirement has been removed from the syscon interface, as is common to the
extres framework.

Reviewed by:	strejda
Differential Revision:	https://reviews.freebsd.org/D13521
2017-12-23 14:27:42 +00:00
Warner Losh
04291531bc Fix cut-and-paste error s/pccard/isa/ 2017-12-23 07:02:45 +00:00
Warner Losh
5fe4723c70 Create a new ISA_PNP_INFO macro. Use this macro every where we have
ISA PNP card support (replace by hand version in if_ed). Move module
declarations to the end of some files. Fix PCCARD_PNP_INFO to use
nitems(). Remove some stale comments about pc98, turns out the comment
was simply wrong.
2017-12-23 06:49:27 +00:00
Warner Losh
75d0374765 Expand cryptic comment with inforation I've learned in the mean time
about CIS3/CIS4, including studies I've done on my large collection of
PC Cards bought off e-bay over the years since the original entry as
well as conversations I've had at conferences.
2017-12-23 06:11:19 +00:00
Warner Losh
a914e889e3 These drivers have a sentinel at the end of the device list. Exclude
it.
2017-12-23 05:32:20 +00:00
Warner Losh
79554b4049 The device tables end with a sentinel in iflib. Don't include the
sentinel in the output.
2017-12-23 04:50:52 +00:00
Konstantin Belousov
7aea69e54a Remove mips MD atomic_load_64 and atomic_store_64.
The only users of the functions were db_read_bytes() and
db_write_bytes() ddb(4) interfaces.  Replace the calls with direct
reads and writes, which are automatically atomic on 64bits and n32.

Note that removed assembler implementation for mips32 is not atomic
anyway.

Reviewed by:	jhb
Discussed with:	imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13586
2017-12-22 23:27:03 +00:00
Warner Losh
6efa5583c7 Fix typos from last commit, these should have been #. 2017-12-22 20:48:49 +00:00
Alexander Motin
d171d2f281 Add AHCI/XHCI device IDs found on AMD Ryzen+B350 system.
MFC after:	2 weeks
2017-12-22 20:44:21 +00:00
Navdeep Parhar
f549e3521d cxgbe(4): Do not forward interrupts to queues with freelists. This
leaves the firmware event queue (fwq) as the only queue that can take
interrupts for others.

This simplifies cfg_itype_and_nqueues and queue allocation in the driver
at the cost of a little (never?) used configuration.  It also allows
service_iq to be split into two specialized variants in the future.

MFC after:	2 months
Sponsored by:	Chelsio Communications
2017-12-22 19:10:19 +00:00
Warner Losh
d2064cf030 Use '#' rather than some made up name for fields we want to ignore. 2017-12-22 17:53:27 +00:00
Pedro F. Giffuni
fdac1d623b SPDX: Reverse License ID tags from the lmc driver.
While the BSD-2-Clause license is there, the GPLv2 is also present.
I am unsure of the implications of having both licenses as they are here.

I'll just leave it untagged and open for interpretation.
2017-12-22 17:15:02 +00:00
Warner Losh
a5ec991c27 Need to NULL terminate this list. It worked before by accidental data
in the module following it that terminated the search.
2017-12-22 17:13:54 +00:00
Warner Losh
56f3600c8b PC Card PNP tables are terminated by a NULL sentinel. This shouldn't
be recorded in the linker hints, so subtract one to omit it.
2017-12-22 16:59:50 +00:00
Konstantin Belousov
37f48d5aba Fix mips build after introduction of MD definitions of atomic_load_64
and atomic_store_64.

The MD definitions are provided for LP64 only, while mips also uses
them for 32bit and n32.  Only define mips variants for 32bit and n32
and change the syntax to match common definitions.

Note that this commit does not fix 32bit asm implementation to follow
new KBI, this will be fixed later.  The functions are only used for 8
byte ddb accesses so the known bug does not prevent normal kernel
operations.

Sponsored by:	The FreeBSD Foundation
2017-12-21 23:39:00 +00:00
Konstantin Belousov
95c8838c2a Fix build for LP64 arches with gcc.
gcc complaints that the comparision is always false due to the value
range, and the cast does not prevent the analysis.  Split the LP64
vs. ILP32 clamping as a workaround.

Sponsored by:	The FreeBSD Foundation
2017-12-21 23:08:10 +00:00
Konstantin Belousov
97755e83f5 Fix build for kernels with SCHED_4BSD.
Sponsored by:	The FreeBSD Foundation
2017-12-21 23:05:13 +00:00
Tycho Nightingale
9e33a61693 Recognize a pending virtual interrupt while emulating the halt instruction.
Reviewed by:	grehan, rgrimes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13573
2017-12-21 18:30:11 +00:00
Navdeep Parhar
426b6bd53c cxgbe(4): Read the MFG diags version from the VPD and make it available
in the sysctl MIB.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-12-21 15:19:43 +00:00
Bruce Evans
da9fba5447 Use resume_cpus() instead of restart_cpus() to resume from ACPI suspension.
restart_cpus() worked well enough by accident.  Before this set of fixes,
resume_cpus() used the same cpuset (started_cpus, meaning CPUs directed to
restart) as restart_cpus().  resume_cpus() waited for the wrong cpuset
(stopped_cpus) to become empty, but since mixtures of stopped and suspended
CPUs are not close to working, stopped_cpus must be empty when resuming so
the wait is null -- restart_cpus just allows the other CPUs to restart and
returns without waiting.

Fix resume_cpus() to wait on a non-wrong cpuset for the ACPI case, and
add further kludges to try to keep it working for the XEN case.  It
was only used for XEN.  It waited on suspended_cpus.  This works for
XEN.  However, for ACPI, resuming is a 2-step process.  ACPI has already
woken up the other CPUs and removed them from suspended_cpus.  This
fix records the move by putting them in a new cpuset resuming_cpus.
Waiting on suspended_cpus would give the same null wait as waiting on
stopped_cpus.  Wait on resuming_cpus instead.

Add a cpuset toresume_cpus to map the CPUs being told to resume to keep
this separate from the cpuset started_cpus for mapping the CPUs being told
to restart.  Mixtures of stopped and suspended/resuming CPUs are still far
from working.  Describe new and some old cpusets in comments.

Add further kludges to cpususpend_handler() to try to avoid breaking it
for XEN.  XEN doesn't use resumectx(), so it doesn't use the second
return path for savectx(), and it goes from the suspended state directly
to the restarted state, while ACPI resume goes through the resuming state.
Enter the resuming state early for all cases so that resume_cpus can test
for being in this state and not have to worry about the intermediate
!suspended state for ACPI only.

Reviewed by:	kib
2017-12-21 09:17:48 +00:00
Marius Strobl
5eff7c4116 Remove MD atomic_load_{32,64,int,long,ptr}(9) obsolete since the addition
of (conflicting) MI ones in r326971.
2017-12-21 01:27:32 +00:00
Stephen Hurd
25ac1dd5c7 Don't call tcp_lro_rx() unless hardware verified TCP/UDP csum
It seems that tcp_lro_rx() doesn't verify TCP checksums, so
if there are bad checksums in the packets caused by invalid data, the
invalid data will pass through without errors.

This was noticed with the igb driver and a specific internet host:
fetch http://www.mpfr.org/mpfr-current/mpfr-3.1.6.tar.xz -o test.bin && sha256 test.bin
Would result in a different value sometimes.

This ends up making LRO require RXCSUM to be enabled, and RXCSUM to
support TCP and UDP checksums.

PR:		224346
Reported by:	gjb
Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13561
2017-12-21 01:22:36 +00:00
Eric Joyner
788a123054 ixgbe(4): Fix build error on i386.
Reported by:	markj
2017-12-21 00:35:14 +00:00
Ian Lepore
68359587f6 If a temporary mapping is made to support EARLY_PRINTF, undo that mapping
after cninit() runs, otherwise we leave a bogus device-memory mapping in
userspace VA in the kernel pmap forever.

Pointed out by:	cognet
2017-12-20 22:19:11 +00:00
Ian Lepore
e3842da22f Allow pmap_kremove() to remove 1MB section mappings as well as 4K pages.
This will allow it to undo temporary device mappings such as those made
with pmap_preboot_map_attr().

Reviewed by:	cognet
2017-12-20 22:17:27 +00:00
Ian Lepore
df1e0a51ec Restore the ability to use EARLY_PRINTF support during most of initarm().
The real kernel page tables are set up much earlier in initarm() now than
they were when early printf support was first added, and they end up undoing
the mapping made in locore.S for early printf support.  This re-adds the
mapping after switching to the new/real kernel page tables, making early
printf work again right after switching to them.
2017-12-20 20:46:12 +00:00
Ian Lepore
5d83601f1f Remove arm-specific implementations of atomic_load/store_xxx() now that
they are provided by sys/atomic_common.h.
2017-12-20 20:41:51 +00:00
Pedro F. Giffuni
33d72c30f1 Revert r327005 - SPDX tags for license similar to BSD-2-Clause.
After consultation with SPDX experts and their matching guidelines[1],
the licensing doesn't exactly match the BSD-2-Clause. It yet remains to be
determined if they are equivalent or if there is a recognized license that
matches but it is safer to just revert the tags.

Let this also be a reminder that on FreeBSD, SPDX tags are only advisory
and have no legal value (but IANAL).

Pointyhat to:	pfg
Thanks to:	Rodney Grimes, Gary O'Neall

[1] https://spdx.org/spdx-license-list/matching-guidelines
2017-12-20 20:25:28 +00:00
Warner Losh
4571c92f4b Simplify the code a bit.
Replace clumsy for(;;) { if (foo) break; ...} with simpler
while (!foo) { ... }.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13546
2017-12-20 19:14:16 +00:00
Warner Losh
bb0107830d Add device location wiring to the pci bus.
This allows one to specify, for example, that if there's an igb card
in bus 12, slot 0, function 0, it should be assigned igb5. If there
isn't, or there's one in a different slot, normal numbering rules
apply (hinted units are skipped). Adding 'hint.igb.5.at="pci12:0:0"'
or 'hint.igb.5.at="pci0:12:0:0"' to /boot/device.hints will accomplish
this. The double quotes are important.

The kernel only accepts the strings (in shell notation):
	pci$d:$b:$s:$f
and	pci$b:$s:$f
where $d is the pci domain, $b is the pci bus number, $s is the slot
number and $f is the function number. A string compare is done with
the current device to avoid another string parser in the kernel. All
numbers are unsigned decimal without leading zeros.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13546
2017-12-20 19:14:05 +00:00
Warner Losh
4484c8f5d2 Return domain, bus, slot, and function for the transport settings in
PATH_INQ requests for nvme.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13546
2017-12-20 19:13:55 +00:00
Ian Lepore
5cf10fb96a Add a new kernel config option, MD_ROOT_READONLY, which forces on the
MD_READONLY flag for the md device automatically instantiated during
kernel init for an mdroot filesystem.

Note that there is specifically and by design no tunable or sysctl
control over this feature.  Without this option, you already have control
over whether the mdroot fs is writeable using vfs.root.mountfrom.options
from loader(8), the root_rw_mount rcvar, and by using "mount -u[rw] /"
or equivelent on the fly.  This option is being added to provide a way
to make the mdroot fs truly immutable before userland code begins running.

Differential Revision:	https://reviews.freebsd.org/D13411
2017-12-20 18:23:22 +00:00
Eric Joyner
c19c7afee3 ixgbe(4): Convert driver to use iflib
Initial update to the ixgbe PF and VF drivers to support the iflib interface.

The PF driver version is bumped to 4.0.0, and the VF driver version is bumped to 2.0.0.

Special thanks to sbruno@ for the support in helping make this conversion happen.

Submitted by:	Jeb Cramer <cramerj@intel.com>, Krzysztof Galazka (Chris) <krzysztof.galazka@intel.com>, Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by:	sbruno@, shurd@, #IntelNetworking
Tested by:	Jeffrey Pieper <jeffrey.e.pieper@intel.com>, Sergey Kozlov <kozlov.sergey.404@gmail.com>
Sponsored by:	Limelight Networks, Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D11727
2017-12-20 18:15:06 +00:00
Justin Hibbits
87879ba805 Increase default MAXDSIZ to 32G on powerpc64
Linking LLVM now seems to require more than 1GB data size, so increase the
default to 32G, which matches amd64.

Reviewed by:	nwhitehorn
2017-12-20 16:49:45 +00:00
Li-Wen Hsu
40cf51c438 Add missing ;
Approved by:	kevlo
2017-12-20 06:08:16 +00:00
Stephen Hurd
b103855e18 Support attaching tx queues to cpus
This will attempt to use a different thread/core on the same L2
cache when possible, or use the same cpu as the rx thread when not.
If SMP isn't enabled, don't go looking for cores to use. This is mostly
useful when using shared TX/RX queues.

Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12446
2017-12-20 01:03:34 +00:00
John Baldwin
f27d3a8a72 Don't return early for non-failure for one of the EMLINK checks.
r326987 enabled two #if 0'd-out EMLINK checks in zfs_link_create() for
link overflow.  However, one of the checks (when the vnode adding a link
is a directory such as for mkdir) always returned even if the link did not
overflow.  Change this to only return early if it needs to report an
EMLINK error.

Reported by:	db, shurd
Sponsored by:	Chelsio Communications
2017-12-19 23:54:44 +00:00
John Baldwin
5538424353 Replace one more LINK_MAX with NFS_LINK_MAX missed in r326991.
Sponsored by:	Chelsio Communications
2017-12-19 22:43:39 +00:00
John Baldwin
f83f3d7986 Update link count handling in fuse for post-ino64.
Set FUSE_LINK_MAX to UINT32_MAX instead of LINK_MAX to match the maximum
link count possible in the 'nlink' field of 'struct fuse_attr'.

Sponsored by:	Chelsio Communications
2017-12-19 22:40:54 +00:00
Pedro F. Giffuni
d17aef79bb SPDX: These are fundamentally BSD-2-Clause.
They just omit the introductory line and numbering.
2017-12-19 22:40:16 +00:00
John Baldwin
b501cc5da6 Rework pathconf handling for FIFOs.
On the one hand, FIFOs should respect other variables not supported by
the fifofs vnode operation (such as _PC_NAME_MAX, _PC_LINK_MAX, etc.).
These values are fs-specific and must come from a fs-specific method.
On the other hand, filesystems that support FIFOs are required to
support _PC_PIPE_BUF on directory vnodes that can contain FIFOs.
Given this latter requirement, once the fs-specific VOP_PATHCONF
method supports _PC_PIPE_BUF for directories, it is also suitable for
FIFOs permitting a single VOP_PATHCONF method to be used for both
FIFOs and non-FIFOs.

To that end, retire all of the FIFO-specific pathconf methods from
filesystems and change FIFO-specific vnode operation switches to use
the existing fs-specific VOP_PATHCONF method.  For fifofs, set it's
VOP_PATHCONF to VOP_PANIC since it should no longer be used.

While here, move _PC_PIPE_BUF handling out of vop_stdpathconf() so that
only filesystems supporting FIFOs will report a value.  In addition,
only report a valid _PC_PIPE_BUF for directories and FIFOs.

Discussed with:	bde
Reviewed by:	kib (part of a larger patch)
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D12572
2017-12-19 22:39:05 +00:00
Stephen Hurd
ff46fd16e5 Add log messages for unknown and unhandled phy types
Previously, it silently only supported auto, instead, log a message
indicating why only auto is supported.

Submitted by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Sponsored by:	Broadcom Limited
Differential Revision:	https://reviews.freebsd.org/D13358
2017-12-19 22:15:46 +00:00
Stephen Hurd
a0b660301a On Link up & down, update media types
It's possible to change the SFP module when link is down, which would
change the available media types.  This is part of D13358.

Submitted by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Sponsored by:	Broadcom Limited
2017-12-19 22:06:25 +00:00
Stephen Hurd
980da9f2f0 Support short HWRM commands
New Stratus bnxt devices require support for short HWRM commands for VFs
to function.  Enable their use, but only use them if it's both supported
and required... prefer the long HWRM commands when possible.

Submitted by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Sponsored by:	Broadcom Limited
Differential Revision:	https://reviews.freebsd.org/D13269?id=36180
2017-12-19 21:07:30 +00:00
Stephen Hurd
6a0dc418a6 Don't populate NVRAM sysctls for VFs
Only the PF allows NVRAM interaction on bnxt devices.

Submitted by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Sponsored by:	Broadcom Limited
2017-12-19 20:32:45 +00:00
John Baldwin
35b1a3abd3 Update tmpfs link count handling for ino64.
Add a new TMPFS_LINK_MAX to use in place of LINK_MAX for link overflow
checks and pathconf() reporting.  Rather than storing a full 64-bit
link count, just use a plain int and use INT_MAX as TMPFS_LINK_MAX.

Discussed with:	bde
Reviewed by:	kib (part of a larger patch)
Sponsored by:	Chelsio Communications
2017-12-19 20:19:07 +00:00
John Baldwin
c24008cc39 Honor NANDFS_LINK_MAX for post-ino64.
This uses NANDFS_LINK_MAX instead of LINK_MAX for link overflow checks
and the value reported by pathconf() / fpathconf().

Sponsored by:	Chelsio Communications
2017-12-19 20:17:07 +00:00
John Baldwin
f6e25ec77c Report INT_MAX for LINK_MAX for devfs' VOP_PATHCONF().
devfs uses int's for link counts internally and already reports the the
full link count via stat() post ino64.

Sponsored by:	Chelsio Communications
2017-12-19 20:07:57 +00:00
John Baldwin
a74da9fb83 Use FUSE_LINK_MAX for LINK_MAX in fuse' VOP_PATHCONF().
Should have included this in r326993.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2017-12-19 19:57:55 +00:00
John Baldwin
418e621276 Handle _PC_FILESIZEBITS and _PC_SYMLINK_MAX for devfs' VOP_PATHCONF().
MFC after:	1 month
Sponsored by:	Chelsio Communications
2017-12-19 19:53:34 +00:00
John Baldwin
599afe53a8 Move NAME_MAX, LINK_MAX, and CHOWN_RESTRICTED out of vop_stdpathconf().
Having all filesystems fall through to default values isn't always correct
and these values can vary for different filesystem implementations.  Most
of these changes just use the existing default values with a few exceptions:
- Don't report CHOWN_RESTRICTED for ZFS since it doesn't do the exact
  permissions check this claims for chown().
- Use NANDFS_NAME_LEN for NAME_MAX for nandfs.
- Don't report a LINK_MAX of 0 on smbfs.  Now fail with EINVAL to
  indicate hard links aren't supported.

Requested by:	bde (though perhaps not this exact implementation)
Reviewed by:	kib (earlier version)
MFC after:	1 month
Sponsored by:	Chelsio Communications
2017-12-19 19:51:36 +00:00
Ed Maste
2dd51e16ca embed_mfs: support embedding mfs into loader
The script originally supported embedding an mfs into ELF files or any
other type of file, because it searched for magic strings to mark the
beginning and end of the embeddable section. It was later modified to
read the section offset and length via readelf, which made it work for
ELF only. Restore the ability to update arbitrary file types by using
the readelf technique for ELF, and the magic string technique for all
others (including PE/COFF files like loader.efi).

Submitted by:	Zakary Nafziger <worldofzak@gmail.com>
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D12746
2017-12-19 19:44:06 +00:00
John Baldwin
a0a073b16d Update NFS to handle larger link counts post ino64.
- Define a NFS_LINK_MAX as UINT32_MAX to match the wire protocol.
- Use NFS_LINK_MAX instead of LINK_MAX as the fallback value reported
  for a PATHCONF RPC by the NFS server.
- Use NFS_LINK_MAX instead of LINK_MAX as the default value reported
  by the NFS client pathconf() if not overridden by the NFS server.
- When reading the link count out of an RPC reply, read the full 32
  bits instead of the lower 16 bits.

Reviewed by:	rmacklem (earlier version)
Sponsored by:	Chelsio Communications
2017-12-19 19:18:48 +00:00
John Baldwin
4a62795260 Handle _PC_FILESIZEBITS and _PC_NO_TRUNC for smbfs' VOP_PATHCONF().
MFC after:	1 month
Sponsored by:	Chelsio Communications
2017-12-19 19:14:01 +00:00
John Baldwin
853b3a8ae8 Support _PC_FILESIZEBITS in msdosfs' VOP_PATHCONF().
MFC after:	1 month
Sponsored by:	Chelsio Communications
2017-12-19 19:10:00 +00:00
John Baldwin
746c92e04e Add a custom VOP_PATHCONF method for fuse.
This method handles _PC_FILESIZEBITS, _PC_SYMLINK_MAX, and _PC_NO_TRUNC.
For other values it defers to vop_stdpathconf().

MFC after:	1 month
Sponsored by:	Chelsio Communications
2017-12-19 19:09:06 +00:00
John Baldwin
697a86b6bf Adjust ZFS' link count handling for ino64.
- Define a ZFS_LINK_MAX as the ZFS version of LINK_MAX which is set to
  UINT64_MAX to match the on-disk format.
- Enable the currently #if 0'd code to check for link overflows and
  return EMLINK.
- Don't clamp the link count reported in stat() to LINK_MAX as that is
  still the 16-bit limit, but report the full link counts.  Also,
  avoid possibly overflowing the reported link count to 0 when adjusting
  the link count to account for ".snapshot".
- Update the LINK_MAX reported by pathconf() to report ZFS_LINK_MAX
  rather than LINK_MAX (but clamped to LONG_MAX for 32-bit systems).

Reviewed by:	avg (earlier version)
Sponsored by:	Chelsio Communications
2017-12-19 19:07:24 +00:00
John Baldwin
dd688800e1 Add a custom VOP_PATHCONF method for fdescfs.
The method handles NAME_MAX and LINK_MAX explicitly.  For all other
pathconf variables, the method passes the request down to the underlying
file descriptor.  This requires splitting a kern_fpathconf() syscallsubr
routine out of sys_fpathconf().  Also, to avoid lock order reversals with
vnode locks, the fdescfs vnode is unlocked around the call to
kern_fpathconf(), but with the usecount of the vnode bumped.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2017-12-19 18:20:38 +00:00
Stephen Hurd
78fcf2de93 Add byte swapping in bnxt_cfg_async_cr() request
The firmware is always in little endian, use htole*() for all request fields
larger than one byte.

Submitted by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Sponsored by:	Broadcom Limited
2017-12-19 18:12:18 +00:00
Stephen Hurd
96fc97c81f Update Matthew Macy contact info
Email address has changed, uses consistent name (Matthew, not Matt)

Reported by:	Matthew Macy <mmacy@mattmacy.io>
Differential Revision:	https://reviews.freebsd.org/D13537
2017-12-19 17:59:00 +00:00
Mark Johnston
9abe2e7e98 Avoid using bioq_* in gmirror.
gmirror does not perform any sorting of I/O requests, so the bioq API
doesn't provide any advantages over plain TAILQs. The API also does not
provide operations needed by an upcoming change.

No functional change intended. The diff shrinks the geom_mirror.ko
text and the gmirror softc slightly.

Tested by:	pho (part of a larger patch)
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-12-19 17:13:04 +00:00
Nathan Whitehorn
d6716aa2af The highest-order bit of the bootloader cookie is 1, with the result that
the 32-bit cookie can be sign-extended on its way out of the loader and
through Open Firmware. If sign-extended, the in-kernel check of its value
would fail on 64-bit systems, resulting in a mountroot prompt. Solve this
by telling the kernel to ignore the high-order bits.

PR:		kern/224437
Submitted by:	Gustavo Romero
2017-12-19 16:45:40 +00:00
Nathan Whitehorn
7cc0ad62e3 Make __startkernel line up with KERNBASE, so that the math to compute the
applied relocation offset in link_elf.c works as intended. We may want to
revisit how that works in future, for example by having elf_reloc_self()
actually store the numbers it is using rather than computing them later,
but this fixes symbol lookup after r326203.

Reported by:	andreast@
Pointy hat to:	me
2017-12-19 15:50:46 +00:00
Konstantin Belousov
e44f4f3547 mlx5en: Avoid SFENCe on x86
The IA32 memory model guarantees that all writes are seen in the program
order.  Also, any access to the uncacheable memory flushes the store
buffers.  As the consequence, SFENCE instruction is (almost) never needed,
in particular, it is not needed to ensure the correct order of updates as
seen by a PCIe device.

Use atomic_thread_fence_rel() instead of wb() to only emit compiler barriers
on x86 there.  Other architectures get the right barrier instruction as
well.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-12-19 14:11:41 +00:00
Konstantin Belousov
200f8117ba Perform all accesses to uma_reclaim_needed using atomic(9) KPI.
Reviewed by:	alc, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13534
2017-12-19 10:06:55 +00:00
Konstantin Belousov
6f697994fd Use atomic_load(9) to read ppsinfo sequence numbers.
In this case volatile qualifiers enusre that a compiler does not
optimize the accesses out.

Reviewed by:	alc, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13534
2017-12-19 10:05:45 +00:00
Konstantin Belousov
30d4f9e888 Add atomic_load(9) and atomic_store(9) operations.
They provide relaxed-ordered atomic access semantic.  Due to the
FreeBSD memory model, the operations are syntaxical wrappers around
the volatile accesses.  The volatile qualifier is used to ensure that
the access not optimized out and in turn depends on the volatile
semantic as implemented by supported compilers.

The motivation for adding the operation is to help people coming from
other systems or knowing the C11/C++ standards where atomics have
special type and require use of the special access operations.  It is
still the case that FreeBSD requires plain load and stores of aligned
integer types to be atomic.

Suggested by:	jhb
Reviewed by:	alc, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13534
2017-12-19 09:59:20 +00:00
Warner Losh
5cf3cd108f When doing a dump, the scheduler is normally not running, so this
changed worked to capture dumps for me. However, the test for
SCHEDULER_STOPPED() isn't right. We can also call the dump routine
from ddb, in which case the scheduler is still running. This leads to
an assertion panic that we're sleeping when we shouldn't. Instead, use
the proper test for dumping or not. This brings us in line with other
places that do special things while we're doing polled I/O like this.

Noticed by: pho@
Differential Revision: https://reviews.freebsd.org/D13531
2017-12-19 04:13:22 +00:00
Conrad Meyer
e054cac74a Implement ACPI CPU support when Processor object is not present
By the ACPI standard (ACPI 5 chapter 8.4 Declaring Processors) Processors
can be implemented in 2 distinct ways:

* Through a Processor object type (which provides P_BLK)
* Through a Device object type

Prior to this change, the FreeBSD driver only supported the former.  AMD
Epyc / Poweredge systems we are testing both implement the latter only.  Add
the missing support.

Because P_BLK is not defined in the device object case, C-states entering
must be completely controlled via _CST methods rather than P_LVL2/3.

John Baldwin points out that ACPI 6.0 formally deprecates the Processor
keyword, so eventually processors will only be enumerated as Device objects.

Submitted by:	attilio
Reviewed by:	jhb, markj, Anton Rang <rang AT acm.org>
Relnotes:	maybe
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13457
2017-12-19 02:49:11 +00:00
Warner Losh
989c7f0b7c Although we only have one quirk at the moment, guard against the day
we have more than one by checking the actual quirk bit before delaying
the reset.

Noticed by: rpokala@
2017-12-18 20:11:21 +00:00
Warner Losh
ce1ec9c178 When we're disabling the nvme device, some drives have a controller
bug that requires 'hands off' for a period of time (2.3s) before we
check the RDY bit. Sicne this is a very odd quirk for a very limited
selection of drives, do this as a quirk. This prevented a successful
reset of the card when the card wedged.

Also, make sure that we comply with the advice from section 3.1.5 of
the 1.3 spec says that transitioning CC.EN from 0 to 1 when CSTS.RDY
is 1 or transitioning CC.EN from 1 to 0 when CSTS.RDY is 0 "has
undefined results". Short circuit when EN == RDY == desired state.

Finally, fail the reset if the disable fails. This will lead to a
failed device, which is what we want. (note: nda device needs
work for coping with a failed device).

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13389
2017-12-18 18:38:00 +00:00
Mark Johnston
6c7828a280 Avoid CPU migration in dtrace_gethrtime() on x86.
dtrace_gethrtime() may be called outside of probe context, and in
particular, from the DTRACEIOC_BUFSNAP handler.

Disable interrupts rather than using sched_pin() to help ensure that
we don't call any external functions when in probe context.

PR:		218452
MFC after:	1 week
2017-12-18 17:26:24 +00:00
Bruce Evans
d5d5600725 Also forgotten in the previous that removed the permanent double mapping
of low physical memory:

Update the comment about leaving the permanent mapping in place.  This
also improves the wording of the comment.  PTD 0 is still left alone
because it is fairly important that it was unmapped earlier, and the
comment now describes the unmapping of the other low PTDs that the code
actually does.

Reviewed by:	kib
2017-12-18 14:29:48 +00:00
Bruce Evans
2ba6fe0009 Remove the permanent double mapping of low physical memory and replace
it by a transient double mapping for the one instruction in ACPI wakeup
where it is needed (and for many surrounding instructions in ACPI resume).
Invalidate the TLB as soon as convenient after undoing the transient
mapping.  ACPI resume already has the strict ordering needed for this.

This fixes the non-trapping of null pointers and other garbage pointers
below NBPDR (except transiently).  NBPDR is quite large (4MB, or 2MB for
PAE).

This fixes spurious traps at the first instruction in VM86 bioscalls.
The traps are for transiently missing read permission in the first
VM86 page (physical page 0) which was just written to at KERNBASE in
the kernel.  The mechanism is unknown (it is not simply PG_G).

locore uses a similar but larger transient double mapping and needs
it for 2 instructions instead of 1.  Unmap the first PDE in it after
the 2 instructions to detect most garbage pointers while bootstrapping.
pmap_bootstrap() finishes the unmapping.

Remove the avoidance of the double mapping for a recently fixed special
case.  ACPI resume could use this avoidance (made non-special) to avoid
any problems with the transient double mapping, but no such problems
are known.

Update comments in locore.  Many were for old versions of FreeBSD which
tried to map low memory r/o except for special cases, or might have
allowed access to low memory via physical offsets.  Now all kernel
maps are r/w, and removal of of the double map disallows use of physical
offsets again.
2017-12-18 13:53:22 +00:00
Bruce Evans
4a5eb9ac99 Fix the undersupported option KERNLOAD, part 2: fix crashes in locore
when KERNLOAD is smaller than NBPDR (not the default) and PG_G is
enabled (the default if the CPU supports it).  This case has relatively
minor problems with coherency of the permanent double mapping, but the
fix in r167869 to improve coherency creates page tables with 3 different
errors so never worked.

The permanent double mapping is fundamentally broken and will be removed
soon.  It fundamentally breaks trapping for null pointers and requires
complications to avoid cache coherency bugs.  It is currently used for
only a single instruction in ACPI resume,

Many fixes VM86 and/or ACPI and/or the double map were attempted near
r1200000.  r167869 attempted to fix cache coherency bugs in an unusual
case, but the bugs were unreachable because older errors in page tables
caused a crash first.

This commit just makes r167869 work as intended.  Part 1 of these fixes
fixed the other errors, but also stopped mapping the PDE for KERNBASE
as a large page, so double mapping of this PDE only causes the same
problems as when KERNLOAD is the default.  Except for the problem of
trapping null pointers, r167869 could be used to fix these problems,
but it is inactive in usual cases.  The only known other problem is
that incoherent permissions for page 0 cause spurious traps in VM86
BIOS calls.

Reviewed by:	kib
2017-12-18 11:57:05 +00:00
Bruce Evans
3f21dc2985 Fix the undersupported option KERNLOAD, part 1: fix crashes in locore
when KERNLOAD is not a multiple of NBPDR (not the default) and PSE is
enabled (the default if the CPU supports it).  Addresses in PDEs must
be a multiple of NBPDR in the PSE case, but were not so in the crashing
case.

KERNLOAD defaults to NBPDR.  NBPDR is 4 MB for !PAE and 2 MB for PAE.
The default can be changed by editing i386/include/vmparam.h or using
makeoptions.  It can be changed to less than NBPDR to save real and
virtual memory at a small cost in time, or to more than NBPDR to waste
real and virtual memory.  It must be larger than 1 MB and a multiple of
PAGE_SIZE.  When it is less than NBPDR, it is necessarily not a multiple
of NBPDR.  This case has much larger bugs which will be fixed in part 2.

The fix is to only use PSE for physical addresses above <KERNLOAD
rounded _up_ to an NBPDR boundary>.  When the rounding is non-null,
this leaves part of the kernel not using large pages.  Rounding down
would avoid this pessimization, but would break setting of PAT bits
on i/o pages if it goes below 1MB.  Since rounding down always goes
below 1MB when KERNLOAD < NBPDR and the KERNLOAD > NBPDR case is not
useful, never round down.

Fix related style bugs (e.g., wrong literal values for NBPDR in comments).

Reviewed by:	kib
2017-12-18 09:32:56 +00:00
Ian Lepore
b5496277c7 Do not attempt to refill the TX fifo if there is no data left to transfer.
A comment in bcm_bsc_fill_tx_fifo() even lists sc_totlen > 0 as a
precondition for calling the routine.   I apparently forgot to make the
code do what my comment said.
2017-12-18 02:34:37 +00:00
Ian Lepore
ae99239461 Fix debugging output, fallout from something like s/read/readctl/g
while renaming variables in a previous change.
2017-12-18 00:15:53 +00:00
Mark Johnston
7a177c2d5e Unregister the ARC lowmem event handler earlier in arc_fini().
Otherwise a poorly timed lowmem event may attempt to acquire a destroyed
lock. Unregister the handler before destroying the ARC reclaim thread.

Reported by:	gjb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D13480
2017-12-17 18:21:40 +00:00
Andrey V. Elsukov
0e253fd12c Fix possible memory leak.
vxlan_ftable entries are sorted in ascending order, due to wrong arguments
order it is possible to stop search before existing element will be found.
Then new element will be allocated in vxlan_ftable_update_locked() and can
be inserted in the list second time or trigger MPASS() assertion with
enabled INVARIANTS.

PR:		224371
MFC after:	1 week
2017-12-16 14:36:21 +00:00