opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-02-21 08:50:22 -05:00

Author	SHA1	Message	Date
Conrad Meyer	c9bf814804	restore(8): Handle extended attribute names correctly UFS2 extended attribute names are not NUL-terminated. Handle appropriately. Correct the EXTATTR_BASE_LENGTH() macro, which handled ea_namelength == one (mod eight) extended attributes incorrectly. PR: 216127 Reported by: dewayne at heuristicsystems.com.au Reviewed by: kib@ Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D9208	2017-01-18 18:16:57 +00:00
Mateusz Guzik	c5f61e6f96	sx: reduce lock accesses similarly to r311172 Discussed with: jhb Tested by: pho (previous version)	2017-01-18 17:55:08 +00:00
Mateusz Guzik	3f0a0612e8	rwlock: reduce lock accesses similarly to r311172 Discussed with: jhb Tested by: pho (previous version)	2017-01-18 17:53:57 +00:00
Gleb Smirnoff	f3b3aa83ce	Format and sort MSG_* flags, to prevent misedits in future. There is no functional change.	2017-01-18 17:21:28 +00:00
Gleb Smirnoff	cf66bb8db9	Fix regression from r311568: collision of MSG_NOSIGNAL with MSG_MORETOCOME lead to delayed send of data sent with sendto(MSG_NOSIGNAL). Submitted by: rrs	2017-01-18 17:09:22 +00:00
Hans Petter Selasky	f3e7afe2d7	Implement kernel support for hardware rate limited sockets. - Add RATELIMIT kernel configuration keyword which must be set to enable the new functionality. - Add support for hardware driven, Receive Side Scaling, RSS aware, rate limited sendqueues and expose the functionality through the already established SO_MAX_PACING_RATE setsockopt(). The API support rates in the range from 1 to 4Gbytes/s which are suitable for regular TCP and UDP streams. The setsockopt(2) manual page has been updated. - Add rate limit function callback API to "struct ifnet" which supports the following operations: if_snd_tag_alloc(), if_snd_tag_modify(), if_snd_tag_query() and if_snd_tag_free(). - Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT flag, which tells if a network driver supports rate limiting or not. - This patch also adds support for rate limiting through VLAN and LAGG intermediate network devices. - How rate limiting works: 1) The userspace application calls setsockopt() after accepting or making a new connection to set the rate which is then stored in the socket structure in the kernel. Later on when packets are transmitted a check is made in the transmit path for rate changes. A rate change implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the destination network interface, which then sets up a custom sendqueue with the given rate limitation parameter. A "struct m_snd_tag" pointer is returned which serves as a "snd_tag" hint in the m_pkthdr for the subsequently transmitted mbufs. 2) When the network driver sees the "m->m_pkthdr.snd_tag" different from NULL, it will move the packets into a designated rate limited sendqueue given by the snd_tag pointer. It is up to the individual drivers how the rate limited traffic will be rate limited. 3) Route changes are detected by the NIC drivers in the ifp->if_transmit() routine when the ifnet pointer in the incoming snd_tag mismatches the one of the network interface. The network adapter frees the mbuf and returns EAGAIN which causes the ip_output() to release and clear the send tag. Upon next ip_output() a new "snd_tag" will be tried allocated. 4) When the PCB is detached the custom sendqueue will be released by a non-blocking ifp->if_snd_tag_free() call to the currently bound network interface. Reviewed by: wblock (manpages), adrian, gallatin, scottl (network) Differential Revision: https://reviews.freebsd.org/D3687 Sponsored by: Mellanox Technologies MFC after: 3 months	2017-01-18 13:31:17 +00:00
Maxim Sobolev	339efd75a4	Add a new socket option SO_TS_CLOCK to pick from several different clock sources to return timestamps when SO_TIMESTAMP is enabled. Two additional clock sources are: o nanosecond resolution realtime clock (equivalent of CLOCK_REALTIME); o nanosecond resolution monotonic clock (equivalent of CLOCK_MONOTONIC). In addition to this, this option provides unified interface to get bintime (equivalent of using SO_BINTIME), except it also supported with IPv6 where SO_BINTIME has never been supported. The long term plan is to depreciate SO_BINTIME and move everything to using SO_TS_CLOCK. Idea for this enhancement has been briefly discussed on the Net session during dev summit in Ottawa last June and the general input was positive. This change is believed to benefit network benchmarks/profiling as well as other scenarios where precise time of arrival measurement is necessary. There are two regression test cases as part of this commit: one extends unix domain test code (unix_cmsg) to test new SCM_XXX types and another one implementis totally new test case which exchanges UDP packets between two processes using both conventional methods (i.e. calling clock_gettime(2) before recv(2) and after send(2)), as well as using setsockopt()+recv() in receive path. The resulting delays are checked for sanity for all supported clock types. Reviewed by: adrian, gnn Differential Revision: https://reviews.freebsd.org/D9171	2017-01-16 17:46:38 +00:00
Sean Bruno	227743cad4	Change startup order for the no EARLY_AP_STARTUP case to initialize gtaskqueue bits at SI_SUB_INIT_IF instead of waiting until SI_SUB_SMP which is far too late. Add an assertion in taskqgroup_attach() to catch startup initialization failures in the future. Reported by: kib bde	2017-01-16 16:58:12 +00:00
Hiren Panchasara	7d03ff1fe9	Add kevent EVFILT_EMPTY for notification when a client has received all data i.e. everything outstanding has been acked. Reviewed by: bz, gnn (previous version) MFC after: 3 days Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D9150	2017-01-16 08:25:33 +00:00
Conrad Meyer	1d64db52f3	Fix a variety of cosmetic typos and misspellings No functional change. PR: 216096, 216097, 216098, 216101, 216102, 216106, 216109, 216110 Reported by: Bulat <bltsrc at mail.ru> Sponsored by: Dell EMC Isilon	2017-01-15 18:00:45 +00:00
Conrad Meyer	db4fcadf52	"Buses" is the preferred plural of "bus" Replace archaic "busses" with modern form "buses." Intentionally excluded: * Old/random drivers I didn't recognize * Old hardware in general * Use of "busses" in code as identifiers No functional change. http://grammarist.com/spelling/buses-busses/ PR: 216099 Reported by: bltsrc at mail.ru Sponsored by: Dell EMC Isilon	2017-01-15 17:54:01 +00:00
Conrad Meyer	e22c3fc79c	Fix a minor typo (Seiral) PR: 216095 Reported by: <bltsrc at mail.ru>	2017-01-15 08:05:00 +00:00
Gleb Smirnoff	4fce19da8d	Remove deprecated fgetsock() and fputsock().	2017-01-13 22:16:41 +00:00
Ian Lepore	a6f63533a7	Check tty_gone() after allocating IO buffers. The tty lock has to be dropped then reacquired due to using M_WAITOK, which opens a window in which the tty device can disappear. Check for this and return ENXIO back up the call chain so that callers can cope. This closes a race where TF_GONE would get set while buffers were being allocated as part of ttydev_open(), causing a subsequent call to ttydevsw_modem() later in ttydev_open() to assert. Reported by: pho Reviewed by: kib	2017-01-13 16:37:38 +00:00
Ian Lepore	f64342e354	Rework tty_drain() to poll the hardware for completion, and restore drain timeout handling to historical freebsd behavior. The primary reason for these changes is the need to have tty_drain() call ttydevsw_busy() at some reasonable sub-second rate, to poll hardware that doesn't signal an interrupt when the transmit shift register becomes empty (which includes virtually all USB serial hardware). Such hardware hangs in a ttyout wait, because it never gets an opportunity to trigger a wakeup from the sleep in tty_drain() by calling ttydisc_getc() again, after handing the last of the buffered data to the hardware. While researching the history of changes to tty_drain() I stumbled across some email describing the historical BSD behavior of tcdrain() and close() on serial ports, and the ability of comcontrol(1) to control timeout behavior. Using that and some advice from Bruce Evans as a guide, I've put together these changes to implement the hardware polling and restore the historical timeout behaviors... - tty_drain() now calls ttydevsw_busy() in a loop at 10 Hz to accomodate hardware that requires polling for busy state. - The "new historical" behavior for draining during close(2) is retained: the drain timeout is "1 second without making any progress". When the 1-second timeout expires, if the count of bytes remaining in the tty layer buffer is smaller than last time, the timeout is extended for another second. Unfortunately, the same logic cannot be extended all the way down to the hardware, because the interface to that layer is a simple busy/not-busy indication. - Due to the previous point, an application that needs a guarantee that all data has been transmitted must use TIOCDRAIN/tcdrain(3) before calling close(2). - The historical behavior of honoring the drainwait setting for TIOCDRAIN (used by tcdrain(3)) is restored. - The historical kern.drainwait sysctl to control the global default drainwait time is restored, but is now named kern.tty_drainwait. - The historical default drainwait timeout of 300 seconds is restored. - Handling of TIOCGDRAINWAIT and TIOCSDRAINWAIT ioctls is restored (this also makes the comcontrol(1) drainwait verb work again). - Manpages are updated to document these behaviors. Reviewed by: bde (prior version)	2017-01-12 00:48:06 +00:00
Pedro F. Giffuni	990c731f0d	Remove unused __gnu_inline() attribute. This was meant to be used by a future FORTIFY_SOURCE implementation. Probably for good, FORTIFY_SOURCE and this particular GCCism were never well supported by clang or other compilers. Furthermore, the technology has long since been replaced by either static checkers, sanitizers, or even just the strong stack protector that was enabled by default. Drop __gnu_inline to avoid cluttering the headers. MFC after: 5 days	2017-01-10 20:44:31 +00:00
Konstantin Belousov	c9eddd660c	Define _POSIX_PRIORITY_SCHEDULING as 0. sched_*(2) syscalls might be not available at runtime. Defining this constant as zero directs POSIX-compliant code to call sysconf(3) to detect the feature at runtime, and forces libc sysconf(3) to ask kernel. Noted by: ngie Reviewed by: jilles, ngie Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D9055	2017-01-07 12:24:45 +00:00
John Baldwin	14da48cbe4	Set MORETOCOME for AIO write requests on a socket. Add a MSG_MOREOTOCOME message flag. When this flag is set, sosend* set PRUS_MOREOTOCOME when invoking the protocol send method. The aio worker tasks for sending on a socket set this flag when there are additional write jobs waiting on the socket buffer. Reviewed by: adrian MFC after: 1 month Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D8955	2017-01-06 23:41:45 +00:00
Konstantin Belousov	2f304845e2	Do not allocate struct statfs on kernel stack. Right now size of the structure is 472 bytes on amd64, which is already large and stack allocations are indesirable. With the ino64 work, MNAMELEN is increased to 1024, which will make it impossible to have struct statfs on the stack. Extracted from: ino64 work by gleb Discussed with: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-01-05 17:19:26 +00:00
Mark Johnston	ec492b13f1	Add a small allocator for exec_map entries. Upon each execve, we allocate a KVA range for use in copying data to the new image. Pages must be faulted into the range, and when the range is freed, the backing pages are freed and their mappings are destroyed. This is a lot of needless overhead, and the exec_map management becomes a bottleneck when many CPUs are executing execve concurrently. Moreover, the number of available ranges is fixed at 16, which is insufficient on large systems and potentially excessive on 32-bit systems. The new allocator reduces overhead by making exec_map allocations persistent. When a range is freed, pages backing the range are marked clean and made easy to reclaim. With this change, the exec_map is sized based on the number of CPUs. Reviewed by: kib MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D8921	2017-01-05 01:44:12 +00:00
Mateusz Guzik	2604eb9e17	mtx: reduce lock accesses Instead of spuriously re-reading the lock value, read it once. This change also has a side effect of fixing a performance bug: on failed _mtx_obtain_lock, it was possible that re-read would find the lock is unowned, but in this case the primitive would make a trip through turnstile code. This is diff reduction to a variant which uses atomic_fcmpset. Discussed with: jhb (previous version) Tested by: pho (previous version)	2017-01-03 21:36:15 +00:00
Mark Johnston	b1fd102ee7	Add a page queue for holding dirty anonymous unswappable pages. On systems without a configured swap device, an attempt to launder pages from a swap object will always fail and result in the page being reactivated. This means that the page daemon will continuously scan pages that can never be evicted. With this change, anonymous pages are instead moved to PQ_UNSWAPPABLE after a failed laundering attempt when no swap devices are configured. PQ_UNSWAPPABLE is not scanned unless a swap device is configured, so unreferenced unswappable pages are excluded from the page daemon's workload. Reviewed by: alc	2017-01-03 00:05:44 +00:00
Konstantin Belousov	67319388b8	Remove unneeded externs keywords. Reindent long lines. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-01-02 09:02:39 +00:00
Pedro F. Giffuni	d35974ba4a	Addition of clang nullability qualifiers. Add two new qualifiers for use by the static checkers: _Nonnull The _Nonnull nullability qualifier indicates that null is not a meaningful value for a value of the _Nonnull pointer type. _Nullable The _Nullable nullability qualifier indicates that a value of the _Nullable pointer type can be null. These were introduced in Clang 3.7. For more information, see: http://clang.llvm.org/docs/AttributeReference.html#nonnull We add these now without using them so that the GCC ports have time to pick up the header change. Hinted by: Android Bionic libc [1] Also seen in: Apple's Libc-1158.20.4 [1] `baa2a973bd`	2016-12-31 15:58:15 +00:00
Baptiste Daroussin	5f101037af	Bump copyright year. Happy New Year 2017!	2016-12-31 12:41:42 +00:00
Konstantin Belousov	6ca28febf6	Remove unused declaration. The setconf() implementation was removed by r52778 Nov 1 1999. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-12-31 11:07:33 +00:00
Mateusz Guzik	0b3b55a0f2	Remove cpu_spinwait after seq_consistent. It does not add any benefit as the read routine will do it as necessary.	2016-12-30 06:26:17 +00:00
John Baldwin	1fabda45c3	Regen after r310638. Differential Revision: https://reviews.freebsd.org/D8854	2016-12-27 20:22:17 +00:00
John Baldwin	34ed0c63c8	Rename the 'flags' argument to getfsstat() to 'mode' and validate it. This argument is not a bitmask of flags, but only accepts a single value. Fail with EINVAL if an invalid value is passed to 'flag'. Rename the 'flags' argument to getmntinfo(3) to 'mode' as well to match. This is a followup to r308088. Reviewed by: kib MFC after: 1 month	2016-12-27 20:21:11 +00:00
Konstantin Belousov	fd30dd7c26	Make knote KN_INFLUX state counted. This is final fix for the issue closed by r310302 for knote(). If KN_INFLUX \| KN_SCAN flags are set for the note passed to knote() or knote_fork(), i.e. the knote is scanned, we might erronously clear INFLUX when finishing notification. For normal knote() it was fixed in r310302 simply by remembering the fact that we do not own KN_INFLUX, since there we own knlist lock and scan thread cannot clear KN_INFLUX until we drop the lock. For knote_fork(), the situation is more complicated, e must drop knlist lock AKA the process lock, since we need to register new knotes. Change KN_INFLUX into counter and allow shared ownership of the in-flux state between scan and knote_fork() or knote(). Both in-flux setters need to ensure that knote is not dropped in parallel. Added assert about kn_influx == 1 in knote_drop() verifies that in-flux state is not shared when knote is destroyed. Since KBI of the struct knote is changed by addition of the int kn_influx field, reorder kn_hook and kn_hookid to fill pad on LP64 arches [1]. This keeps sizeof(struct knote) to same 128 bytes as it was before addition of kn_influx, on amd64. Reviewed by: markj Suggested by: markj [1] Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D8898	2016-12-26 19:33:40 +00:00
Konstantin Belousov	fc05543fa7	Some optimizations for kqueue timers. There is no need to do two allocations per kqueue timer. Gather all data needed by the timer callout into the structure and allocate it at once. Use the structure to preserve the result of timer2sbintime(), to not perform repeated 64bit calculations in callout. Remove tautological casts. Remove now unused p_nexttime [1]. Noted by: markj [1] Reviewed by: markj (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week X-MFC note: do not remove p_nexttime Differential revision: https://reviews.freebsd.org/D8901	2016-12-25 19:49:35 +00:00
Sepherosa Ziehau	fff5be0be3	hyperv: Implement userspace gettimeofday(2) with Hyper-V reference TSC This 6 times gettimeofday performance, as measured by tools/tools/syscall_timing Reviewed by: kib MFC after: 1 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D8789	2016-12-19 07:40:45 +00:00
Dimitry Andric	2cae16075b	Add __scanflike attributes to the kernel's sscanf() and vsscanf() declarations. This should help to catch future mismatches between format strings and arguments. MFC after: 1 week	2016-12-16 19:49:22 +00:00
Adrian Chadd	df85a9edd9	Bump FreeBSD_version .	2016-12-16 04:44:46 +00:00
Ed Schouten	13a83707bc	Revert accidental change made in r310056. Because I had to cherry-pick some of my changes in r310051, I accidentally made a typo when manually applying the rest in r310056.	2016-12-14 13:35:33 +00:00
Ed Schouten	977ffc4c2a	Let all FEATURE()s use the same Prometheus metric. Without this change, every individual FEATURE() declaration would have an individual metric in Prometheus. Though this wouldn't be harmful, it would look very cluttered. By letting it use a single metric with the name of the feature attached as a label, it also becomes easier to search, as you can apply regex matching, etc. Reviewed by: cem Differential Revision: https://reviews.freebsd.org/D8775	2016-12-14 13:05:04 +00:00
Ed Schouten	1e1f3941e4	Add support for attaching aggregation labels to sysctl objects. I'm currently working on writing a metrics exporter for the Prometheus monitoring system to provide access to sysctl metrics. Prometheus and sysctl have some structural differences: - sysctl is a tree of string component names. - Prometheus uses a flat namespace for its metrics, but allows you to attach labels with values to them, so that you can do aggregation. An initial version of my exporter simply translated hw.acpi.thermal.tz1.temperature to sysctl_hw_acpi_thermal_tz1_temperature_celcius while we should ideally have sysctl_hw_acpi_thermal_temperature_celcius{thermal_zone="tz1"} allowing you to graph all thermal zones on a system in one go. The change presented in this commit adds support for accomplishing this, by providing the ability to attach labels to nodes. In the example I gave above, the label "thermal_zone" would be attached to "tz1". As this is a feature that will only be used very rarely, I decided to not change the KPI too aggressively. Discussed on: hackers@ Reviewed by: cem Differential Revision: https://reviews.freebsd.org/D8775	2016-12-14 12:47:34 +00:00
Mateusz Guzik	5afb134c32	vfs: add vrefact, to be used when the vnode has to be already active This allows blind increment of relevant counters which under contention is cheaper than inc-not-zero loops at least on amd64. Use it in some of the places which are guaranteed to see already active vnodes. Reviewed by: kib (previous version)	2016-12-12 15:37:11 +00:00
Konrad Witaszczyk	480f31c214	Add support for encrypted kernel crash dumps. Changes include modifications in kernel crash dump routines, dumpon(8) and savecore(8). A new tool called decryptcore(8) was added. A new DIOCSKERNELDUMP I/O control was added to send a kernel crash dump configuration in the diocskerneldump_arg structure to the kernel. The old DIOCSKERNELDUMP I/O control was renamed to DIOCSKERNELDUMP_FREEBSD11 for backward ABI compatibility. dumpon(8) generates an one-time random symmetric key and encrypts it using an RSA public key in capability mode. Currently only AES-256-CBC is supported but EKCD was designed to implement support for other algorithms in the future. The public key is chosen using the -k flag. The dumpon rc(8) script can do this automatically during startup using the dumppubkey rc.conf(5) variable. Once the keys are calculated dumpon sends them to the kernel via DIOCSKERNELDUMP I/O control. When the kernel receives the DIOCSKERNELDUMP I/O control it generates a random IV and sets up the key schedule for the specified algorithm. Each time the kernel tries to write a crash dump to the dump device, the IV is replaced by a SHA-256 hash of the previous value. This is intended to make a possible differential cryptanalysis harder since it is possible to write multiple crash dumps without reboot by repeating the following commands: # sysctl debug.kdb.enter=1 db> call doadump(0) db> continue # savecore A kernel dump key consists of an algorithm identifier, an IV and an encrypted symmetric key. The kernel dump key size is included in a kernel dump header. The size is an unsigned 32-bit integer and it is aligned to a block size. The header structure has 512 bytes to match the block size so it was required to make a panic string 4 bytes shorter to add a new field to the header structure. If the kernel dump key size in the header is nonzero it is assumed that the kernel dump key is placed after the first header on the dump device and the core dump is encrypted. Separate functions were implemented to write the kernel dump header and the kernel dump key as they need to be unencrypted. The dump_write function encrypts data if the kernel was compiled with the EKCD option. Encrypted kernel textdumps are not supported due to the way they are constructed which makes it impossible to use the CBC mode for encryption. It should be also noted that textdumps don't contain sensitive data by design as a user decides what information should be dumped. savecore(8) writes the kernel dump key to a key.# file if its size in the header is nonzero. # is the number of the current core dump. decryptcore(8) decrypts the core dump using a private RSA key and the kernel dump key. This is performed by a child process in capability mode. If the decryption was not successful the parent process removes a partially decrypted core dump. Description on how to encrypt crash dumps was added to the decryptcore(8), dumpon(8), rc.conf(5) and savecore(8) manual pages. EKCD was tested on amd64 using bhyve and i386, mipsel and sparc64 using QEMU. The feature still has to be tested on arm and arm64 as it wasn't possible to run FreeBSD due to the problems with QEMU emulation and lack of hardware. Designed by: def, pjd Reviewed by: cem, oshogbo, pjd Partial review: delphij, emaste, jhb, kib Approved by: pjd (mentor) Differential Revision: https://reviews.freebsd.org/D4712	2016-12-10 16:20:39 +00:00
Gleb Smirnoff	169170209c	Provide counter_ratecheck(), a MP-friendly substitution to ppsratecheck(). When rated event happens at a very quick rate, the ppsratecheck() is not only racy, but also becomes a performance bottleneck. Together with: rrs, jtl	2016-12-09 17:58:34 +00:00
Robert Watson	52b42f6287	Regnerate system-call definitions following r309677 correcting a whitespace glitch in syscalls.master.	2016-12-07 16:12:27 +00:00
Eric van Gyzen	3d32d4a7c9	Export the whole thread name in kinfo_proc kinfo_proc::ki_tdname is three characters shorter than thread::td_name. Add a ki_moretdname field for these three extra characters. Add the new field to kinfo_proc32, as well. Update all in-tree consumers to read the new field and assemble the full name, except for lldb's HostThreadFreeBSD.cpp, which I will handle separately. Bump __FreeBSD_version. Reviewed by: kib MFC after: 1 week Relnotes: yes Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D8722	2016-12-07 15:04:22 +00:00
Ryan Stone	669f39b29c	Revert r309372 The bug intended to be fixed by r309372 was already addressed by r296178, so revert my change. Reported by: seph	2016-12-02 15:38:34 +00:00
Ryan Stone	86a6fcd4ff	Fix a false positive in a buf_ring assert buf_ring contains an assert that checks whether an item being enqueued already exists on the ring. There is a subtle bug in this assert. An item can be returned by a peek() function and freed, and then the consumer thread can be preempted before calling advance(). If this happens the item appears to still be on the queue, but another thread may allocate the item from the free pool and wind up trying to enqueue it again, causing the assert to trigger incorrectly. Fix this by skipping the head of the consumer's portion of the ring, as this index is what will be returned by peek(). Sponsored by: Dell EMC Isilon MFC After: 1 week Differential Revision: https://reviews.freebsd.org/D8685 Reviewed by: hselasky	2016-12-01 21:08:42 +00:00
Konstantin Belousov	abc1515601	NFSv4 client tracks opens, and the track records are only dropped when the vnode is inactivated. This contradicts with the nullfs caching which keeps upper vnode around, as consequence keeping the use reference to lower vnode. Add a filesystem flag to request nullfs to not cache when mounted over that filesystem, and set the flag for nfs v4 mounts. Reported by: asomers Reviewed by: rmacklem Tested by: asomers, rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-11-27 09:20:58 +00:00
Dimitry Andric	d590c67486	In preparation for merging back to head, bump __FreeBSD_version, FREEBSD_CC_VERSION and set date in ObsoleteFiles.inc.	2016-11-24 21:12:43 +00:00
Mark Johnston	99e6e1930c	Release laundered vnode pages to the head of the inactive queue. The swap pager enqueues laundered pages near the head of the inactive queue to avoid another trip through LRU before reclamation. This change adds support for this behaviour to the vnode pager and makes use of it in UFS and ext2fs. Some ioflag handling is consolidated into a common subroutine so that this support can be easily extended to other filesystems which make use of the buffer cache. No changes are needed for ZFS since its putpages routine always undirties the pages before returning, and the laundry thread requeues the pages appropriately in this case. Reviewed by: alc, kib Differential Revision: https://reviews.freebsd.org/D8589	2016-11-23 17:53:07 +00:00
Alan Cox	bba39b9ae3	Remove PG_CACHED-related fields from struct vmmeter, because they are no longer used. More precisely, they are always zero because the code that decremented and incremented them no longer exists. Bump __FreeBSD_version to mark this change. Reviewed by: kib, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8583	2016-11-22 18:13:46 +00:00
Justin Hibbits	69f690335b	Actually bump __FreeBSD_version	2016-11-20 06:11:30 +00:00
Gleb Smirnoff	00b5ffde8e	Add flag SF_USER_READAHEAD to sendfile(2). When specified, the syscall won't do any speculations about readahead, and use exactly the amount of readahead specified by user. E.g. setting SF_FLAGS(0, SF_USER_READAHEAD) will guarantee that no readahead at all will be performed.	2016-11-17 21:36:18 +00:00

1 2 3 4 5 ...

8863 commits