The syscall is a trivial wrapper around new VOP_FDATASYNC(), sharing
code with fsync(2). For all filesystems, this commit provides the
implementation which delegates the work of VOP_FDATASYNC() to
VOP_FSYNC(). This is functionally correct but not efficient.
This is not yet POSIX-compliant implementation, because it does not
ensure that queued AIO requests are completed before returning.
Reviewed by: mckusick
Discussed with: avg (ZFS), jhb (AIO part)
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D7471
CloudABI executables already provide support for passing in vDSOs. This
functionality is used by the emulator for OS X to inject system call
handlers. On FreeBSD, we could use it to optimize calls to
gettimeofday(), etc.
Though I don't have any plans to optimize any system calls right now,
let's go ahead and already pass in a vDSO. This will allow us to
simplify the executables, as the traditional "syscall" shims can be
removed entirely. It also means that we gain more flexibility with
regards to adding and removing system calls.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D7438
Our mprotect() function seems to take a "const void *" address to the
pages whose permissions need to be adjusted. POSIX uses "void *". Simply
stick to the POSIX one to prevent us from writing unportable code.
PR: 211423 (exp-run)
Tested by: antoine@ (Thanks!)
Any sensible workflow will include a revision control system from which
to restore the old files if required. In normal usage, developers just
have to clean up the mess.
Reviewed by: jhb
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D7353
and getboottimebin(9) KPI. Change consumers of boottime to use the
KPI. The variables were renamed to avoid shadowing issues with local
variables of the same name.
Issue is that boottime* should be adjusted from tc_windup(), which
requires them to be members of the timehands structure. As a
preparation, this commit only introduces the interface.
Some uses of boottime were found doubtful, e.g. NLM uses boottime to
identify the system boot instance. Arguably the identity should not
change on the leap second adjustment, but the commit is about the
timekeeping code and the consumers were kept bug-to-bug compatible.
Tested by: pho (as part of the bigger patch)
Reviewed by: jhb (same)
Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
X-Differential revision: https://reviews.freebsd.org/D7302
It looks like our "struct shmid_ds::shm_nattch" deviates from the
standard in the sense that it is a signed integer, whereas POSIX
requires that it is unsigned, having a special type shmatt_t.
Patch up our native and 32-bit copies to use a new shmatt_t that is an
unsigned integer. As it's unsigned, we can relax the comparisons that
are performed on it. Leave the Linux, iBCS2, etc. copies of the
structure alone.
Reviewed by: ngie
Differential Revision: https://reviews.freebsd.org/D6655
FreeBSD support NX bit on X86_64 processors out of the box, for i386 emulation
use READ_IMPLIES_EXEC flag, introduced in r302515.
While here move common part of mmap() and mprotect() code to the files in compat/linux
to reduce code dupcliation between Linuxulator's.
Reported by: Johannes Jost Meixner, Shawn Webb
MFC after: 1 week
XMFC with: r302515, r302516
In Linux if this flag is set, PROT_READ implies PROT_EXEC for mmap().
Linux/i386 set this flag automatically if the binary requires executable stack.
READ_IMPLIES_EXEC flag will be used in the next Linux mmap() commit.
[1] Remove unneeded sockaddr conversion before kern_recvit() call as the from
argument is used to record result (the source address of the received message) only.
[2] In Linux the type of msg_namelen member of struct msghdr is signed but native
msg_namelen has a unsigned type (socklen_t). So use the proper storage to fetch fromlen
from userspace and than check the user supplied value and return EINVAL if it is less
than 0 as a Linux do.
Reported by: Thomas Mueller <tmueller at sysgo dot com> [1]
Reviewed by: kib@
Approved by: re (gjb, kib)
MFC after: 3 days
Mark the pipe() system call as COMPAT10.
As of r302092 libc uses pipe2() with a zero flags value instead of pipe().
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D6816
As of r302092 libc uses pipe2() with a zero flags value instead of pipe().
Commit with regenerated files and implementation to follow.
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D6816
threads, to make it less confusing and using modern kernel terms.
Rename the functions to reflect current use of the functions, instead
of the historic KSE conventions:
cpu_set_fork_handler -> cpu_fork_kthread_handler (for kthreads)
cpu_set_upcall -> cpu_copy_thread (for forks)
cpu_set_upcall_kse -> cpu_set_upcall (for new threads creation)
Reviewed by: jhb (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (hrs)
Differential revision: https://reviews.freebsd.org/D6731
the loop in linprocfs_doswaps() is useless.
List of the registered filesystems is protected by vfsconf_sx,
not by the Giant. Adjust linprocfs_dofilesystems() correspondingly.
Approved by: re (delphij), des (linprocfs maintainer)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Linux module parameters have a permissions value. If any write bits
are set we are allowed to modify the module parameter runtime. Reflect
this when creating the static SYSCTL nodes.
Sponsored by: Mellanox Technologies
MFC after: 1 week
Bool module parameters are no longer supported, because there is no
equivalent in FreeBSD.
There are two macros available which control the behaviour of the
LinuxKPI module parameters:
- LINUXKPI_PARAM_PARENT allows the consumer to set the SYSCTL parent
where the modules parameters will be created.
- LINUXKPI_PARAM_PREFIX defines a parameter name prefix, which is
added to all created module parameters.
Sponsored by: Mellanox Technologies
MFC after: 1 week
run after a panic(). This for example allows a LinuxKPI based graphics
stack to receive prints during a panic.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
- Revert the change for seed(0) in r300384. I misunderstood the standard
and while our random() implementation in libkern may be improved, it
handles the seed(0) case fine.
Pointed out by: bde, ache
- Revert r300377: The implementation claims to return a value
within the range. [1]
- Adjust the value for the case of a zero seed, whihc according
to standards should be equivalent to a seed of value 1.
Pointed out by: cem
This is a long standing problem: our random() function returns an
unsigned integer but the rand provided by ndis(4) returns an int.
Scale it down.
MFC after: 2 weeks
In ndis(4) we expose a rand() function that was constantly reseeding
with a time depending function every time it was called. This
essentially broke the reasoning behind seeding, and rendered srand()
a no-op.
Keep it simple, just use random() and srandom() as it's meant to work.
It would have been tempting to just go for arc4random() but we
want to mimic Microsoft, and we don't need crypto-grade randomness
here.
PR: 209616
MFC after: 2 weeks
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
at it use NULL for some pointer checks.
Bump the FreeBSD version to force recompilation of all kernel modules
due to a structure size change.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
streamline the rest of the xxx_to_jiffies() functions to have a
constant 64-bit argument and use identical range checks for the
result.
Specifically preserve msecs_to_jiffies(0) returning 0. See r282743 for
further details.
MFC after: 1 week
Sponsored by: Mellanox Technologies
error code checks might fail. ERESTART is in the BSD world defined as
-1. While at it add more Linux error codes.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
Linux requires that all IOCTL data resides in userspace. FreeBSD
always moves the main IOCTL structure into a kernel buffer before
invoking the IOCTL handler and then copies it back into userspace,
before returning. Hide this difference in the "linux_copyin()" and
"linux_copyout()" functions by remapping userspace addresses in the
range from 0x10000 to 0x20000, to the kernel IOCTL data buffer.
It is assumed that the userspace code, data and stack segments starts
no lower than memory address 0x400000, which is also stated by "man 1
ld", which means any valid userspace pointer can be passed to regular
LinuxKPI handled IOCTLs.
Bump the FreeBSD version to force recompilation of all kernel modules.
Discussed with: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
"current" inside all LinuxKPI file operation callbacks. The "current"
is frequently used for various debug prints, printing the thread name
and thread ID for example.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies
Ensure the actual poll result is returned by the "linux_file_poll()"
function instead of zero which means no data is available.
MFC after: 3 days
Sponsored by: Mellanox Technologies
The "len" parameter is uint32_t, indexing it with an int may
end up in a signed integer overflow.
strlen(3) returns an integer of size_t so the corresponding index should
have that size.
MFC after: 1 week
This is a minor follow-up to r297422, prompted by a Coverity warning. (It's
not a real defect, just a code smell.) OSD slot array reservations are an
array of pointers (void **) but were cast to void* and back unnecessarily.
Keep the correct type from reservation to use.
osd.9 is updated to match, along with a few trivial igor fixes.
Reported by: Coverity
CID: 1353811
Sponsored by: EMC / Isilon Storage Division
undefined symbol svr4_delete_socket which was moved from streams to the svr4 module
in r160558 that created a two-way dependency between them.
PR: 208464
Submitted by: Kristoffer Eriksson
Reported by: Kristoffer Eriksson
MFC after: 2 week