Implement Linux-variant of MSG_TRUNC input flag used in recv(), recvfrom() and recvmsg().
Posix defines MSG_TRUNC as an output flag, indicating packet/datagram truncation.
Linux extended it a while (~15+ years) ago to act as input flag,
resulting in returning the full packet size regarless of the input
buffer size.
It's a (relatively) popular pattern to do recvmsg( MSG_PEEK | MSG_TRUNC) to get the
packet size, allocate the buffer and issue another call to fetch the packet.
In particular, it's popular in userland netlink code, which is the primary driving factor of this change.
This commit implements the MSG_TRUNC support for SOCK_DGRAM sockets (udp, unix and all soreceive_generic() users).
PR: kern/176322
Reviewed by: pauamma(doc)
Differential Revision: https://reviews.freebsd.org/D35909
MFC after: 1 month
As per the updated FreeBSD copyright template. These were unambiguous
cases where the Foundation was the only listed copyright holder.
Sponsored by: The FreeBSD Foundation
When pselect is passed a null pointer for the signal mask, the standard
says it shall behave like select (except for the different timeout
arg). Make a note of that here.
Sponsored by: Netflix
Instead of returning EMSGSIZE pass the error code from fdallocn() directly
to userland. That would be EMFILE, which makes much more sense. This
error code is not listed in the specification[1], but the specification
doesn't cover such edge case at all. Meanwhile the specification lists
EMSGSIZE as the error code for invalid value of msg_iovlen, and FreeBSD
follows that, see sys_recmsg(). Differentiating these two cases will make
a developer/admin life much easier when debugging.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/recvmsg.html
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35640
named(8) hasn't been in base for some time. Remove all references to it in
manual pages.
Approved by: manpages (Pau Amma)
Differential Revision: https://reviews.freebsd.org/D35586
This additional socket was created in 2e89951b6f and 240d5a9b1c
to try workaround problems with classic PF_UNIX/SOCK_DGRAM sockets.
With recent changes in kernel this trick is no longer needed, so the
trick can be reverted.
In syslogd(8) we would still create the socket for the next several
major releases for compatibility.
Differential revision: https://reviews.freebsd.org/D35305
The "/dev/log" socket existed in pre-FreeBSD times. Later it was
substituted to a compatibility symlink. The symlink creation was
deprecated in FreeBSD 10.2 and 9-STABLE.
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35304
Add documentation for gethostbyname_r, gethostbyname2_r and gethostbyaddr_r
Create proper MLINKs for the new functions.
PR: 249154
Reported by: asomers@
Approved by: manpages (0mp@), Pau Amma
Differential Revision: https://reviews.freebsd.org/D30469
Enhanced REP MOVSB feature of CPUs starting from Ivy Bridge makes
REP MOVSB the fastest way to copy memory in most of cases. However
Intel Optimization Reference Manual says: "setting the DF to force
REP MOVSB to copy bytes from high towards low addresses will expe-
rience significant performance degradation". Measurements on Intel
Cascade Lake and Alder Lake, same as on AMD Zen3 show that it can
drop throughput to as low as 2.5-3.5GB/s, comparing to ~10-30GB/s
of REP MOVSQ or hand-rolled loop, used for non-ERMS CPUs.
This patch keeps ERMS use for forward ordered memory copies, but
removes it for backward overlapped moves where it does not work.
This is just a cosmetic sync with kernel, since libc does not use
ERMS at this time.
Reviewed by: mjg
MFC after: 2 weeks
Commit e0e0323354 removed compat stubs for
kernels that did not have futimens() and utimensat() system calls, but
removed the documentation for them in the manual page only partially.
Remove the rest of the documentation of the compatibility code.
MFC after: 1 week
POSIX deprecated getpagesize(3). The portable way to obtain the page
size is `sysconf(_SC_PAGESIZE)`.
Reviewed by: cperciva (earlier), imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35352
On success gnu libc sched_getaffinity() should return 0, unlike underlying
Linux syscall which returns the size of CPU mask copied to user.
PR: 263939
MFC after: 2 weeks
Linux has more tolerant checks of the user supplied cpuset_t's.
Minimum cpuset_t size that the Linux kernel permits in case of
getaffinity() is the maximum CPU id, present in the system / NBBY,
the maximum size is not limited.
For setaffinity(), Linux does not limit the size of the user-provided
cpuset_t, internally using only the meaningful part of the set, where
the upper bound is the maximum CPU id, present in the system, no larger
than the size of the kernel cpuset_t.
Unlike FreeBSD, Linux ignores high bits if set in the setaffinity(),
so clear it in the sched_setaffinity() and Linuxulator itself.
Reviewed by: Pau Amma (man pages)
In collaboration with: jhb
Differential revision: https://reviews.freebsd.org/D34849
MFC after: 2 weeks
There are some sections which could be improved
and work to do so is on going. The work will be
covered via 'X-MFC-WITH' commits.
Obtained from: OpenBSD
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D34759
The time() system call first appeared in Version 1 AT&T UNIX. Through
the Version 3 AT&T UNIX, it returned 60 Hz ticks since an epoch that
changed occasionally, because it was a 32-bit value that overflowed in a
little over 2 years.
In Version 4 AT&T UNIX the granularity of the return value was reduced to
whole seconds, delaying the aforementioned overflow until 2038.
Version 7 AT&T UNIX introduced the ftime() system call, which returned
time at a millisecond level, though retained the gtime() system call
(exposed as time() in userland). time() could have been implemented as a
wrapper around ftime(), but that wasn't done.
4.1cBSD implemented a higher-precision time function gettimeofday() to
replace ftime() and reimplemented time() in terms of that.
Since FreeBSD 9 the implementation of time() uses
clock_gettime(CLOCK_SECOND) instead of gettimeofday() for performance
reasons.
With most valuable input from Warner (imp@).
Reviewed by: 0mp, jilles, imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34751
Problem is that open(O_PATH) on nullfs -o nocache is broken then,
because there is no reference on the vnode after the open syscall exits.
Reported and tested by: ambrisko
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The error control was not properly implemented. "changelist" is const, hence
event.flags is never changed by the syscall.
PR: 196844
Reported by: eugen@
Reviewed by: PauAmma <pauamma@gundo.com>
Approved by: eugen@
Fixes: 8c231786f0
To be more compatible to IEEE Std 1003.1-2008 (“POSIX.1”).
Reviewed by: mjg, Pau Amma (doc)
Differential revision: https://reviews.freebsd.org/D34680
MFC after: 2 weeks
RFC 2533 refers to 'A Syntax for Describing Media Feature Sets',
which is wrong since the correct reference should be
RFC 2553 'Basic Socket Interface Extensions for IPv6'.
Obtained from: OpenBSD
MFC after: 1 week
Previously, such errors were not distinguished from the end-of-directory
condition.
With improvements from Mahmoud Abumandour <ma.mandourr@gmail.com>.
Reviewed by: markj
PR: 262038
MFC after: 2 weeks
Turns out clang converts "memcmp(foo, bar, len) == 0" and similar to
bcmp calls.
Reviewed by: emaste (previous version), jhb (previous version)
Differential Revision: https://reviews.freebsd.org/D34673
To support cc -pg on arm64 we need to implement .mcount. As clang and
gcc think it is function like it just needs to load the arguments
to _mcount and call it.
On gcc the first argument is passed in x0, however this is missing on
clang so we need to load it from the stack. As it's the caller return
address this will be at a known location.
PR: 262709
Reviewed by: emaste (earlier version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34634
Preferably bcmp would just alias memcmp but there is build magic which
makes this problematic.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D28846
__sfvwrite() advances the pointer before calling fflush. If fflush()
fails, it is not enough to roll back inside it, because we cannot know
how much was advanced by the caller.
Reported by: Peter <pmc@citylink.dinoex.sub.org>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Fixes: 86a16ada1e
time() is now implemented using clock_gettime(2) instead of
gettimeofday(2).
Reviewed by: debdrup
Fixes: 358ed16f75 Use clock_gettime(CLOCK_SECOND)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34438
During distributeworld we call distribute on subdirectories, which in
turn calls installconfig. However, this recursive installconfig call
appends the distribution name (in these cases, "base") to DESTDIR. For
install(1) this works fine as its -D argument comes from the top-level
Makefile.inc1, which passes the original DESTDIR, thereby resulting in
the METALOG entry having the distribution name as a prefix representing
its true installed path relative to the root, but for the hand-rolled
entries they do not use install(1) and thus do not have access to what
the original DESTDIR was, resulting in the METALOG missing this prefix.
Thus, pass down the name of the distribution via a new variable DISTBASE
(chosen as Makefile.inc1 already uses that to convey this exact same
information to etc's distrib-dirs during distributeworld) and prepend
this to the handful of manually-generated METALOG entries. For the
installworld case this variable will be empty and so this behaves as
before.
Note that we need to be careful to avoid double slashes in the METALOG;
distributeworld uses find | awk to split the single METALOG up into
multiple dist.meta files, and this relies on the paths in the METALOG
having the exact prefix ./dist (or ./dist/usr/lib/debug).
Reviewed by: brooks, emaste
Differential Revision: https://reviews.freebsd.org/D33997
Require the newly opened file descriptor to be good, instead of
re-requiring the one that was required three lines earlier.
Thankfully, opening /dev/null is really unlikely to fail.
Reported by: Coverity
MFC after: 1 week
Sponsored by: Dell EMC Isilon