The divert(4) is not a protocol of IPv4. It is a socket to
intercept packets from ipfw(4) to userland and re-inject them
back. It can divert and re-inject IPv4 and IPv6 packets today,
but potentially it is not limited to these two protocols. The
IPPROTO_DIVERT does not belong to known IP protocols, it
doesn't even fit into u_char. I guess, the implementation of
divert(4) was done the way it is done basically because it was
easier to do it this way, back when protocols for sockets were
intertwined with IP protocols and domains were statically
compiled in.
Moving divert(4) out of inetsw accomplished two important things:
1) IPDIVERT is getting much closer to be not dependent on INET.
This will be finalized in following changes.
2) Now divert socket no longer aliases with raw IPv4 socket.
Domain/proto selection code won't need a hack for SOCK_RAW and
multiple entries in inetsw implementing different flavors of
raw socket can merge into one without requirement of raw IPv4
being the last member of dom_protosw.
Differential revision: https://reviews.freebsd.org/D36379
o Undocument sockets that are no longer supported, or never were.
o Add AF_HYPERV. Note: PF_HYPERV isn't defined, no typo here.
o Point at ip(4) and ip6(4) instead of unwelcoming "not described here".
Reviewed by: gbe, markj
Differential revision: https://reviews.freebsd.org/D36284
Add a strverscmp(3) function to libc, a GNU extension I implemented by
reading its glibc manual page. It orders strings following a much more
natural ordering (e.g. "ent1 < ent2 < ent10" as opposed to
"ent1 < ent10 < ent2" with strcmp(3)'s lexicographic ordering).
Also add versionsort(3) for use as scandir(3)'s compar argument.
Update manual page for scandir(3) and add one for strverscmp(3).
Reviewed by: pstef, gbe, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D35807
The new helper scandir_dirp() takes DIR *, i.e. a pre-opened directory,
instead of the directory name.
Reviewed by: emaste, imp, kevans, markj, Aymeric Wibo <obiwac@gmail.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D36301
A future commit will actually implement //IGNORE so that applications
using base iconv can, e.g., sanitize UTF-8 strings. To do this, the
iconv_std module needs to be able to determine the minimum width for any
given encoding so that it can skip that many bytes in the input buffer.
This is mainly an issue for UTF-16 and UTF-32.
This commit bumps shlib versions to 5 for libiconv modules to reflect
the ABI change. It also fixes OptionalObsoleteFiles to remove the
libiconv modules if WITHOUT_ICONV is in use.
re: _ENCODING_MB_CUR_MIN, note that this file (citrus_stdenc_template.h)
is included at the bottom of an encoding *implementation*, so the
implementation is free to #define it prior. UTF1632 is a good example,
as it redefines the minimum to be a property on the encodinginfo, and
the minimum is set to 2 or 4 bytes for UTF-16 and UTF-32 respectively.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34344
Make it vaguely aware of options in the sense that it now knows that it
can zap any trailing //. It now copies the entire string in realsrc and
realdst, then terminates them at the options.
__bsd___iconv_open can now stop trying to allocate memory just for this
purpose, and the new version is technically more correct. GNU libiconv
will ignore options on the `in` codeset and still do the right thing.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D34343
The main change was v1.57 by djm@:
Randomise the rekey interval a little. Previously, the chacha20
instance would be rekeyed every 1.6MB. This makes it happen at a
random point somewhere in the 1-2MB range.
Reviewed by: csprng (markm, cem)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D36088
Recover application ability to supply fabricated PID
embedded into ident that was lost when libc switched
to generation of RFC 5424 log messages, for example:
logger -t "ident[$$]" -p user.notice "test"
It is essential for long running scripts.
Also, this change unbreaks matching resulted entries
by ident in syslog.conf:
!ident
*.* /var/log/ident.log
Without the fix, the log (and matching) was broken:
Aug 1 07:36:58 hostname ident[123][86483]: test
Now it works as expected and worked before breakage:
Aug 1 07:39:40 hostname ident[123]: test
Differential: https://reviews.freebsd.org/D36005
MFC after: 2 weeks
These all have my copyright so can be removed. Some also have FreeBSD
Foundation copyright so drop from there as has been done for previous
files.
Sponsored by: The FreeBSD Foundation
This has already been done for most files that have the Foundation as
the only listed copyright holder. Do it now for files that list
multiple copyright holders, but have the Foundation copyright in its own
section.
Sponsored by: The FreeBSD Foundation
Implement Linux-variant of MSG_TRUNC input flag used in recv(), recvfrom() and recvmsg().
Posix defines MSG_TRUNC as an output flag, indicating packet/datagram truncation.
Linux extended it a while (~15+ years) ago to act as input flag,
resulting in returning the full packet size regarless of the input
buffer size.
It's a (relatively) popular pattern to do recvmsg( MSG_PEEK | MSG_TRUNC) to get the
packet size, allocate the buffer and issue another call to fetch the packet.
In particular, it's popular in userland netlink code, which is the primary driving factor of this change.
This commit implements the MSG_TRUNC support for SOCK_DGRAM sockets (udp, unix and all soreceive_generic() users).
PR: kern/176322
Reviewed by: pauamma(doc)
Differential Revision: https://reviews.freebsd.org/D35909
MFC after: 1 month
As per the updated FreeBSD copyright template. These were unambiguous
cases where the Foundation was the only listed copyright holder.
Sponsored by: The FreeBSD Foundation
When pselect is passed a null pointer for the signal mask, the standard
says it shall behave like select (except for the different timeout
arg). Make a note of that here.
Sponsored by: Netflix
Instead of returning EMSGSIZE pass the error code from fdallocn() directly
to userland. That would be EMFILE, which makes much more sense. This
error code is not listed in the specification[1], but the specification
doesn't cover such edge case at all. Meanwhile the specification lists
EMSGSIZE as the error code for invalid value of msg_iovlen, and FreeBSD
follows that, see sys_recmsg(). Differentiating these two cases will make
a developer/admin life much easier when debugging.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/recvmsg.html
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35640
named(8) hasn't been in base for some time. Remove all references to it in
manual pages.
Approved by: manpages (Pau Amma)
Differential Revision: https://reviews.freebsd.org/D35586
This additional socket was created in 2e89951b6f and 240d5a9b1c
to try workaround problems with classic PF_UNIX/SOCK_DGRAM sockets.
With recent changes in kernel this trick is no longer needed, so the
trick can be reverted.
In syslogd(8) we would still create the socket for the next several
major releases for compatibility.
Differential revision: https://reviews.freebsd.org/D35305
The "/dev/log" socket existed in pre-FreeBSD times. Later it was
substituted to a compatibility symlink. The symlink creation was
deprecated in FreeBSD 10.2 and 9-STABLE.
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35304
Add documentation for gethostbyname_r, gethostbyname2_r and gethostbyaddr_r
Create proper MLINKs for the new functions.
PR: 249154
Reported by: asomers@
Approved by: manpages (0mp@), Pau Amma
Differential Revision: https://reviews.freebsd.org/D30469
Enhanced REP MOVSB feature of CPUs starting from Ivy Bridge makes
REP MOVSB the fastest way to copy memory in most of cases. However
Intel Optimization Reference Manual says: "setting the DF to force
REP MOVSB to copy bytes from high towards low addresses will expe-
rience significant performance degradation". Measurements on Intel
Cascade Lake and Alder Lake, same as on AMD Zen3 show that it can
drop throughput to as low as 2.5-3.5GB/s, comparing to ~10-30GB/s
of REP MOVSQ or hand-rolled loop, used for non-ERMS CPUs.
This patch keeps ERMS use for forward ordered memory copies, but
removes it for backward overlapped moves where it does not work.
This is just a cosmetic sync with kernel, since libc does not use
ERMS at this time.
Reviewed by: mjg
MFC after: 2 weeks
Commit e0e0323354 removed compat stubs for
kernels that did not have futimens() and utimensat() system calls, but
removed the documentation for them in the manual page only partially.
Remove the rest of the documentation of the compatibility code.
MFC after: 1 week
POSIX deprecated getpagesize(3). The portable way to obtain the page
size is `sysconf(_SC_PAGESIZE)`.
Reviewed by: cperciva (earlier), imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35352
On success gnu libc sched_getaffinity() should return 0, unlike underlying
Linux syscall which returns the size of CPU mask copied to user.
PR: 263939
MFC after: 2 weeks
Linux has more tolerant checks of the user supplied cpuset_t's.
Minimum cpuset_t size that the Linux kernel permits in case of
getaffinity() is the maximum CPU id, present in the system / NBBY,
the maximum size is not limited.
For setaffinity(), Linux does not limit the size of the user-provided
cpuset_t, internally using only the meaningful part of the set, where
the upper bound is the maximum CPU id, present in the system, no larger
than the size of the kernel cpuset_t.
Unlike FreeBSD, Linux ignores high bits if set in the setaffinity(),
so clear it in the sched_setaffinity() and Linuxulator itself.
Reviewed by: Pau Amma (man pages)
In collaboration with: jhb
Differential revision: https://reviews.freebsd.org/D34849
MFC after: 2 weeks
There are some sections which could be improved
and work to do so is on going. The work will be
covered via 'X-MFC-WITH' commits.
Obtained from: OpenBSD
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D34759
The time() system call first appeared in Version 1 AT&T UNIX. Through
the Version 3 AT&T UNIX, it returned 60 Hz ticks since an epoch that
changed occasionally, because it was a 32-bit value that overflowed in a
little over 2 years.
In Version 4 AT&T UNIX the granularity of the return value was reduced to
whole seconds, delaying the aforementioned overflow until 2038.
Version 7 AT&T UNIX introduced the ftime() system call, which returned
time at a millisecond level, though retained the gtime() system call
(exposed as time() in userland). time() could have been implemented as a
wrapper around ftime(), but that wasn't done.
4.1cBSD implemented a higher-precision time function gettimeofday() to
replace ftime() and reimplemented time() in terms of that.
Since FreeBSD 9 the implementation of time() uses
clock_gettime(CLOCK_SECOND) instead of gettimeofday() for performance
reasons.
With most valuable input from Warner (imp@).
Reviewed by: 0mp, jilles, imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D34751
Problem is that open(O_PATH) on nullfs -o nocache is broken then,
because there is no reference on the vnode after the open syscall exits.
Reported and tested by: ambrisko
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The error control was not properly implemented. "changelist" is const, hence
event.flags is never changed by the syscall.
PR: 196844
Reported by: eugen@
Reviewed by: PauAmma <pauamma@gundo.com>
Approved by: eugen@
Fixes: 8c231786f0
To be more compatible to IEEE Std 1003.1-2008 (“POSIX.1”).
Reviewed by: mjg, Pau Amma (doc)
Differential revision: https://reviews.freebsd.org/D34680
MFC after: 2 weeks