mirror of
https://github.com/opnsense/src.git
synced 2026-06-20 22:19:13 -04:00
unix: new implementation of unix/stream & unix/seqpacket
[this is an updated version of d80a97def9, that had been reverted]
Provide protocol specific pr_sosend and pr_soreceive for PF_UNIX
SOCK_STREAM sockets and implement SOCK_SEQPACKET sockets as an extension
of SOCK_STREAM. The change meets three goals: get rid of unix(4) specific
stuff in the generic socket code, provide a faster and robust unix/stream
sockets and bring unix/seqpacket much closer to specification. Highlights
follow:
- The send buffer now is truly bypassed. Previously it was always empty,
but the send(2) still needed to acquire its lock and do a variety of
tricks to be woken up in the right time while sleeping on it. Now the
only two things we care about in the send buffer is the I/O sx(9) lock
that serializes operations and value of so_snd.sb_hiwat, which we can read
without obtaining a lock. The sleep of a send(2) happens on the mutex of
the receive buffer of the peer. A bulk send/recv of data with large
socket buffers will make both syscalls just bounce between owning the
receive buffer lock and copyin(9)/copyout(9), no other locks would be
involved. Since event notification mechanisms, such as select(2), poll(2)
and kevent(2) use state of the send buffer to monitor writability, the new
implementation provides protocol specific pr_sopoll and pr_kqfilter. The
sendfile(2) over unix/stream is preserved, providing protocol specific
pr_send and pr_sendfile_wait methods.
- The implementation uses new mchain structure to manipulate mbuf chains.
Note that this required converting to mchain two functions that are shared
with unix/dgram: unp_internalize() and unp_addsockcred() as well as adding
a new shared one uipc_process_kernel_mbuf(). This induces some non-
functional changes in the unix/dgram code as well. There is a space for
improvement here, as right now it is a mix of mchain and manually managed
mbuf chains.
- unix/seqpacket previously marked as PR_ADDR & PR_ATOMIC and thus treated
as a datagram socket by the generic socket code, now becomes a true stream
socket with record markers.
- Note on aio(4). First problem with socket aio(4) is that it uses socket
buffer locks for queueing and piggybacking on this locking it calls
soreadable() and sowriteable() directly. Ideally it should use
pr_sopoll() method. Second problem is that unlike a syscall, aio(4) wants
a consistent uio structure upon return. This is incompatible with our
speculative read optimization, so in case of aio(4) write we need to
restore consistency of uio. At this point we workaround those problems
on the side of unix(4), but ideally those workarounds should be socket
aio(4) problem (not a first class citizen) rather than problem of unix(4),
definitely a primary facility.
This commit is contained in:
parent
fbd7087b0b
commit
d157927807
2 changed files with 1332 additions and 380 deletions
File diff suppressed because it is too large
Load diff
|
|
@ -132,6 +132,18 @@ struct sockbuf {
|
|||
/* TLS state, locked by sockbuf and sock I/O mutexes. */
|
||||
struct ktls_session *sb_tls_info;
|
||||
};
|
||||
/*
|
||||
* PF_UNIX/SOCK_STREAM and PF_UNIX/SOCK_SEQPACKET
|
||||
* A simple stream buffer with not ready data pointer.
|
||||
*/
|
||||
struct {
|
||||
STAILQ_HEAD(, mbuf) uxst_mbq;
|
||||
struct mbuf *uxst_fnrdy;
|
||||
struct socket *uxst_peer;
|
||||
u_int uxst_flags;
|
||||
#define UXST_PEER_AIO 0x1
|
||||
#define UXST_PEER_SEL 0x2
|
||||
};
|
||||
/*
|
||||
* PF_UNIX/SOCK_DGRAM
|
||||
*
|
||||
|
|
|
|||
Loading…
Reference in a new issue