Commit graph

57 commits

Author SHA1 Message Date
Michael Tuexen
2176c9ab71 dtrace: improve siftr probe
Improve consistency of the field names with tcpsinfo_t:
* Use mss instead of max_seg_size.
* Use lport and rport instead of tcp_localport and tcp_foreignport.

Use t_flags instead of flags to improve consistency with t_flags2.

Add laddr and raddr, since the addresses were missing when compared
to the output of siftr.

Reviewed by:		cc
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D40834
2023-07-02 03:08:51 +02:00
Michael Tuexen
cd9da8d072 siftr: unbreak dtrace support
This patch adds back some fields needed by the siftr probe, which were
removed in
https://cgit.freebsd.org/src/commit/?id=aa61cff4249c92689d7a1f15db49d65d082184cb

With this fix, you can run dtrace scripts again when the siftr
module is loaded. And the siftr probe works again.

Reviewed by:		rscheff
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D40826
2023-07-01 09:50:54 +02:00
Michael Tuexen
dc2d26df43 siftr: provide dtrace with the correct pointer to data
This fixes a bug which was introduced in the commit
https://svnweb.freebsd.org/changeset/base/282276

Reviewed by:		cc, rscheff
MFC after:		3 days
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D40806
2023-06-30 22:03:04 +02:00
Cheng Cui
7a52b570e7
siftr: bring back the siftr_pkts_per_log feature
Summary: this missing feature is introduced by commit aa61cff424

Test Plan: verified in Emulab.net

Reviewers: rscheff, tuexen
Approved by: tuexen (mentor)
Subscribers: imp, melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D40336
2023-05-30 08:23:36 -04:00
Cheng Cui
b71f278465
siftr: convert this tval.tv_sec to type intmax_t to print across platforms
Reviewers: rscheff, tuexen
Approved by: tuexen (mentor)
Subscribers: imp, melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D40323
2023-05-30 02:27:33 -04:00
Cheng Cui
78914cd641
siftr: sync-up man page with recent code changes, and cleanup code
Reviewers: rscheff, tuexen
Approved by: tuexen (mentor)
Subscribers: imp, melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D40322
2023-05-29 12:16:09 -04:00
Cheng Cui
d29a9a615c
siftr: fix a build error for powerpc and arm platforms
Summary: This is introduced by commit aa61cff424.

Reviewers: rscheff, tuexen
Approved by: tuexen (mentor)
Subscribers: imp, melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D40320
2023-05-29 12:10:46 -04:00
Cheng Cui
aa61cff424
siftr: three changes that improve performance
Summary:
(1) use inp_flowid or a new packet hash for a flow identification
(2) cache constant connection info into struct flow_hash_node
(3) use compressed notation for IPv6 address representation

Reviewers: rscheff, tuexen
Approved by: tuexen (mentor)
Subscribers: imp, melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D40302
2023-05-29 05:11:18 -04:00
Warner Losh
4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Cheng Cui
60167184ab
siftr: remove barely used hash generation per record
Reviewers: rscheff, tuexen
Approved by: rscheff, tuexen
Subscribers: imp, melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D39835
2023-04-26 15:57:42 -04:00
Cheng Cui
d090464ecd
Change the unit of srtt and rto to usec, inspired by these in struct "tcp_info". Therefore, no need hz and tcp_rtt_scale in the headline of the log. Update the man page as well.
Summary: Simplify srtt and rto values in siftr log.

Test Plan:
Tested in Emulab testbed:
cc@s1:~ % sudo sysctl net.inet.siftr
net.inet.siftr.port_filter: 0
net.inet.siftr.genhashes: 0
net.inet.siftr.ppl: 1
net.inet.siftr.logfile: /var/log/siftr.log
net.inet.siftr.enabled: 0
cc@s1:~ % sudo sysctl net.inet.siftr.port_filter=5001
net.inet.siftr.port_filter: 0 -> 5001
cc@s1:~ % sudo sysctl net.inet.siftr.enabled=1
net.inet.siftr.enabled: 0 -> 1
cc@s1:~ %
cc@s1:~ % iperf -c r1 -n 1M
------------------------------------------------------------
Client connecting to r1, TCP port 5001
TCP window size: 32.0 KByte (default)
------------------------------------------------------------
[  1] local 10.1.1.2 port 33817 connected with 10.1.1.3 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-0.91 sec  1.00 MBytes  9.22 Mbits/sec
cc@s1:~ % sudo sysctl net.inet.siftr.enabled=0
net.inet.siftr.enabled: 1 -> 0

cc@s1:~ % ll /var/log/siftr.log
-rw-r--r--  1 root  wheel    91K Apr 25 09:38 /var/log/siftr.log
cc@s1:~ % cat /var/log/siftr.log
enable_time_secs=1682437111	enable_time_usecs=121115	siftrver=1.3.0	sysname=FreeBSD	sysver=1400088	ipmode=4
o,0x00000000,1682437125.907343,10.1.1.2,33817,10.1.1.3,5001,1073725440,1073725440,2,0,0,0,0,2,536,0,1,672,1000000,32768,0,65536,0,0,0,0,0
i,0x00000000,1682437126.106759,10.1.1.2,33817,10.1.1.3,5001,1073725440,1073725440,2,0,0,0,0,2,536,0,1,672,1000000,32768,0,65536,0,1,0,0,0
o,0x00000000,1682437126.106767,10.1.1.2,33817,10.1.1.3,5001,1073725440,14480,2,65535,65700,9,9,4,1460,201000,1,16778209,803000,33580,0,65700,0,0,0,0,0
o,0x00000000,1682437126.107141,10.1.1.2,33817,10.1.1.3,5001,1073725440,14480,2,65535,65700,9,9,4,1460,201000,1,16778208,803000,33580,60,65700,0,0,0,0,0
...
i,0x00000000,1682437127.016754,10.1.1.2,33817,10.1.1.3,5001,1073725440,606109,1030,748544,66048,9,9,9,1460,100812,1,1008,303000,475948,0,65700,0,0,0,0,0
o,0x00000000,1682437127.016759,10.1.1.2,33817,10.1.1.3,5001,1073725440,606109,1030,748544,66048,9,9,10,1460,100812,1,1011,303000,475948,0,65700,0,0,0,0,0
disable_time_secs=1682437131	disable_time_usecs=767582	num_inbound_tcp_pkts=371	num_outbound_tcp_pkts=186	total_tcp_pkts=557	num_inbound_skipped_pkts_malloc=0	num_outbound_skipped_pkts_malloc=0	num_inbound_skipped_pkts_tcpcb=0	num_outbound_skipped_pkts_tcpcb=0	num_inbound_skipped_pkts_inpcb=0	num_outbound_skipped_pkts_inpcb=0	total_skipped_tcp_pkts=0	flow_list=10.1.1.2;33817-10.1.1.3;5001,

Reviewers: rscheff, tuexen
Approved by: rscheff, tuexen
Subscribers: imp, melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D39803
2023-04-25 13:26:39 -04:00
Cheng Cui
1f782fcc0c
Remove unused fields in siftr_stats. Thus, update the man page as well.
Summary: Remove unused fields in siftr_stats. Thus, update the man page as well.

Test Plan: Tested in Emulab testbed.

Reviewers: rscheff, tuexen
Approved by: rscheff, tuexen
Subscribers: imp, melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D39776
2023-04-24 15:31:15 -04:00
Gleb Smirnoff
caf32b260a pfil: add pfil_mem_{in,out}() and retire pfil_run_hooks()
The 0b70e3e78b changed the original design of a single entry point
into pfil(9) chains providing separate functions for the filtering
points that always provide mbufs and know the direction of a flow.
The motivation was to reduce branching.  The logical continuation
would be to do the same for the filtering points that always provide
a memory pointer and retire the single entry point.

o Hooks now provide two functions: one for mbufs and optional for
  memory pointers.
o pfil_hook_args() has a new member and pfil_add_hook() has a
  requirement to zero out uninitialized data. Bump PFIL_VERSION.
o As it was before, a hook function for a memory pointer may realloc
  into an mbuf.  Such mbuf would be returned via a pointer that must
  be provided in argument.
o The only hook that supports memory pointers is ipfw:default-link.
  It is rewritten to provide two functions.
o All remaining uses of pfil_run_hooks() are converted to
  pfil_mem_in().
o Transparent union of pfil_packet_t and tricks to fix pointer
  alignment are retired. Internal pfil_realloc() reduces down to
  m_devget() and thus is retired, too.

Reviewed by:		mjg, ocochard
Differential revision:	https://reviews.freebsd.org/D37977
2023-02-14 10:02:49 -08:00
Richard Scheffenegger
9c65583835 siftr: apply filter early on
Quickly check TCP port filter, before investing into
expensive operations.

No functional change.

Obtained from:  	guest-ccui
Reviewed By:    	#transport, tuexen, guest-ccui
Sponsored by:   	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D36842
2022-10-07 01:39:41 +02:00
Gleb Smirnoff
53af690381 tcp: remove INP_TIMEWAIT flag
Mechanically cleanup INP_TIMEWAIT from the kernel sources.  After
0d7445193a, this commit shall not cause any functional changes.

Note: this flag was very often checked together with INP_DROPPED.
If we modify in_pcblookup*() not to return INP_DROPPED pcbs, we
will be able to remove most of this checks and turn them to
assertions.  Some of them can be turned into assertions right now,
but that should be carefully done on a case by case basis.

Differential revision:	https://reviews.freebsd.org/D36400
2022-10-06 19:24:37 -07:00
Dag-Erling Smørgrav
c198adf394 siftr: spell PFIL_PASS correctly.
Sponsored by:	NetApp
Sponsored by:	Klara Inc.
Differential Revision: https://reviews.freebsd.org/D36539
2022-09-12 19:20:10 +02:00
Tom Jones
1241e8e7ae siftr: expose t_flags2 in siftr output
Replace the old snd_bwnd field which was kept for compatibility with the
t_flags2 field from the tcpcb. This exposes in siftr logs interesting
things such as ECN, PLPMTUD, Accurate ECN and if first bytes are
complete.

Reviewed by:	rscheff (transport), chengc_netapp.com,  debdrup (manpages)
Sponsored by:   NetApp, Inc.
Sponsored by:   Klara, Inc.
X-NetApp-PR:    #73
Differential Revision:	https://reviews.freebsd.org/D34672
2022-04-07 10:17:09 +01:00
Allan Jude
34d8fffff3 SIFTR: Fix compilation with -DSIFTR_IPV6
A few pieces of the SIFTR code that are behind #ifdef SIFTR_IPV6 have
not been updated as APIs have changed, etc.

Reported by:	Alexander Sideropoulos <Alexander.Sideropoulos@netapp.com>
Reviewed by:	rscheff, lstewart
Sponsored by:	NetApp
Sponsored by:	Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D32698
2021-11-04 00:32:17 +00:00
Mateusz Guzik
662c13053f net: clean up empty lines in .c and .h files 2020-09-01 21:19:14 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Randall Stewart
481be5de9d White space cleanup -- remove trailing tab's or spaces
from any line.

Sponsored by:	Netflix Inc.
2020-02-12 13:31:36 +00:00
Michael Tuexen
746c7ae563 In r343587 a simple port filter as sysctl tunable was added to siftr.
The new sysctl was not added to the siftr.4 man page at the time.
This updates the man page, and removes one left over trailing whitespace.

Submitted by:		Richard Scheffenegger
Reviewed by:		bcr@
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D21619
2019-10-07 20:35:04 +00:00
Gleb Smirnoff
547392731f Repair siftr(4): PFIL_IN and PFIL_OUT are defines of some value, relying
on them having particular values can break things.
2019-02-01 08:10:26 +00:00
Gleb Smirnoff
b252313f0b New pfil(9) KPI together with newborn pfil API and control utility.
The KPI have been reviewed and cleansed of features that were planned
back 20 years ago and never implemented.  The pfil(9) internals have
been made opaque to protocols with only returned types and function
declarations exposed. The KPI is made more strict, but at the same time
more extensible, as kernel uses same command structures that userland
ioctl uses.

In nutshell [KA]PI is about declaring filtering points, declaring
filters and linking and unlinking them together.

New [KA]PI makes it possible to reconfigure pfil(9) configuration:
change order of hooks, rehook filter from one filtering point to a
different one, disconnect a hook on output leaving it on input only,
prepend/append a filter to existing list of filters.

Now it possible for a single packet filter to provide multiple rulesets
that may be linked to different points. Think of per-interface ACLs in
Cisco or Juniper. None of existing packet filters yet support that,
however limited usage is already possible, e.g. default ruleset can
be moved to single interface, as soon as interface would pride their
filtering points.

Another future feature is possiblity to create pfil heads, that provide
not an mbuf pointer but just a memory pointer with length. That would
allow filtering at very early stages of a packet lifecycle, e.g. when
packet has just been received by a NIC and no mbuf was yet allocated.

Differential Revision:	https://reviews.freebsd.org/D18951
2019-01-31 23:01:03 +00:00
Brooks Davis
435a8c1560 Add a simple port filter to SIFTR.
SIFTR does not allow any kind of filtering, but captures every packet
processed by the TCP stack.
Often, only a specific session or service is of interest, and doing the
filtering in post-processing of the log adds to the overhead of SIFTR.

This adds a new sysctl net.inet.siftr.port_filter. When set to zero, all
packets get captured as previously. If set to any other value, only
packets where either the source or the destination ports match, are
captured in the log file.

Submitted by:	Richard Scheffenegger
Reviewed by:	Cheng Cui
Differential Revision:	https://reviews.freebsd.org/D18897
2019-01-30 17:44:30 +00:00
Brooks Davis
c53d6b90ba Make SIFTR work again after r342125 (D18443).
Correct a logic error.

Only disable when already enabled or enable when disabled.

Submitted by:	Richard Scheffenegger
Reviewed by:	Cheng Cui
Obtained from:	Cheng Cui
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D18885
2019-01-18 21:46:38 +00:00
Brooks Davis
855acb84ca Fix bugs in plugable CC algorithm and siftr sysctls.
Use the sysctl_handle_int() handler to write out the old value and read
the new value into a temporary variable. Use the temporary variable
for any checks of values rather than using the CAST_PTR_INT() macro on
req->newptr. The prior usage read directly from userspace memory if the
sysctl() was called correctly. This is unsafe and doesn't work at all on
some architectures (at least i386.)

In some cases, the code could also be tricked into reading from kernel
memory and leaking limited information about the contents or crashing
the system. This was true for CDG, newreno, and siftr on all platforms
and true for i386 in all cases. The impact of this bug is largest in
VIMAGE jails which have been configured to allow writing to these
sysctls.

Per discussion with the security officer, we will not be issuing an
advisory for this issue as root access and a non-default config are
required to be impacted.

Reviewed by:	markj, bz
Discussed with:	gordon (security officer)
MFC after:	3 days
Security:	kernel information leak, local DoS (both require root)
Differential Revision:	https://reviews.freebsd.org/D18443
2018-12-15 15:06:22 +00:00
Andrey V. Elsukov
384a5c3c28 Add INP_INFO_WUNLOCK_ASSERT() macro and use it instead of
INP_INFO_UNLOCK_ASSERT() in TCP-related code. For encapsulated traffic
it is possible, that the code is running in net_epoch_preempt section,
and INP_INFO_UNLOCK_ASSERT() is very strict assertion for such case.

PR:		231428
Reviewed by:	mmacy, tuexen
Approved by:	re (kib)
Differential Revision:	https://reviews.freebsd.org/D17335
2018-10-01 10:46:00 +00:00
Andrew Turner
2bf9501287 Create a new macro for static DPCPU data.
On arm64 (and possible other architectures) we are unable to use static
DPCPU data in kernel modules. This is because the compiler will generate
PC-relative accesses, however the runtime-linker expects to be able to
relocate these.

In preparation to fix this create two macros depending on if the data is
global or static.

Reviewed by:	bz, emaste, markj
Sponsored by:	ABT Systems Ltd
Differential Revision:	https://reviews.freebsd.org/D16140
2018-07-05 17:13:37 +00:00
Matt Macy
f6960e207e netinet silence warnings 2018-05-19 05:56:21 +00:00
Pedro F. Giffuni
fe267a5590 sys: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:23:17 +00:00
John Baldwin
47cedcbd72 Use SI_SUB_LAST instead of SI_SUB_SMP as the "catch-all" subsystem.
Reviewed by:	kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D5515
2016-03-11 23:18:06 +00:00
George V. Neville-Neil
bee68e9400 Move the SIFTR DTrace probe out of the writing thread context
and directly into the place where the data is collected.
2015-04-30 17:43:40 +00:00
George V. Neville-Neil
981ad3ecf9 Brief demo script showing the various values that can be read via
the new SIFTR statically defined tracepoint (SDT).

Differential Revision:	https://reviews.freebsd.org/D2387
Reviewed by:	bz, markj
2015-04-29 17:19:55 +00:00
Lawrence Stewart
efca16682d The addition of flowid and flowtype in r280233 and r280237 respectively forgot
to extend the IPv6 packet node format string, which causes a build failure when
SIFTR is compiled with IPv6 support.

Reported by:	Lars Eggert
2015-03-24 15:08:43 +00:00
Hiren Panchasara
d0a8b2a5ae Add connection flow type to siftr(4).
Suggested by:	adrian
Sponsored by:	Limelight Networks
2015-03-19 00:23:16 +00:00
Hiren Panchasara
a025fd1487 Add connection flowid to siftr(4).
Reviewed by:	lstewart
MFC after:	1 week
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D2089
2015-03-18 23:24:25 +00:00
Gleb Smirnoff
cfa6009e36 In preparation of merging projects/sendfile, transform bare access to
sb_cc member of struct sockbuf to a couple of inline functions:

sbavail() and sbused()

Right now they are equal, but once notion of "not ready socket buffer data",
will be checked in, they are going to be different.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-12 09:57:15 +00:00
Hans Petter Selasky
0e1152fcc2 The SYSCTL data pointers can come from userspace and must not be
directly accessed. Although this will work on some platforms, it can
throw an exception if the pointer is invalid and then panic the kernel.

Add a missing SYSCTL_IN() of "SCTP_BASE_STATS" structure.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-28 12:00:39 +00:00
Gleb Smirnoff
c3322cb91c Include necessary headers that now are available due to pollution
via if_var.h.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-28 07:29:16 +00:00
Lawrence Stewart
1e0e83d760 The hashmask returned by hashinit() is a valid index in the returned hash array.
Fix a siftr(4) potential memory leak and INVARIANTS triggered kernel panic in
hashdestroy() by ensuring the last array index in the flow counter hash table is
flushed of entries.

MFC after:	3 days
2013-03-07 04:42:20 +00:00
Gleb Smirnoff
8f134647ca Switch the entire IPv4 stack to keep the IP packet header
in network byte order. Any host byte order processing is
done in local variables and host byte order values are
never[1] written to a packet.

  After this change a packet processed by the stack isn't
modified at all[2] except for TTL.

  After this change a network stack hacker doesn't need to
scratch his head trying to figure out what is the byte order
at the given place in the stack.

[1] One exception still remains. The raw sockets convert host
byte order before pass a packet to an application. Probably
this would remain for ages for compatibility.

[2] The ip_input() still subtructs header len from ip->ip_len,
but this is planned to be fixed soon.

Reviewed by:	luigi, Maxim Dounin <mdounin mdounin.ru>
Tested by:	ray, Olivier Cochard-Labbe <olivier cochard.me>
2012-10-22 21:09:03 +00:00
Robert Watson
fa046d8774 Decompose the current single inpcbinfo lock into two locks:
- The existing ipi_lock continues to protect the global inpcb list and
  inpcb counter.  This lock is now relegated to a small number of
  allocation and free operations, and occasional operations that walk
  all connections (including, awkwardly, certain UDP multicast receive
  operations -- something to revisit).

- A new ipi_hash_lock protects the two inpcbinfo hash tables for
  looking up connections and bound sockets, manipulated using new
  INP_HASH_*() macros.  This lock, combined with inpcb locks, protects
  the 4-tuple address space.

Unlike the current ipi_lock, ipi_hash_lock follows the individual inpcb
connection locks, so may be acquired while manipulating a connection on
which a lock is already held, avoiding the need to acquire the inpcbinfo
lock preemptively when a binding change might later be required.  As a
result, however, lookup operations necessarily go through a reference
acquire while holding the lookup lock, later acquiring an inpcb lock --
if required.

A new function in_pcblookup() looks up connections, and accepts flags
indicating how to return the inpcb.  Due to lock order changes, callers
no longer need acquire locks before performing a lookup: the lookup
routine will acquire the ipi_hash_lock as needed.  In the future, it will
also be able to use alternative lookup and locking strategies
transparently to callers, such as pcbgroup lookup.  New lookup flags are,
supplementing the existing INPLOOKUP_WILDCARD flag:

  INPLOOKUP_RLOCKPCB - Acquire a read lock on the returned inpcb
  INPLOOKUP_WLOCKPCB - Acquire a write lock on the returned inpcb

Callers must pass exactly one of these flags (for the time being).

Some notes:

- All protocols are updated to work within the new regime; especially,
  TCP, UDPv4, and UDPv6.  pcbinfo ipi_lock acquisitions are largely
  eliminated, and global hash lock hold times are dramatically reduced
  compared to previous locking.
- The TCP syncache still relies on the pcbinfo lock, something that we
  may want to revisit.
- Support for reverting to the FreeBSD 7.x locking strategy in TCP input
  is no longer available -- hash lookup locks are now held only very
  briefly during inpcb lookup, rather than for potentially extended
  periods.  However, the pcbinfo ipi_lock will still be acquired if a
  connection state might change such that a connection is added or
  removed.
- Raw IP sockets continue to use the pcbinfo ipi_lock for protection,
  due to maintaining their own hash tables.
- The interface in6_pcblookup_hash_locked() is maintained, which allows
  callers to acquire hash locks and perform one or more lookups atomically
  with 4-tuple allocation: this is required only for TCPv6, as there is no
  in6_pcbconnect_setup(), which there should be.
- UDPv6 locking remains significantly more conservative than UDPv4
  locking, which relates to source address selection.  This needs
  attention, as it likely significantly reduces parallelism in this code
  for multithreaded socket use (such as in BIND).
- In the UDPv4 and UDPv6 multicast cases, we need to revisit locking
  somewhat, as they relied on ipi_lock to stablise 4-tuple matches, which
  is no longer sufficient.  A second check once the inpcb lock is held
  should do the trick, keeping the general case from requiring the inpcb
  lock for every inpcb visited.
- This work reminds us that we need to revisit locking of the v4/v6 flags,
  which may be accessed lock-free both before and after this change.
- Right now, a single lock name is used for the pcbhash lock -- this is
  undesirable, and probably another argument is required to take care of
  this (or a char array name field in the pcbinfo?).

This is not an MFC candidate for 8.x due to its impact on lookup and
locking semantics.  It's possible some of these issues could be worked
around with compatibility wrappers, if necessary.

Reviewed by:    bz
Sponsored by:   Juniper Networks, Inc.
2011-05-30 09:43:55 +00:00
Sergey Kandaurov
6bed196c35 Staticize malloc types.
Approved by:	lstewart
MFC after:	1 week
2011-04-13 11:28:46 +00:00
Lawrence Stewart
891b8ed467 Use the full and proper company name for Swinburne University of Technology
throughout the source tree.

Requested by:	Grenville Armitage, Director of CAIA at Swinburne University of
			Technology
MFC after:	3 days
2011-04-12 08:13:18 +00:00
Dimitry Andric
3e288e6238 After some off-list discussion, revert a number of changes to the
DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various
people working on the affected files.  A better long-term solution is
still being considered.  This reversal may give some modules empty
set_pcpu or set_vnet sections, but these are harmless.

Changes reverted:

------------------------------------------------------------------------
r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines

Instead of unconditionally emitting .globl's for the __start_set_xxx and
__stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu
sections are actually defined.

------------------------------------------------------------------------
r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines

Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.

------------------------------------------------------------------------
r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines

Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.
2010-11-22 19:32:54 +00:00
Lawrence Stewart
92ea5581dd Fix a minor code redundancy nit.
MFC after:	3 days
2010-11-20 08:40:37 +00:00
Lawrence Stewart
052aec123c When enabling or disabling SIFTR with a VIMAGE kernel, ensure we add or remove
the SIFTR pfil(9) hook functions to or from all network stacks. This patch
allows packets inbound or outbound from a vnet to be "seen" by SIFTR.

Additional work is required to allow SIFTR to actually generate log messages for
all vnet related packets because the siftr_findinpcb() function does not yet
search for inpcbs across all vnets. This issue will be fixed separately.

Reported and tested by:	David Hayes <dahayes at swin edu au>
MFC after:	3 days
2010-11-20 07:36:43 +00:00
Dimitry Andric
31c6a0037e Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.
2010-11-14 20:38:11 +00:00
Lawrence Stewart
619ad9eb3e Standardise all Swinburne related copyright/licence statements throughout the
tree in preparation for another large code import. Swinburne University is the
legal entity that owns copyright and the 2-clause BSD licence is acceptable.
2010-11-12 00:44:18 +00:00