We fail mapping for any udf_bmap_internal error and there can be
different reasons for it, so no need to (over-)emphasize files with
data in fentry.
Submitted by: bde
Approved by: jhb
are used by glibc. This silents the message "2.4+ kernel w/o ELF notes?"
from some programs at start, among them are top and pkill.
Do the assignment of the vector entries in elf_linux_fixup()
as it is done in glibc.
Fix some minor style issues.
Submitted by: Marcin Cieslak <saper at SYSTEM PL>
Approved by: kib (mentor)
MFC after: 1 week
and do not attempt to perform a group lookup.
This is a socket layer lock, and the bottom half of IP
really has no business taking it.
Use the value of the in_mcast_loop sysctl to determine
if we should loop back by default, in the absence of
any multicast socket options. Because the check on
group membership is now deferred to the input path,
an m_copym() is now required.
This should increase multicast send performance where the
source has not requested loopback, although this has not been
benchmarked or measured.
It is also a necessary change for IN_MULTI_LOCK to become
non-recursive, which is required in order to implement IGMPv3
in a thread-safe way.
IPv4 multicast sends are looped back to senders by default
on a stack-wide basis, rather than relying on the socket option.
Note that the sysctl only applies to newly created multicast sockets.
- Added missing firmware for 5709 A1 controllers.
- Changed some debug statistic variable names to be more consistent.
Submitted by: davidch
MFC after: Two weeks
while developing and compiling with kernel options that change the
size of at least one structure. The current kernel build framework
does not allow us to pass -Dxxx to module builds so we would possibly
need a kernel option to disable the checks and that might not work
for people just building modules alone.
For now they helped to identify possibly API problems and bring
those back into minds of developers seeking for better solutions.
Problems reported by: kib, warner
Reviewed by: warner
Clang disallows structs with variable length arrays to be nested inside
other structs, because this is in violation with ISO C99. Even though we
can keep bugging the LLVM folks about this issue, we'd better just fix
our code to not do this. This code seems to be the only code in the
entire source tree that does this.
I haven't tested this patch by using the kernel modules in question, but
Diane Bruce and I have compared disassembled versions of these kernel
modules. We would have expected them to be exactly the same, but due to
randomness in the register allocator and reordering of instructions,
there were some minor differences.
Approved by: julian
wrapper macros that allow trace points and arguments to be declared
using a single macro rather than several. This means a lot less
repetition and vertical space for each trace point.
Use these macros when defining privilege and MAC Framework trace points.
Reviewed by: jb
MFC after: 1 week
A while back, Warner changed the PCI bus code to reserve resources when
enumerating devices and simply give devices the previously allocated
resources when they call bus_alloc_resource(). This ensures that address
ranges being decoded by a BAR are always allocated in the nexus0 device
(or whatever device the PCI bus gets its address space from) even if a
device driver is not attached to the device. This patch extends this
behavior further:
- To let the PCI bus distinguish between a resource being allocated by
a device driver vs. merely being allocated by the bus, use
rman_set_device() to assign the device to the bus when it is owned
by the bus and to the child device when it is allocated by the child
device's driver. We can now prevent a device driver from allocating
the same device twice. Doing so could result in odd things like
allocating duplicate virtual memory to map the resource on some
archs and leaking the original mapping.
- When a PCI device driver releases a resource, don't pass the request
all the way up the tree and release it in the nexus (or similar device)
since the BAR is still active and decoding. Otherwise, another device
could later allocate the same range even though it is still in use.
Instead, deactivate the resource and assign it back to the PCI bus
using rman_set_device().
- pci_delete_resource() will actually completely free a BAR including
attemping to disable it.
- Disable BAR decoding via the command register when sizing a BAR in
pci_alloc_map() which is used to allocate resources for a BAR when
the BIOS/firmware did not assign a usable resource range during boot.
This mirrors an earlier fix to pci_add_map() which is used when to
size BARs during boot.
- Move the activation of I/O decoding in the PCI command register into
pci_activate_resource() instead of doing it in pci_alloc_resource().
Previously we could actually enable decoding before a BAR was
initialized via pci_alloc_map().
Glanced at by: bsdimp
RH0 was deprecated by RFC 5095.
While most of the code had been disabled by #if 0 already, leave a
bit of infrastructure for possible RH2 code and a log message under
BURN_BRIDGES in case a user still tries to send RH0 packets.
Reviewed by: gnn (a bit back, earlier version)
Previosly readdir missed some directory entries because there was
no space for them in current uio but directory stream offset
was advanced nevertheless.
jhb has discoved the issue and provided a test-case.
Reviewed by: bde
Approved by: jhb (mentor)
vfsopt and the vfs_buildopts function public, and add some new fields
to struct vfsopt (pos and seen), and new functions vfs_getopt_pos and
vfs_opterror.
Further extend the interface to allow reading options from the kernel
in addition to sending them to the kernel, with vfs_setopt and related
functions.
While this allows the "name=value" option interface to be used for more
than just FS mounts (planned use is for jails), it retains the current
"vfsopt" name and <sys/mount.h> requirement.
Approved by: bz (mentor)
as a handler.
The variable was exported only for debugging, but there is little reason
to do it now that the timekeeping is supported by various other variables.
For the time being just comment out the sysctl, but I think this
should go away.
of sysctl_variables.
I would also remove it from the VNET record but I am unsure if
there is any ABI issue -- so for the time being just mark it as
unused in ip_fw.h, and then we will collect the garbage at some
appropriate time in the future.
MFC after: 3 days
operation is known and to retry or fail accordingly to that
outcome. This fixes the problem with namespace traversing
programs failing with random ENOENT errors if someone just
happened to try to unmount that same filesystem at the same
time.
Reported by: dhw
Reviewed by: kib, attilio
Sponsored by: Juniper Networks, Inc.
This addresses interrupt storms that were noticed after enabling MSI
in drm. I think this is due to a loose interpretation of the PCI 2.3
spec, which states that a function using MSI is prohibitted from using
INTx. It appears that some vendors interpretted that to mean that they
should handle it in hardware, while others felt it was the drivers
responsibility.
This fix will also likely resolve interrupt storm related issues with
devices other than drm.
Reviewed by: jhb@
MFC after: 3 days
memory from int to size_t. Implement a workaround for current ABI not
allowing to properly save size for and report more then 2Gb sized segment
of shared memory.
This makes it possible to use > 2 Gb shared memory segments on 64bit
architectures. Please note the new BUGS section in shmctl(2) and
UPDATING note for limitations of this temporal solution.
Reviewed by: csjp
Tested by: Nikolay Dzham <i levsha org ua>
MFC after: 2 weeks
name i2c-address instead of reg. Change the OFW I2C probe to check both
locations for the address.
Submitted by: Marco Trillo
Reported by: Justin Hibbits
contrib/openbsm (svn merge) and src/sys/{bsm,security/audit} (manual
merge).
OpenBSM history for imported revision below for reference.
MFC after: 1 month
Sponsored by: Apple, Inc.
Obtained from: TrustedBSD Project
OpenBSM 1.1 beta 1
- The filesz parameter in audit_control(5) now accepts suffixes: 'B' for
Bytes, 'K' for Kilobytes, 'M' for Megabytes, and 'G' for Gigabytes.
For legacy support no suffix defaults to bytes.
- Audit trail log expiration support added. It is configured in
audit_control(5) with the expire-after parameter. If there is no
expire-after parameter in audit_control(5), the default, then the audit
trail files are not expired and removed. See audit_control(5) for
more information.
- Change defaults in audit_control: warn at 5% rather than 20% free for audit
partitions, rotate automatically at 2mb, and set the default policy to
cnt,argv rather than cnt so that execve(2) arguments are captured if
AUE_EXECVE events are audited. These may provide more usable defaults for
many users.
- Use au_domain_to_bsm(3) and au_socket_type_to_bsm(3) to convert
au_to_socket_ex(3) arguments to BSM format.
- Fix error encoding AUT_IPC_PERM tokens.
changes since the last imported OpenBSM release:
OpenBSM 1.1 beta 1
- The filesz parameter in audit_control(5) now accepts suffixes: 'B' for
Bytes, 'K' for Kilobytes, 'M' for Megabytes, and 'G' for Gigabytes.
For legacy support no suffix defaults to bytes.
- Audit trail log expiration support added. It is configured in
audit_control(5) with the expire-after parameter. If there is no
expire-after parameter in audit_control(5), the default, then the audit
trail files are not expired and removed. See audit_control(5) for
more information.
- Change defaults in audit_control: warn at 5% rather than 20% free for audit
partitions, rotate automatically at 2mb, and set the default policy to
cnt,argv rather than cnt so that execve(2) arguments are captured if
AUE_EXECVE events are audited. These may provide more usable defaults for
many users.
- Use au_domain_to_bsm(3) and au_socket_type_to_bsm(3) to convert
au_to_socket_ex(3) arguments to BSM format.
- Fix error encoding AUT_IPC_PERM tokens.
Obtained from: TrustedBSD Project
Sponsored by: Apple Inc.
ready status. Most of controllers managed to issue coommand and set BUSY
bit almost simultaneously, before we will read it, but at least JMicron JMB363
don't. Ignore timeout errors to keep old behavior when error there was
impossible.
For me this fixes timeout errors on the first command after channel attach
or reinit. Boot in my case is not affected, as there is much time passing
between reset and next command giving reset time to complete.
Unlike GCC, LLVM defines __STDC_VERSION__ to 199901L by default. This
means `restrict' keywords in files end up being given to lint, which
results in errors during compilation of usr.bin/xlint.
Other keywords are also expanded to nothing when using lint, so do the
same with restrict.
done in other places. Until we have no support for command queueing we have
no any benefit from FBS, while enabling it only here somehow leads to
"port not ready" errors on Intel 63XXESB2 controller.
Tested by: Larry Rosenman <ler AT lerctr.org>
pointers together, move padding to the bottom of the structure, and add
two new integer spares due to attrition over time. Remove unused spare
"flags" field, we can use one of the spare ints if we need it later.
This change requires a rebuild of device driver modules that depend on
the layout of ifnet for binary compatibility reasons.
Discussed with: kmacy
which are not in a module of their own like gif.
Single kernel compiles and universe will fail if the size of the struct
changes. Th expected values are given in sys/vimage.h.
See the comments where how to handle this.
Requested by: peter
architecture to implement size-guards on the vimage vnet_* structures.
As CTASSERT_EQUAL() needs special compile time options we back it
by CTASSERT() in the default case. Unfortunately CTASSERT() triggers
first, thus add an option to allow compilation with CTASSERT_EQUAL() only.
See the comments how to get new values if you trigger the assert
and what to do in that case.
Reviewed by: rwatson, zec (earlier versions)
It's better to just use internal language constructs, because it is
likely the compiler has a better opinion on whether to perform inlining,
which is very likely to happen to struct winsize.
Submitted by: Christoph Mallon <christoph mallon gmx de>
It takes a positive integer constant (the expected value) and
another positive integer, usually compile-time evaluated,
e.g. CTASSERT_EQUAL(FOO_EXPECTED_SIZE, sizeof (struct foo));
While the classic CTASSERT() gives:
error: size of array '__assert60' is negative
this gives you:
In function '__ctassert_equal_at_line_60':
warning: '__expected_42_but_got[464ul]' is used uninitialized in this function
and you can directly see the difference in the expected and the
real value.
CTASSERT_EQUAL() needs special compile time options to trigger
thus keep it locally to this header. If it proves to be of general
interest it can be moved to systm.h.
Submitted by: jmallett
Reviewed by: sam, warner, rwatson, jmallett (earlier versions)
* Add RB_FOREACH_FROM() which continues traversal *at*
the y-node provided. There is no pre-increment.
* Nuke RB_FOREACH_SAFE as it was buggy; it would omit the final node.
* Replace RB_FOREACH_SAFE() with a working implementation
derived from RB_FOREACH_FROM().
The key observation is that we now only check the loop-control
variable, but still cache the next member pointer.
* Add RB_FOREACH_REVERSE_FROM() which continues backwards
traversal *at* the y-node provided. There is no pre-increment.
Typically this is used to back out of allocations made
whilst walking an RB-tree.
* Add RB_FOREACH_REVERSE_SAFE() which performs insertion and
deletion safe backwards traversal.
and partially r188903. Revert breaks new drives detection on reinit to the
state as it was before me, but fixes series of new bugs reported by some
people.
Unconditional queueing of ata_completed() calls can lead to deadlock if
due to timeout ata_reinit() was called at the same thread by previous
ata_completed(). Calling of ata_identify() on ata_reinit() in current
implementation opens numerous races and deadlocks.
Problems I was touching here are still exist and should be addresed, but
probably in different way.
When copying big structures, LLVM generates calls to memmove(), because
it may not be able to figure out whether structures overlap. This caused
linker errors to occur. memmove() is now implemented using bcopy().
Ideally it would be the other way around, but that can be solved in the
future. On ARM we don't do add anything, because it already has
memmove().
Discussed on: arch@
Reviewed by: rdivacky
drivers' probe routines. It allows not to sleep and so not drop Giant inside
ata_identify() critical section and so avoid crash if it reentered on
request timeout. Reentering of probe call checked inside of it.
Give device own knowledge about it's type (ata/atapi/atapicam). It is not
a good idea to ask channel status for device type inside ata_getparam().
Add softc memory deallocation on device destruction.
On FreeBSD, this is the default behaviour. According to the spec, we may
give this flag a value of zero, but I'd rather not do this. If we define
it to a non-zero value, we can always change default behaviour without
changing the ABI. This is very unlikely to happen, though.
wcscasecmp(), and wcsncasecmp().
- Make some previously non-standard extensions visible
if POSIX_VISIBLE >= 200809.
- Use restrict qualifiers in stpcpy().
- Declare off_t and size_t in stdio.h.
- Bump __FreeBSD_version in case the new symbols (particularly
getline()) cause issues with ports.
Reviewed by: standards@
at irq install/uninstall time, but when we vt switch, we uninstall the
irq handler. When the irq handler is reinstalled, the modeset ioctl
happens first. The modeset ioctl is supposed to tell us that we can
disable vblank interrupts if there are no active consumers. This will
fail after a vt switch until another modeset ioctl is called via dpms
or xrandr. Leading to cases where either interrupts are on and can't
be disabled, or worse, no interrupts at all.
MFC after: 2 weeks
usb stack rather than with the rest of the processor support code.
Not sure that's a good idea, as we were moving away from it, but this
fixes the build in the mean time so we can have that discussion.
- Fix the copy, we can't do a blind copy but must transfer
the data from the old to the new.
- Fix the ACK processing so we properly stop retransmitting
the thing.
- Fix it so if we get a retran we will properly reply with
the saved response without doing anything.
MFC after: 1 month
the devfs clone handler to open the (invisible) devices on the fly.
The /dev entries are layed out as follows,
/dev/usbctl = master device
/dev/usb/0.1.0.5 = usb device, (<bus>.<dev>.<iface>.<endpoint>)
/dev/ugen0.1 -> usb/0.1.0.0 = ugen link to ctrl endpoint
This also removes the custom permissions model from USB. Bump
__FreeBSD_version to 800066.
Submitted by: rink (earlier version)
net/route.h.
Remove the hidden include of opt_route.h and net/route.h from net/vnet.h.
We need to make sure that both opt_route.h and net/route.h are included
before net/vnet.h because of the way MRT figures out the number of FIBs
from the kernel option. If we do not, we end up with the default number
of 1 when including net/vnet.h and array sizes are wrong.
This does not change the list of files which depend on opt_route.h
but we can identify them now more easily.
printf() and vprintf() are exactly the same, except the way arguments
are passed. Just like we see in other pieces of code (i.e. libc's
printf()), implement printf() using vprintf().
Submitted by: Christoph Mallon <christoph mallon gmx de>
As mentioned by bz and bde, the change I made wasn't the proper way to
fix. Inspired by bde's patch, perform some small cleanups to uprintf().
Reviewed by: bz
Previously, DBCR0 flags were set "globally", but this leads to problems
because Book-E fine grained debug settings work only in conjuction with the
debug master enable bit in MSR: in scenarios when the DBCR0 was set with
intention to debug one process, but another one with MSR[DE] set got
scheduled, the latter would immediately cause debug exceptions to occur upon
execution of its own code instructions (and not the one intended for
debugging).
To avoid such problems and properly handle debugging context, DBCR0 state
should be managed individually per process.
Submitted by: Grzegorz Bernacki gjb ! semihalf dot com
Reviewed by: marcel
The function pow() in libmp(3) clashes with pow(3) in libm. We could
rename this single function, but we can just take the same approach as
the Solaris folks did, which is to prefix all function names with mp_.
libmp(3) isn't really popular nowadays. I suspect not a single
application in ports depends on it. There's still a chance, so I've
increased the SHLIB_MAJOR and __FreeBSD_version.
Reviewed by: deischen, rdivacky
kernel dumping case.
ata_completed() may initiate ata_reinit() on error, that may lead to drives
attach or detach. Attach and detach are sending requests to drives and sleep
waiting for results. But ata_finish() can be called directly from
interrupt handler where sleeping is prohibited, so we must break this chain
somewhere. This place seems to fit best.
go before the standard -I$S to have the search happen before /sys/.
Make this conditional on 'makeoptions WITH_LEGACY=1' in the kernel config.
Prodded by: sam
Disable MSI for nVidia MCP51 controller. Enabling MSI there leads to
unexpected errors and timeouts, that should not happen even if interrupts
are not working completely.
Currently bread()-ing through device vnode with
(1) VMIO enabled,
(2) bo_bsize != DEV_BSIZE
(3) more than 1 block
results in data being incorrectly cached.
So instead a more common approach of using a vnode belonging to fs is now
employed.
Also, prevent attempt to bread more than MAXBSIZE bytes because of
adjustments made to account for offset that doesn't start on block
boundary.
Add expanded comments to explain the calculations.
Also drop unused inline function while here.
PR: kern/120967
PR: kern/129084
Reviewed by: scottl, kib
Approved by: jhb (mentor)
Inside do_execve(), we have a pointer `ndp', which always points to
`&nd'. I can imagine a primitive (non-optimizing) compiler to really
reserve space for such a pointer, so just remove the variable and use
`&nd' directly.
kern_time.c:
- Unused variable `p'.
kern_thr.c:
- Variable `error' is always caught immediately, so no reason to
initialize it. There is no way that error != 0 at the end of
create_thread().
kern_sig.c:
- Unused variable `code'.
kern_synch.c:
- `rval' is always assigned in all different cases.
kern_rwlock.c:
- `v' is always overwritten with RW_UNLOCKED further on.
kern_malloc.c:
- `size' is always initialized with the proper value before being used.
kern_exit.c:
- `error' is always caught and returned immediately. abort2() never
returns a non-zero value.
kern_exec.c:
- `len' is always assigned inside the if-statement right below it.
tty_info.c:
- `td' is always overwritten by FOREACH_THREAD_IN_PROC().
Found by: LLVM's scan-build
... as opposed to files with data in extents.
Some UDF authoring tools produce this type of file for sufficiently small
data files.
Approved by: jhb (mentor)
Use bit-shift instead of division/multiplication.
Act on error as soon as it is detected.
Report attempt to read data embedded in file entry via regular way.
While there, fix lblktosize macro and make use of it.
No functionality should change as a result.
Approved by: jhb (mentor)
`p' is already initialized with `td->td_proc'. Because td is always
curthread, it is safe to initialize it without any locks.
Found by: LLVM's scan-build
priv:kernel:priv_check:priv_ok fires for granted privileges
priv:kernel:priv_check:priv_errr fires for denied privileges
The first argument is the requested privilege number. The naming
convention is a little different from the OpenSolaris equivilent
because we can't have '-' in probefunc names, and our privilege
namespace is different.
MFC after: 1 week
fault. In r188331 this update was relocated because of synchronization
changes to a place where it would occur on both hard and soft faults. This
change again restricts the update to hard faults.
required to make 3CR990 familiy controllers run on NV flash
firmware version 03.001.008.
The latest firmware added HMAC digest information so teach txp(4)
to pass them to sleep image before downloading is started.
While I'm here restore previous IMR/IER register if firmware
downloading have failed.
PR: kern/89876, kern/132047
The old BTX passed the general purpose registers from the 32-bit client to
the routines called via virtual 86 mode. The new BTX did the same thing.
However, it turns out that some instructions behave differently in virtual 86
mode and real mode (even though this is under-documented). For example, the
LEAVE instruction will cause an exception in real mode if any of the upper
16-bits of %ebp are non-zero after it executes. In virtual 8086 mode the
upper 16-bits are simply ignored. This could cause faults in hardware
interrupt handlers that inherited an %ebp larger than 0xffff from the 32-bit
client (loader, boot2, etc.) while running in real mode.
To fix, when executing hardware interrupt handlers provide an explicit clean
state where all the general purpose and segment registers are zero upon
entry to the interrupt handler. While here, I attempted to simplify the
control flow in the 'intusr' code that sets up the various stack frames
and exits protected mode to invoke the requested routine via real mode.
A huge thanks to Tor Egge (tegge@) for debugging this issue.
Submitted by: tegge
Reviewed by: tegge
Tested by: bz
MFC after: 1 week
function, done in r188334. Instead, collect the entries that shall be
freed, in the deferred_freelist member of the map. Automatically purge
the deferred freelist when map is unlocked.
Tested by: pho
Reviewed by: alc
an NFS file. Now the priming is conditional on a new
vfs.nfs.prime_access_cache sysctl. For now I've left the default setting
to disabling the priming.
Requested by: scottl
checks for the tcpcb, previously used to detect complete disconnection,
with INP_DROPPED checks. Correct that, preventing shutdown() from
improperly generating a TCP segment with destination IP and port of
0.0.0.0:0.
PR: kern/132050
Reported by: david gueluy <david.gueluy at netasq.com>
MFC after: 3 weeks
- We don't need to exit the Giant mutex when sleeping. This is done
automatically. Replace Giant by NULL mutex for all control requests in the
enumeration path.
- Optimise away duplicate alternate interface selection requests in USB Host
mode.
Submitted by: Hans Petter Selasky
Improvements to "usb2_transfer_setup()" and "usb2_transfer_unsetup()". Set
"ppxfer[n]" when the transfer setup is complete to prevent races. Remove
redundant NULL-checks from "usb2_transfer_unsetup()".
Submitted by: Hans Petter Selasky
- The software computed HID size is not always correct, because the algoritm
does not handle unsorted HID descriptors.
- Change the way we obtain the report ID.
- Use the X/Y/Z+button locations instead for report ID source for ums.
- Add more range checks.
- Remove Microsoft Mouse quirks. If the positions are moduloed the report
length multiplied by 8, the values seem correct.
- Some minor style changes.
Submitted by: Hans Petter Selasky
o add ah_configPCIE and ah_disablePCIE for drivers to configure PCIE
power save operation (modeled after ath9k, may need changes)
o add private state flag to indicate if device is PCIE (replaces private
hack in 5212 code)
o add serdes programming ini bits for 5416 and later parts and setup
for each part (5416 and 9160 logic hand-crafted from existing routines);
5212 remains open-coded but is now hooked in via ah_configPCIE
o add PCIE workaround gunk
o add ar5416AttachPCIE for iodomatic code used by 5416 and later parts
o add output mux support
o gpio pin count is chip-dependent
o 9280 and 9285 do input handling different
o hookup gpio interrupts
o no need to save/restore soft led state around reset
1 will trigger a pass through the VM's low-memory handlers, such as
protocol and UMA drain routines. This makes it easier to exercise
these otherwise rarely-invoked code paths.
MFC after: 3 days
to in_rtrequest(); the radix head lock is already acquired before
rnh_walktree is called in in_rtqtimo_one(). This avoids a recursive
acquisition that is no longer permitted in 8.x due to use of an rwlock
for the radix head lock.
Reported by: dikshie <dikshie at gmail.com>
MFC after: 3 days
msdosfs_unmount() and ffs_unmount() exit early after getting ENXIO.
However, dounmount() treats ENXIO as a success and proceeds with
unmounting. In effect, the filesystem gets unmounted without closing
GEOM provider etc.
Reviewed by: kib
Approved by: rwatson (mentor)
Tested by: dho
Sponsored by: FreeBSD Foundation
register instead of AAC_RX_FWSTATUS, as that is the way it's done in
Adaptec's vendor driver and in the Linux drivers. (The same applies
to aac_rkt_get_fwstatus as well.)
However, a concern has been raised about the compatibility of this
change and old hardware / firmware versions. In the absense of
specific information, revert to the original behaviour if the firmware
does not support the "New comm." interface. Users of old cards or
firmware haven't reported the problems that are potentially solved by
switching to OMR0.
different cpu is still assigned to that vector by never clearing idt
entries. This was only provided as a debugging feature and the bugs
are caught by other means.
- Drop the sched lock when rebinding to reassign an interrupt vector
to a new cpu so that pending interrupts have a chance to be delivered
before removing the old vector.
Discussed with: tegge, jhb
- protect againtst recursions,
- add new devices detection using ata_identify().
Improve ata_identify():
- do not add duplicate device if device already exist.
Rework SATA hot-plug events handling. Instead of unsafe duplicate
implementation use common ata_reinit() to handle all state changes.
All together this gives quite stable and robust cold- and hot-plug operation,
invariant to false, lost and duplicate events.
properly. Otherwise the minimum of 1 is used and you can
only insert a single partition/slice and only at sector
0 (index 1).
o When adding a partition/slice, recalculate the index after
the start and size of the partition/slice are adjusted to
make them a multiple of the track size. Since the precheck
method sets the index based on the start of the partition
as provided by the user, we know that we're off by at most
1 and adjusting the index is safe.
[1] Add the support for the NARK controller which seems a variant of
the i960Rx.
[2] Split up memory regions and other resources in 2 different parts
as long as NARK uses them separately (it is not clear to me
why though as long as there are no more informations available
on this controller). Please note that in all the other cases,
the regions overlaps leaving the default behaviour for all the
other controllers.
[3] Implement a clock daemon responsible for maintain updated the
wall clock time of the controller (run any 30 minutes)*.
Submitted by: Adaptec (driver build 15317 [1, 2] and 15727 [3])
Reviewed by: emaste
Tested by: emaste
Sponsored by: Sandvine Incorporated
* Please note that originally, in the Adaptec driver, the clock daemon
is not implemented with callouts as in our in-tree driver.
predefined set of methods, which are set in osd_register() and called
via osd_call(). Currently, no methods are defined, though prison
objects will have some in the future.
Expand the locking from a single per-type mutex to three different kinds
of locks (four if you include the requirement that the container
(e.g. prison) be locked when getting/setting data). This clears up one
existing issue, as well as others added by the method support.
Approved by: bz (mentor)
ATA specification declares minimal reset time of 5us. SATA keeps it, but
requires devices to handle commands transmitted even one by one without
any gap.
The existing code calls kern_open() to resolve the vnode of a pathname
right after a stat(). This is not correct, because it causes random
character devices to be opened in /dev. This means ls'ing a tape
streamer will cause it to rewind, for example. Changes I have made:
- Add kern_statat_vnhook() to allow binary emulators to `post-process'
struct stat, using the proper vnode.
- Remove unneeded printf's from stat() and statfs().
- Make the Linuxolator use kern_statat_vnhook(), replacing
translate_path_major_minor_at().
- Let translate_fd_major_minor() use vp->v_rdev instead of
vp->v_un.vu_cdev.
Result:
crw-rw-rw- 1 root root 0, 14 Feb 20 13:54 /dev/ptmx
crw--w---- 1 root adm 136, 0 Feb 20 14:03 /dev/pts/0
crw--w---- 1 root adm 136, 1 Feb 20 14:02 /dev/pts/1
crw--w---- 1 ed tty 136, 2 Feb 20 14:03 /dev/pts/2
Before this commit, ptmx also had a major number of 136, because it
silently allocated and deallocated a pseudo-terminal. Device nodes that
cannot be opened now have proper major/minor-numbers.
Reviewed by: kib, netchild, rdivacky (thanks!)
as ATA RAID, but generic ATAPCI driver unable to detect drives there. AHCI
driver reported to handle them fine. Linux does the same.
Submitted by: Andrey V. Elsukov on stable@
This fixes the low "max device openings" count that has lead to poor
performance in FreeBSD 7.0 and 7.1.
Extra thanks goes to Mike Tancsa at Sentex for providing a debug system for
this.
1. Extend geom_dev by having it create the symlink (i.e. call
make_dev_alias) based on the DIOCGPROVIDERALIAS ioctl.
In this way the functionaility is generic and thus usable
by any geom/provider.
2. Have g_part handle said ioctl through the devalias method,
so that it's under control of the scheme itself. By design
the alias will not be created for newly added partitions.
stale entries, we save a copy of the directory's modification time when
the first negative cache entry was added in the directory's NFS node.
When a negative cache entry is hit during a pathname lookup, the parent
directory's modification time is checked. If it has changed, all of the
negative cache entries for that parent are purged and the lookup falls
back to using the RPC. This required adding a new cache_purge_negative()
method to the name cache to purge only negative cache entries for a given
directory.
Submitted by: mohans, Rick Macklem, Ricardo Labiaga @ NetApp
Reviewed by: mohans
opportunistic ACCESS RPC to populate both the access and attribute caches
of the file and instead always use a GETATTR RPC. On many modern NFS
servers, an ACCESS RPC is much more expensive to service than a GETATTR
RPC.
Submitted by: mohans
open() of the same file will load fresh attributes, so they do not need to
be explicitly flushed in close() to guarantee close to open consistency.
However, other file desciptors may still reference this file and clearing
the attributes in close() forces those other file descriptors to fetch
fresh attributes the next time they need them.
Reviewed by: mohans
MFC after: 1 week
- Don't return a negative errno when using an unknown ioctl() on a
pseudo-terminal master device. Be sure to convert ENOIOCTL to ENOTTY,
just like the TTY layer does.
- Even though we should return st_rdev of the master device node when
emulating pty(4) devices, FIODGNAME should still return the name of
the slave device. Otherwise ptsname(3) and ttyname(3) return an
invalid device name.
Not enough space in user-land buffer is not an error, userland
will read further until eof is reached. So instead of propagating
-1 to caller we convert it to zero/success.
cd9660 code works exactly the same way.
PR: kern/78987
Reviewed by: jhb (mentor)
Approved by: jhb (mentor)
This is triggered only if BIOS configures ACPI_BITREG_BUS_MASTER_RLD
aka BRLD_EN_BM to 1.
Rationale:
1. we do not support C3 on PIIX4E
2. bus master activity need not break out of C2 state
3. because of CPU_QUIRK_NO_BM_CTRL quirk we may reset bus master
status which would result in immediate break out from C2
So if you have seen
cpu0: too many short sleeps, backing off to C1
with this chipset before you may want to try cx_lowest of C2 again.
Reviewed by: rpaulo (mentor), njl
Approved by: rpaulo (mentor)
ata_detach() to implement IOCATAATTACH/IOCATADETACH ioctls.
This will permit channel drivers to properly shutdown port hardware on channel
detach and init it on attach.
and xmit parameters. This makes it possible to use tdma on fractional
channels.
o add IEEE80211_MODE_HALF and IEEE80211_MODE_QUARTER; note these are
band-agnostic (may need revisiting)
o setup all default rates in ic_sup_rates instead of doing it only
for active modes; we need these to calculate the default tx parameters
which are not recalculated after a regulatory update (can't just
recalculate after installing a new channel list because we might
clobber user settings)
o remove special case code in ieee80211_get_suprates; this is now
a candidate for an inline or removal
o add various entries for new modes (roaming+tx params, wme, rate
mapping, scan set setup, country ie construction, tdma, basic rates)
Note these modes are intentionally not visible through if_media.
parameters for IEEE80211_IOC_ROAM and IEEE80211_IOC_TXPARAMS; this
lets us add more modes and still have old apps work
o consolidate loops to remote assumptions about mode ordering
o mark phy type to indicate 1/2 or 1/4-rate operation
o use phy type instead of channel attributes to identify 1/2 and 1/4-rate
operation
o general cleanup of code including move phy constants to ah_internal.h
Eventually this code should go away and we should use the net0211 equivalents.
the old TTY implementation; however, take a cut at stripping its
optional Giant-protected code paths enabled using debug.cx.mpsafenet,
which will no longer work once IFF_NEEDSGIANT is removed.
join allocate() and dmainit() atapci subdriver's channel initialization
methods into single ch_attach() method.
As opposite to ch_attach() add new ch_detach() method to deallocate/disable
channel.
members for a kinfo entry on a process-wide system.
- Use the newly introduced function in order to fix cases like
KERN_PROC_PROC where aggregating stats are broken because they just
consider the first thread in the pool for each process.
(Note, additively, that KERN_PROC_PROC is rather inaccurate on
thread-wide informations like the 'state' of the process. Such
informations should maybe be invalidated and being forceably discarded
by the consumers?).
- Simplify the logic of sysctl_out_proc() and adjust the
fill_kinfo_thread() accordingly.
- Remove checks on the FIRST_THREAD_IN_PROC() being NULL but add
assertives.
This patch should fix aggregate statistics for KERN_PROC_PROC.
This is one of the reasons why top doesn't use this option and now it
can be use it safely.
ps, when launched in order to display just processes, now should report
correct cpu utilization percentages and times (as opposed by the old
code).
Reviewed by: jhb, emaste
Sponsored by: Sandvine Incorporated
it instead of overloading sbp_show_sdev_info().
replace calls to printf with calls to device_printf and cleanup debug
messages
Remove a bit of dead, commented out code.
Reviewed by: scottl(mentor)
MFC after: 2 weeks
- Update to firmware 1.4.39 for dual-chip NIC (10G-PCIE2-xxx)
support, and SFP+ i2c support
- Identify newer "B" NICs (10G-PCIEx-8B-x) correctly, rather than
mis-identifying them as "A" NICs (cosmetic only)
- Identify the IFM_10G_LRM ifmedia type, where applicable.
- Identify ifmedia types for SFP+ based NICs
- Update copyright
Sponsored by: Myricom
MFC after: 1 week
Deprecate unused phy_delay Self ID field as it was removed
by 1394a-2000.
Attempt to parse extended Self ID PHY packets if they are detected
Reviewed by: scottl (mentor)
MFC after: 2 weeks
interesting conditional printf's into single device_printf's
Change a couple of variable names so that I don't have to trace what they
acutally do anymore.
Enable the display of the Self ID PHY packet if firewire_debug > 0
Reviewed by: scottl(mentor)
MFC after: 2 weeks
Prior to this fix, IEVENT register was always cleared before calling
tsec_error_intr_locked(), which prevented error recovery actions from
happening with polling enabled (and could lead to serious problems, including
controller hang).
Submitted by: Marcin Ligenza marcinl ! pacomp dot com dot pl
- interrupt coalescing
- polling
- jumbo frames
- multicast
- VLAN tagging
The enhanced version of the chip (eTSEC) can also take advantage of:
- TCP/IP checksum calculation h/w offloading
Obtained from: Freescale, Semihalf
If speed of link between two devices is slower than the reported max
speed of both endpoints, the current driver will fail and be unable to
negotiate.
Summary:
Test negotiated speed by reading the CSRROM into a dummy variable.
If that read fails, decrement our speed and retry. If all else fails,
go to lowest speed possible(0).
Report speed to the user.
Add display of the Bus Info Block when debug.firewire_debug > 1
Support the Bus Info Block(1394a-2000) method of speed detection.
I also should note that I am moving "hold_count" to 0 for future
releases.
This variable determines how many bus resets to "hold" a removed
firewire device before deletion. I don't feel this is useful and will
probably drop support for this sysctl in the future.
Reviewed by: scottl(mentor)
MFC after: 2 weeks
almost once. After we've configured the devices that were present the
first time through, then we know that we're done. If the device has
other devices that are deferred, then it must do a similar dance.
This catches both PC Cards and CardBus cards.
to not allocate them after the recent ata channels enumeration changes.
It allows to save some resources, not bother user with unexisting hardware
and not check unimplemented ports status on every interrupt.
method allows schemes to reject the ctl request, pre-check
the parameters and/or modify/set parameters. There are 2
use cases that triggered the addition:
1. When implementing a R/O scheme, deletes will still
happen to the in-memory representation. The scheme is
not involved in that operation. The pre-check method
can be used to fail the delete up-front. Without this
the write to disk will typically fail, but at that
time the delete already happened.
2. The EBR scheme uses a linked list to record slices.
There's no index. The EBR scheme defines the index
as a function of the start LBA of the partition. The
add verb picks an index for the range and then invokes
the add method of the scheme to fill in the blanks. It
is too late for the add method to change the index.
The pre-check is used to set the index up-front. This
also (silently) overrides/nullifies any (pointless)
user-specified index value.
Works fine with AHCI and theoretically other MSI capable devices.
At this moment support disabled by default. To enable it, set
"hint.atapci.X.msi=1" device hint.
Add two new functions to the libusb20 API and required kernel ioctls.
- libusb20_dev_get_iface_desc
- libusb20_dev_get_info
New command to usbconfig, "show_ifdrv", which will print out the kernel driver
attached to the given USB device aswell.
See "man libusb20" for a detailed description.
Some minor style corrections long-line wrapping.
Submitted by: Hans Petter Selasky
- specification claims that 1 second is just a maximum controller reset time;
implement controller reset properly to save almost 1 second of boot, and
about half second of resume time;
- enable channel interrupts only after channel status reset to fix duplicate
device creation on resume due to unwanted device connection event;
- as described in specification, wait for disk ready status after channel
power-up; it is not so important when disk already touched by BIOS, but
solves device not ready problems on resume and probably some other cases.
- uncomment channel stop/start on soft-reset as it is declared mandatory by
specification; it was commented due to some random drive detection problems
on VIA and JMicron controllers, but I hope it is fixed by previous point.
If the file system backing a process' cwd is removed, and procstat -f PID
is called, then these messages would have been printed. The extra verbosity is
not required in this situation.
Requested by: kib
Approved by: kib
Move channel softc initialization from ata_XXX_probe() to ata_XXX_attach().
Instead of calculating ata channel number as position in child device list,
pass it's real number directly from controller probe routine using ivars.
It is simpler and IMHO more correct.
revealed that a process' current working directory can be VBAD if the
directory is removed. This can trigger a panic when procstat -f PID is
run.
Tested by: pho
Discovered by: phobot
Reviewed by: kib
Approved by: kib
Giving a charactere device execute permissions doesn't have any use.
Right now there isn't a single device node in /dev that has it, except
the USB2 device node, so remove it.
Approved by: hps, thompsa
longer do we require SCTP to be in the kernel for the
lib to be able to handle SCTP. We do this by moving
the CRC32c checksum into libkern/crc32.c and then adjusting
all routines to use the common methods. Note that this
will improve the performance of iSCSI since they were
using the old single 256 bit table lookup versus the
slicing 8 algorithm (which gives a 4x speed up in
CRC32c calculation :-D)
Reviewed by:rwatson, gnn, scottl, paolo
MFC after: 4 week? (assuming we MFC the alias_sctp changes)
fails to attach (possibly due to disable hints) then we get called back for
unload. Correctly handle the case where the keyboard isnt found rather than
calling panic.
- Make usb2_transfer_pending() part of the USB core header file.
- Make usb2_transfer_pending() NULL safe.
- Make sure that USB process functions return if the process has been drained.
- Remove two unused functions.
Submitted by: Hans Petter Selasky
name) (not sure whether this works correctly, but should be close).
Fix the stub attach phase for some Novatel cards. They expect the CSW
(repsonse to CBW, SCSI eject command) to be fetched before switching to
modem mode.
MFC after: 2 weeks
after the LLADDR is reclaimed which causes a null pointer deref with
inherit_mac enabled. Record the ifnet pointer of the interface and then compare
that to find when to re-assign the bridge address.
Submitted by: sam
Add a note next to fields in network format.
The n_* types are not enough for compiler checks on endianness, and their
use often requires an otherwise unnecessary #include <netinet/in_systm.h>
The typedef in in_systm.h are still there.
the correct behaviour (sorting by distance from the current head position
in the scan direction) and bioq_insert_head() and bioq_insert_tail()
have a well defined (and useful) behaviour, especially when intermixed
with calls to bioq_disksort().
In particular:
- fix a bug in the existing bioq_disksort() that did not use the
current head position correctly;
- redefine semantics of bioq_insert_head() and bioq_insert_tail().
bioq_insert_tail() can now be used as a barrier
between previous and subsequent calls to bioq_disksort().
The code is heavily documented in the source code so please refer
to that for the details.
Much of this code comes from Fabio Checconi. Also thanks to Kirk
for feedback on the (re)definition of bioq_insert_tail().
NOTE: in the current tree there is only a handful of files which
intermix calls to bioq_disksort() with bioq_insert_head() and
bioq_insert_tail(). The ordering of the queue in these situation
was not specified (nor easy to figure out) before, so I doubt any
of that code could be affected by the specification of the API.
Also note that the current implementation is significantly simpler
than the previous one (also used in ata_sort_queue()).
It would be useful to reimplement ata_sort_queue() using
the same code used in bioq_disksort().
MFC after: 1 week
a serial number, fall through to the next case so that initial negotiation
still happens. Without this, devices were showing up with only 1 available
tag opening, leading to observations of very poor I/O performance.
This should fix problems reported with VMWare Fusion and ESX. Early
generation MPT-SAS controllers with SATA disks might also be affected.
HP CISS controllers are also likely affected, as are many other
pseudo-scsi disk subsystems.
references to iv_bss and the sta table; this is equivalent and causes
direct reclaim of the old bss node when any references in packets inflight
are reclaimed (previously the old node would sit in the bss table until
the inactivity processing reclaimed it)
o remove ieee80211_node_reclaim now that it's only use is gone
Reviewed by: avatar, cbzimmer
parent interface tasks to complete. This had been added to the ioctl path but
it is also need elsewhere like detach so its safe to teardown.
Reported by: Hans Petter Selasky
Submitted by: sam
checked whether this applies to builds in /sys/*/compile/* as well):
- Create empty opt_*.h files were missing
- Hook up svr4 to the build. It compiles fine here, so no reason to
disconnect it in the Makefile. were missing
- Hook up svr4 to the build. It compiles fine here, so no reason to
disconnect it in the Makefile.
kernel. Rather than just kick off the page daemon, we actively retire
more mappings. The inner loop now looks a lot like the inner loop of
pmap_remove_all.
Also, get_pv_entry can't return NULL now, so remove panic if it did.
Reviewed by: alc@
- Always read the character device pointer while the associated devfs vnode
is locked. Also, use dev_ref() to obtain a new reference on the vnode for
the mountpoint. This reference is released on unmount. This mirrors the
earlier fix to FFS.
Reviewed by: kib
cleanup. Before the GEOM consumer would not have been closed.
- Bump the reference on the character device being mounted while the
associated devfs vnode is locked.
Reviewed by: kib
guaranteed to initialize its two last arguments. Therefore, there is a
warning in the subsequent caller ar5416FillVpdTable(), which doesn't
initialize those arguments.
Change getLowerUpperIndex() to assign values to indexL and indexR even
in the case of assertion failure.
Submitted by: Pavel Roskin <proski@gnu.org>
Just like the old TTY layer, the current MPSAFE TTY layer does not make
any attempt to serialize calls of write(). Data is copied into the
kernel in 256 (TTY_STACKBUF) byte chunks. If a write() call occurs at
the same time, the data may interleave. This is especially likely when
the TTY starts blocking, because the output queue reaches the high
watermark.
I've implemented this by adding a new flag, TTY_BUSY_OUT, which is used
to mark a TTY as having a thread stuck in write(). Because I don't want
non-blocking processes to be possibly blocked by a sleeping thread, I'm
still allowing it to bypass the protection. According to this message,
the Linux kernel returns EAGAIN in such cases, but I think that's a
little too restrictive:
http://kerneltrap.org/index.php?q=mailarchive/linux-kernel/2007/5/2/85418/thread
PR: kern/118287
from the parent to the child process if they have an operation vector
of &badfileops. This narrows a set of races involving system calls that
allocate a new file descriptor, potentially block for some extended
period, and then return the file descriptor, when invoked by a threaded
program that concurrently invokes fork(2). Similar approches are used
in both Solaris and Linux, and the wideness of this race was introduced
in FreeBSD when we moved to a more optimistic implementation of
accept(2) in order to simplify locking.
A small race necessarily remains because the fork(2) might occur after
the finit() in accept(2) but before the system call has returned, but
that appears unavoidable using current APIs. However, this race is
vastly narrower.
The fix can be validated using the newfileops_on_fork regression test.
PR: kern/130348
Reported by: Ivan Shcheklein <shcheklein at gmail dot com>
Reviewed by: jhb, kib
MFC after: 1 week
in 'sc_state'. This allows the lpt_release_ppbus() calls in those two
routines to actually release the ppbus and thus fixes the hangs noticed
with the lpt(4) driver since the recent ppbus changes. The old lpt(4)
driver didn't actually check the HAVEBUS flag in lpt_release_ppbus() which
is why these bugs weren't noticed before.
use patches so far:
+ Envy24:
- fix: broken init data for M Audio Delta DiO 2496
- add: support for M Audio Delta 44
- add: support for M Audio Delta 1010LT
Tested by: Dominique Goncalves, dominique.goncalves at gmail.com
- add: support for Terratec EWX 2496
Tested by: Stefan Sperling, stsp at stsp.name
- add: support for M Audio Delta 66
Tested by: Richard Bown, richard.bown at blueyonder.co.uk
- add: support for M Audio Delta 1010
Tested by: Andrew Reilly, areilly at bigpond.net.au
+ Envy24HT:
- add: support for Terrasoniq TS22PCI
- fix: M-Audio Revolution 5.1 sound volume is very low
Reported by: Oliver Hartmann, ohartman at zedat.fu-berlin.de
Andrey Slusar, anrays at gmail.com
Tested by: Andrey Slusar, anrays at gmail.com
Rusu Silviu, arol.the at gmail.com
- fix: M-Audio Revolution 7.1 sound is distorted and very quiet
Reported by: Olev Hannula, hannula at gmail.com
Tested by: Olev Hannula, hannula at gmail.com
Stanislav Belansky, stanislav at icmail.ru
- fix: Terratec PHASE 22 codec is power-off due to wrong init data
Reported by: Philipp Ost, pj at smo.de
Tested by: Philipp Ost, pj at smo.de
+ SpicDS:
- fix: AK4381 produce hiss sound on 192kHz sample rate
- fix: stupid bug with volume control for AK4396
Submitted by: Konstantin Dimitrov <kosio.dimitrov@gmail.com>
kernel one as the non-faulting flush address in the loader so
we can can change KERNBASE and VM_MIN_KERNEL_ADDRESS if we
ever want to without needing to worry about using a compatible
loader.
- Correctly check for LOADER_DEBUG.
- Add a missing const for page_sizes[].
This file was missed in r188455.
o Fix a minor indentation problem.
o Put in the extra-strict KOBJMETHOD define, but commented out since
the tree isn't yet ready.
Reviewed by: (1) was posted to arch@ without objection (and 1 go for it)
Attach call without devclass set crashes the system.
On resume AHCI driver sometimes tries to create duplicate adX device.
It is surely his own problem, but IMHO it is not a reason to crash here.
Other reasons are also possible.
kernel one as the non-faulting flush address in the loader so
we can can change KERNBASE and VM_MIN_KERNEL_ADDRESS if we
ever want to without needing to worry about using a compatible
loader.
- Correctly check for LOADER_DEBUG.
- Add a missing const for page_sizes[].
result in errors for a format loading but subsequent correct recognizing
for another format.
File format loading functions should avoid printing any additional
informations but just returning appropriate (and different between each
other) error condition, characterizing different informations.
Additively, the linker should handle appropriately different format
loading errors.
While a general mechanism is desired, fix a simple and common case on
amd64: file type is not recognized for link elf and confuses the linker.
Printout an error if all the registered linker classes can't recognize
and load the module.
Reviewed by: jhb
Sponsored by: Sandvine Incorporated
MFp4 //depot/projects/usb; 157412
Sync from svn.freebsd.org/base/user/thompsa/usb which is a minimal changeset
from oldUSB (no config_td).
This excludes the taskqueue changes (for the moment) as requested.
Sync from svn.freebsd.org/base/user/thompsa/usb which is a minimal changeset
from oldUSB (no config_td).
This excludes the taskqueue changes (for the moment) as requested.
- Change "usb2_pause_mtx" so that it takes the timeout value in ticks
- Make sure that attach waits for the generic probe to leave room for firware
loader drivers and device specific USB Audio drivers like USB phone adapters
Submitted by: Hans Petter Selasky
- USB serial drivers cleanup, factor out code
- Simplify line state programming
- Integrate uslcom from old USB stack
Submitted by: Hans Petter Selasky
1. Move most of the ifnet logic into the usb2_ethernet module, this includes,
- make all usb ethernet interfaces named ue%d
- handle all threading in usb2_ethernet
- provide default ioctl handler
- handle mbuf rx
- provide locked callbacks for init,start,stop,etc
2. Cleanup CDC-Ethernet driver.
Submitted by: Hans Petter Selasky
Obtained from: svn.freebsd.org/base/user/thompsa/usb [1]
- Change "usb2_pause_mtx" so that it takes the timeout value in ticks
- Factor out USB ethernet and USB serial driver specific control request.
- USB process naming cleanup.
Submitted by: Hans Petter Selasky
- Change "usb2_pause_mtx" so that it takes the timeout value in ticks
- USB controller: EHCI High Speed Interrupt endpoint fix.
- Fix OHCI and EHCI counting bug when multiple TD's are involved in
a short USB transfer and a short packet happens on the non-last TD in the
USB transfer frame.
- USB process naming cleanup.
Submitted by: Hans Petter Selasky
lookups:
- Honor the caller's locking flags in udf_root() and udf_vget().
- Set VV_ROOT for the root vnode in udf_vget() instead of only doing it in
udf_root().
- Honor the requested locking flags during pathname lookups in udf_lookup().
- Release the buffer holding the directory data before looking up the vnode
for a given file to avoid a LOR between the "udf" vnode locks and
"bufwait".
- Use vn_vget_ino() to handle ".." lookups.
- Special case "." lookups instead of calling udf_vget(). We have to do
extra checking for the vnode lock for "." lookups.
the same dma tag. However, it can happen multiple dma tags share the same
bounce zone too, so add a per-bounce zone map counter, and check it instead of
the dma tag map counter, to know if we have to alloc more pages.
Reported by: miwi
Reviewed by: scottl
controllers that lose Tx completion interrupts under certain
conditions. With this change it's safe to use MSI on PCIe
controllers so enable MSI on these controllers.
found inside extended partitions and used to create logical partitions.
At this time write/modify support is not (yet) present.
The EBR and MBR schemes both check the parent scheme. The MBR will
back-off when nested under another MBR, whereas the EBR only nests
under a MBR.
offset. This is needed for the ehci hardware buffer rings that assume
this behavior.
This is an interim solution, and a more general one is being worked
on. This solution doesn't break anything that doesn't ask for it
directly. The mbuf and uio variants with this flag likely don't work
and haven't been tested.
Universe builds with these changes. I don't have a huge-memory
machine to test these changes with, but will be happy to work with
folks that do and hps if this changes turns out not to be sufficient.
Submitted by: alfred@ from Hans Peter Selasky's original
hold the map lock there, and might need the vnode lock for OBJT_VNODE
objects. Postpone object deallocation until caller of vm_map_delete()
drops the map lock. Link the map entries to be freed into the freelist,
that is released by the new helper function vm_map_entry_free_freelist().
Reviewed by: tegge, alc
Tested by: pho
Reference object, drop the map lock, and then call vm_object_sync().
The object sync might require vnode lock for OBJT_VNODE type objects.
Reviewed by: tegge
Tested by: pho
acquire vnode lock for OBJT_VNODE object after map lock is dropped.
Because we have the busy page(s) in the object, sleeping there would
result in deadlock with vnode resize. Try to get lock without sleeping,
and, if the attempt failed, drop the state, lock the vnode, and restart
the fault handler from the start with already locked vnode.
Because the vnode_pager_lock() function is inlined in vm_fault(),
axe it.
Based on suggestion by: alc
Reviewed by: tegge, alc
Tested by: pho
underlying partitioning scheme.
o Put the start and end of the partition in the XML configuration.
The start and end are the LBAs of the first and last sector
(resp.) of the partition. They are currently identical to the
offset and size attributes, which describe the partition as an
offset and size in bytes, but may not in the future. The start
and end will be used for the logical partition boundaries and
may include metadata. The offset and size will always represent
the useful storage space within the partition. Typically these
two notions are the same, but for logical partitions in an
extended partition, the EBR is more naturally treated as being
part of the partition.
describing why several calls to vm_deallocate_object() with locked map
do not result in the acquisition of the vnode lock after map lock.
Suggested and reviewed by: tegge
be accessible outside vmspace_fork() yet, but locking it would satisfy
the protocol of the vm_map_entry_link() and other functions called
from vmspace_fork().
Use trylock that is supposedly cannot fail, to silence WITNESS warning
of the nested acquisition of the sx lock with the same name.
Suggested and reviewed by: tegge
both node pointer and name component. This does the right thing for
hardlinks to the same node in the same directory.
Submitted by: Yoshihiro Ota <ota j email ne jp>
PR: kern/131356
MFC after: 2 weeks
(duplicate) code in sys/netipsec/ipsec.c and fold it into
common, INET/6 independent functions.
The file local functions ipsec4_setspidx_inpcb() and
ipsec6_setspidx_inpcb() were 1:1 identical after the change
in r186528. Rename to ipsec_setspidx_inpcb() and remove the
duplicate.
Public functions ipsec[46]_get_policy() were 1:1 identical.
Remove one copy and merge in the factored out code from
ipsec_get_policy() into the other. The public function left
is now called ipsec_get_policy() and callers were adapted.
Public functions ipsec[46]_set_policy() were 1:1 identical.
Rename file local ipsec_set_policy() function to
ipsec_set_policy_internal().
Remove one copy of the public functions, rename the other
to ipsec_set_policy() and adapt callers.
Public functions ipsec[46]_hdrsiz() were logically identical
(ignoring one questionable assert in the v6 version).
Rename the file local ipsec_hdrsiz() to ipsec_hdrsiz_internal(),
the public function to ipsec_hdrsiz(), remove the duplicate
copy and adapt the callers.
The v6 version had been unused anyway. Cleanup comments.
Public functions ipsec[46]_in_reject() were logically identical
apart from statistics. Move the common code into a file local
ipsec46_in_reject() leaving vimage+statistics in small AF specific
wrapper functions. Note: unfortunately we already have a public
ipsec_in_reject().
Reviewed by: sam
Discussed with: rwatson (renaming to *_internal)
MFC after: 26 days
X-MFC: keep wrapper functions for public symbols?
solving a possible panic when snd_ai2s is loaded at boot time. Also change
the device setup to indicate to the pcm layer that the device is MPSAFE.
Submitted by: Marco Trillo
Suggestions by: Ariff Abdullah
- ath(4) is the last listed device, so make it's comment last as well
- since we have hints for le(4), bring it back by inserting commented
out line until I check, if it can be safely enabled.
- bring snc(4) explanation
- put pmtimer comment together with other drivers' comments in a block
- bring comments for canbus, canbepm, pmc
olpt comment has been left blank, since I don't know how does this
driver differ from other printer drivers.
- leave pmtimer comment that is common to other architectures.
- bring pbio explanation to the block comment relating to other
drivers in the same block.
- remove misleading nve/nfe comments, which make it hard to
distinguish those two at a first glance
- bring pbio documentation to the block comment together with
other drivers
I also brought commented out line responsible for si(4), since it
seems to compile and already has respective comment in this file.
sequence of pathname components. We walk the list building a string in
the caller's passed in buffer. Currently this only handles path names
in CS8 (character set 8) as that is what mkisofs generates for UDF images.
MFC after: 1 month
global NOTES file.
cx(4) driver isn't present in this file, though it could be. However, cx(4)
seems to be more or less dead -- it hasn't been linked to the modules build,
and after TTY-ng transformations it doesn't compile.
Remove it until cx(4) is broken.
need to explicitly list it here once again. This removes:
WARNING: duplicate option `DEV_URAL' encountered.
WARNING: duplicate device `ural' encountered.
Warnings when compiling LINT on amd64.
- correct format strings
- fill opt_agp.h if AGP_DEBUG is defined
- bring AGP_DEBUG to LINT by mentioning it in NOTES
This should hopefully fix a warning that was...
Found by: Coverity Prevent(tm)
CID: 3676
Tested on: amd64, i386
- Add a separate set of vnode operations that inherits from the fifo ops
and use it for fifo nodes.
- Add a VOP_SETATTR() method that allows setting the size (by silently
ignoring the requests) of fifos. This is to allow O_TRUNC opens of
fifo devices (e.g. I/O redirection in shells using ">").
- Add a VOP_PRINT() handler while I'm here.
- Align the fifo output in fifo_print() with other vn_printf() output.
- Remove the leading space from lockmgr_printinfo() so its output lines up
in vn_printf().
- lockmgr_printinfo() now ends with a newline, so remove an extra newline
from vn_printf().
of devvp becomes VBAD, which UFS incorrectly interprets as snapshot
vnode, which in turns causes panic. Fix it by replacing '!= VCHR'
with '== VREG'.
With this fix in place, you should no longer be able to panic the system
by removing a device with an UFS filesystem mounted from it - assuming
you don't use softupdates.
Reviewed by: kib
Tested by: pho
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
Back in 1.1 of kern_sysctl.c the sysctl() routine wired the "old" userland
buffer for most sysctls (everything except kern.vnode.*). I think to prevent
issues with wiring too much memory it used a 'memlock' to serialize all
sysctl(2) invocations, meaning that only one user buffer could be wired at
a time. In 5.0 the 'memlock' was converted to an sx lock and renamed to
'sysctl lock'. However, it still only served the purpose of serializing
sysctls to avoid wiring too much memory and didn't actually protect the
sysctl tree as its name suggested. These changes expand the lock to actually
protect the tree.
Later on in 5.0, sysctl was changed to not wire buffers for requests by
default (sysctl_handle_opaque() will still wire buffers larger than a single
page, however). As a result, user buffers are no longer wired as often.
However, many sysctl handlers still wire user buffers, so it is still
desirable to serialize userland sysctl requests. Kernel sysctl requests
are allowed to run in parallel, however.
- Expose sysctl_lock()/sysctl_unlock() routines to exclusively lock the
sysctl tree for a few places outside of kern_sysctl.c that manipulate
the sysctl tree directly including the kernel linker and vfs_register().
- sysctl_register() and sysctl_unregister() require the caller to lock
the sysctl lock using sysctl_lock() and sysctl_unlock(). The rest of
the public sysctl API manage the locking internally.
- Add a locked variant of sysctl_remove_oid() for internal use so that
external uses of the API do not need to be aware of locking requirements.
- The kernel linker no longer needs Giant when manipulating the sysctl
tree.
- Add a missing break to the loop in vfs_register() so that we stop looking
at the sysctl MIB once we have changed it.
MFC after: 1 month
sysctls during a linker file unload. We drop the lock when doing similar
operations during a linker file load. To close races, clear the LINKED
flag before dropping the lock so that the linker file is no longer visible
to userland.
MFC after: 1 week
o change tdma packet drop msg when ack required to ATH_DEBUG_TDMA
(ATH_DEBUG_XMIT is too noisy)
o add a debug msg for raw packet drop due to interface down/invalid
o add stats for these two cases
o explain how another drop case is handled
IEEE80211_KEY_DEVKEY
o fix channel power printing (they are signed values)
o add show statab to dump a node table and automatically dump the sta
table of a com structure with /s
o add CFI_SUPPORT_STRATAFLASH compile option to enable support
o add new ioctls to get/set the factory and user/oem segments of the PR
and to get/set Protection Lock Register that fuses the user segment
o add #defines for bits in the status register
o update cfi_wait_ready to take an offset so it can be used to wait for
PR write completion and replace constants w/ symbolic names
Note: writing the user segment isn't correct; committing now to get review.
Sponsored by: Carlson Wireless
Reviewed by: imp, Chris Anderson
in udp6_connect (td is already dereferenced elsewhere without such a
check). This makes the conversion from a sockaddr to a sockaddr_in6
always happen, so convert once at the beginning of the function rather
than twice in the middle.
Approved by: bz (mentor)
prison_check_ip4 and prison_check_ip6. As prison_if includes a jailed()
check, remove that check before calling rtm_get_jailed.
Approved by: bz (mentor)
When we leave the console TTY constantly open, we never reset the
termios attributes. This causes output processing, echoing, etc. not to
be reset to the proper values when going into single user mode after the
system has booted. It also causes nl-to-crnl-conversion not to take
place during shutdown, which causes a `staircase effect'.
This patch adds a new TTY flag, TF_OPENED_CONS, which is set when the
TTY is opened through /dev/console. Because the flags are only used by
the kernel and the pstat(8) utility, I've decided to renumber the TTY
flags. This shouldn't be an issue, because the TTY layer is not yet part
of a stable release.
Reported by: Mark Atkinson <atkin901 yahoo com>
Tested by: sepotvin
jail doesn't support. This involves a new function prison_check_af,
like prison_check_ip[46] but that checks only the family.
With this change, most of the errors generated by jailed sockets
shouldn't ever occur, at least until jails are changeable.
Approved by: bz (mentor)
return zero on success and an error code otherwise. The possible errors
are EADDRNOTAVAIL if an address being checked for doesn't match the
prison, and EAFNOSUPPORT if the prison doesn't have any addresses in
that address family. For most callers of these functions, use the
returned error code instead of e.g. a hard-coded EADDRNOTAVAIL or
EINVAL.
Always include a jailed() check in these functions, where a non-jailed
cred always returns success (and makes no changes). Remove the explicit
jailed() checks that preceded many of the function calls.
Approved by: bz (mentor)
called without calling vfs_busy() first. This made umount(8) hang waiting
for mnt_lockref to become zero, which would never happen.
Reviewed by: kib
Approved by: rwatson (mentor)
Reported by: pho
Found with: stress2
Sponsored by: FreeBSD Foundation
were used only for assertions, and rather than ifdef'ing them
INVARIANTS and using local variables, just directly access so_pcb.
Submitted by: Christoph Mallon <christoph dot mallon at gmx dot de>
MFC after: 1 week
while holding a spin mutex. Instead, it now shoves the machine check
records onto a queue that is later drained to add sysctl nodes for each
record. While a routine to drain the queue is present, it is not currently
called.
Reviewed by: marcel
Right now we only have a very small amount of drivers that use clists,
but we still allocate 50 cblocks as slush space, which allows drivers to
temporarily overcommit their storage. Most of the drivers don't allow
this anyway.
I've performed the following changes:
- We don't allocate any cblocks on startup.
- I've removed the DDB command, because it has nothing useful to print
now. You can obtain the amount of allocated blocks by running `vmstat
-m | grep clist'.
- I've removed cfreecount, which is now unused.
- The old code first tries to allocate using M_NOWAIT, followed by
M_WAITOK. This doesn't make any sense, so just remove this logic. It
seems the drivers allow us to sleep anyway.
We can even remove ccmax from clist_alloc_cblocks and c_cbmax from
struct clist, but this breaks binary compatibility.
This reduces the amount of allocated cblocks on my system from 54 to 4.
defrouter_select(), NULL the cached llentry after unlocking
as we are no longer interested in it and with the second
iteration would try to unlock it again resulting in
panic: Lock (rw) lle not locked @ ...
Reported by: Mark Atkinson <m.atkinson@f5.com>
Tested by: Mark Atkinson <m.atkinson@f5.com>
PR: kern/128247 (in follow-up, unrelated to original report)
turbo option in addition to the mode bits; otherwise if the current
channel is a turbo mode channel we'll form an invalid media setting
and the ifmedia_set operation in vap_attach will panic.
While here use C99-style initialization for an array indexed by mode;
this makes it consistent w/ other usage and avoids breakage if we
should ever change the set of modes.
is messing with the UDP tunnel. This means
that if two users actually tried to change the
tunnel port at the same time interesting things COULD
result, but its probably very unlikely to happen :-)
The TTY buffers used the standard <sys/queue.h> lists. Unfortunately
they have a big shortcoming. If you want to have a double linked list,
but no tail pointer, it's still not possible to obtain the previous
element in the list. Inside the buffers we don't need them. This is why
I switched to custom linked list macros. The macros will also keep track
of the amount of items in the list. Because it doesn't use a sentinel,
we can just initialize the queues with zero.
In its simplest form (the output queue), we will only keep two
references to blocks in the queue, namely the head of the list and the
last block in use. All free blocks are stored behind the last block in
use.
I noticed there was a very subtle bug in the previous code: in a very
uncommon corner case, it would uma_zfree() a block in the queue before
calling memcpy() to extract the data from the block.
o add bus shim for cfi driver
o add static mapping for CS0 (we map all 16M as the cfi driver doesn't
support demand mapping)
Note this needs some tweaking to work for 2358 boards which is why the
CAMBRIA config is not touched.
for slave addressing by using left-adjusted slave addresses (i.e.
xxxxxxx0b).
- Require the low bit of the slave address to always be zero in smb(4) to
help catch broken applications.
- Adjust some code in the IPMI driver to not convert the slave address for
SSIF to a right-adjusted address. I (or possibly ambrisko@) added this in
the past to (unknowingly) work around the bug in ichsmb(4).
Submitted by: Andriy Gapon <avg of icyb.net.ua> (1,2)
MFC after: 1 month
- Prepare for CRC offloading, add MIB counters (RS/MT).
- Bugfix: Disable CRC computation for IPv6 addresses with local scope (MT).
- Bugfix: Handle close() with SO_LINGER correctly when notifications
are generated during the close() call(MT).
- Bugfix: Generate DRY event when sender is dry during subscription.
Only for 1-to-1 style sockets (RS/MT)
- Bugfix: Put vtags for the correct amount of time into time-wait (MT).
- Bugfix: Clear vtag entries correctly on expiration (MT).
- Bugfix: shutdown() indicates ENOTCONN when called for unconnected
1-to-1 style sockets (MT).
- Bugfix: In sctp Auth code (PL).
- Add support for devices that support SCTP csum offload (igb).
- Add missing sctp_associd to mib sysctl xsctp_tcb structure (RS)
Obtained from: With help from Peter Lei and Michael Tuexen
we, like TCP and UDP, move the checksum calculation
into the IP routines when there is no hardware support
we call into the normal SCTP checksum routine.
The next round of SCTP updates will use
this functionality. Of course the IGB driver needs
a few updates to support the new intel controller set
that actually does SCTP csum offload too.
Reviewed by: gnn, rwatson, kmacy
mode.
- Make the NMI handler run on its own stack (TSS_IST2).
- Store the GSBASE value for each CPU just before the start of
each NMI stack, permitting efficient retrieval using %rsp-relative
addressing.
- For NMIs taken from kernel mode, program MSR_GSBASE explicitly
since one or both of MSR_GSBASE and MSR_KGSBASE can be potentially
invalid. The current contents of MSR_GSBASE are saved and restored
at exit.
- For NMIs handled from user mode, continue to use 'swapgs' to
load the per-CPU GSBASE.
Reviewed by: jeff
Debugging help: jeff
Tested by: gnn, Artem Belevich <artemb at gmail dot com>
device. The details include the current value of the BAR (including all
the flag bits and the current base address), its length, and whether or not
it is enabled. Since this operation is not invasive, non-root users are
allowed to use it (unlike manual config register access which requires
root). The intention is that userland apps (such as Xorg) will use this
interface rather than dangerously frobbing the BARs from userland to
obtain this information.
- Add a new sub-mode to the 'list' mode of pciconf. The -b flag when used
with -l will now list all the active BARs for each device.
(Missed in previous commit)
MFC after: 1 week
device. The details include the current value of the BAR (including all
the flag bits and the current base address), its length, and whether or not
it is enabled. Since this operation is not invasive, non-root users are
allowed to use it (unlike manual config register access which requires
root). The intention is that userland apps (such as Xorg) will use this
interface rather than dangerously frobbing the BARs from userland to
obtain this information.
- Add a new sub-mode to the 'list' mode of pciconf. The -b flag when used
with -l will now list all the active BARs for each device.
MFC after: 1 month
src/usr.bin/usbhidctl/usbhid.c
src/sys/dev/usb2/include/usb2_hid.h
src/sys/dev/usb2/input/uhid2.c
src/lib/libusbhid/Makefile
src/lib/libusbhid/descr.c
src/lib/libusbhid/descr_compat.c
src/lib/libusbhid/usbhid.3
src/lib/libusbhid/usbhid.h
src/lib/libusbhid/usbvar.h
Patches to make libusbhid and HID userland utilities compatible with
the new USB stack. All HID ioctls should go through the libusbhid
library to ensure compatibility. I have found at least one piece of
software in /usr/ports which needs to get updated before USB HID
devices will work. This is the X joystick input driver.
Reported and tested by:
Daichi GOTO and Masanori OZAWA.
src/sys/dev/usb2/core/usb2_process.c
Correct USB process names.
Reported by:
Andre Guibert de Bruet
src/sys/dev/usb2/serial/uftdi2.c
Integrate changes from old USB stack.
Submitted by: hps
Move the interupt handler to a driver_intr_t type function as it was trying
to do way to much for a lightweight filter interrupt function.
Introduce much more locking around fc->mtx. Tested this for lock reversals
and other such lockups. Locking seems to be working better, but there
is much more to do with regard to locking. The most significant lock is
in the BUS RESET handler. It was possible, before this checkin, to set
a bus reset via "fwcontrol -r" and have the BUS RESET handler fire before
the code responsible for asserting BUS RESET was complete. This locking
fixes that issue.
Move some of the memory allocations in the fc struct to the attach function
in firewire.c
Rework the businfo.generation indicator to be merely a on/off bit now.
It's purpose according to spec is to notify the bus that the config ROM
has changed. That's it.
Catch and squash a possible panic in SBP where in the SBP_LOCK was held
during a possible error case. The error handling code would definitely
panic as it would try to acquire the SBP_LOCK on entrance.
Catch and squash a camcontrol/device lockup when firewire drives go away.
When a firewire device was powered off or disconnected from the firewire
bus, a "camcontrol rescan all" would hang trying to poll removed devices
as they were not properly detached. Don't do that.
Approved by: scottl
MFC after: 2 weeks
channel, only accept a real 11g channel; this fixes a problem where we were
wrongly promoting 11b to a Dynamic Turbo G channel which broke scanning on
channel 6
from the inet6 stack along with statistics and make sure we
properly free the rt in all cases.
While the current situation is not better performance wise it
prevents panics seen more often these days.
After more inet6 and ipsec cleanup we should be able to improve
the situation again passing the rt to ip6_forward directly.
Leave the ip6_forward_rt entry in struct vinet6 but mark it
for removal.
PR: kern/128247, kern/131038
MFC after: 25 days
Committed from: Bugathon #6
Tested by: Denis Ahrens <denis@h3q.com> (different initial version)
that the annotated function returns a pointer that doesn't alias any
extant pointer. This results in a 50%+ speedup in microbenchmarks such
as the following:
char *cp = malloc(1), *buf = malloc(BUF);
for (i = 0; i < BUF; i++) buf[i] = *cp;
In real programs, your mileage will vary. Note that gcc already
performs this optimization automatically for any function called
`malloc', `calloc', `strdup', or `strndup' unless -fno-builtins is
used.
So we are no longer interested in the error returned from
the *fs_doio() functions. With that we can remove the
error variable as its value is unused now.
Submitted by: Christoph Mallon christoph.mallon@gmx.de
a locked route. Thus we have to use RTFREE_LOCKED(9) to get it unlocked
and rtfree(9)d rather than just rtfree(9)d.
Since the PR was filed, new places with the same problem were added
with new code. Also check that the rt is valid before freeing it
either way there.
PR: kern/129793
Submitted by: Dheeraj Reddy <dheeraj@ece.gatech.edu>
MFC after: 2 weeks
Committed from: Bugathon #6
Leave then in struct vinet6 to not break the ABI with kernel modules
but mark them for removal so we can do it in one batch when the time
is right.
MFC after: 1 month
work ok in C++. Note that we enable this only for gcc 4.x for any value
of x. The __null was introduced in gcc 4.1 (in fact it was commited
12 days after release of gcc 4.0) and as we have never released any version
of FreeBSD with gcc 4.0 nor ports support gcc 4.0.x this is a safe check.
Using __GNUC_PREREQ__ would require us to include cdefs.h in params.h so
we just check __GNUC__.
Approved by: kib (mentor)
Tested by: exp build of ports (done by pav)
Tested by: make universe (done by me)
more irqs as we have more cpus. This is principally useful on systems
with msi devices which may want many irqs per-cpu.
Discussed with: jhb
Sponsored by: Nokia
32-bit processes. The value matches the initial setting used by
FreeBSD/i386. Otherwise, 32-bit binaries using floating point would use
a slightly different initial state when run on FreeBSD/amd64.
MFC after: 1 week
After running a `make buildkernel', I noticed most of the Giant locks in
sysctl are only caused by a very small amount of sysctl's:
- sysctl.name2oid. This one is locked by SYSCTL_LOCK, just like
sysctl.oidfmt.
- kern.ident, kern.osrelease, kern.version, etc. These are just constant
strings.
- kern.arandom, used by the stack protector. It is already protected by
arc4_mtx.
I also saw the following sysctl's show up. Not as often as the ones
above, but still quite often:
- security.jail.jailed. Also mark security.jail.list as MPSAFE. They
don't need locking or already use allprison_lock.
- kern.devname, used by devname(3), ttyname(3), etc.
This seems to reduce Giant locking inside sysctl by ~75% in my primitive
test setup.
mutex to a reader/writer lock. Lookup operations first grab a read lock and
perform the lookup. If the operation results in a need to modify the cache,
then it tries to do an upgrade. If that fails, it drops the read lock,
obtains a write lock, and redoes the lookup.
pathname lookups.
- Remove 'i_offset' and 'i_ino' from the ISO node structure and replace
them with local variables in the lookup routine instead.
- Cache a copy of 'i_diroff' for use during a lookup in a local variable.
- Save a copy of the found directory entry in a malloc'd buffer after a
successfull lookup before getting the vnode. This allows us to release
the buffer holding the directory block before calling vget() which
otherwise resulted in a LOR between "bufwait" and the vnode lock.
- Use an inlined version of vn_vget_ino() to handle races with ..
lookups. I had to inline the code here since cd9660 uses an internal
vget routine to save a disk I/O that would otherwise re-read the
directory block.
- Honor the requested locking flags during lookups to allow for shared
locking.
- Honor the requested locking flags passed to VFS_ROOT() and VFS_VGET()
similar to UFS.
- Don't make every ISO 9660 vnode hold a reference on the vnode of the
underlying device vnode of the mountpoint. The mountpoint already
holds a suitable reference.
o remove HAL_CHANNEL; convert the hal to use net80211 channels; this
mostly involves mechanical changes to variable names and channel
attribute macros
o gut HAL_CHANNEL_PRIVATE as most of the contents are now redundant
with the net80211 channel available
o change api for ath_hal_init_channels: no more reglass id's, no more outdoor
indication (was a noop), anM contents
o add ath_hal_getchannels to have the hal construct a channel list without
altering runtime state; this is used to retrieve the calibration list for
the device in ath_getradiocaps
o add ath_hal_set_channels to take a channel list and regulatory data from
above and construct internal state to match (maps frequencies for 900MHz
cards, setup for CTL lookups, etc)
o compact the private channel table: we keep one private channel
per frequency instead of one per HAL_CHANNEL; this gives a big
space savings and potentially improves ani and calibration by
sharing state (to be seen; didn't see anything in testing); a new config
option AH_MAXCHAN controls the table size (default to 96 which
was chosen to be ~3x the largest expected size)
o shrink ani state and change to mirror private channel table (one entry per
frequency indexed by ic_devdata)
o move ani state flags to private channel state
o remove country codes; use net80211 definitions instead
o remove GSM regulatory support; it's no longer needed now that we
pass in channel lists from above
o consolidate ADHOC_NO_11A attribute with DISALLOW_ADHOC_11A
o simplify initial channel list construction based on the EEPROM contents;
we preserve country code support for now but may want to just fallback
to a WWR sku and dispatch the discovered country code up to user space
so the channel list can be constructed using the master regdomain tables
o defer to net80211 for max antenna gain
o eliminate sorting of internal channel table; now that we use ic_devdata
as an index, table lookups are O(1)
o remove internal copy of the country code; the public one is sufficient
o remove AH_SUPPORT_11D conditional compilation; we always support 11d
o remove ath_hal_ispublicsafetysku; not needed any more
o remove ath_hal_isgsmsku; no more GSM stuff
o move Conformance Test Limit (CTL) state from private channel to a lookup
using per-band pointers cached in the private state block
o remove regulatory class id support; was unused and belongs in net80211
o fix channel list construction to set IEEE80211_CHAN_NOADHOC,
IEEE80211_CHAN_NOHOSTAP, and IEEE80211_CHAN_4MSXMIT
o remove private channel flags CHANNEL_DFS and CHANNEL_4MS_LIMIT; these are
now set in the constructed net80211 channel
o store CHANNEL_NFCREQUIRED (Noise Floor Required) channel attribute in one
of the driver-private flag bits of the net80211 channel
o move 900MHz frequency mapping into the hal; the mapped frequency is stored
in the private channel and used throughout the hal (no more mapping in the
driver and/or net80211)
o remove ath_hal_mhz2ieee; it's no longer needed as net80211 does the
calculation and available in the net80211 channel
o change noise floor calibration logic to work with compacted private channel
table setup; this may require revisiting as we no longer can distinguish
channel attributes (e.g. 11b vs 11g vs turbo) but since the data is used
only to calculate status data we can live with it for now
o change ah_getChipPowerLimits internal method to operate on a single channel
instead of all channels in the private channel table
o add ath_hal_gethwchannel to map a net80211 channel to a h/w frequency
(always the same except for 900MHz channels)
o add HAL_EEBADREG and HAL_EEBADCC status codes to better identify regulatory
problems
o remove CTRY_DEBUG and CTRY_DEFAULT enum's; these come from net80211 now
o change ath_hal_getwirelessmodes to really return wireless modes supported
by the hardware (was previously applying regulatory constraints)
o return channel interference status with IEEE80211_CHANSTATE_CWINT (should
change to a callback so hal api's can take const pointers)
o remove some #define's no longer needed with the inclusion of
<net80211/_ieee80211.h>
Sponsored by: Carlson Wireless
Inside the kernel, the minor() function was responsible for obtaining
the device minor number of a character device. Because we made device
numbers dynamically allocated and independent of the unit number passed
to make_dev() a long time ago, it was actually a misnomer. If you really
want to obtain the device number, you should use dev2udev().
We already converted all the drivers to use dev2unit() to obtain the
device unit number, which is still used by a lot of drivers. I've
noticed not a single driver passes NULL to dev2unit(). Even if they
would, its behaviour would make little sense. This is why I've removed
the NULL check.
Ths commit removes minor(), minor2unit() and unit2minor() from the
kernel. Because there was a naming collision with uminor(), we can
rename umajor() and uminor() back to major() and minor(). This means
that the makedev(3) manual page also applies to kernel space code now.
I suspect umajor() and uminor() isn't used that often in external code,
but to make it easier for other parties to port their code, I've
increased __FreeBSD_version to 800062.
can cope with a result buffer of NULL in the "Final" function, we cannot.
Thus pass in a temporary buffer long enough for either md5 or sha1 results
so that we do not panic.
PR: bin/126468
MFC after: 1 week
section included a lot of stuff that did not belong there.
So split the block in multiple components each around the relevant stuff.
This said, I wonder if building a kernel where SYSCTL_NODE is not
defined is supported at all.
Submitted by: Marta Carbone
o max antenna gain
o driver private opaque data
Note this grows the size of a channel to 16 bytes; which makes the
default channel table 4Kbytes (up from 3Kbytes).
o change ioctl's that pass channel lists in/out to handle variable-size
arrays instead of a fixed (compile-time) value; we do this in a way
that maintains binary compatibility
o change ifconfig so all channel list data structures are now allocated
to hold MAXCHAN entries (1536); this, for example, allows the kernel
to return > IEEE80211_CHAN_MAX entries for calls like IEEE80211_IOC_DEVCAPS
the ISO tables, mark them accordingly
o add sku's for handling 900MHz cards
o add opaque struct defs and change []'s to *'s so this file can be
included w/o requiring all of net80211 to be pulled in
o make CTRY_DEBUG and CTRY_DEFAULT public
extended attributes since FreeBSD 5, make the following semantic
changes:
- Don't update the inode modification time (mtime) when extended
attributes (and hence also ACLs) are added, modified, or removed.
- Don't update the inode access tie (atime) when extended attributes
(and hence also ACLs) are queried.
This means that rsync (and related tools) won't improperly think
that the data in the file has changed when only the ACL has changed.
Note that ffs_reallocblks() has not been changed to not update on an
IO_EXT transaction, but currently EAs don't use the cluster write
routines so this shouldn't be a problem. If EAs grow support for
clustering, then VOP_REALLOCBLKS() will need to grow a flag argument
to carry down IO_EXT to UFS.
MFC after: 1 week
PR: ports/125739
Reported by: Alexander Zagrebin <alexz@visp.ru>
Tested by: pluknet <pluknet@gmail.com>,
Greg Byshenk <freebsd@byshenk.net>
Discussed with: kib, kientzle, timur, Alexander Bokovoy <ab@samba.org>
exceeded the maximum size of 1 page for OHCI controllers. Other serial
drivers use the same size, so I assume this should be enough (1MB/s
throughput?).
Make detach() completely synchronous. Properly handle stalled USB
transfers (use internal mechanism instead of submitting own control
transfers). Rename/remove a couple of variables and update comments.
This work was done in close collaboration with HPS.
Reviewed by: HPS
reimplementation of the same. Note that this changes -std=c99
to -std=iso9899:1999 but those two are synonyms.
Approved by: kib (mentor)
Reviewed by: ru
without corresponding number of fifo_open(). This causes assertion
failure in fifo_close() due to vp->v_fifoinfo being NULL for kernel
with INVARIANTS, or NULL pointer dereference otherwise. In fact, we may
ignore excess calls to fifo_close() without bad consequences.
Turn KASSERT() into the return, and print warning for now.
Tested by: pho
Reviewed by: rwatson
MFC after: 2 weeks
the PIC before the interrupt handler was set. If the interrupt triggered in
that window, then the interrupt vector would be disabled.
Reported by: Marco Trillo
Even though the code seems to be FreeBSD kernel code, it isn't compiled
on FreeBSD. I could have known this, because I was a little amazed that
I couldn't find a prototype of pfopen()/pfclose() somewhere else,
because it isn't marked as static.
Apart from that, removing these functions wouldn't have been harmful
anyway, because there are some other strange things about them (the
implementation isn't consistent with the prototype at the top). Still,
it's better to leave it, because it makes merging code back to older
branches a little harder.
Requested by: mlaier
It turns out I was patching functions that weren't used by pf(4) anyway.
They still seem to use `struct proc *' instead of `struct thread *'.
They weren't listed in pf_cdevsw.
Because it is not possible to access the pf(4) character device through
any other device node as the one in devfs, there is no need to check for
unknown device minor numbers.
Approved by: mlaier
an interpreter definition in its program header), set the auxiliary
ELF argument AT_BASE to 0 rather than to the address that we would
have mapped the interpreter at if there had been one.
The ELF ABI specifications appear to be ambiguous as to the desired
behavior in this situation, as they define AT_BASE as the base address
of the interpreter, but do not mention what to do if there is none.
On Solaris, AT_BASE will be set to the base address of the static
binary if there is no interpreter, and on Linux, AT_BASE is set to 0.
We go with the Linux semantics as they are of more immediate utility
and allow the early runtime environment to know that the kernel has
not mapped an interpreter, but because AT_PHDR points at the ELF
header for the running binary, it is still possible to retrieve all
required mapping information when the process starts should it be
required. Either approach would be preferable to our current behavior
of passing a pointer to an unmapped region of user memory as AT_BASE.
MFC after: 3 weeks
backend kegs so it may source compatible memory from multiple backends.
This is useful for cases such as NUMA or different layouts for the same
memory type.
- Provide a new api for adding new backend kegs to secondary zones.
- Provide a new flag for adjusting the layout of zones to stagger
allocations better across cache lines.
Sponsored by: Nokia
sizeof("MAXCPU") being used to calculate a string length rather than
something more reasonable such as sizeof("32"). This shouldn't have
caused any ill effect until we run on machines with 1000000 or more
cpus.
the vap ioctl. This means that the parent interface should hopefully be up
before we return to userland, it does not depend on the parent init succeeding,
just that it was run.
This fixes wpa_supplicant with ndis and USB where the parent interfaces can be
slow to init.
- Restructure selscan() and selrescan() to avoid producing extra selfps
when we have a fd in multiple sets. As described below multiple selfps
may still exist for other reasons.
- Make selrescan() tolerate multiple selfds for a given descriptor
set since sockets use two selinfos per fd. If an event on each selinfo
fires selrescan() will see the descriptor twice. This could result in
select() returning 2x the number of fds actually existing in fd sets.
Reported by: mgleason@ncftp.com
Now that make_dev() doesn't require unit numbers to be unique, there is
no need to use an unrhdr here to generate the numbers. Remove the entire
init-routine, because it is optional.
pointers to the callout handler just before and just after the callout
it invoked. I attempted to do this in a manner congruent to tracing in
Solaris's callout mechanism, but couldn't quite use the same names due
to convention and syntax differences.
Example DTrace script to generate a distribution graph of callout
execution times:
callout_execute:::callout_start
{
self->cstart = timestamp;
}
callout_execute:::callout_end
{
@length = quantize(timestamp - self->cstart);
}
Reviewed by: jb
MFC after: 3 days
inside the SYSCTL() macros and thus does not need to be done for
all of the nodes scattered across the source tree.
- Mark the name-cache related sysctl's (including debug.hashstat.*) MPSAFE.
- Mark vm.loadavg MPSAFE.
- Remove GIANT_REQUIRED from vmtotal() (everything in this routine already
has sufficient locking) and mark vm.vmtotal MPSAFE.
- Mark the vm.stats.(sys|vm).* sysctls MPSAFE.
around calls to vlrureclaim() on non-MPSAFE filesystems. Specifically,
vnlru no longer needs Giant for the common case of waking up and deciding
there is nothing for it to do.
MFC after: 2 weeks
rather than a fixed 512... This fixes the mount root problem on at91.
Prior to the SD card reorg, all data transfers were 512 bytes, so we
didn't notice.
the serial port class when we set the devclass since it is now
no-longer a compile time constant. Eliminate the pci include, as it
isn't relevant or necessary.
time constant. This allows us to potentially change it at runtime or
autodetect it early in the boot (the latter being much more likely to
have a good outcome).
- To avoid having a bunch of locks that end up always getting acquired as
a group, give each ppc(4) device a mutex which it shares with all the
child devices including ppbus(4), lpt(4), plip(4), etc. This mutex
is then used for all the locking.
- Rework the interrupt handling stuff yet again. Now ppbus drivers setup
their interrupt handler during attach and tear it down during detach
like most other drivers. ppbus(4) only invokes the interrupt handler
of the device that currently owns the bus (if any) when an interrupt
occurs, however. Also, interrupt handlers in general now accept their
softc pointers as their argument rather than the device_t. Another
feature of the ppbus interrupt handlers is that they are called with
the parent ppc device's lock already held. This minimizes the number
of lock operations during an interrupt.
- Mark plip(4), lpt(4), pcfclock(4), ppi(4), vpo(4) MPSAFE.
- lpbb(4) uses the ppc lock instead of Giant.
- Other plip(4) changes:
- Add a mutex to protect the global tables in plip(4) and free them on
module unload.
- Add a detach routine.
- Split out the init/stop code from the ioctl routine into separate
functions.
- Other lpt(4) changes:
- Use device_printf().
- Use a dedicated callout for the lptout timer.
- Allocate the I/O buffers at attach and detach rather than during
open and close as this simplifies the locking at the cost of
1024+32 bytes when the driver is attached.
- Other ppi(4) changes:
- Use an sx lock to serialize open and close.
- Remove unused HADBUS flag.
- Add a detach routine.
- Use a malloc'd buffer for each read and write to avoid races with
concurrent read/write.
- Other pps(4) changes:
- Use a callout rather than a callout handle with timeout().
- Conform to the new ppbus requirements (regular mutex, non-filter
interrupt handler). pps(4) is probably going to have to become a
standalone driver that doesn't use ppbus(4) to satisfy it's
requirements for low latency as a result.
- Use an sx lock to serialize open and close.
- Other vpo(4) changes:
- Use the parent ppc device's lock to create the CAM sim instead of
Giant.
- Other ppc(4) changes:
- Fix ppc_isa's detach method to detach instead of calling attach.
Tested by: no one :-(
Some time ago I tried adding Unicode rendering to the teken demo
application, but I didn't get it working. It seems I forgot to call
setlocale(). Polish this code and make sure it doesn't get lost.
Also a small fix for my previous commit: all Unicode characters in
teken_boxdrawing are below 0x10000, so store them as 16-bit values.
o Only set 4-bit caps on those boards that have 4-bit caps (this means that
because we don't set wire4 yet, this forces us to always use 1-bit bus).
o Don't test wire4 when setting up the bus width, since bad things will
happen if we do.
# This likely won't fix the busted at91 sd card support, but these are
# needful changes for correctness.
the helper function. It is supposed to be useful for any filesystem
that has to unlock dvp to walk to the ".." entry in lookup routine.
Requested by: jhb
Tested by: pho
MFC after: 1 month
VOP_MARKATIME() since unlike the rest of VOP_SETATTR(), VA_MARKATIME
can be performed while holding a shared vnode lock (the same functionality
is done internally by VOP_READ which can run with a shared vnode lock).
Add missing locking of the vnode interlock to the ufs implementation and
remove a special note and test from the NFS client about not supporting the
feature.
Inspired by: ups
Tested by: pho
section of code, this uses WITNESS_NORELEASE() and WITNESS_RELEASEOK() to mark
the boundaries. Both functions require the lock to be held when calling.
This is intended for scenarios like a bus asserting that the bus lock is not
dropped during a driver call. There doesn't appear to be a man page to
document this in.
Reviewed by: jhb
indirect block pages are not removed by the mentioned invocation of
the vnode_pager_setsize().
Put a common code into the helper function ffs_pages_remove().
Reported and tested by: dchagin
Reviewed by: ups
MFC after: 3 weeks
- Always program RX configuration register from scratch instead of
doing read/modify/write.
- Rename re_setmulti() to re_set_rxmode() to be reflect reality.
- Simplify hash filter logic a little while I am here.
Reviewed by: yongari (early version)
the fsbase value. The switch loads the fs segment register, that
invalidates the value in fsbase msr, thus value in %r9 can not be
considered the current value for fsbase anymore.
Unconditionally reload fsbase when switching to 32bit binary.
PR: 130526
MFC after: 3 weeks
Even though VT100-like devices can display non-ASCII characters, they do
not use an 8-bit character set. Special escape sequences allow the VT100
to switch character maps. The special graphics character set stores the
box drawing characters, starting at 0x60, ending at 0x7e. This means
we now pass the character map tests in vttest, even the save/restore
cursor test, combined with character maps. dialog(1) also works a lot
better now.
This commit also includes some other minor fixes:
- Default to 24 lines in teken_demo when using xterm emulation.
- Make white foreground and background work in teken_demo.
address space where to put vnode pages, and then call UFS_BALLOC(),
to actually allocate new block and map it. When UFS_BALLOC() returns
error, sometimes we forget to revert the vm object size increase,
allowing for the pages that are not backed by the logical disk blocks.
Revert vnode_pager_setsize() back when UFS_BALLOC() failed, for
ffs_truncate() and ffs_write().
PR: 129956
Reviewed by: ups
MFC after: 3 weeks
vnode, from -1 down. When vinvalbuf(vp, V_ALT) is done for the vnode, it
incorrectly does vm_object_page_remove(0, 0), removing all pages from
the underlying vm object, not only the pages that back the extended
attributes data.
Change vinvalbuf() to not remove any pages from the object when
V_NORMAL or V_ALT are specified. Instead, the only in-tree caller
in ffs_inode.c:ffs_truncate() that specifies V_ALT explicitely
removes the corresponding page range. The V_NORMAL caller
does vnode_pager_setsize(vp, 0) immediately after the call to
vinvalbuf(V_NORMAL) already.
Reported by: csjp
Reviewed by: ups
MFC after: 3 weeks
In normal operation, the number of cache entries is roughly equal to the
number of active vnodes. However, when most of the recently accessed
vnodes have many hard links, the number of cache entries can be 32000
times as large, exhausting kernel memory and provoking a panic in
kmem_malloc().
MFC after: 2 weeks
very well maintained and point user to sysutils/fusefs-ntfs, which
at the time of this writing seems to be a better alternative.
Suggested by: luigi
MFC after: 2 weeks
of superblock rather than using hardcoded values. This fixes ext2fs on
filesystems with inode sized other than 128.
Submitted by: Alex Lyashkov <Alexey.Lyashkov@Sun.COM> (based on)
MFC after: 2 weeks
Cons25 doesn't seem to use a straight 1:1 mapping to the ANSI colors,
but uses the same color numbers as at least used by syscons on i386. I
suspect if you change the definitions on a different architecture,
things may break? Not sure.
Add a small array to convert syscons-style color codes to ANSI
equivalents, which are used by libteken internally. I didn't notice this
bug, because I only tested my code with black, white and green, all of
them shared the same numbers.
It turns out I forgot to implement two escape sequences that allows the
user to change the default foreground and background colors. I thought
they were implemented by syscons itself, but vidcontrol just generates
some escape sequences, which get interpreted by the terminal emulator.
Reported by: mgp (forums)
The teken library already supports UTF-8 handling and xterm emulation,
but we have reasons to disable this right now. Because we should make it
easy and interesting for people to experiment with these features, allow
them to be set in kernel configuration files.
Before this commit we had a flag called `TEKEN_CONS25' to enable
cons25-style emulation. I'm calling it the opposite now, `TEKEN_XTERM',
because we want to enable it in kernel configuration files explicitly.
Requested by: kib
with src/tools/sched/schedgraph.py. This allows developers to quickly
create a graphical view of ktr data for any resource in the system.
- Add sched_tdname() and the pcpu field 'name' for quickly and uniformly
identifying records associated with a thread or cpu.
- Reimplement the KTR_SCHED traces using the new generic facility.
Obtained from: attilio
Discussed with: jhb
Sponsored by: Nokia
by writing all 1's to it to determine its length. This fixes issues with
MCFG on at least some machines where a trashed BAR claimed subsequent
attempts at PCI config transactions because the addresses in the MCFG
window fell in the decoding range of the BAR.
In general it is a bad idea to leave the BARs enabled while we are
frobbing with them in this manner.
Sleuthing by: tegge
MFC after: 1 week
The digi(4) driver directory contains some files that cannot be checked
out on Windows filesystems. This isn't a big deal, but the files aren't
used anyway.
There are still some other places where checkouts on Windows don't work,
such as VFS_MOUNT.9/vfs_mount.9. This should already be a small
improvement.
MFC after: 1 month
check on the sysctl argument value being RTF_LLINFO is conditioned on
the COMPAT_ROUTE_FLAGS kernel option. This mismatch caused the L2
table retrieval failure, and the arp/ndp -an command displays empty L2
tables.
Reviewed by: pjd
work when the bus attaches its own children. Instead of hardcoding a unit
number and returning BUS_PROBE_NOWILDCARD, which will break multiple iicbus
systems, check in the probe routine whether the device address is 0. Real
I2C devices will never have this address, but devices added with
BUS_ADD_CHILD() will.
Requested by: jhb
Reviewed by: jhb
- Add debug output
- Fix pmap_zero_page and related places: use uncached segments and invalidate
cache after zeroing memory.
- Do not test for modified bit if it's not neccessary
(merged from mips-juniper p4 branch)
- Some #includes reorganization
guarantee atomicity of the operation for other semaphore consumers.
In particular, this should guard against access to the semaphore with
not done or partially done MAC label assignment.
Reviewed by: rwatson
MFC after: 1 month
heavy loads or working. It looks this bug exists since r158869
so needs to revert a part of the previous.
Reviewed by: imp
Tested by: sam
MFC after: 3 weeks
So now, if you:
- specify "makeoptions DEBUG=-g" in your kernel config
- make buildkernel WITH_CTF=1,
then "-g" will be added to CTFFLAGS.
However, "-g" will still not be added to CTFFLAGS when building
kernel modules, if the above steps are performed. This needs to be fixed.
Noticed by: thompsa
The new behaviour is on by default, and can be disabled by setting the
net.inet.tcp.rfc3465 sysctl to 0 to obtain previous behaviour.
The patch changes struct tcpcb in sys/netinet/tcp_var.h which breaks
the ABI. Bump __FreeBSD_version to 800061 accordingly. User space tools
that rely on the size of struct tcpcb (e.g. sockstat) need to be recompiled.
Reviewed by: rpaulo, gnn
Approved by: gnn, kmacy (mentors)
Sponsored by: FreeBSD Foundation
power and thermal control, as well as GPIOs on Xserves and controlling
sound codecs for Apple built-in audio.
Submitted by: Marco Trillo
Obtained from: NetBSD
indicated I2C devices, and provides an ofw_bus interface for driver probing.
This should be MI, but is currently provided only on PowerPC due to lack of
sparc64 hardware with an I2C controller.
Discussed on: freebsd-arch
utilities, add the ${DEBUG} variable from the kernel config. Otherwise,
if we build a kernel with WITH_CTF=1 set, ctfmerge will not have
the -g flag set. In this case, the cc has -g specified, so the
.o files will have debug information generated, but since ctfmerge
does not have -g set, it will strip out the ELF sections containing
the DWARF debugging info, leading to a kernel without debugging symbols.
Reviewed by: jb
on SysV semaphores.
The squeeze of the semaphore array in the kern_semctl() modifies
sem_base for the semaphores with sem_base greater then sem_base of
the removed semaphore, as well as the values of the semaphores,
without locking their mutex. This can lead to (killable) hangs or
unexpected behaviour of the processes performing any sem operations
while other process does IPC_RMID.
The semexit_myhook() eventhandler unlocks SEMUNDO_LOCK() while
accessing *suptr. This allows for IPC_RMID for the sem id to be
performed in parallel with undo hook referenced by the current undo
structure. This leads to the panic("semexit - semid not allocated") [1].
The semaphore creation is protected by Giant, while IPC_RMID is done
while only semaphore mutex is held. This seems to result in invalid
values for semtot, causing random ENOSPC error returns [2].
Redo the locking of the semaphores lifetime cycle. Delegate the
sem_mtx to the sole purpose of protecting semget() and
semctl(IPC_RMID). Introduce new sem_undo_mtx to protect SEM_UNDO
handling. Remove the Giant remnants from the code.
Note that mac_sysvsem_check_semget() and mac_sysvsem_create() are
now called while sem_mtx is held, as well as mac_sysvsem_cleanup() [3].
When semaphore is removed, acquire semaphore locks for all semaphores
with sem_base that is going to be changed by squeeze of the sema
array. The lock order is not important there, because the region is
protected by sem_mtx.
Organize both used and free sem_undo structures into the lists,
protected by sem_undo_mtx. In semexit_myhook(), remove sem_undo
structure that is being processed, from used list, without putting it
onto the free to prevent modifications by other threads. This allows
for sem_undo_lock to be dropped to acquire individial semaphore locks
without violating lock order. Since IPC_RMID may no longer find this
sem_undo, do tolerate references to unallocated semaphores in undo
structure, and check sequential number to not undo unrelated semaphore
with the same id.
While there, convert functions definitions to ANSI C and fix small
style(9) glitches.
Reported by: Omer Faruk Sen <omerfsen gmail com> [1], pho [2]
Reviewed by: rwatson [3]
Tested by: pho
MFC after: 1 month
contrib/openbsm (svn merge) and src/sys/{bsm,security/audit} (manual
merge). Hook up bsm_domain.c and bsm_socket_type.c to the libbsm
build along with man pages, add audit_bsm_domain.c and
audit_bsm_socket_type.c to the kernel environment.
OpenBSM history for imported revisions below for reference.
MFC after: 1 month
Sponsored by: Apple Inc.
Obtained from: TrustedBSD Project
OpenBSM 1.1 alpha 5
- Stub libauditd(3) man page added.
- All BSM error number constants with BSM_ERRNO_.
- Interfaces to convert between local and BSM socket types and protocol
families have been added: au_bsm_to_domain(3), au_bsm_to_socket_type(3),
au_domain_to_bsm(3), and au_socket_type_to_bsm(3), along with definitions
of constants in audit_domain.h and audit_socket_type.h. This improves
interoperability by converting local constant spaces, which vary by OS, to
and from Solaris constants (where available) or OpenBSM constants for
protocol domains not present in Solaris (a fair number). These routines
should be used when generating and interpreting extended socket tokens.
- Fix build warnings with full gcc warnings enabled on most supported
platforms.
- Don't compile error strings into bsm_errno.c when building it in the kernel
environment.
- When started by launchd, use the label com.apple.auditd rather than
org.trustedbsd.auditd.
for jumbo frame.
o Nuke unneeded jlist lock which was used to protect jumbo buffer
management in local allocator.
o Added a new tunable hw.mskc.jumbo_disable to disable jumbo
frame support for the driver. The tunable could be set for
systems that do not need to use jumbo frames and it would
save (9K * number of Rx descriptors) bytes kernel memory.
o Jumbo buffer allocation failure is no longer critical error
for the operation of msk(4). If msk(4) encounter the allocation
failure it just disables jumbo frame support and continues to
work without your intervention.
Using local allocator had several drawbacks such as requirement of
large amount of continuous kernel memory and fixed (small) number
of available buffers. The need for large continuous memory resulted
in failure of loading driver with kldload on running systems.
Also small number of buffer used in local allocator showed poor
performance for some applications.
to be caused by a metadata corruption that occurs quite often after
unplugging a pendrive during write activity.
Reviewed by: scottl
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
Add missing set frame data pointer call. The
function call was missed when zero copy was
introduced in UMASS.
Reported by: WATANABE Kazuhiro.
Submitted by: Hans Petter Selasky
Remove dependancy towards the USB config thread in
the USB serial core. Use USB process msignalling
instead. Saves a little memory and hopefully makes
the code more understandable.
Submitted by: Hans Petter Selasky
Remove "vbus_interrupt" method from bus methods and use
a global function instead for the various drivers using it.
The reason for the removal is to simplify the code.
Submitted by: Hans Petter Selasky
Reduce the number of callback processes to 4 per
USB controller. There are two rough categories:
1) Giant locked USB transfers.
2) Non-Giant locked USB transfers.
On a real system with many USB devices plugged in the
number of processes reported by "ps auxw | grep USBPROC"
was reduced from 40 to 18.
Submitted by: Hans Petter Selasky
This change is about removing three fields from "struct usb2_xfer"
which can be reached from "struct usb2_xfer_root" instead and cleaning
up the code after this change. The fields are "xfer->udev",
"xfer->xfer_mtx" and "xfer->usb2_sc". In this process the following
changes were also made:
Rename "usb2_root" to "xroot" which is short for "xfer root".
Rename "priv_mtx" to "xfer_mtx" in USB core.
The USB_XFER_LOCK and USB_XFER_UNLOCK macros should only be used in
the USB core due to dependency towards "xroot". Substitute macros
for the real lock in two USB device drivers.
Submitted by: Hans Petter Selasky
Factor out roothub process into the USB bus structure for
all USB controller drivers. Essentially I am trying to
save some processes on the root HUB and get away
from the config thread pradigm. There will be a follow up
commit where the root HUB control and interrupt callback
will be moved over to run from the roothub process.
Total win: 3 processes become 1 for every USB controller.
Submitted by: Hans Petter Selasky
Usability improvement. Make sure that setting
power mode ON resurrects the device if powered OFF.
Reported by: Alexander Best.
Submitted by: Hans Petter Selasky
Initial version of ATMEGA USB device controller
driver. Has not been tested on real hardware yet.
The driver is based upon the AT91DCI driver.
Submitted by: Hans Petter Selasky
o Eliminate tlb0[] (a s/w copy of TLB0)
- The table contents cannot be maintained reliably in multiple MMU
environments, where asynchronous events (invalidations from other cores)
can change our local TLB0 contents underneath.
- Simplify and optimize TLB flushing: system wide invalidations are
performed using tlbivax instruction (propagates to other cores), for
local MMU invalidations a new optimized routine (assembly) is introduced.
o Improve and simplify TID allocation and management.
- Let each core keep track of its TID allocations.
- Simplify TID recycling, eliminate dead code.
- Drop the now unused powerpc/booke/support.S file.
o Improve page tables management logic.
o Simplify TLB1 manipulation routines.
o Other improvements and polishing.
Obtained from: Freescale, Semihalf
- add a reference to the config(5) manpage;
- hopefully clarify the format of the 'env FILENAME' directive.
I am putting these notes in sys/${arch}/conf/GENERIC and not
in sys/conf/NOTES because:
1. i386/GENERIC already had reference to a similar option (hints..)
and to documentation (handbook)
2. GENERIC is what most users look at when they have to modify or
create a new kernel config, so having the suggestion there is
more effective.
I am only touching i386 and amd64 because the other GENERIC files
are already out of sync, and I am not sure what is the overall plan.
MFC after: 3 days
down will cause a fault. Check the phy power state before possibly
reading from the bb, this can happen as ar5212Reset intentionally
calls ar5212GetRfgain before bringing the bb out of reset (but we
do it here and not in the caller to guard against other possible uses).
expected in acd_fixate().
This should fix various problems folks are having with 'burncd' reporting
"burncd: ioctl(CDRIOCFIXATE): Input/output error" during the fixate phase
when "fixate" is issued together with the "data" command.
PR: 95979
Submitted by: Jaakko Heinonen <jh@saunalahti.fi>
by the new kernel option COMPAT_ROUTE_FLAGS for binary backward
compatibility. The RTF_LLDATA flag maps to the same value as RTF_LLINFO.
RTF_LLDATA is used by the arp and ndp utilities. The RTF_LLDATA flag is
always returned to the userland regardless whether the COMPAT_ROUTE_FLAGS
is defined.
changes since the last imported OpenBSM release:
OpenBSM 1.1 alpha 5
- Stub libauditd(3) man page added.
- All BSM error number constants with BSM_ERRNO_.
- Interfaces to convert between local and BSM socket types and protocol
families have been added: au_bsm_to_domain(3), au_bsm_to_socket_type(3),
au_domain_to_bsm(3), and au_socket_type_to_bsm(3), along with definitions
of constants in audit_domain.h and audit_socket_type.h. This improves
interoperability by converting local constant spaces, which vary by OS, to
and from Solaris constants (where available) or OpenBSM constants for
protocol domains not present in Solaris (a fair number). These routines
should be used when generating and interpreting extended socket tokens.
- Fix build warnings with full gcc warnings enabled on most supported
platforms.
- Don't compile error strings into bsm_errno.c when building it in the kernel
environment.
- When started by launchd, use the label com.apple.auditd rather than
org.trustedbsd.auditd.
Obtained from: TrustedBSD Project
Sponsored by: Apple Inc.
in the loopback and synthetic loopback code so that packets are
access control checked and relabeled. Previously, the MAC
Framework enforced that packets sent over the loopback weren't
relabeled, but this will allow policies to make explicit choices
about how and whether to relabel packets on the loopback. Also,
for SIMPLEX devices, this produces more consistent behavior for
looped back packets to the local MAC address by labeling those
packets as coming from the interface.
Discussed with: csjp
Obtained from: TrustedBSD Project
things around so the periph destructors look alike. Based on a patch
by Jaakko Heinonen.
Submitted by: Jaakko Heinonen
Reviewed by: scottl
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
they label, derive that information implicitly from the set of label
initializers in their policy operations set. This avoids a possible
class of programmer errors, while retaining the structure that
allows us to avoid allocating labels for objects that don't need
them. As before, we regenerate a global mask of labeled objects
each time a policy is loaded or unloaded, stored in mac_labeled.
Discussed with: csjp
Suggested by: Jacques Vidrine <nectar at apple.com>
Obtained from: TrustedBSD Project
Sponsored by: Apple, Inc.
read with libkvm) to the addresses of a prison, when inside a
jail. [1]
As the patch from the PR was pre-'new-arp', add checks to the
llt_dump handlers as well.
While touching RTM_GET in route_output(), consistently use
curthread credentials rather than the creds from the socket
there. [2]
PR: kern/68189
Submitted by: Mark Delany <sxcg2-fuwxj@qmda.emu.st> [1]
Discussed with: rwatson [2]
Reviewed by: rwatson
MFC after: 4 weeks
applications to specify a non-local IP address when bind()'ing a socket
to a local endpoint.
This allows applications to spoof the client IP address of connections
if (obviously!) they somehow are able to receive the traffic normally
destined to said clients.
This patch doesn't include any changes to ipfw or the bridging code to
redirect the client traffic through the PCB checks so TCP gets a shot
at it. The normal behaviour is that packets with a non-local destination
IP address are not handled locally. This can be dealth with some IPFW hackery;
modifications to IPFW to make this less hacky will occur in subsequent
commmits.
Thanks to Julian Elischer and others at Ironport. This work was approved
and donated before Cisco acquired them.
Obtained from: Julian Elischer and others
MFC after: 2 weeks
jail-aware. Up to now we returned the first address of the interface
for SIOCGIFADDR w/o an ifr_addr in the query. This caused problems for
programs querying for an address but running inside a jail, as the
address returned usually did not belong to the jail.
Like for v6, if there was an ifr_addr given on v4, you could probe
for more addresses on the interfaces that you were not allowed to see
from inside a jail. Return an error (EADDRNOTAVAIL) in that case
now unless the address is on the given interface and valid for the
jail.
PR: kern/114325
Reviewed by: rwatson
MFC after: 4 weeks
- The contents of 'feroceon_cpufuncs' dispatch table was really dedicated for the
new Sheeva CPU (in 88F6xxx and MV-78xxx SOCs), and NOT Feroceon.
- Feroceon CPU (in 88F5xxx SOCs) appears as a regular ARM926EJ-S core and does
not require dedicated routines.
This will be accompanied by a file rename commit.
- Provide dedicated rmans for MEM and IO resources.
- Convert PCI IRQ routing info into a table (from callback approach), provide
config data for alternative DB- boards.
- Fix a wrong boundary check error in pcib_mbus_init_bar()
Obtained from: Semihalf
- Allow for setting per platform MPP/GPIO configuration in the kernel, so
that we can override all settings firmware might set.
- Set decode windows for the remaining on-chip peripherals: CESA, SATA and XOR.
- Improve handling of USB controllers so that all port are available on the
given SOC/platform (e.g. up to three on DB-78xxx), this includes rework of
USB decode windows set-up.
- Other minor fixes and cosmetics.
Obtained from: Semihalf
created by atapicam is being kept opened or mounted. This is probably just
a temporary solution until we invent something better.
Reviewed by: scottl
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
Reported by: Jaakko Heinonen
o add net80211 support for a tdma vap that is built on top of the
existing adhoc-demo support
o add tdma scheduling of frame transmission to the ath driver; it's
conceivable other devices might be capable of this too in which case
they can make use of the 802.11 protocol additions etc.
o add minor bits to user tools that need to know: ifconfig to setup and
configure, new statistics in athstats, and new debug mask bits
While the architecture can support >2 slots in a TDMA BSS the current
design is intended (and tested) for only 2 slots.
Sponsored by: Intel
- Clean up TCLK handling so that it's dynamically recognized depending on
registers settings or chip version/revision. Update registers definitions.
- Teach SOC ident routine about A0 (initial silicon version for general
audience)
Obtained from: Marvell, Semihalf
locked. Lookup could attempt to recursively lock that vnode.
Do not call vn_start_write(V_WAIT) while vnode is locked, this may
result in a deadlock with suspension.
vfs_busy() the mountpoint before dropping vnode lock for vnode
that was used to look up the mountpoint, to prevent unmount in
between.
Reported and tested by: pho
Reviewed by: rwatson
MFC after: 3 weeks
This fixes problems with discovering some USB devices that are very slow to
respond during initialisation.
When a USB device is inserted, CAM performs the sequence:
1) INQUIRY
2) INQUIRY (second time with other parameters)
3) TEST UNIT READY
4) READ CAPACITY
Before this change CAM didn't check if TEST UNIT READY was successful and went
on blindly to the next state and sent READ CAPACITY. If the device was still
not ready by then, CAM ended with error message. This patch adds checking for the
status of TEST UNIT READY command and retrying up to 10 times with 0.5 sec
interval.
Submitted by: Grzegorz Bernacki gjb ! semihalf dot com
Reviewed by: scottl
1.Sync TD on close to ensure USB request in close callback issued.
2.Add sysctls to indicate device role.
3.Enable handsfree port support.
Now modem port and obex port works well.
Handsfree port works but not with good response.
When trying to read scratched or damaged CDs and DVDs, the default
mechanism is sub-optimal. Programs like ddrescue do much better if
you turn off retries entirely, since their algorithms are designed
scan big areas fast, then winnow the areas down. Turning off retries
speeds these programs up by as much as 20x, since the drive is able to
'stream past' many small errors...
The sysctl/tunable kern.cam.cd.retry_count controls this. That
defaults to '4' (for a total of 5 attempts). Setting to 0 turns off
all retry attempts.
Reviewed by: scottl@
specifically SPI controllers now also work in big-endian
machines and some conversions relevant for FC and SAS
controllers as well as support for ILP32 machines which all
were omitted in previous attempts are now also implemented.
The IOCTL-interface is intentionally left (and where needed
actually changed) to be completely little-endian as otherwise
we would have to add conversion code for every possible
configuration page to mpt(4), which didn't seem the right
thing to do, neither did converting only half of the user-
interface to the native byte order.
This change was tested on amd64 (SAS+SPI), i386 (SAS) and
sparc64 (SAS+SPI). Due to lack of the necessary hardware
the target mode code is still left to be made endian-clean.
Reviewed by: scottl
MFC after: 1 month
functions and stop attaching of dcons(4) and dcons_crom(4) if
they indicate failure. This fixes a panic seen on sparc64 machines
with no free physical memory in the requested 32-bit region but
still doesn't make dcons(4)/dcons_crom(4) these work. I think
the latter can be fixed by simply specifying ~0UL as the upper
limit for contigmalloc(9) and letting the bounce pages and the
IOMMU respectively handle limitations of the DMA engine. I didn't
want to change that without the consensus of simokawa@ though,
who unfortunately didn't reply so far.
MFC after: 1 week
installing the kernel allows one, like with modules, to override
the default user/group and install as non-su to a temporary
directory to test, create images or seed a tftp dir.
Reviewed by: Andrzej Tobola <ato@iem.pw.edu.pl>
MFC after: 4 weeks
module. These files cause manual interaction when building
ports/audio/aureal-kmod which provides a usable i386-only driver (it requires
linking against some linux object files distributed by vendor which bankrupted
back in 2000).
MFC after: 1 week
subclasses as are available with PCI. Changes I2C device drivers without
real probe logic to return BUS_PROBE_NOWILDWARD to avoid interference with
firmware bus enumeration, and reduces the probe priority of the iicbus
base driver to allow subclass attachment at higher priority.
Discussed on: freebsd-arch
lock in order to avoid the lock acquire hit if the pipe list is very
likely empty.
Obtained from: TrustedBSD Project
MFC after: 3 weeks
Sponsored by: Apple, Inc.
allowing the full 16-bit width of the corresponding fields in the
VTOC8 label to be used. The removed limits basically only held
true for providers labeled using the synthetic geometry provided
by cam_calc_geometry(9) but neither SCSI disks labeled with Solaris
nor sufficiently large ATA disks.
- Given that providers (originally) labeled with Solaris typically
use the native geometry as reported by the target while FreeBSD
typically uses a synthetic one put the message complaining about
mismatching geometries between what the label indicates and what
GEOM thinks the provider has, which we generally can't help,
under bootverbose in order to not unnecessarily scare users.
- For informational purposes add the non-matching values to the
message complaining about them, similar to what r186501 did for
g_part_bsd_read() except also indicating the origin of the
values.
- Make it clear that the messages emitted by this code refer to
the VTOC8 support rather than to another existing scheme or to
VTOC32.
record is active on the current thread--historically we may always
have wanted to enter the audit code if auditing was enabled, but now
we just commit the audit record so don't need to enter if there isn't
one.
Obtained from: TrustedBSD Project
Sponsored by: Apple, Inc.
the KASSERT and checks suggested.
Reviewed by: The udp tunneling was discussed on net@ under the
thread entitled "Heads up -- Thinking about UDP and tunneling"
- Remove unused typedefs to avoid confusion and ease in merging
ip6protosw with protosw.
- Correct a few comments.
- Remove most of a comment about usrreq. [1]
- Use tabs instead of spaces for consistency.
Submitted by: rwatson [1]
Reviewed by: rwatson
MFC after: 3 weeks
would fail to attach due to unsupported USB revision. It should have
no effect when running on a real hardware.
Reviewed by: imp
Approved by: rwatson (mentor)
belonging to a devices children, in analogy to the way we handle interrupts
for SCC serial devices. This is required to counteract overly deep nesting
on onboard audio devices.
Submitted by: Marco Trillo
- Implement NP (ASCII 12, Form Feed). When used with cons25, it should
clear the screen and place the cursor at the top of the screen. When
used with xterm, it should just simulate a newline.
- When we want to use xterm emulation, make teken_demo set TERM to
xterm.
Spotted by: Paul B. Mahol <onemda@gmail.com>
driver since it couldn't have worked with NEWCARD w/o these fixes.
This should allow selecting 16-bit memory width as well (which was
what was broken).
functions used in the bootloader. The goal is to make the code more
readable and smaller (especially because we have size issues
in the loader's environment).
High level description of the changes:
+ define some string manipulation functions to improve readability;
+ create functions to manipulate module descriptors, removing some
duplicated code;
+ rename the error codes to ESOMETHING;
+ consistently use set_environment_variable (which evaluates
$variables) when interpreting variable=value assignments;
I have tested the code, but there might be code paths that I have
not traversed so please let me know of any issues.
Details of this change:
--- loader.4th ---
+ add some module operators, to remove duplicated code while parsing
module-related commands:
set-module-flag
enable-module
disable-module
toggle-module
show-module
--- pnp.4th ---
+ move here the definition related to the pnp devices list, e.g.
STAILQ_* , pnpident, pnpinfo
--- support.4th ---
+ rename error codes to capital e.g. ENOMEM EFREE ... and do obvious
changes related to the renaming;
+ remove unused structures (those relevant to pnp are moved to pnp.4th)
+ various string functions
- strlen removed (it is an internal function)
- strchr, defined as the C function
- strtype -- type a string to output
- strref -- assign a reference to the string on the stack
- unquote -- remove quotes from a string
+ remove reset_line_buffer
+ move up the 'set_environment_variable' function (which now
uses the interpreter, so $variables are evaluated).
Use the function in various places
+ add a 'test_file function' for debugging purposes
MFC after: 4 weeks
to GENERIC configuration files. This brings what's in 8.x in sync
with what is in 7.x, but does not change any current defaults.
Possibly they should now be enabled in head by default?
Because we now have cons25-style linewrapping, we must also use cons25-
style reverse linewrapping. This means that a ^H on column 0 will move
the cursor one line up.
Also fix a small regression: if the user invokes a RIS (Reset to Initial
State), we must show the cursor again.
Spotted by: Paul B. Mahol <onemda gmail com>
did not use C99-style sparse structure initialization, so remove
NULL assignments for now-removed pr_usrreq function pointers.
Reported by: Chris Ruiz <yr.retarded at gmail.com>
During boot, the domain list is locked with Giant. It is not possible to
register any protocols after the system has booted, so the lock is only
used to protect insertion of entries.
There is already a mutex in uipc_domain.c called dom_mtx. Use this mutex
to lock the list, instead of using Giant. It won't matter anything with
respect to performance, but we'll never get rid of Giant if we don't
remove from places where we don't need it.
Approved by: rwatson
MFC after: 3 weeks
o Don't check the dummy fields.
o The entry is unused if either dp_mid is 0 or dp_sid is 0.
o The start or end cylinder cannot be 0.
o The start CHS cannot be equal to the end CHS.
Submitted by: nyan
With cons25, there are printable characters below 0x1B. This is not the
case with ASCII, UTF-8, etc. but in this case we just have to.
Also don't set LC_CTYPE to UTF-8 when libteken is compiled without UTF-8
in the demo-application.
src/lib/libusb20/libusb20_desc.c
Make "libusb20_desc_foreach()" more readable.
src/sys/dev/usb2/controller/*.[ch]
src/sys/dev/usb2/core/*.[ch]
Implement support for USB power save for all HC's.
Implement support for Big-endian EHCI.
Move Huawei quirks back into "u3g" driver.
Improve device enumeration.
src/sys/dev/usb2/ethernet/*[ch]
Patches for supporting new AXE Gigabit chipset.
src/sys/dev/usb2/serial/*[ch]
Fix IOCTL return code.
src/sys/dev/usb2/wlan/*[ch]
Sync with old USB stack.
Submitted by: hps
It turns out I was looking too much at mimicing xterm, that I didn't
take the differences of cons25 into account. There are some differences
between xterm and cons25 that are important. Create a new #define called
TEKEN_CONS25 that can be toggled to switch between cons25 and xterm
mode.
- Don't forget to redraw the cursor after processing a forward/backward
tabulation.
- Implement cons25-style (WYSE?) autowrapping. This form of autowrapping
isn't that nice. It wraps the cursor when printing something on column
80. xterm wraps when printing the first character that doesn't fit.
- In cons25, a \t shouldn't overwrite previous contents, while xterm
does.
Reported by: Garrett Cooper <yanefbsd gmail com>
cells in the map, instead of using a value passed to it and then panicing if it
disagrees. This fixes interrupt map parsing for PCI bridges on some Apple
Uninorth PCI controllers.
Reported by: marcel
Tested on: G4 iBook, Sun Ultra 5
of the counter, that may happen when too many sendfile(2) calls are
being executed with this vnode [1].
To keep the size of the struct vm_page and offsets of the fields
accessed by out-of-tree modules, swap the types and locations
of the wire_count and cow fields. Add safety checks to detect cow
overflow and force fallback to the normal copy code for zero-copy
sockets. [2]
Reported by: Anton Yuzhaninov <citrin citrin ru> [1]
Suggested by: alc [2]
Reviewed by: alc
MFC after: 2 weeks
disabled entirely, which is its default state before set to a
non-zero value.
PR: 128790
Submitted by: Nick Hilliard <nick at foobar dot org>
MFC after: 3 weeks
to ip_output(). The destionation is represented in a sockaddr{} object
that may contain other pieces of information, e.g., port number. This
same destination sockaddr{} object may be passed into L2 code, which
could be used to create a L2 entry. Since there exists a L2 table per
address family, the L2 lookup function can make address family specific
comparison instead of the generic bcmp() operation over the entire
sockaddr{} structure.
Note in the IPv6 case the sin6_scope_id is not compared because the
address is currently stored in the embedded form inside the kernel.
The in6_lltable_lookup() has to account for the scope-id if this
storage format were to change in the future.
During startup some of the syscons TTY's are used to set attributes like
the screensaver and mouse options. These actions cause /dev/console to
be rendered unusable.
Fix the issue by leaving the TTY opened when it is used as the console
device.
Reported by: imp
The cursor is only inside the scrolling region when we are in origin
mode. In that case, it should use originreg instead of scrollreg. It is
completely valid to place the cursor outside the scrolling region.
the field in the mbuf constructors, since otherwise we have no way to
tell if they are valid. In the future, Kip has plans to add a flag
specifically to indicate validity, which is the preferred model.
whole KVA space using one locked 4MB dTLB entry per GB of physical
memory. On Cheetah-class machines only the dt16 can hold locked
entries though, which would be completely consumed for the kernel
TSB on machines with >= 16GB. Therefore limit the KVA space to use
no more than half of the lockable dTLB slots, given that we need
them also for other things.
- Add sanity checks which ensure that we don't exhaust the (lockable)
TLB slots.
Some time ago I started working on a library called libteken, which is
terminal emulator. It does not buffer any screen contents, but only
keeps terminal state, such as cursor position, attributes, etc. It
should implement all escape sequences that are implemented by the
cons25 terminal emulator, but also a fair amount of sequences that are
present in VT100 and xterm.
A lot of random notes, which could be of interest to users/developers:
- Even though I'm leaving the terminal type set to `cons25', users can
do experiments with placing `xterm-color' in /etc/ttys. Because we
only implement a subset of features of xterm, this may cause
artifacts. We should consider extending libteken, because in my
opinion xterm is the way to go. Some missing features:
- Keypad application mode (DECKPAM)
- Character sets (SCS)
- libteken is filled with a fair amount of assertions, but unfortunately
we cannot go into the debugger anymore if we fail them. I've done
development of this library almost entirely in userspace. In
sys/dev/syscons/teken there are two applications that can be helpful
when debugging the code:
- teken_demo: a terminal emulator that can be started from a regular
xterm that emulates a terminal using libteken. This application can
be very useful to debug any rendering issues.
- teken_stress: a stress testing application that emulates random
terminal output. libteken has literally survived multiple terabytes
of random input.
- libteken also includes support for UTF-8, but unfortunately our input
layer and font renderer don't support this. If users want to
experiment with UTF-8 support, they can enable `TEKEN_UTF8' in
teken.h. If you recompile your kernel or the teken_demo application,
you can hold some nice experiments.
- I've left PC98 the way it is right now. The PC98 platform has a custom
syscons renderer, which supports some form of localised input. Maybe
we should port PC98 to libteken by the time syscons supports UTF-8?
- I've removed the `dumb' terminal emulator. It has been broken for
years. It hasn't survived the `struct proc' -> `struct thread'
conversion.
- To prevent confusion among people that want to hack on libteken:
unlike syscons, the state machines that parse the escape sequences are
machine generated. This means that if you want to add new escape
sequences, you have to add an entry to the `sequences' file. This will
cause new entries to be added to `teken_state.h'.
- Any rendering artifacts that didn't occur prior to this commit are by
accident. They should be reported to me, so I can fix them.
Discussed on: current@, hackers@
Discussed with: philip (at 25C3)
When sysctl() is being called with a buffer that is too small, it will
return ENOMEM. Unfortunately the changes I made the other day sets the
error number to 0, because it just returns the error number of the
copyout(). Revert this part of the change.
Add support to uscanner.c for known-working devices
(the same should be done for uscanner2.c).
Waiting for 7.1 to be released before the merge.
MFC after: 3 weeks
contrib/openbsm (svn merge) and src/sys/{bsm,security/audit} (manual
merge). Add libauditd build parts and add to auditd's linkage;
force libbsm to build before libauditd.
OpenBSM history for imported revisions below for reference.
MFC after: 1 month
Sponsored by: Apple Inc.
Obtained from: TrustedBSD Project
OpenBSM 1.1 alpha 4
- With the addition of BSM error number mapping, we also need to map the
local error number passed to audit_submit(3) to a BSM error number,
rather than have the caller perform that conversion.
- Reallocate user audit events to avoid collisions with Solaris; adopt a
more formal allocation scheme, and add some events allocated in Solaris
that will be of immediate use on other platforms.
- Add an event for Calife.
- Add au_strerror(3), which allows generating strings for BSM errors
directly, rather than requiring applications to map to the local error
space, which might not be able to entirely represent the BSM error
number space.
- Major auditd rewrite for launchd(8) support. Add libauditd library
that is shared between launchd and auditd.
- Add AUDIT_TRIGGER_INITIALIZE trigger (sent via 'audit -i') for
(re)starting auditing under launchd(8) on Mac OS X.
- Add 'current' symlink to active audit trail.
- Add crash recovery of previous audit trail file when detected on audit
startup that it has not been properly terminated.
- Add the event AUE_audit_recovery to indicated when an audit trail file
has been recovered from not being properly terminated. This event is
stored in the new audit trail file and includes the path of recovered
audit trail file.
- Mac OS X and FreeBSD dependent code in auditd.c is separated into
auditd_darwin.c and auditd_fbsd.c files.
- Add an event for the posix_spawn(2) and fsgetpath(2) Mac OS X system
calls.
- For Mac OS X, we use ASL(3) instead of syslog(3) for logging.
- Add support for NOTICE level logging.
OpenBSM 1.1 alpha 3
- Add two new functions, au_bsm_to_errno() and au_errno_to_bsm(), to map
between BSM error numbers (largely the Solaris definitions) and local
errno(2) values for 32-bit and 64-bit return tokens. This is required
as operating systems don't agree on some of the values of more recent
error numbers.
- Fix a bug how au_to_exec_args(3) and au_to_exec_env(3) calculates the
total size for the token. This buge.
- Deprecated Darwin constants, such as TRAILER_PAD_MAGIC, removed.
vm_map_lookup{,_locked}() to vm_map_lookup_entry(). Having the fast path
in vm_map_lookup{,_locked}() limits its benefits to page faults. Moving
it to vm_map_lookup_entry() extends its benefits to other operations on
the vm map.
did not compared nc_dvp with supplied parent directory vnode pointer.
Add the check and note that now branches for vp != NULL and vp == NULL
are the same, thus can be merged.
Reported and reviewed by: kan
Tested by: pho
MFC after: 2 weeks
and re-enable it as default.
In particular:
+ re-enable the 'update' flag in the Makefile (of course!);
+ commit Warner's patch "orb $NOUPDATE,_FLAGS(%bp)"
to avoid writing to disk in case of a timeout/default choice;
+ fix an off-by-one count in the partition scan code that would
print the wrong name for unknown partitions;
+ unconditionally change the boot prompt to 'Boot:' instead of 'Default:'
to make room for the extra code/checks/messages. Some of the changes
listed below are also made to save space;
+ rearrange and fix comments for known partition types. Right now we
explicitly recognise *BSD, Linux, FAT16 (type 6, used on many USB keys),
NTFS (type 7), FAT32 (type 11).
Depending on other options we also recognise Extended (type 5),
FAT12 (type 1) and FAT16 < 32MB (type 4).
+ Add an entry "F6 PXE" when the code is built with -DPXE (which is
a default now). Technically, F6 boots through INT18, so the prompt 'PXE'
is a bit misleading. Unfortunately the name INT18
is too long and does not fit in - we could use ROM perhaps.
The reason I picked 'PXE' is that on many (I believe) new systems
INT18 calls PXE.
Apart from the choice of the name for PXE/ROM/INT18, this should close
pending issues on the 1-sector boot0 code and we should be able to
move the code to RELENG_7 when it reopens.
No boot0cfg changes are necessary.
MFC after: 3 weeks
It seems I forgot to remove `int error' from a single piece of code. I'm
also moving ogetkerninfo() to kern_xxx.c, because it belongs to the
class of compat system information system calls, not the generic sysctl
code.
pfs_vncache_free() that removes pvd from the list, while it is not yet
put on the list.
Prevent the invalid removal from the list by clearing pvd_next and
pvd_prev for the newly allocated pvd, and only move pfs_vncache list
head when the pvd was at the head.
Suggested and approved by: des
MFC after: 2 weeks
In the existing code we didn't really enforce that callers hold Giant
before calling userland_sysctl(), even though there is no guarantee it
is safe. Fix this by just placing Giant locks around the call to the oid
handler. This also means we only pick up Giant for a very short period
of time. Maybe we should add MPSAFE flags to sysctl or phase it out all
together.
I've also added SYSCTL_LOCK_ASSERT(). We have to make sure sysctl_root()
and name2oid() are called with the sysctl lock held.
Reviewed by: Jille Timmermans <jille quis cx>
compare map->timestamp with saved timestamp after map read lock is
reacquired, not with saved timestamp + 1. The only consequence of the +1
was unconditional lookup of the next map entry, though.
Tested by: pho
Approved by: des
MFC after: 2 weeks
may need to lock arbitrary vnodes, causing either lock order reversal or
recursive vnode lock acquisition.
Tested by: pho
Approved by: des
MFC after: 2 weeks
do pfs_vncache_alloc() for the same pfs_node and pid. In this case, we
could end up with two vnodes for the pair. Recheck the cache under the
locked pfs_vncache_mutex after all sleeping operations are done [1].
This case mostly cannot happen now because pseudofs uses exclusive vnode
locking for lookup. But it does drop the vnode lock for dotdot lookups,
and Marcus' pseudofs_vptocnp implementation is vulnerable too.
Do not call free() on the struct pfs_vdata after insmntque() failure,
because vp->v_data points to the structure, and pseudofs_reclaim()
frees it by the call to pfs_vncache_free().
Tested by: pho [1]
Approved by: des
MFC after: 2 weeks
Log:
- merge in latest xenbus from dfr's xenhvm
- fix race condition in xs_read_reply by converting tsleep to mtx_sleep
Log:
unmask evtchn in bind_{virq, ipi}_to_irq
Log:
- remove code for handling case of not being able to sleep
- eliminate tsleep - make sleeps atomic
changes since the last imported OpenBSM release:
OpenBSM 1.1 alpha 4
- With the addition of BSM error number mapping, we also need to map the
local error number passed to audit_submit(3) to a BSM error number,
rather than have the caller perform that conversion.
- Reallocate user audit events to avoid collisions with Solaris; adopt a
more formal allocation scheme, and add some events allocated in Solaris
that will be of immediate use on other platforms.
- Add an event for Calife.
- Add au_strerror(3), which allows generating strings for BSM errors
directly, rather than requiring applications to map to the local error
space, which might not be able to entirely represent the BSM error
number space.
- Major auditd rewrite for launchd(8) support. Add libauditd library
that is shared between launchd and auditd.
- Add AUDIT_TRIGGER_INITIALIZE trigger (sent via 'audit -i') for
(re)starting auditing under launchd(8) on Mac OS X.
- Add 'current' symlink to active audit trail.
- Add crash recovery of previous audit trail file when detected on audit
startup that it has not been properly terminated.
- Add the event AUE_audit_recovery to indicated when an audit trail file
has been recovered from not being properly terminated. This event is
stored in the new audit trail file and includes the path of recovered
audit trail file.
- Mac OS X and FreeBSD dependent code in auditd.c is separated into
auditd_darwin.c and auditd_fbsd.c files.
- Add an event for the posix_spawn(2) and fsgetpath(2) Mac OS X system
calls.
- For Mac OS X, we use ASL(3) instead of syslog(3) for logging.
- Add support for NOTICE level logging.
OpenBSM 1.1 alpha 3
- Add two new functions, au_bsm_to_errno() and au_errno_to_bsm(), to map
between BSM error numbers (largely the Solaris definitions) and local
errno(2) values for 32-bit and 64-bit return tokens. This is required
as operating systems don't agree on some of the values of more recent
error numbers.
- Fix a bug how au_to_exec_args(3) and au_to_exec_env(3) calculates the
total size for the token. This bug resulted in "unknown" tokens being
printed after the exec args/env tokens.
- Support for AUT_SOCKET_EX extended socket tokens, which describe a
socket using a pair of IPv4/IPv6 and port tuples.
- OpenBSM BSM file header version bumped for 1.1 release.
- Deprecated Darwin constants, such as TRAILER_PAD_MAGIC, removed.
Obtained from: TrustedBSD Project
Sponsored by: Apple Inc.
options. Using the already included std.avila is not considered
to be entirely right (and the options slightly differ) but the best
match we currently have. Upcoming work should fit better.
Reorder another variable to match the layout of other configs.
Reviewed by: sam, warner (earlier version with options removed)
variable name for the inpcb.
For consistency with the other *_hdrsiz functions use 'size'
instead of 'siz' as variable name.
No functional change.
MFC after: 4 weeks
- Always use round brackets with return ().
- Add empty line to beginning of functions without local variables.
- Comments start with a capital letter and end in a '.'.
While there adapt a few comments.
Reviewed by: rwatson
MFC after: 4 weeks
arrays under #ifndef XEN to make XEN config compile again.
In case of Xen vm_guest is hard coded.
Move the list for the vm_guest sysctl out of the restictive
bounds as the sysctl is there in either case.
Note, that the patch provided with this card for the Linux states that
the card uses DEFAULT_RCLK * 2, while in fact it is '* 10'. So probably
we should also use the subdevice/subvendord here. For now just ignore
that fact.
PR: kern/129665
Submitted by: bsam
Obtained from: united efforst with bsam
plex. If the plex is a raid5 plex, and is being written to, parity data might
have to be read from the underlying disks, requiring them to be opened for
reading as well as writing.
MFC after: 1 week
Now the NDISulator supports NDIS USB drivers that it've tested with
devices as follows:
- Anygate XM-142 (Conexant)
- Netgear WG111v2 (Realtek)
- U-Khan UW-2054u (Marvell)
- Shuttle XPC Accessory PN20 (Realtek)
- ipTIME G054U2 (Ralink)
- UNiCORN WL-54G (ZyDAS)
- ZyXEL G-200v2 (ZyDAS)
All of them succeeded to attach and worked though there are still some
problems that it's expected to be solved.
To use NDIS USB support, you should rebuild and install ndiscvt(8) and
if you encounter a problem to attach please set `hw.ndisusb.halt' to
0 then retry.
I expect no changes of the NDIS code for PCI, PCMCIA devices.
Obtained from: //depot/projects/ndisusb/...
Disable some unneeded pathes in overcomplicated input mixer to help parser
to handle the rest better. This gives mic input boost control in some
configurations and just more predictable operation in others.
1. The "route" command allows route insertion through the interface-direct
option "-iface". During if_attach(), an sockaddr_dl{} entry is created
for the interface and is part of the interface address list. This
sockaddr_dl{} entry describes the interface in detail. The "route"
command selects this entry as the "gateway" object when the "-iface"
option is present. The "arp" and "ndp" commands also interact with the
kernel through the routing socket when adding and removing static L2
entries. The static L2 information is also provided through the
"gateway" object with an AF_LINK family type, similar to what is
provided by the "route" command. In order to differentiate between
these two types of operations, a RTF_LLDATA flag is introduced. This
flag is set by the "arp" and "ndp" commands when issuing the add and
delete commands. This flag is also set in each L2 entry returned by the
kernel. The "arp" and "ndp" command follows a convention where a RTM_GET
is issued first followed by a RTM_ADD/DELETE. This RTM_GET request fills
in the fields for a "rtm" object, which is reinjected into the kernel by
a subsequent RTM_ADD/DELETE command. The entry returend from RTM_GET
is a prefix route, so the RTF_LLDATA flag must be specified when issuing
the RTM_ADD/DELETE messages.
2. Enforce the convention that NET_RT_FLAGS with a 0 w_arg is the
specification for retrieving L2 information. Also optimized the
code logic.
Reviewed by: julian
the following operations, e.g.:
1) ifconfig tun0 create
2) ifconfig tun0 10.1.1.1 10.1.1.2
3) route add -net 192.103.54.0/24 -iface tun0
4) ifconfig tun0 destroy
If cv wait on the TUN_CLOSED flag, then the last operation (4) will
block forever.
Revert the previous changes and fix the mtx_unlock() leak.
invariants and approach for protocol switch methods in protsw_init(),
and also some KASSERT's for non-domain init entries in protocol
switch tables: pru_abort and pru_send must both be implemented.
For now, leave those assertions #if 0'd, since there are a few
protocols that violate them in non-harmful ways. Whether or not we
should enforce pru_abort being implemented for non-stream protocols
is an interesting question: currently abort is only invoked on stream
sockets in situations where un-accepted sockets must be abruptly
closed (i.e., close() on a listen socket with pending connections),
but in principle it is useful for datagram sockets and most datagram
socket types implement it.
MFC after: 3 weeks
by adding a separate TUN_CLOSED flag that is set after tunclose is done referencing it.
- drop the tun_mtx after the flag check to avoid holding it across if_detach which can recurse in to
if_tun.c
FPA floating-point format is identical to the VFP format,
but is always stored in big-endian.
Introduce _IEEE_WORD_ORDER to describe the byte-order of
the FP representation.
Obtained from: Juniper Networks, Inc
looked up would have v_dd set to a non-NULL value. This fixes a panic
seen when running installworld on a diskless system with a separate /usr
file system.
Submitted by: cracauer
Approved by: kib
Note that you need at least xf86-video-intel 2.4.3 for this to work.
The G4X doesn't put the GATT into the same area of stolen memory
as all the other chips and older versions of the driver didn't
handle that properly.
Tested by: ganbold
Approved by: kib
MFC after: 2 weeks
Note that there is no working backend (or at least
that is mentioned in the PR ticket) but the device
is now supported on our end.
PR: 117205
Submitted by: Artem Naluzhnyy <tut at nhamon dot com dot ua>
MFC after: 1 week
o check feature bits when probing NPE ethernet support
o move firmware loading logic from if_npe to core npe support
o allow multiple refs to core NPE driver
o while here fix hw.npe.debug tunable path
o add definitions for more bits, for masking out IXP465-specific bits,
and %b format string
o add ixp4xx_read_feature_bits to retrieve the mask of valid features
(aka fuse bits)
o add cpu_is_ixp42x() macro
o print feature bits at boot
o add EHCI_SCFLG_BIGEMMIO flag to force big-endian byte-select to be
set in USBMODE
o split reset work into new public routine ehci_reset so bus shim drivers
can force big-endian byte-select before ehci_init
long commands into multiple requests. [08:12]
Avoid calling uninitialized function pointers in protocol switch
code. [08:13]
Merry Christmas everybody...
Approved by: so (cperciva)
Approved by: re (kensmith)
Security: FreeBSD-SA-08:12.ftpd, FreeBSD-SA-08:13.protosw
mac_proc_vm_revoke_recurse() requests a read lock on the vm map at the start
but does not handle failure by vm_map_lock_upgrade() when it seeks to modify
the vm map. At present, this works because all lock request on a vm map are
implemented as exclusive locks. Thus, vm_map_lock_upgrade() is a no-op that
always reports success. However, that is about to change, and
proc_vm_revoke_recurse() will require substantial modifications to handle
vm_map_lock_upgrade() failures. For the time being, I am changing
mac_proc_vm_revoke_recurse() to request a write lock on the vm map at the
start.
Approved by: rwatson
MFC after: 3 months
firmware versions which wedge when using the OFW test service,
so given that we don't really depend on SUNW,stop-self just nuke
it altogether instead of risking problems.
- At least Fire V880 have a small hardware glitch which causes the
reception of IDR_NACKs for CPUs we actually haven't tried to send
an IPI to, even not as part of the initial try. According to tests
this apparently can be safely ignored though, so just return if
checking for the individual IDR_NACKs indicates no outstanding
dispatch. Serializing the sending of IPIs between MD and MI code
by the combined usage of smp_ipi_mtx makes no difference to this
phenomenon. [1]
- Provide relevant debugging bits already with the initial panic
in case of problems with the IPI dispatch, which would have
allowed to diagnose the above problem without a specially built
kernel.
- In case of cheetah_ipi_selected() base the delay we wait for
other CPUs which also might want to dispatch IPIs on the total
amount of CPUs instead of just the number of CPUs we let this
CPU send IPIs to because in the worst case all CPUs also want
to IPI us at the same time.
Reported and access for extensive tests provided by: Beat Gaetzi [1]
destroy operation until the referenced clone device has
been closed by the process properly. The behavior is now
consistently with the previous release.
Reviewed by: Kip Macy
event from mii(4) may not be delivered if valid link was already
established. To address the issue, check current link state after
driving MII_TICK. This should fix a regression introduced in
r184245.
PR: kern/129647
event from mii(4) may not be delivered if valid link was already
established. To address the issue, check current link state after
driving MII_TICK. This should fix a regression introduced in
r185753 on fast ethernet controllers.
Reported by: csjp, Bruce Cran < bruce <> cran DOT org DOT uk >
Tested by: csjp, Bruce Cran (initial version)
In r185891 I removed the newlines from messages written to /dev/console,
because it made startup messages from rc-scripts harder to read. This,
unfortunately, causes the kernel message that is printed after a
non-terminated log message to be concatenated.
This could be fixed, but on short term it's better to just revert the
change.
Reported by: Jaakko Heinonen <jh saunalahti fi>
Inside ptsdrv_{in,out}wakeup() we call KNOTE_LOCKED() to wake up any
kevent(2) users. Because the kqueue handlers are executed synchronously,
we must set PTS_FINISHED before calling ptsdrv_{in,out}wakeup().
Discovered by: nork
Right now the wchan strings "ttyinp" and "ttybgw" only differ one
character from the strings we used prior to MPSAFE TTY. Just rename them
back to their pre-MPSAFE TTY counterparts.
Also rename "ttylck" to "ttymtx", which should make it more clear that a
process is blocked on the TTY mutex, not some other form of locking.
o add support for IXP435 cpu's (e.g. 64 irq's)
o add support for Cambria-specific devices: npe, led's (front panel and
octal latch), ehci, mcu, ide cf
o redo memory mapping for xscale/ixp4xx boards: previously memory
was assumed aliased to 0x10000000 but this appears to be true only
for ixp425 systems and breaks operation on others; rework so memory
is assumed to start at 0
o rework NPE configuration support to use NPE id's instead of port #'s;
these changes also rename the associated MAC's to follow the NPE's
they are attached to
o update npe firmware to latest rev (same license) and update default fw
imageid's to match; in particular this adds NPE-A and crypto support
o re-style NPE fw handling code and add a console msg identifying the
attributes of the loaded fw
o fix numerous problems with handling failures during npe setup
o fix npe rx q setup; need to spin waiting for mailbox responses during
early boot stages as qmgr interrupts are not delivered; this fixes
the problem where all 8 traffic classifications were not tied to the
rx q (and eliminates the console msg "remember to fix rx q setup")
o add DELAY to npe MII wait logic for IXP435
o strip down builtin phys->virt address translation table in resource
handling to just those resources that require it and add a console msg
to alert people when this (kludge) table needs to be extended
o purge a bunch of dead netbsd-ism's
o cleanup avila led driver
o add Cambria support to boot2 and rework code for better multi-board support
Notes:
1. NPE-A doesn't work and causes NPE-C to stop working; it is disabled
in the hints
2. USB isn't working yet; controller communicates ok but device
discovery fails
3. Cambria support must be configured separately from IXP425 boards;
multi-board support is TBD
Sponsored by: Hobnob, Gateworks (board donation)
Reviewed by: imp
o add support to byte swap EHCI descriptor contents; the IXP435
has dual-EHCI controllers integral but descriptor contents are
in big-endian format; this support is configured with the
USB_EHCI_BIG_ENDIAN_DESC option and enabled with EHCI_SCFLG_BIGEDESC
o clean up EHCI USBMODE register setup during init; add #defines for
bit values
o split debug support out into a new file and enable use through ddb
o while here remove a bunch of lingering netbsd-isms
Reviewed by: imp
of OFW access semantics, in order to allow future support for real-mode
OF access and flattened device frees. OF client interface modules are
implemented using KOBJ, in a similar way to the PPC PMAP modules.
Because we need Open Firmware to be available before mutexes can be used on
sparc64, changes are also included to allow KOBJ to be used very early in
the boot process by only using the mutex once we know it has been initialized.
Reviewed by: marius, grehan
This commit is slightly different from the original patch in the PR:
1. EM_ALPHA keeps the old value for compatibility reason.
2. Non-standard SHT_NUM is not added.
3. Style.
PR: kern/118540
Submitted by: "Pedro F. Giffuni" <giffunip[at]tutopia.com>
Intel 855 chips present the same pci id for both heads. This prevents
us from attaching to the dummy second head. All other chips that I
am aware of either only present a single pci id, or different ids
for each head so that we only match on the correct head.
Approved by: kib@
MFC after: 2 weeks
storage class. This check was lost. It is not important for the most cases,
but as it was reported on current@, it does important for sis driver and
surely inportant for AHCI driver. So restore it there.
Submitted by: Toshikazu ICHINOSEKI, Andrey V. Elsukov
Discussed on: current@
ati pci gart to use bus_dma to handle the allocations. This fixes
a garbled screen issue on at least some radeons (X1400 tested). It is
also likely that this is the correct fix for PR# 119324, though that
is not confirmed yet.
Reviewed by: jhb@ (mentor, prior version)
Approved by: kib@
MFC after: 2 weeks
be fatal so just inform about this instead of panicing.
- Ensure we use the right softc in case the interrupt of a child is
is routed to the companion PBM instead. This hasn't been seen in the
wild so far but given that it's the case for the Schizo interrupts,
handling this situation also for child interrupts as a precaution
seemed a good idea.
- Deal with broken firmware versions which miss child entries in the
ino-bitmap as seen on V880 by belatedly registering as interrupt
controller in schizo_setup_intr(). [1]
- Add missing '\n' when printing the warning regarding Schizo Errata
I-13.
Reported and tested by: Beat Gaetzi [1]
- Make LBC resources management self-contained: introduce explicit LBC
resources definition (much like the OCP), provide dedicated rman for LB mem
space.
- Full configuration of an LB chip select device: program LAW and BR/OR, map
into KVA, handle all LB attributes (bus width, machine select, ecc,
write protect etc).
- Factor out LAW manipulation routines into shared code, adjust OCP area
accordingly.
- Other LBC fixes and clean-ups.
Obtained from: Semihalf
NULL pointer to struct mount if the looked up vnode is reclaimed. Also,
these syscalls only mnt_ref() the mp, still allowing it to be unmounted;
only struct mount memory is kept from being reused.
Lock the vnode when doing name lookup, then reference its mount point,
unlock the vnode and vfs_busy the mountpoint. This sequence shall take
care of both races.
Reported and tested by: pho
Discussed with: attilio
MFC after: 1 month
payload length in TSO case. Leaving unused TBD also seem to cause
SCB timeouts under certain conditions when TSO/non-TSO traffics
are active at the same time.
Add code to the Chelsio driver so that it can recognize different
module types which may be plugged into it, including SR, LR lasers
and TWINAX copper cables.
Obtained from: Chelsio Inc.
MFC after: 1 week
it running under a virtual environment. This also introduces a globally
accessible variable vm_guest that can be used where appropriate in the
kernel to inspect this environment.
To make it easier for the long run, an enum VM_GUEST is also introduced,
which could possibly be factored out in a header somewhere (but the
question is where - vm/vm_param.h? sys/param.h?) so it eventually becomes
a part of the standard KPI. In any case, it's a start.
The purpose of all this isn't to absolutely detect that the OS is running
under a virtual environment (cf. "redpill") but to allow the parts of the
kernel and the userland that care about this particular aspect and can do
something useful depending on it to have a standardised interface. Reducing
kern.hz is one example but there are other things that could be done like
avoiding context switches, not using CPU instructions that are known to be
slow in emulation, possibly different strategies in VM (memory) allocation,
CPU scheduling, etc.
It isn't clear if the JAILS/VIMAGE functionality should also be exposed
by this particular mechanism (probably not since they're not "full"
virtual hardware environments). Sometime in the future another sysctl and
a variable could be introduced to reflect if the kernel supports any kind
of virtual hosting (e.g. VMWare VMI, Xen dom0).
Reviewed by: silence from src-commiters@, virtualization@, kmacy@
Approved by: gnn (mentor)
Security: Obscurity doesn't help.
- split bootstrap code into more modular routines, which will also be used for
the non-booting cores
- clean up registers usage
- improve comments to better reflect reality
- eliminate dead or redundant code
- other minor fixes
This refactoring is a preliminary step before importing dual-core (MPC8572)
support.
Obtained from: Freescale, Semihalf
had been the only flag with random usage patterns.
Switch inc_flags to be used as a real bit field by using
INC_ISIPV6 with bitops to check for the 'isipv6' condition.
While here fix a place or two where in case of v4 inc_flags
were not properly initialized before.[1]
Found by: rwatson during review [1]
Discussed with: rwatson
Reviewed by: rwatson
MFC after: 4 weeks
command whenever Tx completion interrupt is raised. The Tx poll
bit is cleared when all packets waiting to be transferred have been
processed. This means the second Tx poll command can be silently
ignored as the Tx poll bit could be still active while processing
of previous Tx poll command is in progress.
To address the issue re(4) used to invoke the Tx poll command in Tx
completion handler whenever it detects there are pending packets in
TxQ. However that still does not seem to completely eliminate
watchdog timeouts seen on RealTek PCIe controllers. To fix the
issue kick Tx poll command only after Tx completion interrupt is
raised as this would indicate Tx is now idle state such that it can
accept new Tx poll command again. While here apply this workaround
for PCIe based controllers as other controllers does not seem to
have this limitation.
Tested by: Victor Balada Diaz < victor <> bsdes DOT net >
out of sleep mode prior to accessing to PHY. This should fix device
attach failure seen on these controllers. Also enable the sleep
mode when device is put into sleep state.
PR: kern/123123, kern/123053
* Remove trailing whitespace (added in r186162)
* Reduce indentation by rephrasing test
Submitted by: Christopher Mallon (christoph dot mallon at gmx dot de)
- threadA runs vfs_rel(mp1)
- threadB does unmount the mp1 fs, sets MNTK_UNMOUNT and drop MNT_ILOCK()
- threadA runs vfs_busy(mp1) and, as long as, MNTK_UNMOUNT is set, sleeps
waiting for threadB to complete the unmount
- threadB, in vfs_mount_destroy(), finds mnt_lock > 0 and sleeps waiting
for the refcount to expire.
Fix the deadlock by adding a flag called MNTK_REFEXPIRE which signals the
unmounter is waiting for mnt_ref to expire.
The vfs_busy contenders got awake, fails, and if they retry the
MNTK_REFEXPIRE won't allow them to sleep again.
2) Simplify significantly the code of vfs_mount_destroy() trimming
unnecessary codes:
- as long as any reference exited, it is no-more possible to have
write-op (primarty and secondary) in progress.
- it is no needed to drop and reacquire the mount lock.
- filling the structures with dummy values is unuseful as long as
it is going to be freed.
Tested by: pho, Andrea Barberio <insomniac at slackware dot it>
Discussed with: kib
anything other than 0. Make it so. This fixes
"panic: VOP_STRATEGY failed bp=0xc320dd90 vp=0xc3b9f648",
encountered when writing to an orphaned filesystem. Reason
for the panic was the following assert:
KASSERT(i == 0, ("VOP_STRATEGY failed bp=%p vp=%p", bp, bp->b_vp));
at vfs_bio:bufstrategy().
Reviewed by: scottl, phk
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
result in panic:
mdconfig -af blah.img -o force
mount /dev/md0 /mnt
mdconfig -du 0
Reviewed by: scottl
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
o Add support for compiling elf64 for this file (the rest of the changes are
coming later)
o Fill in some misssing relocation types. We need to support these in
elf_machdep.c's relocation routines eventually, but that's future work
too.
the device, which means refcount on periph drivers never drops,
which means cam_sim_free() never returns, which results in umass
sleeping there ad infinitum.
Submitted by: pjd
Reviewed by: scottl, pjd
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
unregistration, and execution:
- Add some brackets for clarity and trim a bit of vertical whitespace.
- Remove comments that may not contribute to clarity, such as "Lock"
before acquiring a lock and "Get memory" before allocating memory.
- During hook registration, don't drop pfil_list_lock between checking
for a duplicate and registering the hook, as this leaves a race
condition by failing to enforce the "no duplicate hooks" invariant.
- Don't lock the hook during registration, since it's not yet in use.
- Document assumption that hooks will be quiesced before being
unregistered.
- Don't write-lock hooks during removal because they are assumed
quiesced.
- Rename "done" label to "locked_error" to be clear that it's an error
path on the way out of hook execution.
MFC after: pretty soon
does - in DragonFly, it's cam_sim_release() what actually frees the
SIM; cam_sim_free does nothing more than calling cam_sim_release().
Here, we drain in cam_sim_free, waiting for refcount to drop to zero.
We cannot do the same think DragonFly does, because after cam_sim_free
returns, client would destroy the sim->mtx, and CAM would trip over
an initialized mutex.
Reviewed by: scottl
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
to actually use it would panic on mtx operation, as dead_sim doesn't
have a proper mutex. Even if it had a properly initialized mutex,
it wouldn't have properly locked and owned one.
Reviewed by: scottl
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
configuration registers (which are not going to change) on every interrupt
looks expensive, especially when interrupt is shared. Profiling shows me 3%
of time spent by atapci0 on pure network load due to IRQ sharing with em0.
separate index variable.
It gives more then double rc4_init() performance increase on tested i386 P4.
It also gives about 15% speedup to PPTP VPN with stateless MPPE encryption
(by ng_mppc) which calls rc4_init() for every packet.
- Initialize variables before use.
- Remove a KASSERT() that could falsely trigger if there are other sources
of NMIs in the system.
Efficiency tweak:
- When checking PMCs that overflowed, ignore PMCs that were not configured for
sampling.
a real packet error but simply indicate that an unexpected unicast or multicast
error was received by the NIC, which was not counted in the past as well.
Reported by: many (on -stable@)
Reviewed by: davidch
MFC after: 3 days
On HyperThreading CPUs logical cores have same frequency, so setting it
on any core will change the other's one. In most cases first request
to the second core will be the "set" request, done after setting frequency
of the first core. In such case second CPU will obtain throttled frequency
of the first core as it's max_mhz making cpufreq broken due to different
frequency sets.
(e.g. ath): we must check the key index and not whether the key
points at a cipher other than "undef". This looks like it's been
broken for a while. Might be worth adding an explicit clear cipher
at some point though this would require changes to the usage of
IEEE80211_KEY_UNDEFINED.
PR: 125906
same LOM hardware with goofed-up EEPROM programming also needed reading the
Ethernet address from the chips registers as the EEPROM did not have a
sensible address programmed.
Patch developed by: pyun@
Funky hardware on loan: www.id-it.nl
MFC after: 2 weeks
the inpcb names rather than the following IPv6 compat macros:
in6pcb,in6p_sp, in6p_ip6_nxt,in6p_flowinfo,in6p_vflag,
in6p_flags,in6p_socket,in6p_lport,in6p_fport,in6p_ppcb and
sotoin6pcb().
Apart from removing duplicate code in netipsec, this is a pure
whitespace, not a functional change.
Discussed with: rwatson
Reviewed by: rwatson (version before review requested changes)
MFC after: 4 weeks (set the timer and see then)
controllers. Reading this register, for which there are indications
that it doesn't really exist, returns 0 on at least some 12160
and doing so on Sun Fire V880 causes a data access error exception.
Reported and tested by: Beat Gaetzi
Approved by: mjacob
Obtained from: OpenBSD (modulo setting isp_lvdmode)
the code for parsing interrupt maps) to PowerPC and reflect their new MI
status by moving them to the shared dev/ofw directory.
This commit also modifies the OFW PCI enumeration procedure on PowerPC to
allow the bus to find non-firmware-enumerated devices that Apple likes to add,
and adds some useful Open Firmware properties (compat and name) to the pnpinfo
string of children on OFW SBus, EBus, PCI, and MacIO links. Because of the
change to PCI enumeration on PowerPC, X has started working again on PPC
machines with Grackle hostbridges.
Reviewed by: marius
Obtained from: sparc64
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,
The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.
Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:
- Kip Macy revised the locking code completely, thus completing
the last piece of the puzzle, Kip has also been conducting
active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
me maintaining that branch before the svn conversion
really was meant to be 256. Adjust usage accordingly and replace
bogus usage of this value in checking IEEE channel #'s.
NB: this causes an ABI change; ifconfig must be recompiled
indicates if an association id is required before outbound traffic
is permitted. This cleans up the previous change that broke mcast
traffic "to the stack" in ap mode as a side effect.
Reviewed by: sephe, thompsa, weongyo
src into the tree. The old split was balanced on module dependencies
and symbol exposure that no longer exists. Users that want a module
setup with rate control algorithm other than sample must override
ATH_RATE in the ath module Makefile.
Reviewed by: imp
to keep for 7-STABLE when MFCing in_pcbladdr() to not change the
behaviour there.
With this a destination route via a loopback interface is treated as
a valid and reachable thing for IPv4 source address selection, even
though nothing of that network is ever directly reachable, but it is
more like a blackhole route.
With this the source address will be selected and IPsec can grab the
packets before we would discard them at a later point, encapsulate them
and send them out from a different tunnel endpoint IP.
Discussed on: net
Reported by: Frank Behrens <frank@harz.behrens.de>
Tested by: Frank Behrens <frank@harz.behrens.de>
MFC after: 4 weeks (just so that I get the mail)
Remove ng_rmnode_flags() function.
ng_rmnode_self() was made to be called only while having node locked.
When node is properly locked, any function call sent to it will always be
queued. So turning ng_rmnode_self() into the ng_rmnode_flags() is not just
meaningless, but incorrent, as it violates node locking when called outside.
No objections: julian, thompsa
variable in this copy of the code[1].
While here prefix the variables with 'pf_' to avoid file static global
variables with colliding names that are or will be virtualized.
Discussed with: rwatson, silby [1]
by redoing the Open Firmware card initialization calls in ofwfb_set_mode(). This
uses the same trick (setting V_ADP_MODECHANGE) to arrange this as machfb(4) and
creatorfb(4).
but formerly missed under VIMAGE_GLOBAL.
Put the extern declarations of the virtualized globals
under VIMAGE_GLOBAL as the globals themsevles are already.
This will help by the time when we are going to remove the globals
entirely.
Sponsored by: The FreeBSD Foundation
time it is marked for user space callchain capture in the NMI
handler and the time the callchain capture callback runs.
- Improve code and control flow clarity by invoking hwpmc(4)'s user
space callchain capture callback directly from low-level code.
Reviewed by: jhb (kern/subr_trap.c)
Testing (various patch revisions): gnn,
Fabien Thomas <fabien dot thomas at netasq dot com>,
Artem Belevich <artemb at gmail dot com>
if (batt_sleep_ms)
AcpiOsSleep(1);
where the rest are all:
if (batt_sleep_ms)
AcpiOsSleep(batt_sleep_ms);
I can't recall why that one was different, so change it
to match the rest.
Pointed out by: Christoph Mallon
MFC after: 2 weeks
All ioctl()'s that aren't implemented by pts(4) are forwarded to the TTY
itself. Unfortunately this is not correct for FIONREAD, because it will
give the wrong amount of bytes that are available to read.
Tested by: keramida
Reminded by: keramida
On some laptops with smart batteries, enabling battery monitoring
software causes keystrokes from atkbd to be lost. This has also been
reported on Linux, and is apparently due to the keyboard and I2C line
for the battery being routed through the same chip. Whether that's
accurate or not, adding extra sleeps to the status checking code
causes the problem to go away.
I've been running this for nearly six months now on my laptop,
it works like a charm.
Reviewed by: Nate Lawson (in a previous revision)
MFC after: 2 weeks
o recognize ixp435 cpu
o change memory layout for for ixp4xx to not assume memory is aliases
to 0x10000000 (Cambria/ixp435 memory starts at zero)
o handle 64 irqs for ixp435
o dual EHCI USB 2.0 controller integral to ixp435
o overhaul NPE code for ixp435 and better MAC+MII naming
o updated NPE firmware (including NPE-A image for ixp435/ixp465)
o Gateworks Cambria board support:
- IDE compact flash
- MCU
- front panel LED on i2c bus
- Octal LED latch
Sanity-tested with NFS-root on Avila and Cambria boards. Requires
pending boot2 mods for CF-boot on Cambria.
error is not EAGAIN. Several sysctls that inspect another process use
p_candebug() for checking access right for the curproc. p_candebug()
returns EAGAIN for some reasons, in particular, for the process doing
exec() now. If execing process tries to lock Giant, we get a livelock,
because sysctl handlers are covered by Giant, and often do not sleep.
Break the livelock by dropping Giant and allowing other threads to
execute in the EAGAIN loop.
Also, do not return EAGAIN from p_candebug() when process is executing,
use more appropriate EBUSY error [1].
Reported and tested by: pho
Suggested by: rwatson [1]
Reviewed by: rwatson, des
MFC after: 1 week
state changes. This change modifies tunopen and tunclose to call the
if_link_state_change() function. Among other things, this will result in
devd(8) receiving events from devctl(4) for linkup/link down. This allows
us to do several useful things, including initializing tunnel parameters
and adding routes.
Discussed on: freebsd-net@
MFC after: 2 weeks
on a best-effort basis. Teach vn_fullpath to use this new VOP if a
regular VFS cache lookup fails. This VOP is designed to supplement the
VFS cache to provide a better chance that a vnode-to-name lookup will
succeed.
Currently, an implementation for devfs is being committed. The default
implementation is to return ENOENT.
A big thanks to kib for the mentorship on this, and to pho for running it
through his stress test suite.
Reviewed by: arch
Approved by: kib
One thing I didn't expect many applications to use, was kqueue() on
pseudo-terminal master devices. There are applications that use kqueue()
on the TTY itself (rtorrent, etc). That doesn't mean we shouldn't
implement this. Libraries like libevent use kqueue() by default, which
means they wouldn't be able to use kqueue().
The old TTY layer implements a very broken version of kqueue() by
performing the actual polling on the TTY device.
Discussed with: peter
When I added RLIMIT_NPTS, I forgot to add it to rlimit_ident[]. Make
sure the rlimit_ident[] array is always RLIM_NLIMITS elements big. So if
we ever forget to add new rlimits to this list again. it will contain a
null pointer, instead of random data.
Spotted by: rwatson
missed under VIMAGE_GLOBAL.
Start putting the extern declarations of the virtualized globals
under VIMAGE_GLOBAL as the globals themsevles are already.
This will help by the time when we are going to remove the globals
entirely.
While there garbage collect a few dead externs from ip6_var.h.
Sponsored by: The FreeBSD Foundation
o Try to be smarter about reading the ExCA CSC register. Now, we only
do it for 16-bit cards. Add some experimental code to treat it like
a power interrupt, but I'm not 100% sure that I like it. It may be
removed upon further testing. It seemed to help in one test case, but
the evidence may be inconclusive. This may be beneficial for cleaning up
exca_reset and exca_wait_ready.
o Check for CSTS events on the socket event register. We ask for it when
we're powering up a card, but I don't think we're otherwise using
it. Just ACK the interrupt for now. In theory, we can use it
instead of the busy wait we do in cbb_cardbus_reset. More research
is necessary to see if we can optimize things there when we're
waiting for the DEVVENDOR register to become valid.
o Rework the comments a bit. Minor tidying up. Etc.
outside the prison_states array.
When checking if there is a name configured for the prison, check the
first character to not be '\0' instead of checking if the char array
is present, which it always is. Note, that this is different for the
*jailname in the syscall.
Found with: Coverity Prevent(tm)
CID: 4156, 4155
MFC after: 4 weeks (just that I get the mail)
jhb probably forgot to commit this file with r185878 and will want to
review this. It unbreaks the build here.
Obtained from: p4 //depot/user/jhb/lock/compat/freebsd32/freebsd32_signal.h#2
container structures, depending on VIMAGE_GLOBALS compile time option.
Make VIMAGE_GLOBALS a new compile-time option, which by default will not
be defined, resulting in instatiations of global variables selected for
V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be
effectively compiled out. Instantiate new global container structures
to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0,
vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.
Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_
macros resolve either to the original globals, or to fields inside
container structures, i.e. effectively
#ifdef VIMAGE_GLOBALS
#define V_rt_tables rt_tables
#else
#define V_rt_tables vnet_net_0._rt_tables
#endif
Update SYSCTL_V_*() macros to operate either on globals or on fields
inside container structs.
Extend the internal kldsym() lookups with the ability to resolve
selected fields inside the virtualization container structs. This
applies only to the fields which are explicitly registered for kldsym()
visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently
this is done only in sys/net/if.c.
Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code,
and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in
turn result in proper code being generated depending on VIMAGE_GLOBALS.
De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c
which were prematurely V_irtualized by automated V_ prepending scripts
during earlier merging steps. PF virtualization will be done
separately, most probably after next PF import.
Convert a few variable initializations at instantiation to
initialization in init functions, most notably in ipfw. Also convert
TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in
initializer functions.
Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation
by running the tunable_mbinit() SYSINIT at SI_ORDER_MIDDLE
before the init_maxsockets() SYSINT at SI_ORDER_ANY.
Reviewed by: rwatson, zec
Sponsored by: The FreeBSD Foundation
MFC after: 4 weeks
The /dev/console device node logs all strings that are written to it.
When the string does not contain a trailing newline, it appends one. I
can imagine this was useful a long time ago, but with our current
rc-scripts, it generates a whole bunch of messages that look like:
| Configuring syscons:
| blanktime
| .
By not appending the newlines, the output of `dmesg -a' is now (almost?)
exactly the same as what the user will see on the console device
(syscons, uart).
aio code and are registered via the recently added SYSCALL32_*() helpers.
- Since the aio code likes to invoke fuword and suword a lot down in the
"bowels" of system calls, add a structure holding a set of operations for
things like storing errors, copying in the aiocb structure, storing
status, etc. The 32-bit system calls use a separate operations vector to
handle fuword32 vs fuword, etc. Also, the oldsigevent handling is now
done by having seperate operation vectors with different aiocb copyin
routines.
- Split out kern_foo() functions for the various AIO system calls so the
32-bit front ends can manage things like copying in and converting
timespec structures, etc.
- For both the native and 32-bit aio_suspend() and lio_listio() calls,
just use copyin() to read the array of aiocb pointers instead of using
a for loop that iterated over fuword/fuword32. The error handling in
the old case was incomplete (lio_listio() just ignored any aiocb's that
it got an EFAULT trying to read rather than reporting an error), and
possibly slower.
MFC after: 1 month
causes data corruption in combination with certain bridges.
Information about this problem was kindly provided by davidch. [1]
- As BGE_FLAG_PCIX is meant to indicate that the controller is in
PCI-X mode, revert to the pre __FreeBSD_version 602101 method of
reading the bus mode register rather than checking the mere
existence of a PCI-X capability, which is also there when the
NIC f.e. is put into a 32-bit slot causing it not to be in PCI-X
mode. Setting BGE_FLAG_PCIX inappropriately could cause the NIC
to be tuned incorrectly.
PR: 128833 [1]
Reviewed by: jhb
MFC after: 3 days
When VLAN tagged frame is received the hardware sets 'LONG' bit of
Rx status word. It is always set when the size of received frame
exceeded 1518 bytes, including CRC. This VLAN tagged frame clears
'OK' bit of Rx status word such that driver should not rely on 'OK'
bit of Rx status word to pass the VLAN tagged frame to upper stack.
To fix the bug, don't use SIS_CMDSTS_PKT_OK for Rx error check and
introduce SIS_RXSTAT_ERROR macro that checks Rx errors. If we are
configured to accept VLAN tagged frames and the received frame size
is less than or equal to maximum allowed length of VLAN tagged
frame, clear 'LONG' bit of Rx status word before checking Rx
errors.
Reported by: Vladimir Ermako < samflanker <> gmail DOT com >
Tested by: Vladimir Ermako < samflanker <> gmail DOT com >
to read-locking in the TCP input path, allowing greater TCP
input parallelism where multiple ithreads or ithread and netisr
are able to run in parallel. Previously, most TCP input paths
held a write lock on the global tcbinfo lock, effectively
serializing TCP input.
Before looking up the connection, acquire a write lock if a
potentially state-changing flag is set on the TCP segment header
(FIN, RST, SYN), and otherwise a read lock. We may later have
to upgrade to a write lock in certain cases (ACKs received by the
syncache or during TIMEWAIT) in order to support global state
transitions, but this is never required for steady-state packets.
Upgrading from a write lock to a read lock must be done as a
trylock operation to avoid deadlocks, and actually violates the
lock order as the tcbinfo lock preceeds the inpcb lock held at
the time of upgrade. If the trylock fails, we bump the refcount
on the inpcb, drop both locks, and re-acquire in-order. If
another thread has freed the connection while the locks are
dropped, we free the inpcb and repeat the lookup (this should
hardly ever or never happen in practice).
For now, maintain a number of new counters measuring how many
times various cases execute, and in particular whether various
optimistic assumptions about when read locks can be used, whether
upgrades are done using the fast path, and whether connections
close in practice in the above-described race, actually occur.
MFC after: 6 weeks
Discussed with: kmacy
Reviewed by: bz, gnn, kmacy
Tested by: kmacy
incremented using in_pcbref(), and decremented using in_pcbfree()
or inpcbrele(). Protocols using only current in_pcballoc() and
in_pcbfree() calls will see the same semantics, but it is now
possible for TCP to call in_pcbref() and in_pcbrele() to prevent
an inpcb from being freed when both tcbinfo and per-inpcb locks
are released. This makes it possible to safely transition from
holding only the inpcb lock to both tcbinfo and inpcb lock
without re-looking up a connection in the input path, timer
path, etc.
Notice that in_pcbrele() does not unlock the connection after
decrementing the refcount, if the connection remains, so that
the caller can continue to use it; in_pcbrele() returns a flag
indicating whether or not the inpcb pointer is still valid, and
in_pcbfee() is now a simple wrapper around in_pcbrele().
MFC after: 1 month
Discussed with: bz, kmacy
Reviewed by: bz, gnn, kmacy
Tested by: kmacy
message for r185765.
Noted by: rdivacky
Requested by: des
Commit message for r185765 should be:
In procfs map handler, and in linprocfs maps handler, do not call
vn_fullpath() while having vm map locked. This is done in anticipation
of the vop_vptocnp commit, that would make vn_fullpath sometime
acquire vnode lock.
Also, in linprocfs, maps handler already acquires vnode lock.
No objections from: des
MFC after: 2 week
sbuf instead of doing uiomove. This allows for reads from non-zero
offsets to work.
Patch is forward-ported des@' one, and was adopted to current code
by dchagin@ and me.
Reviewed by: des (linprocfs part)
PR: kern/101453
MFC after: 1 week
does final drop of the the dq reference to put it onto the free list.
There is a possibility that the dq would be found by another thread
after sync and before the dqh lock is acquired. If that other thread
drops the dq before we have taken the dqh lock, the dirty dq is put on
the free list.
Recheck the DQ_MOD after the dqh lock is relocked. Repeat dqsync() if
the dq is dirty. This ensures that up to date dq is written in the quota
file and fixes assertion in dqget().
Reported and tested by: Frode Nordahl <frode nordahl net>
MFC after: 3 days
Waiting for 1ms for each GMII register access looks overkill and it
may also decrease overall performance of driver because re(4)
invokes mii_tick for every hz.
Tested by: rpaulo
laptops. This includes battery presence detection, charging status, current
and voltage readouts, and charge level indication. The sysctl interface
is somewhat ACPI-like.
established a valid link or not. In miibus_statchg handler add a
check for established link is valid one for the controller(e.g.
1000baseT is not a valid link for fastethernet controllers.)
o Added a flag RE_FLAG_FASTETHER to mark fastethernet controllers.
o Added additional check to know whether we've really encountered
watchdog timeouts or missed Tx completion interrupts. This change
may help to track down the cause of watchdog timeouts.
o In interrupt handler, removed a check for link state change
interrupt. Not all controllers have the bit and re(4) did not
rely on the event for a long time. In addition, re(4) didn't
request the interrupt in RL_IMR register.
Tested by: rpaulo
drivers, there should be a 1us delay after every write when
bit-banging the MII. Also insert barriers in order to ensure
the intended ordering. These changes hopefully will solve the
bus wedging occasionally experienced with DM9102A since r182461.
- Deobfuscate dc_mii_readreg() a bit.
loader_conf_files="foo bar baz"
should cause loading the files listed, and then resume with the
remaining config files (from previous values of the variable).
Unfortunately, sometimes the line was ignored -- actually even
modifying the line in /boot/default/loader.conf sometimes doesn't work.
ANALYSIS: After much investigation, turned out to be a bug in the logic.
The existing code detected a new assignment by looking at the address
of the the variable containing the string. This only worked by pure
chance, i.e. if the new string is longer than the previous value
then the memory allocator may return a different address
to store the string hence triggering the detection.
SOLUTION: This commit contains a minimal change to fix the problem,
without altering too much the existing structure of the code.
However, as a step towards improving the quality and reliability of
this code, I have introduced a handful of one-line functions
(strget, strset, strfree, string= ) that could be used in dozens
of places in the existing code.
HOWEVER:
There is a much bigger problem here. Even though I am no Forth
expert (as most fellow src committers) I can tell that much of the
forth code (in support.4th at least) is in severe need of a
review/refactoring:
+ pieces of code are replicated multiple times instead of writing
functions (see e.g. set_module_*);
+ a lot of stale code (e.g. "structure" definitions for
preloaded_files, kernel_module, pnp stuff) which is not used
or at least belongs elsewhere.
The code bload is extremely bad as the loader runs with very small
memory constraints, and we already hit the limit once (see
http://svn.freebsd.org/viewvc/base?view=revision&revision=185132
Reducing the footprint of the forth files is critical.
+ two different styles of coding, one using pure stack functions
(maybe beautiful but surely highly unreadable), one using
high level mechanisms to give names to arguments and local
variables (which leads to readable code).
Note that this code is used by default by all FreeBSD installations,
so the fragility and the code bloat are extremely damaging.
I will try to work fixing the three items above, but if others have
time, please have a look at these issues.
MFC after: 4 weeks
multiple algorithms and potentially collect multiple samples.
Instead of a single calibration interval we now have short and long
intervals; the long interval roughly corresponds to the previous
single interval. The short interval is used to speedup collection
of samples and happens much quicker. We make calls using the short
interval until we're told the calibration work is complete at which
point we fallback to the long interval. In addition there is a
much longer reset interval used to flush all calibration state and
cause everthing to start anew.
With these changes you can also disable calibration entirely by
setting the long interval to zero.
at. I don't think this will make a huge difference, but I have
received a report of a interrupt storm on one 16-bit card that this
might fix (chances are it won't, since I think that we may need to
check both the CBB registers for the 16-bit card as well as the PCIC
registers for power state change).
Submitted by: jhb@
the power interrupt and init code waiting for the interrupt are
running on different CPUs. I haven't seen this make any real
difference, but I've also had some reports of odd behavior I can't
otherwise explain. It is an infrequent operation, and certainly
wouldn't hurt.
my right mouse button and keyboard LEDs from working due to mangled
configuration packets. Fixed several other races and associated problems in the
main ADB stack that were exposed while fixing this.
Now it is possible to suspend/resume with inserted and active card.
To reinitialize card on resume and to detect card change while suspended,
implement bus rescan routines. It can also be used by controllers without
card presence detection signals or with multiple cards per slot support.
While there, cleanup msleep() usage. We have no any rights to exit without
"request done" signal from driver as it could lead to modify after free.
RTFREE_LOCKED() here. This macro makes sure the reference count
on the route is being managed properly. This elimates another
case which results in the following message being printed to the
console:
rtfree: 0xc841ee88 has 1 refs
Reviewed by: bz
MFC after: 2 weeks
bit of debugging afterwards):
- Fix protection code for notification generation.
- Decouple associd from vtag
- Allow vtags to have less strigent requirements in non-uniqueness.
o don't pre-hash them when you issue one in a cookie.
o Allow duplicates and use addresses and ports to
discriminate amongst the duplicates during lookup.
- Add support for the NAT draft draft-ietf-behave-sctpnat-00, this
is still experimental and needs more extensive testing with the
Jason Butt ipfw changes.
- Support for the SENDER_DRY event to get DTLS in OpenSSL working
with a set of patches from Michael Tuexen (hopefully heading to OpenSSL soon).
- Update the support of SCTP-AUTH by Peter Lei.
- Use macros for refcounting.
- Fix MTU for UDP encapsulation.
- Fix reporting back of unsent data.
- Update assoc send counter handling to be consistent with endpoint sent counter.
- Fix a bug in PR-SCTP.
- Fix so we only send another FWD-TSN when a SACK arrives IF and only
if the adv-peer-ack point progressed. However we still make sure
a timer is running if we do have an adv_peer_ack point.
- Fix PR-SCTP bug where chunks were retransmitted if they are sent
unreliable but not abandoned yet.
With the help of: Michael Teuxen and Peter Lei :-)
MFC after: 4 weeks
an unclean shutdown would make it impossible to mount rootfs at boot.
PR: kern/128529
Reviewed by: pjd
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
do not need any locking. Opening and closing translators is serialized
using an sx lock.
Note: This depends on the earlier fix to kern_module.c to properly order
MOD_UNLOAD events.
MFC after: 2 months
memory barriers on i386. It works as a serialization instruction on
all IA32 CPUs.
Alternative solution of using {s,l,}fence requires run-time checking
of the presense of the corresponding SSE or SSE2 extensions, and
possible boot-time patching of the kernel text.
Suggested by: many
parent threads sleep on the parent' struct proc until corresponding
child releases the vmspace. Each sleep is interlocked with proc mutex of
the child, that triggers assertion in the sleepq_add(). The assertion
requires that at any time, all simultaneous sleepers for the channel use
the same interlock.
Silent the assertion by using conditional variable allocated in the
child. Broadcast the variable event on exec() and exit().
Since struct proc * sleep wait channel is overloaded for several
unrelated events, I was unable to remove wakeups from the places where
cv_broadcast() is added, except exec().
Reported and tested by: ganbold
Suggested and reviewed by: jhb
MFC after: 2 week
move that module to the head of the associated linker file's list of modules.
The end result is that once all the modules are loaded, they are sorted in
the reverse of their load order. This causes the kernel linker to invoke
the MOD_QUIESCE and MOD_UNLOAD events in the reverse of the order that
MOD_LOAD was invoked. This means that the ordering of MOD_LOAD events that
is set by the SI_* paramters to DECLARE_MODULE() are now honored in the same
order they would be for SYSUNINIT() for the MOD_QUIESCE and MOD_UNLOAD
events.
MFC after: 1 month
unloading any modules. As a result, if any module veto's an unload
request via MOD_QUIESCE, the entire set of modules for that linker
file will remain loaded and active now rather than leaving the kld
in a weird state where some modules are loaded and some are unloaded.
- This also moves the logic for handling the "forced" unload flag out of
kern_module.c and into kern_linker.c which is a bit cleaner.
- Add a module_name() routine that returns the name of a module and use that
instead of printing pointer values in debug messages when a module fails
MOD_QUIESCE or MOD_UNLOAD.
MFC after: 1 month
interrupt code to be more robust. I've been running these changes for
over a year... With these changes, I don't see the ath card going
into reset like the code in the tree.
packet loss, of between 10-30%. The fix is to put the PHY into
and take it out of local loopback mode when resetting the interface.
Obtained from: Chelsio Inc.
MFC after: 3 days
o Chip full mask revision 2 or later controllers have to
set correct Tx MAC and Tx offload clock depending on negotiated
link speed.
o JMC260 chip full mask revision 2 has a silicon bug that can't
handle 64bit DMA addressing. Add workaround to the bug by
limiting DMA address space to be within 32bit.
o Valid FIFO space of receive control and status register was
changed on chip full mask revision 2 or later controllers. For
these controllers, use default 16QW as it's supposed to be the
safest value for maximum PCIe compatibility. JMicron confirmed
performance will not be reduced even if the FIFO space is set
to 16QW.
o When interface is put into suspend/shutdown state, remove Tx MAC
and Tx offload clock to save more power. We don't need Tx clock
at all in this state.
o Added new register definition for chip full mask revision 2 or
later controllers.
Thanks to JMicron for their continuous support of FreeBSD.
change. As a side effect, this makes the excessive interrupts to
disappear which has been observed as a regression in recent stable/7.
Reported by: many (on -stable@)
Reviewed by: davidch
validation code on ZFS.
Problem: when opening file with O_CREAT|O_EXCL NFS has to jump through
extra hoops to ensure O_EXCL semantics. Namely, client supplies of 8
bytes (NFSX_V3CREATEVERF) bytes of verification data to uniquely
identify this create request. Server then creates a new file with access
mode 0, copies received 8 bytes into va_atime member of struct vattr and
attempt to set the atime on file using VOP_SETATTR. If that succeeds, it
fetches file attributes with VOP_GETATTR and verifies that atime
timestamps match. If timestamps do not match, NFS server concludes it
has probbaly lost the race to another process creating the file with the
same name and bails with EEXIST.
This scheme works OK when exported FS is FFS, but if underlying
filesystem is ZFS _and_ server is running 64bit kernel, it breaks down
due to sanity checking in zfs_setattr function, which refuses to accept
any timestamps which have tv_sec that cannot be represented as 32bit
int. Since struct timespec fields are 64 bit integers on 64bit platforms
and server just copies NFSX_V3CREATEVERF bytes info va_atime, all eight
bytes supplied by client end up in va_atime.tv_sec, forcing it out of
valid 32bit range.
The solution this change implements is simple: it treats
NFSX_V3CREATEVERF as two 32bit integers and unpacks them separately into
va_atime.tv_sec and va_atime.tv_nsec respectively, thus guaranteeing
that tv_sec remains in 32 bit range and ZFS remains happy.
Reviewed by: kib
Close subtle but relatively unlikely race conditions when
propagating the vnode write error to other active sessions
tracing to the same vnode, without holding a reference on
the vnode anymore. [2]
PR: kern/126368 [1]
Submitted by: rwatson [2]
Reviewed by: kib, rwatson
MFC after: 4 weeks
boot0.S changes:
+ import a patch from Christoph Mallon to rearrange the various
print functions and save another couple of bytes;
+ implement the suggestion in PR 70531 to enable booting from
any valid partition because even the extended partitions that
were previously in our kill list may contain a valid boot loader.
This simplifies the code and saves some bytes;
+ followwing up PR 127764, implement conditional code to preserve
the 'Volume ID' which might be used by other OS (NT, XP, Vista)
and is located at offset 0x1b8. This requires a relocation of the
parameter block within the boot sector -- there is no other
possible workaround.
To address this, boot0cfg has been updated to handle both
versions of the boot code;
+ slightly rearrange the strings printed in the menus to make
the code buildable with all options. Given the tight memory
budget, this means that with certain options we need to
shrink or remove certain labels.
and especially:
make -DVOLUME_LABEL -DPXE the default options.
This means that the newly built boot0 block will preserve the
Volume ID, and has the (hidden) option F6 to boot from INT18/PXE.
I think the extra functionality is well worth the change.
The most visible difference here is that the 'Default: ' string
now becomes 'Boot: ' (it can be reverted to the old value
but then we need to nuke 1/2 partition name or entries to
make up for the extra room).
boot0cfg changes:
+ modify the code to recognise the new boot0 structure (with the
relocated options block to make room for the Volume id).
+ add two options, '-i xxxx-xxxx' to set the volume ID, -e c
to modify the character printed in case of bad input
PR: 127764 70531
Submitted by: Christoph Mallon (portions)
MFC after: 4 weeks
contrib/openbsm (svn merge) and sys/{bsm,security/audit} (manual merge).
- Add OpenBSM contrib tree to include paths for audit(8) and auditd(8).
- Merge support for new tokens, fixes to existing token generation to
audit_bsm_token.c.
- Synchronize bsm includes and definitions.
OpenBSM history for imported revisions below for reference.
MFC after: 1 month
Sponsored by: Apple Inc.
Obtained from: TrustedBSD Project
--
OpenBSM 1.1 alpha 2
- Include files in OpenBSM are now broken out into two parts: library builds
required solely for user space, and system includes, which may also be
required for use in the kernels of systems integrating OpenBSM. Submitted
by Stacey Son.
- Configure option --with-native-includes allows forcing the use of native
include for system includes, rather than the versions bundled with OpenBSM.
This is intended specifically for platforms that ship OpenBSM, have adapted
versions of the system includes in a kernel source tree, and will use the
OpenBSM build infrastructure with an unmodified OpenBSM distribution,
allowing the customized system includes to be used with the OpenBSM build.
Submitted by Stacey Son.
- Various strcpy()'s/strcat()'s have been changed to strlcpy()'s/strlcat()'s
or asprintf(). Added compat/strlcpy.h for Linux.
- Remove compatibility defines for old Darwin token constant names; now only
BSM token names are provided and used.
- Add support for extended header tokens, which contain space for information
on the host generating the record.
- Add support for setting extended host information in the kernel, which is
used for setting host information in extended header tokens. The
audit_control file now supports a "host" parameter which can be used by
auditd to set the information; if not present, the kernel parameters won't
be set and auditd uses unextended headers for records that it generates.
OpenBSM 1.1 alpha 1
- Add option to auditreduce(1) which allows users to invert sense of
matching, such that BSM records that do not match, are selected.
- Fix bug in audit_write() where we commit an incomplete record in the
event there is an error writing the subject token. This was submitted
by Diego Giagio.
- Build support for Mac OS X 10.5.1 submitted by Eric Hall.
- Fix a bug which resulted in host XML attributes not being arguments so
that const strings can be passed as arguments to tokens. This patch was
submitted by Xin LI.
- Modify the -m option so users can select more then one audit event.
- For Mac OS X, added Mach IPC support for audit trigger messages.
- Fixed a bug in getacna() which resulted in a locking problem on Mac OS X.
- Added LOG_PERROR flag to openlog when -d option is used with auditd.
- AUE events added for Mac OS X Leopard system calls.
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.
For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.
Reviewed by: brooks, gnn, des, zec, imp
Sponsored by: The FreeBSD Foundation
Sgtty is a programming interface that has been replaced by termios over
the years. In June we already removed <sgtty.h>, which exposes the
ioctl()'s that are implemented by this interface. The importance of this
flag is overrated right now.
when it sees only received packets. In some cases where a device only
recieves data it mistakenly thinks that its transmitting side is broken
and resets the device.
Obtained from: Chelsio Inc.
MFC after: 3 days
of the boot0.S code, with a number of compile-time selectable options,
the most interesting one being the ability to select PXE booting.
The code is completely compatible with the previous one, and with
the boot0cfg program. Even the actual code is largely unmodified,
with only minor rearrangements or fixes to make room for the new
features.
The behaviour of the standard build differs from the previous
version in the following, minor things:
+ 'noupdate' is the default, which means the code does not
write back the selection to disk. You can enable the feature
at runtime with boot0cfg, or changing the flags in the Makefile.
+ a drive number of 0x00 (floppy, or USB in floppy emulation) is
now accepted as valid. Previously, it was overridden with 0x80,
meaning that the partition table coming from the media was
used to access sectors on a possibly different media.
You can revert to the previous mode building with -DCHECK_DRIVE,
and you can always use the 'setdrv' option in boot0cfg
+ certain FAT or NTFS partitions are listed as WIN instead of DOS.
+ the 'bel' character on a bad selection is replaced by a '#' to
make it clear that the system is not hang even if the machine
does not have a speaker. This can be reverted back at compile
time, or at runtime with an upcoming boot0cfg option.
Additional features are available as compile time options,
and may be become the default if deemed useful. In particular:
+ INT18/PXE boot (make -DPXE)
This option enables booting through INT 18h (which on certain
BIOSes can be hooked to PXE) by pressing F6. There is unfortunately
no room to print the additional menu option.
Also, to make room for the code, the 'Default: ' string is
changed to 'Boot: '
+ print current drive number (make -DTEST)
Prints a line indicating the current drive number.
This is useful to figure out what is going on for machines/bioses
which remap drives in sometimes surprising ways.
+ disable numeric keys in console mode (make -DONLY_F_KEYS)
Not really a significant option, but it is needed to make
room for the -DTEST mode.
+ disable floppy support (make -DCHECK_DRIVE)
Revert to the old behaviour of only accepting 0x80 and above
as valid drive numbers.
MFC after: 6 weeks
entries for one name. Then, creating inode with that name would remove
one entry, leaving others dormant. Reclaiming the vnode would uncover
negative entries, causing false return of ENOENT from the calls like
stat, that do not create inode.
Prevent creation of the duplicated negative entries.
Reported and debugged with: pho
Reviewed by: jhb
X-MFC: after shared lookup changes
hardware for PMCs that have been configured for sampling.
- Bug fix: acknowledge PMC hardware overflows irrespective of the
the (software) PMC's state.
- break complex conditionals in to multiple lines to avoid wrapping
- remove copious unused debug statements
- be more aggressive about cleaning in the calling thread
- eliminate usage of ENOSPC
- increase number of iterations that cxgbsp can do
- eliminate "initerr" usage to simplify ENOBUFS handling
- when coalescing pass all packets to BPF
- always set overrun if hardware queue is full
This changes struct kinfo_filedesc and kinfo_vmentry such that they are
same on both 32 and 64 bit platforms like i386/amd64 and won't require
sysctl wrapping.
Two new OIDs are assigned. The old ones are available under
COMPAT_FREEBSD7 - but it isn't that simple. The superceded interface
was never actually released on 7.x.
The other main change is to pack the data passed to userland via the
sysctl. kf_structsize and kve_structsize are reduced for the copyout.
If you have a process with 100,000+ sockets open, the unpacked records
require a 132MB+ copyout. With packing, it is "only" ~35MB. (Still
seriously unpleasant, but not quite as devastating). A similar problem
exists for the vmentry structure - have lots and lots of shared libraries
and small mmaps and its copyout gets expensive too.
My immediate problem is valgrind. It traditionally achieves this
functionality by parsing procfs output, in a packed format. Secondly, when
tracing 32 bit binaries on amd64 under valgrind, it uses a cross compiled
32 bit binary which ran directly into the differing data structures in 32
vs 64 bit mode. (valgrind uses this to track file descriptor operations
and this therefore affected every single 32 bit binary)
I've added two utility functions to libutil to unpack the structures into
a fixed record length and to make it a little more convenient to use.
offload for VLAN frames are also supported. The VLAN hardware
assistance is available only on 82550/82551 based controllers.
While I'm here change the confusing name of bit1 in byte 22 of
configuration block to vlan_drop_en. The bit controls whether
hardware strips VLAN tagged frame or not. Special thanks to wpaul
who sent valuable VLAN related information to me.
Tested on: i386, sparc64
events. Just reading PMDR register was not enough to have fxp(4)
immuninize against received magic packets during system boot.
Tested by: Alexey Shuvaev < shuvaev <> physik DOT uni-wuerzburg DOT de >
exact multiple of system page size should still be allowed to be mapped
in their entirety to match the regular vnode backed file behavior.
Reported by: ed
Reviewed by: jhb
module; the ath module now brings in the hal support. Kernel
config files are almost backwards compatible; supplying
device ath_hal
gives you the same chip support that the binary hal did but you
must also include
options AH_SUPPORT_AR5416
to enable the extended format descriptors used by 11n parts.
It is now possible to control the chip support included in a
build by specifying exactly which chips are to be supported
in the config file; consult ath_hal(4) for information.
contents.
- It is possible to override the dynamic configuration by using
AT91C_MAIN_CLOCK option in kernel config.
PR: arm/128961 (based on)
Submitted by: Bjorn Konig <bkoenig@alpha-tierchen.de>
Reviewed by: imp
Approved by: kib (mentor, implicit)
by getaudit(2). Some applications such has su, id will interpret E2BIG as
requiring the use of getaudit_addr(2) to pull extended audit state (ip6)
from the kernel.
This change un-breaks the ABI when auditing has been activated on a system
and the users are logged in via ip6.
This is a RELENG_7_1 candidate.
MFC after: 1 day
Discussed with: rwatson
o eliminate private state indexed by 802.11 rate codes; use the hal's
rate tables directly to get the same info
o calculate a mask of operational rates to optimize lookups and checks
(instead of using for loops and similar)
o optimize size bin operations
o ignore rates marked as "do not use" in the hal phy tables
o fix bug that caused upshifting to break in 11g once the rate dropped
below 11Mb/s
o add more intelligent multi-rate tx schedules
o add support for 1/2 and 1/4 width channels
o add dev.ath.X.sample_stats sysctl to dump runtime statistics to the console
(needs to go up to a user app)
o export more tuning knobs via sysctls (still a couple of magic constants)
necessary workarounds, add code to detect these hangs and distinguish
them from other events; note this code is only invoked for anomalous
conditions and (at the moment) is a noop because the hang detection
code is in a new hal that's coming shortly
Volume 3B: System Programming Guide, Part 2", CPUs with family 0x6 and model
above or 0xE and CPUs with family 0xF and model above or 0x3 have invariant
TSC.
Volume 3B: System Programming Guide, Part 2", CPUs with family 0x6 and model
above or 0xE and CPUs with family 0xF and model above or 0x3 have invariant
TSC.
Change types used in the linux' struct msghdr and struct cmsghdr
definitions to the properly-sized architecture-specific types.
Move ancillary data handler from linux_sendit() to linux_sendmsg().
Submitted by: dchagin
Add a custom version of copyiniov() to deal with the 32-bit iovec
pointers from userland (to be used later).
Adjust prototypes for linux_readv() and linux_writev() to use new
l_iovec32 definition and to match actual linux code. In particular,
use ulong for fd (why ?).
Submitted by: dchagin
Bring in updated jail support from bz_jail branch.
This enhances the current jail implementation to permit multiple
addresses per jail. In addtion to IPv4, IPv6 is supported as well.
Due to updated checks it is even possible to have jails without
an IP address at all, which basically gives one a chroot with
restricted process view, no networking,..
SCTP support was updated and supports IPv6 in jails as well.
Cpuset support permits jails to be bound to specific processor
sets after creation.
Jails can have an unrestricted (no duplicate protection, etc.) name
in addition to the hostname. The jail name cannot be changed from
within a jail and is considered to be used for management purposes
or as audit-token in the future.
DDB 'show jails' command was added to aid debugging.
Proper compat support permits 32bit jail binaries to be used on 64bit
systems to manage jails. Also backward compatibility was preserved where
possible: for jail v1 syscalls, as well as with user space management
utilities.
Both jail as well as prison version were updated for the new features.
A gap was intentionally left as the intermediate versions had been
used by various patches floating around the last years.
Bump __FreeBSD_version for the afore mentioned and in kernel changes.
Special thanks to:
- Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches
and Olivier Houchard (cognet) for initial single-IPv6 patches.
- Jeff Roberson (jeff) and Randall Stewart (rrs) for their
help, ideas and review on cpuset and SCTP support.
- Robert Watson (rwatson) for lots and lots of help, discussions,
suggestions and review of most of the patch at various stages.
- John Baldwin (jhb) for his help.
- Simon L. Nielsen (simon) as early adopter testing changes
on cluster machines as well as all the testers and people
who provided feedback the last months on freebsd-jail and
other channels.
- My employer, CK Software GmbH, for the support so I could work on this.
Reviewed by: (see above)
MFC after: 3 months (this is just so that I get the mail)
X-MFC Before: 7.2-RELEASE if possible
to the fs, but before a vnode on the fs is locked, unmount may free fs
structures, causing access to destroyed data and freed memory.
Introduce a vfs_busymp() function that looks up and busies found
fs while mountlist_mtx is held. Use it in nfsrv_fhtovp() and in the
implementation of the handle syscalls.
Two other uses of the vfs_getvfs() in the vfs_subr.c, namely in
sysctl_vfs_ctl and vfs_getnewfsid seems to be ok. In particular,
sysctl_vfs_ctl is protected by Giant by being a non-sleeping sysctl
handler, that prevents Giant-locked unmount code to interfere with it.
Noted by: tegge
Reviewed by: dfr
Tested by: pho
MFC after: 1 month
- Print flags in hex.
- Note that flags can be fine and panic can be due unexpected error condition.
- Remove redundant new line character.
Eventhough panic message excess 80 characters keep it in one line so it is
easier to grep.
In file included from /src/sys/modules/powermac_nvram/../../dev/powermac_nvram/powermac_nvram.c:38:
@/dev/ofw/ofw_bus.h:36:24: error: ofw_bus_if.h: No such file or directory
I am not sure for how long this had not worked and if it was just the
latest vimage commit that had revealed this or if nobody had built
universe successfully in a while. Btw, the tinderbox did not complain
either so that is probably the reason noone had noticed.
underneath #ifdef VIMAGE blocks.
This change introduces some churn in #include ordering and nesting
throughout the network stack and drivers but is not expected to cause
any additional issues.
In the next step this will allow us to instantiate the virtualization
container structures and switch from using global variables to their
"containerized" counterparts.
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation
The mqfs_search() routine uses strncmp() to match message queue objects
by name. This is because it can be called from environments where the
file name is not null terminated (the VFS for example).
Unfortunately it doesn't compare the lengths of the message queue names,
which means if a system has "Queue12345", the name "Queue" will also
match.
I noticed this when a student of mine handed in an exercise using
message queues with names "Queue2" and "Queue".
Reviewed by: rink
IPv6 socket by comparing a constant inp vflag.
This is expected to help to reduce extra locking.
Suggested by: rwatson
Reviewed by: rwatson
MFC after: 6 weeks
IPsec change in r185366 only differed in two additonal IPv6 lines.
Rather than splattering conditional code everywhere add the v6
check centrally at this single place.
Reviewed by: rwatson (as part of a larger changset)
MFC after: 6 weeks (*)
(*) possibly need to leave a stub wrapper in 7 to keep the symbol.
Ignoring different names because of macros (in6pcb, in6p_sp) and
inp vs. in6p variable name both functions were entirely identical.
Reviewed by: rwatson (as part of a larger changeset)
MFC after: 6 weeks (*)
(*) possibly need to leave a stub wrappers in 7 to keep the symbols.
and Core Duo), models 0xF (Core2), model 0x17 (Core2Extreme) and
model 0x1C (Atom).
In these CPUs, the actual numbers, kinds and widths of PMCs present
need to queried at run time. Support for specific "architectural"
events also needs to be queried at run time.
Model 0xE CPUs support programmable PMCs, subsequent CPUs
additionally support "fixed-function" counters.
- Use event names that are close to vendor documentation, taking in
account that:
- events with identical semantics on two or more CPUs in this family
can have differing names in vendor documentation,
- identical vendor event names may map to differing events across
CPUs,
- each type of CPU supports a different subset of measurable
events.
Fixed-function and programmable counters both use the same vendor
names for events. The use of a class name prefix ("iaf-" or
"iap-" respectively) permits these to be distinguished.
- In libpmc, refactor pmc_name_of_event() into a public interface
and an internal helper function, for use by log handling code.
- Minor code tweaks: staticize a global, freshen a few comments.
Tested by: gnn
controllers. ICH based controllers are treated as 82559. 82557,
earlier revision of 82558 and 82559ER have no WOL capability.
o WOL support requires help of a firmware so add check whether
hardware is capable of handling magic frames by reading EEPROM.
o Enable accepting WOL frames only when hardware is about to
suspend or shutdown. Previously fxp(4) used to allow receipt of
magic frame under normal operation mode which could cause
hardware hang if magic frame is received by hardware. Datasheet
clearly states driver should not allow WOL frames under normal
operation mode.
o Disable WOL frame reception in device attach so have fxp(4)
immunize against system hang which can be triggered by magic
packets when the hardware is not in fully initialized state.
o Don't reset all hardware configuration data in fxp_stop()
otherwise important configuration data is lost and this would
reset WOL configuration to default state which in turn cause
hardware hang on receipt of magic frames. To fix the issue,
preserve hardware configuration data by issuing a selective
reset.
o Explicitly disable interrupts after issuing selective reset as
reset may unmask interrupts.
Tested by: Alexey Shuvaev < shuvaev <> physik DOT uni-wuerzburg DOT de >
will sometimes fail to initialize problem due to a lock
contention with management hardware. However, in order to
deliver that fix it was necessary to take a shared code
update as a whole, and this required scattered changes in
the core code to be compatible.
The em driver now has VLAN HW support added as the igb
driver had previously.
MFC after: ASAP - in time for 7.1 RELEASE
-This version has header split, and as a result a number of
aspects of the code have been improved/simplified.
- Interrupt handling refined for performance
- Many small bugs fixed along the way
MFC after: ASAP - in time for 7.1
whitespace) macros from p4/vimage branch.
Do a better job at enclosing all instantiations of globals
scheduled for virtualization in #ifdef VIMAGE_GLOBALS blocks.
De-virtualize and mark as const saorder_state_alive and
saorder_state_any arrays from ipsec code, given that they are never
updated at runtime, so virtualizing them would be pointless.
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation
instead of "puts" which prints whatever is at %si, followed by a CRLF.
It was not noticed during tests because at that point %si points
to a partition entry whose first byte is 0x80, which is both a
terminator for the string and a non printable character.
Submitted by: Christoph Mallon
as in_pcbdetach() and we don't need the code twice.
Reviewed by: rwatson
MFC after: 6 weeks (*)
(*) possibly need to leave a stub wrapper in 7 to keep the symbol.
boot code. The bug was introduced in rev.1.13, and went unnoticed
because FreeBSD's boot1 does not use it, but other systems might.
(I have been struggling for almost a full day trying to figure out
why a syslinux'ed partition would not boot when started with the
FreeBSD /boot/boot0, only to realize that the bug was ours!)
The space for the two extra bytes (push %si and pop %si) is reclaimed
by removing an extra CRLF that is printed before booting.
The bug is not a major one but if there is time it might be a good
thing to merge it into the upcoming releases.
- Bugfix: Don't excede static number of ports allowed when iterating
over endpoints within an interface.
- u3g_speeds contains speeds in baud, not bytes per second, so divide
the buffer size by 10.
throttle. My SMP kernel hangs when one of those is selected by
powerd. Errata AA21 here:
ftp://download.intel.com/design/PentiumXE/specupdt/31030717.pdf
MFC after: 2 weeks
o Configure controller to use dynamic TBD as TSO requires that
operation mode.
o Add a dummy TBD to tx_cb_u as TSO can access one more TBD in TSO
operation.
o Increase a DMA segment size to 4096 to hold a full IP segment
with link layer header.
o Unlike other TSO capable controllers, 82550/82551 does not
modify the first IP packet in TSO operation so driver should
create an IP packet with proper header. Subsequent IP packets
are generated from the header information in the first IP packet
header. Likewise pseudo checksum also should be computed by
driver for the first packet.
o TSO requires one more TBD to hold total TCP payload. To make
code simple for TSO/non-TSO case, increase the index of the
first available TBD array.
o Remove KASSERT that checks the size of a DMA segment should be
less than or equal to MCLBYTES as it's no longer valid in TSO.
o Tx threshold and number of TBDs field is used to store MSS in
TSO. So don't set the Tx threshold in TSO case.
82559 or later controllers added simple checksum calculation logic
in RU. For backward compatibility the computed checksum is appended
at the end of the data posted to Rx buffer. This type of simple
checksum calculation support had been used on several vendors such
as Sun HME/GEM, SysKonnect GENESIS and Marvell Yukon controllers.
Because this type of checksum offload support requires parsing of
received frame and pseudo checksum calculation with software
routine it still consumes more CPU cycles than that of full-fledged
checksum offload controller. But it's still better than software
checksum calculation.
Rx buffer and loads DMA map. Also add a function
fxp_discard_rfabuf that handles reusing Rx buffer/DMA map. With
this change fxp_add_rfabuf just handles appending a new RFA to
existing chain.
o Initialize mbuf length in fxp_new_rfabuf.
o Don't reset rnr and have fxp(4) handle received frames even if
it couldn't allocate new Rx buffer. This will make fxp(4) reload
updated RFA under rnr case. The rnr would still be reset to 0 if
polling is active and fxp(4) processed number of allowed Rx
events.
o Update if_iqdrops if fxp(4) couldn't allocate Rx buffer.
Previously fxp(4) used to try to reuse Rx buffer when new buffer
allocation is failed. But fxp(4) didn't take into account loaded
DMA map such that the same DMA map was loaded again without
unloading the map. There is no reason to unload the loaded map and
reload the same map again, just reusing the map is enough. I
believe the spare DMA map in softc was introduced to implement this
behaviour. Also fxp(4) used to stop Rx processing if once Rx buffer
allocation or DMA map load fails which in turn resulted in losing
incoming frames under heavy network load. With this change fxp(4)
should survive from resource shortage condition.
practice, but it is a good programming practice and allows the kernel to not
depend on userland correctness.
- While there, make sizeof usage match the rest of the code.
Found with: Coverity Prevent(tm)
CID: 660, 662
practice, but it is a good programming practice nontheless and it allows the
kernel to not depend on userland correctness.
Found with: Coverity Prevent(tm)
CID: 655-659, 664-667
o Copy kb920x_machdep.c to at91_machdep.c
o Move board_init to new board_kb920x.c
o rename ramsize to at91_ramsize and make it accessible to board_* files.
o Delete files.kb920x. We can do this selection with the new boards.
o Add a stub for the tsc4370 board init, which will be added in
a future commit.
o Add new 'devices' at91_board_kb920x and at91_board_tsc4370. More are
needed and will be added in future commits.
Reviewed by: stass, cognet
advance of teaching vn_fullpath1() how to query file systems for
vnode-to-name mappings when cache lookups fail.
Thanks to kib for guidance and patience on this process.
Reviewed by: kib
Approved by: kib
Fix some issues about re-scanning of the devices.
src/lib/libusb20/libusb20_ugen20.c
Fix issue about libusb20 having to release the
USB transfers before doing a SET_CONFIG, else
the kernel will kill the file handle.
src/sys/dev/usb2/core/usb2_device.
src/sys/dev/usb2/core/usb2_generic.c
src/sys/dev/usb2/core/usb2_generic.h
Add support for U3G devices.
Improve and cleanup FIFO free handling.
Improve device re-enumeration.
src/sys/dev/usb2/core/usb2_msctest.c
src/sys/dev/usb2/core/usb2_msctest.h
Fix some problems in the USB Mass Storage Test.
Add Huawei vendor specific quirks.
src/sys/dev/usb2/core/usb2_request.c
Improve device re-enumeration.
src/sys/dev/usb2/ethernet/if_aue2.c
src/sys/dev/usb2/include/usb2_devid.h
src/sys/dev/usb2/include/usb2_devtable.h
src/sys/dev/usb2/quirk/usb2_quirk.c
Integrate changes from the old USB driver.
src/sys/dev/usb2/include/usb2_standard.h
Add definition of USB3.0 structures from USB.org.
src/sys/dev/usb2/serial/u3g2.c
src/sys/dev/usb2/serial/ugensa2.c
src/sys/modules/usb2/Makefile
src/sys/modules/usb2/serial_3g/Makefile
Import U3G driver.
Submitted by: Hans Petter Selasky (usb4bsd)
many bugs fixes, many more performance improvements.
Submitted by: Danny Braniss
M sbin/iscontrol/iscsi.conf.5
M sbin/iscontrol/iscontrol.8
M sbin/iscontrol/iscontrol.h
M sbin/iscontrol/config.c
M sbin/iscontrol/fsm.c
M sbin/iscontrol/login.c
M sbin/iscontrol/pdu.c
M sbin/iscontrol/misc.c
M sbin/iscontrol/auth_subr.c
M sbin/iscontrol/iscontrol.c
M sys/dev/iscsi/initiator/isc_cam.c
M sys/dev/iscsi/initiator/iscsi.h
M sys/dev/iscsi/initiator/isc_soc.c
M sys/dev/iscsi/initiator/iscsi_subr.c
M sys/dev/iscsi/initiator/iscsivar.h
M sys/dev/iscsi/initiator/isc_subr.c
M sys/dev/iscsi/initiator/iscsi.c
M sys/dev/iscsi/initiator/isc_sm.c
IFF_DRV_OACTIVE to note resource shortage to upper stack.
- Don't count number of mbuf chains. Default 32 DMA segments for a
frame is enough for most cases. If bus_dmamap_mbuf_sg fails use
m_collapse(9) to collapse the mbuf chain instead of relying on
expensive m_defrag(9).
- Move bpf handling to fxp_start_body() which is supposed to be
more appropriate place.
- Always arm watchdog timer whenever a new Tx request is made.
Previously fxp(4) used to arm watchdog timer only when
FXP_CXINT_THRESH-th Tx request is made. Because fxp(4) does not
rely on Tx interrupt to reclaim transmitted mbufs it's better to
arm watchdog timer to detect potential lockups.
- Add more aggresive Tx buffer reclaiming in fxp_start_body to make
room for new Tx requests. Since fxp(4) does not request Tx
completion interrupt for every frames it's necessary to clean
TXCBs in advance to saturate link.
- Make fxp(4) try to start more packets transmitting regardless of
interrupt type in fxp_intr_body.
patch the RX/TX performance becomes about 17~18 Mbps comparing with
the previous whose values were RX 7~8Mbps and TX 13~14Mbps.
- improve AL2230 RF handling in zd1211b
- support AL2230S RF that PV2000 is renamed to AL2230S
- use register ZYD_CR244, ZYD_CR243, ZYD_CR242 when the driver writes
values on RF. This routine is more faster than the original one
- use private TX lock to avoid LOR at zyd_raw_xmit()
- increase TX slots from 1 to 5
- needs to set the channel at IEEE80211_S_AUTH not IEEE80211_S_RUN
- detailed error handling. In previous the next command was sent to the
device even if there was errors
- setting ZYD_MAC_RX_THRESHOLD value should be different between 1211
and 1211b
- only try to stop the device at zyd_init_locked() if the device is
UPed
- do not use MTX_RECURSE
- do not try to grap Giant lock when the channel is changing
- move the device initialization routines from zyd_attach to zyd_init to
give a device full-reset chance to the driver.
- code cleanup at zyd_raw_xmit()
- simplify zyd_attach() routines
- resort functions and clean up variables
- DPRINTF style change.
- style(9)
Reviewed by: sam
check to fxp_txeof(). While I'm here unarm watchdog timer only if
there are no pending queued Tx requests.
Previously the watchdog timer was unarmed whenever Tx interrupt is
raised. This could be resulted in hiding root cause of watchdog
timeouts.
checksum offload configuration. Now checksum offload can be
controlled by ifconfig(8).
While I'm here add an additional check for interface capabilities
before applying user's request.