I changed. That is never a good sign.
1) only map 1 page at address zero, not 4096 pages
2) page 1 starts at address 4096 (PAGE_SIZE) not 4095 (PAGE_MASK). I
don't even want to think what the pte's looked like.
3) subtract the r/o page group start address from the end before
converting it to a count. Otherwise an extra page is mapped.
If you were affected by this, the symptoms of this was a hang at boot
after the spinner. Sorry folks. :-(
"You broke my laptop!" by: sam
accesses softc after it is freed. Use a different malloc type for
softc than the rest of the bus code to make it more clear when these
things happen that it is the driver that's at fault, not the bus code.
Suggested by: sam and/or phk (I think)
timeout would continue to happen: boom! Fix this[*] by timing out earlier.
[*] almost fixes the race on unload: wi_inquire could be running when
untimeout is called, and there's no way to know when it has actually
returned. This race is very rare and hard to lose.
Submitted by: scottl
seeded with arc4random rather than calling arc4random for each
packet. Note this is the same algorithm used to select the IV when
doing WEP on the host.
o don't grab the mutex at the top of ath_detach; it does nothing
useful
o deal with entry to ath_ioctl during detach to disable promiscuous
mode as a result of calling bpfdetach2: cannot call ath_init when
the device is marked invalid as the code isn't prepared to deal
with it (in particular by that time the hal reference may have
been yanked)
change ath_rate_ctl_reset to handle transition from station
mode to adhoc mode; was not resetting the initial xmit rate
causing outbound frames to be dicarded
use because a kernel thread is borrowing it. The borrowed page table
can change spontaneously, making any dependence on its continued use
subject to a race condition.
- _pmap_unwire_pte_hold() cannot use pmap_is_current(): If a change is
made to a page table page mapping for a borrowed page table, the TLB
must be updated.
In collaboration with: tegge
you on the current queue. In the future, it would be nice if priority
propagation could deterministicly pluck a thread off of the next queue
and put it on the current queue. Until then this hack stops us from
holding up our entire current queue, including interrupt handlers, while
a thread on the next queue is blocked while holding Giant.
- Inherit our pctcpu information from our parent.
- correct signedness mixups.
- log fix.
- preparation for 64bit sequence number.
introduce SA id (unique ID for SA - SPI is useless as duplicated
SPI is allowed)
- no need to malloc/free cksum buffer.
Obtained from: KAME
kqueue write events on a socket and you regularly create tons of pipes
which overwrites the structure causing a panic when removing the knote
from the list. If the peer has gone away (and it's a write knote), then
don't bother trying to remove the knote from the list.
Submitted by: Brian Buchanan and myself
Obtained from: nCircle
- Return NULL instead of returning memory outside of the stackgap
in stackgap_alloc() (FreeBSD-SA-00:42.linux)
- Check for stackgap_alloc() returning NULL in ibcs2_emul_find();
other calls to stackgap_alloc() have not been changed since they
are small fixed-size allocations.
- Replace use of strcpy() with strlcpy() in exec_coff_imgact()
to avoid buffer overflow
- Use strlcat() instead of strcat() to avoid a one byte buffer
overflow in ibcs2_setipdomainname()
- Use copyinstr() instead of copyin() in ibcs2_setipdomainname()
to ensure that the string is null-terminated
- Avoid integer overflow in ibcs2_setgroups() and ibcs2_setgroups()
by checking that gidsetsize argument is non-negative and
no larger than NGROUPS_MAX.
- Range-check signal numbers in ibcs2_wait(), ibcs2_sigaction(),
ibcs2_sigsys() and ibcs2_kill() to avoid accessing array past
the end (or before the start)
parameter in the read and write case dereferenced an unitialized
pointer and can't possibly ever have catched an actual invalid
argument.
This was apparently true for the read/write and getconf cases. The
latter does not even receive the paramter that is to be verified.
I'm surprised that this did not cause kernel panics, but it seems
that the uninitialized local variable happens to contain data that
may be used as a pointer to memory that satisfies the test condition.
Make the code work as intended by moving the test inside the switch
case where the pointer has been properly initialized.
Since the read and write case shared just about all code (except
for the single call to PCIB_READ_CONFIG resp. PCIB_WRITE_CONFIG) I
have merged both cases.
Noticed by: trhodes@FreeBSD.org (Tom Rhodes)
- Allocate storage for uap->msg always because it is copyin()'ed in
native sendmsg().
- Convert sockopt level from Linux to FreeBSD after native recvmsg() calling.
- Some cleanups.
Tested with: Oracle 9i shared server connection mode.
MFC after: 1 week
o correct recursive locking when polling and in em_82547_move_tail
o destroy mutex on detach
o add EM_LOCK_ASSERT and similar macros for creating+deleteing the mtx
Submitted by: Daniel Eischen <eischen@vigrid.com>
beasts which are reported to exist in both Atmel and Prism2 flavours. In
particular, Itronix branded laptops have the Atmel part with an Intersil
radio.
Obtained from: NetBSD
from UWX_REG_MUMBLE to UWX_REG_AR_MUMBLE. Compatibility defines are
present in libuwx. Change the names here so that we don't depend on
compatibility defines.
Note that there's now an UWX_REG_PFS and an UWX_REG_AR_PFS and the
former is not a compatibility define for the latter AFAICT. Change
to UWX_REG_AR_PFS as that seems to be the one we need to handle.
all the fixes locally applied and submitted to the author. Not
included in BETA 5, but part of this import are:
o FreeBSD specific ifdefs to make this compile within a kernel.
These are limited to include directives and defines.
o Removal of unused variables, proper casts and initializations
to allow building with -Werror. This happens in code so has a
higher chance of causing future import conflicts but not enough
to worry about it.
I'm especially thankful that the author accepted the change to
replace DISABLE_TRACE with UWX_TRACE_ENABLE so that we can use it
in kernel config files without nasty mappings or indirections as
that would make the integration less perfect. Thanks Cary!
an uninitialized sysctl_ctx, using flag DA_FLAG_SCTX_INIT. This
prevents a panic encoutered with some umass units that probe correctly
but fail to attach. Same problem, and same fix, as scsi_cd.c rev. 1.86.
Reviewed by: njl, ken
pmap_copy_page() et al. to accept a vm_page_t rather than a physical
address. Also, this change will facilitate locking access to the vm page's
valid field.
has been initialized.
(cdsysctlinit): Set flag CD_FLAG_SCTX_INIT after sysctl_ctx has been
initialized.
This resolves a panic encountered when a cd drive is sucessfully probed
but fails to attach.
Reviewed by: ken
o minor optimization of cardbus_cis processing. Remove a bunch of generic
entries that are handled by generic.
o no longer need the card_get_type stuff.
This MIB specifies how many bus resets should be observed before the
lost device entry is removed. The default value is 3.
You can set this value to 0 if you want a SBP device to be detached from CAM
layer as soon as the device is physically detached like USB.
routine of its own, and allows us to move the indentation back two
layers making the code more readable.
delete a prototype that should have been killed years ago in pccardvar.h.
# adding quirks here is way harder than it needs to be. :-(
In unodered excution case, we cannot detect link-chain end only
by prev == NULL if lastest ORB is executed earlyer than the former
ORBs. Use ORB_LINK_DEAD flag for this case.
- Don't reset agent for management ORB.
- Improve debug messages.
Spotted by: sbp target mode
a long-time bug: vm_pager_get_pages() assumes that m[reqpage] contains a
valid page upon return from pgo_getpages(). In the case of the device
pager this page has been freed and replaced by a fake page. The fake page
is properly inserted into the vm object but m[reqpage] is left pointing
to a freed page. For now, update m[reqpage] to point to the fake page.
Submitted by: tegge
caused snapshot related problems.
- The vp can not be NULL here or we would panic in vfs_bio_awrite(). Stop
confusing the logic by checking for it in several places.
Submitted by: kirk and then rototilled by me to remove vp == NULL checks.
for 21143 based cards which use SIA mode.
This fixes 10mbit mode for ZNYX ZX346Q cards and other
21143 based cards.
PR: 32118
Submitted by: Rene de Vries <rene@tunix.nl>
Geert Jan de Groot <GeertJan.deGroot@tunix.nl>
Obtained from: BSDI
MFC after: 2 weeks
VOP_INACTIVE routines need not worry about their vnode getting
recycled if they block. Remove the code from nfs_inactive() that
used vget() to get an extra vnode reference that was held during
the nfs_vinvalbuf() call.
so make the code slightly more uniform. The vnode lock is acquired in
all cases and now the only difference between VCHR and other is we
call UFS_UPDATE instead of VOP_FSYNC().
Use pre-emption detection to avoid the need for wiring a userland buffer
when copying opaque data structures.
sysctl_wire_old_buffer() is now a no-op. Other consumers of this
API should use pre-emption detection to notice update collisions.
vslock() and vsunlock() should no longer be called by any code
and should be retired in subsequent commits.
Discussed with: pete, phk
MFC after: 1 week
go away in due course. Involuntary pre-emption means that we can't count
on wiring of pages alone for consistency when performing a SYSCTL_OUT()
bigger than PAGE_SIZE.
Discussed with: pete, phk
- Slightly rewrite the fsync loop to be more lock friendly. We must
acquire the vnode interlock before dropping the mnt lock. We must
also check XLOCK to prevent vclean() races.
- Use LK_INTERLOCK in the vget() in ffs_sync to further prevent vclean()
races.
- Use a local variable to store the results of the nvp == TAILQ_NEXT
test so that we do not access the vp after we've vrele()d it.
- Add an XXX comment about UFS_UPDATE() not being protected by any lock
here. I suspect that it should need the VOP lock.
LK_RETRY either, we don't want this vnode if it turns into another.
- Remove the code that checks the mount point after acquiring the lock
we are guaranteed to either fail or get the vnode that we wanted.
- In vtryrecycle() try to vgonel the vnode if all of the previous checks
passed. We won't vgonel if someone has either acquired a hold or usecount
or started the vgone process elsewhere. This is because we may have been
removed from the free list while we were inspecting the vnode for
recycling.
- The VI_TRYLOCK stops two threads from entering getnewvnode() and recycling
the same vnode. To further reduce the likelyhood of this event, requeue
the vnode on the tail of the list prior to calling vtryrecycle(). We can
not actually remove the vnode from the list until we know that it's
going to be recycled because other interlock holders may see the VI_FREE
flag and try to remove it from the free list.
- Kill a bogus XXX comment. If XLOCK is set we shouldn't wait for it
regardless of MNT_WAIT because the vnode does not actually belong to
this filesystem.
purge, the purge in vclean, and the filesystems purge, we had 3 purges
per vnode.
- Move the insmntque(vp, 0) to vclean() so that we may remove it from the
two vgone() functions and reduce the number of lock operations required.
whether or not the sync failed. This could potentially get set between
the time that we VOP_UNLOCK and VI_LOCK() but the race would harmelssly
lead to the sync being delayed by an extra 30 seconds. If we do not move
the vnode it could cause an endless loop if it continues to fail to sync.
- Use vhold and vdrop to stop the vnode from changing identities while we
have it unlocked. Other internal vfs lists are likely to follow this
scheme.
- Create a new function, vgonechrl(), which performs vgone for an in-use
character device. Move the code from vflush() that did this into
vgonechrl().
- Hold the xlock across the entirety of vgonel() and vgonechrl() so that
at no point will an invalid vnode exist on any list without XLOCK set.
- Move the xlock code out of vclean() now that it is in the vgone*()
functions.
work in, but we had it mapped read-only. While this has always been the
case, the PG_PS enable hack hid it and the apm bios code ended up taking
advantage of it.
This is so that we may grab the interlock while still holding the
sync_mtx. We have to VI_TRYLOCK() because in all other cases the lock
order runs the other way.
- If we don't meet any of the preconditions, reinsert the vp into the
list for the next second.
- We don't need to panic if we fail to sync here because each FSYNC
function handles this case. Removing this redundant code also
simplifies locking.
will not actually be set even though we're calling sosetopt. sosetopt
calls down to a single ctloutput function if the name or level is
implemented by a specific protocol.
Submitted by: pete@isilon.com
fail. Remove the panic from that case and document why it might fail.
- Document the reason for calling cache_purge() on a newly created vnode.
- In insmntque() order the operations so that we can call mtx_unlock()
one fewer times. This makes the code somewhat clearer as well.
- Add XXX comments in sched_sync() and vflush().
- In vget(), do not sleep while waiting for XLOCK to clear if LK_NOWAIT is
set.
- In vclean() we don't need to acquire a lock around a single TAILQ_FIRST
call. It's ok if we race here, the vinvalbuf will just do nothing.
- Increase the scope of the lock in vgonel() to reduce the number of lock
operations that are performed.
we release the mntvnode_mtx.
- Call vgonel() directly instead of going through vrecycle() since we own
the interlock now.
- Remove a few cases where we locked the interlock just so that we could
call VOP_UNLOCK with interlock held.
mntvnode_mtx.
- Use a local variable to store the results of the test to see if the
next vnode on the mount list has changed. This is so that we no longer
acess the vnode after we vput() it.
stack trace supplied by phk, I now understand what's going on here. The
check for VI_XLOCK stops us from calling vinvalbuf once the vnode has been
partially torn down in vclean(). It is not clear that this would cause
a problem. Document this in nfs_bio.c, which is where the other two
filesystems copied this code from.
I do not yet understand why, but apm *depended* on the fact that the old
PSE code caused the first 1MB of ram to be mapped read/write because it
was in the same 4MB page as the kernel text+data+bss blob.
If anybody ever tried DISABLE_PSE before, apm would not work.
If your cpu did not have PSE, apm would not work there either (eg: 486).
This bug has been around for a Very Long Time.
The Pentium-4-fix commits did not emulate this unintended side effect of
the PSE post-early-boot fixup, and thus apm blew up. I've added a hack to
emulate the bug until either apm is fixed or we set fire to our bridges.
This is bad though because it gives kernel mode code the opportunity
to accidently write to the first few megs of the general page pool
which is remapped at KERNBASE. It needs to be fixed properly.
that covers updates to the contents. Note this is separate from holding
a reference and/or locking the routing table itself.
Other/related changes:
o rtredirect loses the final parameter by which an rtentry reference
may be returned; this was never used and added unwarranted complexity
for locking.
o minor style cleanups to routing code (e.g. ansi-fy function decls)
o remove the logic to bump the refcnt on the parent of cloned routes,
we assume the parent will remain as long as the clone; doing this avoids
a circularity in locking during delete
o convert some timeouts to MPSAFE callouts
Notes:
1. rt_mtx in struct rtentry is guarded by #ifdef _KERNEL as user-level
applications cannot/do-no know about mutex's. Doing this requires
that the mutex be the last element in the structure. A better solution
is to introduce an externalized version of struct rtentry but this is
a major task because of the intertwining of rtentry and other data
structures that are visible to user applications.
2. There are known LOR's that are expected to go away with forthcoming
work to eliminate many held references. If not these will be resolved
prior to release.
3. ATM changes are untested.
Sponsored by: FreeBSD Foundation
Obtained from: BSD/OS (partly)
A small helper function pmap_is_prefaultable() is added. This function
encapsulate the few lines of pmap_prefault() that actually vary from
machine to machine. Note: pmap_is_prefaultable() and pmap_mincore() have
much in common. Going forward, it's worth considering their merger.
been widely deploy and that's causing us a lot of pain. Back out the
last commit for a few weeks so that we can lessen the support load in
current@ asking why they can't build kernels anymore. Instructions in
UPDATING have been updated, but this should be more effective.
Revert the reverting: November 1st, 2003
quantities on every other architecture.) This change is required in order
to move pmap_prefault() out of the pmap and into the machine-independent
layer.
any queued packets for the isr, process those packets before the newly
submitted packet, maintaining ordering of all packets being delivered
to the netisr. Remove the bypass counter since we don't bypass anymore.
Leave the comment about possible problems and options since later
performance optimization may change the strategy for addressing ordering
problems here.
Specifically, this maintains the strong isr ordering guarantee; additional
parallelism and lower latency may be possible by moving to weaker
guarantees (per-interface, for example). We will probably at some point
also want to remove the one instance netisr dispatch limit currently
enforced by a mutex, but it's not clear that's 100% safe yet, even in
the netperf branch.
Reviewed by: sam, others
o move route_cb to be private to rtsock.c
o replace global static route_proto by locals
o eliminate global #define shorthands for info references
o remove some register decls
o ansi-fy function decls
o move items to be close in scope to their usage
o add rt_dispatch function for dispatching the actual message
o cleanup tangled logic for doing all-but-me msg send
Support by: FreeBSD Foundation
RTF_STATIC routes. Do not check for RTF_HOST so as to avoid being DoSed
when an RTF_GENMASK route exists in the table.
Add a more verbose comment about exactly what this code does.
Submitted by: ru
frame marker) and the syscall stub frame info in the trap frame.
Previously we stored the stub frame info in (rp,pfs) and the
caller frame info in (iip,cfm). This ends up being suboptimal
for the following reasons:
1. When we create a new context, such as for an execve(2), we had
to set the (rp,pfs) pair for the entry point when using the
syscall path out of the kernel but we need to set the (iip,cfm)
pair when we take the interrupt way out. This is mostly just
an inconsistency from the kernel's point of view, but an ugly
irregularity from gdb(1)'s point of view.
2. The getcontext(2) and setcontext(2) syscalls had to swap the
(rp,pfs) and (iip,cfm) pairs to make the context compatible
with one created purely in userland.
Swapping the (rp,pfs) and (iip,cfm) pairs is visible to signal
handlers that actually peek at the mcontext_t and to gdb(1).
Since this change is made for gdb(1) and we don't care about
signal handlers that peek at the mcontext_t because we're still
a tier 2 platform, this ABI breakage is academic at this moment
in time.
Note that there was no real reason to save the caller frame info
in (iip,cfm) and the stub frame info in (rp,pfs).
validating the offset within a given memory buffer before handing the
real work off to uiomove(9).
Use uiomove_frombuf in procfs to correct several issues with
integer arithmetic that could result in underflows/overflows. As a
side-effect, the code is significantly simplified.
Add additional sanity checks when computing a memory allocation size
in pfs_read.
Submitted by: rwatson (original uiomove_frombuf -- bugs are mine :-)
Reported by: Joost Pol <joost@pine.nl> (integer underflows/overflows)
And many changes.
* all
- Major change of struct fw_xfer.
o {send,recv}.buf is splitted into hdr and payload.
o Remove unnecessary fields.
o spd is moved under send and recv.
- Remove unnecessary 'volatile' keyword.
- Add definition of rtcode and extcode.
* firewire.c
- Ignore FWDEVINVAL devices in fw_noderesolve_nodeid().
- Check the existance of the bind before call STAILQ_REMOVE().
- Fix bug in the fw_bindadd().
- Change element of struct fw_bind for simplicity.
- Check rtcode of response packet.
- Reduce split transaction timeout to 200 msec.
(100msec is the default value in the spec.)
- Set watchdog timer cycle to 10 Hz.
- Set xfer->tv just before calling fw_get_tlabel().
* fwohci.c
- Simplifies fwohci_get_plen().
* sbp.c
- Fix byte order of multibyte scsi_status informations.
- Split sbp.c and sbp.h.
- Unit number is not necessary for FIFO¤ address.
- Reduce LOGIN_DELAY and SCAN_DELAY to 1 sec.
- Add some constants defineded in SBP-2 spec.
* fwmem.c
- Introduce fwmem_strategy() and reduce memory copy.
fd_cmask field in the file descriptor structure for the first process
indirectly from CMASK, and when an fd structure is initialized before
being filled in, and instead just use CMASK. This appears to be an
artifact left over from the initial integration of quotas into BSD.
Suggested by: peter
avoid problems with some Pentium 4 cpus and some older PPro/Pentium2
cpus. There are several problems, some documented in Intel errata.
This patch:
1) moves the kernel to the second page in the PSE case. There is an
errata that says that you Must Not point a 4MB page at physical
address zero on older cpus. We avoided bugs here due to sheer luck.
2) sets up PSE page tables right from the start in locore, rather than
trying to switch from 4K to 4M (or 2M) pages part way through the boot
sequence at the same time that we're messing with PG_G.
For some reason, the pmap work over the last 18 months seems to tickle
the problems, and the PAE infrastructure changes disturb the cpu
bugs even more.
A couple of people have reported a problem with APM bios calls during
boot. I'll work with people to get this resolved.
Obtained from: bmilekic
(direct dispatch) in interrupt threads when the netisr in question
isn't already active. If a netisr is already active, or direct
dispatch is already in progress, we queue the packet for later
delivery. Previously, this option was disabled by default. I have
measured 20%+ performance improvements in IP packet forwarding with
this enabled.
Please report any problems ASAP, especially relating to stack depth or
out-of-order packet processing.
Discussed with: jlemon, peter
Sponsored by: DARPA, Network Associates Laboratories
was that accessing the status reg could occour too fast, confusing
the logic in the flash part. Could not have been located without:
HW donated by: Jonas Bülow <jonas@servicefactory.se>
prior to invalidating the TLB to be certain that the processor doesn't
keep a cached copy.
Discussed with: pete
Paniced: tegge
Pointy Hat: The usual spot
evaluating them at compile time rather than at run time. As for x86
and amd64, this requires GCC and it's enabled only if __OPTIMIZE__ is
defined (ie, if at least -O is used).
Reviewed by: jake
Their purpose is to give explicit hints to the compiler to judge
the likelyhood of a test to succeed or fail. Not all architectures
have support for such optimizations, but for those who do, it can
give a nice performance improvement in hot loops.
Obviously, this should be used very rarely in very specific code.
Reviewed by: peter
Obtained from: OpenBSD
AcpiEnterSleepState() calling a long AcpiOsStall() with interrupts
disabled. This fix will instead be added to ACPI-CA.
PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from:
MFC after:
the TLB and ~1600 if it is not. Therefore, it is more effecient to
invalidate the TLB after operations that use CMAP rather than before.
- So that the tlb is invalidated prior to switching off of a processor, we
must change the switchin functions to switchout functions.
- Remove td_switchout from the thread and move it to the x86 pcb.
- Move the code that calls switchout into swtch.s. These changes make this
optimization truely x86 specific.
then the mbuf has been consumed by a hook; otherwise beware of a null
mbuf return (gack). In particular the bridge was doing the wrong thing.
While in the ipv6 code make it's handling of pfil_run_hooks identical
to netbsd.
Pointed out by: Pyun YongHyeon <yongari@kt-is.co.kr>
change 38496
o add ipsec_osdep.h that holds os-specific definitions for portability
o s/KASSERT/IPSEC_ASSERT/ for portability
o s/SPLASSERT/IPSEC_SPLASSERT/ for portability
o remove function names from ASSERT strings since line#+file pinpints
the location
o use __func__ uniformly to reduce string storage
o convert some random #ifdef DIAGNOSTIC code to assertions
o remove some debuggging assertions no longer needed
change 38498
o replace numerous bogus panic's with equally bogus assertions
that at least go away on a production system
change 38502 + 38530
o change explicit mtx operations to #defines to simplify
future changes to a different lock type
change 38531
o hookup ipv4 ctlinput paths to a noop routine; we should be
handling path mtu changes at least
o correct potential null pointer deref in ipsec4_common_input_cb
chnage 38685
o fix locking for bundled SA's and for when key exchange is required
change 38770
o eliminate recursion on the SAHTREE lock
change 38804
o cleanup some types: long -> time_t
o remove refrence to dead #define
change 38805
o correct some types: long -> time_t
o add scan generation # to secpolicy to deal with locking issues
change 38806
o use LIST_FOREACH_SAFE instead of handrolled code
o change key_flush_spd to drop the sptree lock before purging
an entry to avoid lock recursion and to avoid holding the lock
over a long-running operation
o misc cleanups of tangled and twisty code
There is still much to do here but for now things look to be
working again.
Supported by: FreeBSD Foundation
file for vnode mappings. Note that this uses vn_fullpath() and may
be somewhat unreliable, although not too unreliable for shared
libraries. For non-vnode mappings, just print "-" for the field.
Obtained from: TrustedBSD Projects
Sponsored by: DARPA, AFRL, Network Associates Laboratories
make sure we return any allocated space to the drive. This should get
rid of a number of inconsistencies (hopefully all) that have been seen
after configuration errors.
even could call VOP_REVOKE() on vnodes associated with its dev_t's
has originated, but it stops right here.
If there are things people belive destroy_dev() needs to learn how to
do, please tell me about it, preferably with a reproducible test case.
Include <sys/uio.h> in bluetooth code rather than rely on <sys/vnode.h>
to do so.
The fact that some of the USB code needs to include <sys/vnode.h>
still disturbs me greatly, but I do not have time to chase that.
from fiddling with CS_TTGO since fiddling with CS_TTGO was removed in
rev.1.218 of the i386/isa version (which was merged with loss of history
in rev.1.223 of this version).
some symbols in X_db_search_symbol(). Reject the same symbols that
rev.1.13 did (all except STT_OBJECT and STT_FUNC), except don't reject
typeless symbols. This keeps the typeless symbols in non-verbosely
written assembler code visible, but makes file symbols invisible. ELF
file symbols have type STT_FILE and value 0, so this stops small values
and offsets sometimes being displayed in terms of the first file symbol
in the kernel (usually device_if.c). I think it rejects some other
unwanted symbols (small absolute symbols for things like struct offsets).
It may reject some wanted symbols (large absolute symbols for addresses
like PTmap).
about because we're still tier 2 and our current compiler, as well
as future compilers will not support varargs. This is mostly a
no-op in practice, because <sys/varargs.h> should already cause
compile failures.
use the ability on ia64 to map the register stack. The orientation of
the stack (i.e. its grow direction) is passed to vm_map_stack() in the
overloaded cow argument. Since the grow direction is represented by
bits, it is possible and allowed to create bi-directional stacks.
This is not an advertised feature, more of a side-effect.
Fix a bug in vm_map_growstack() that's specific to rstacks and which
we could only find by having the ability to create rstacks: when
the mapped stack ends at the faulting address, we have not actually
mapped the faulting address. we need to include or cover the faulting
address.
Note that at this time mmap(2) has not been extended to allow the
creation of rstacks by processes. If such a need arises, this can
be done.
Tested on: alpha, i386, ia64, sparc64
do exactly the same as vop_nopoll() for consistency and put a
comment in the two pointing at each other.
Retire seltrue() in favour of no_poll().
Create private default functions in kern_conf.c instead of public
ones.
Change default strategy to return the bio with ENODEV instead of
doing nothing which would lead the bio stranded.
Retire public nullopen() and nullclose() as well as the entire band
of public no{read,write,ioctl,mmap,kqfilter,strategy,poll,dump}
funtions, they are the default actions now.
Move the final two trivial functions from subr_xxx.c to kern_conf.c
and retire the now empty subr_xxx.c
This is just a cleanup here (modulo rev.1.108 of kern/tty.c), since the
input speed can be different from to output speed and extra code to
handle both speeds naturally handled all cases.
provide no methods does not make any sense, and is not used by any
driver.
It is a pretty hard to come up with even a theoretical concept of
a device driver which would always fail open and close with ENODEV.
Change the defaults to be nullopen() and nullclose() which simply
does nothing.
Remove explicit initializations to these from the drivers which
already used them.
- Removed conversion of a zero input speed to the output speed. This
has been done better in ttioctl() since rev.1.108 of kern/tty.c
almost 5 years ago. comparam() did the conversion incompletely for
the case where the output speed is also zero. It had complications
to avoid using zero speeds, but would still have used a zero input
speed for setting watermarks if kern/tty.c had passed one.
- Never permit the input speed to be different from the output speed.
There was no validity check on the input speed for the case of a zero
output speed. Then we didn't change the physical speeds, but we used
the unvalidated input speed for setting watermarks and didn't return
an error, so ttioctl() stored the unvalidated input speed in the tty
struct where it could cause problems later.
- Removed complications that were to avoid using a divisor of 0. The
divisor is now always valid if the speed is accepted.
cd_setreg() were still using !(read_eflags() & PSL_I) as the condition
for the lock hidden by COM_LOCK() (if any) being held. This worked
when spin mutexes and/or critical_enter() used hard interrupt disablement,
but it has caused recursion on the non-recursive mutex com_mtx since
all relevant interrupt disablement became soft. The recursion is
harmless unless there are other bugs, but it breaks an invariant so
it is fatal if spinlocks are witnessed.
struct msdosfsmount so that this file has the same prerequisites as
it used to. The new prerequistite was a meta-style bug. It required
many style bugs (unsorted includes ...) elsewhere.
Formatted prototypes in KNF. Resisted urge to sort all the prototypes,
to minimise differences with NetBSD. (NetBSD has reformatted the
prototypes but has not sorted them and still uses __P(()).)
node lock while sending a management frame as this will potentially
result in a LOR with a driver lock. This doesn't happen for the
Atheros driver but does for the wi driver. Use a generation number
to help process each node once when scanning the node table and
drop the node lock if we need to timeout a node and send a frame.
AP has basic rates that we do not support then ignore them instead
of marking the rate set in error.
This fixes an 11b station associating with an 11g/b AP.
are allowed by Windows (ref: MS KB article 120138).
XXX From my reading of the CIFS specification, it's not clear that
clients need to validate filenames at all.
PR: 57123
Submitted by: Paul Coucher
MFC after: 1 month
in the loopback test in the probe. The delay was too short for consoles
at speeds lower than about 3200 bps. This shouldn't have caused many
problems, since such low speeds are rare and the probe is forced to
succeed for consoles.
consdev structure.
If the consdev name is not set and we have a cn_dev, set the name
from there. Try to issue a printf about this, even though it may
not have a place to go.
Modify the sysctl related code to pick up the name from the consdev
instead.
Most of the actual use of the cn_dev field is merely to get the name,
and most of the actual initializations are bogusly using makedev()
because the probe/attach has not been completed.
Instead we will migrate console drivers to fill in the name and if
the driver needs it: the unit number, thereby avoiding the bogus
calls to makedev().
and the Z8530 drivers used the I/O address as a quick and dirty way to
determine which channel they operated on, but formalizing this by
introducing iobase is not a solution. How for example would a driver
know which channel it controls for a multi-channel UART that only has a
single I/O range?
Instead, add an explicit field, called chan, to struct uart_bas that
holds the channel within a device, or 0 otherwise. The chan field is
initialized both by the system device probing (i.e. a system console)
or it is passed down to uart_bus_probe() by any of the bus front-ends.
As such, it impacts all platforms and bus drivers and makes it a rather
large commit.
Remove the use of iobase in uart_cpu_eqres() for pc98. It is expected
that platforms have the capability to compare tag and handle pairs for
equality; as to determine whether two pairs access the same device or
not. The use of iobase for pc98 makes it impossible to formalize this
and turn it into a real newbus function later. This commit reverts
uart_cpu_eqres() for pc98 to an unimplemented function. It has to be
reimplemented using only the tag and handle fields in struct uart_bas.
Rewrite the SAB82532 and Z8530 drivers to use the chan field in struct
uart_bas. Remove the IS_CHANNEL_A and IS_CHANNEL_B macros. We don't
need to abstract anything anymore.
Discussed with: nyan
Tested on: i386, ia64, sparc64
aic7xxx_pci.c:
When performing our register test, be careful
to avoid resetting the chip when pausing the
controller. The test reads the HCNTRL register
and then writes it back with the PAUSE bit
explicitly set. If the last write to the controller
before our probe is to reset it, the CHIPRST
bit will still be set, so we must mask it off
before the PAUSE operation. On some chip versions,
we cannot access registers for a few 100us after
a reset, so this inadvertant reset was causing PCI
errors to occur on the read to check for paused
status.
Submitted by: gibbs
Reimplement pmap_release() such that it uses the page table rather than
the pte object to locate the page table directory pages. (Temporarily,
retain an assertion on the emptiness of the pte object.)
systems where the data/stack/etc limits are too big for a 32 bit process.
Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.
Supply an ia32_fixlimits function. Export the clip/default values to
sysctl under the compat.ia32 heirarchy.
Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max
value rather than the sysctl tweakable variable. This allows mmap to
place mappings at sensible locations when limits have been reduced.
Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same
method as mmap(0, ...) now does.
Note that we cannot remove all references to the sysctl tweakable
maxdsiz etc variables because /etc/login.conf specifies a datasize
of 'unlimited'. And that causes exec etc to fail since it can no
longer find space to mmap things.
Instead, use EXCA_MEMREG_WIN_SHIFT which is the amount we shift the
bus address by to write into upper memory (eg above 24MB). Use the
latter in this case.
be gone in FreeBSD 6, so put BURN_BRIDGES around it. The TRB also
felt that if something better comes along sooner, it can be used to
replace this code.
Delayed by: BSDcon and subsequent disk crash.
o revamp IPv4+IPv6+bridge usage to match API changes
o remove pfil_head instances from protosw entries (no longer used)
o add locking
o bump FreeBSD version for 3rd party modules
Heavy lifting by: "Max Laier" <max@love2party.net>
Supported by: FreeBSD Foundation
Obtained from: NetBSD (bits of pfil.h and pfil.c)
attached network could exhaust kernel memory, and cause a system
panic, by sending a flood of spoofed ARP requests.
Approved by: jake (mentor)
Reported by: Apple Product Security <product-security@apple.com>
Skinny is the protocol used by Cisco IP phones to talk to Cisco Call
Managers. With this code, one can use a Cisco IP phone behind a FreeBSD
NAT gateway.
Currently, having the Call Manager behind the NAT gateway is not supported.
More information on enabling Skinny support in libalias, natd, and ppp
can be found in those applications' manpages.
PR: 55843
Reviewed by: ru
Approved by: ru
MFC after: 30 days
order to use "unmanaged" pages in the kmem object, vm_map_delete() must
unconditionally perform pmap_remove(). Otherwise, sparc64 has problems.
Tested by: jake
32 bit binary stuff. 32 bit binaries do not like it much when the kernel
tries hard to put things above the 8GB mark.
I have a work-in-progress to fix this properly, but I didn't want to burn
anybody with this yet.
initialize a TSC timecounter until we know if it is broke or not.
XXX I think there is a bug in the i386 code here. init_TSC_tc() comes
after:
if (statclock_disable)
return;
ie: if you turn off the statclock interrupt, you dont get the TSC either.
for breakpoint and trace traps from usermode. Although all the setidt
entries are interrupt gates on amd64, all but the trace and bpt trap
entry handlers reenable interrupts after the swapgs instruction in order
to simulate the trap/interrupt gate distinction. In other words, the
amd64 code behaves the same way that i386 does here.
for ddb input in some atkbd-based console drivers. ddb must not use any
normal locks but DELAY() normally calls getit() which needs clock_lock.
This also removes the need for recursion on clock_lock.
known constants at compile time rather than at run time. We have a number
of nasty hacks around the place to cache ntohl() of constants (eg: nfs).
This change allows the compiler to compile-time evaluate ntohl(1) as
0x01000000 rather than having to emit assembler code to do it. This
has other smaller flow-on effects because the compiler can see that
ntohl(constant) itself has a constant value now and can propagate the
compile time evaluation.
Obtained from: Ideas from NetBSD and Linux, and some code from NetBSD
1.186: onoe; Sony's PEGA-WL110 CF WLAN (which strangely has fujitsu's
vendor id)
1.185: ichiro; Quatech Inc, PCMCIA Enhanced Parallel Port Card
Also:
o update $NetBSD$
o minor tweaks to FUJITSU. We've tried to keep the CIS only entries seprate
from vendor id/product id.
freed belong to the kernel object.)
- Increase the granularity of the vm object locking in vm_hold_load_pages()
in order to reduce the number of times that we acquire and release the
same lock.
Fix to the messages output under CAM_DEBUG_CCB: the summary sense
information (error bits and sense key) is in the error field, not
in the result field, of struct ata_request. No other functional change.
completion of recovery is indicated by positioning the CAM_AUTOSNS_VALID
bit in the status field of the CCB, not in the flags field.
This fixes an endless loop of sense recovery actions.
Reviewed by: ken
function, startup_alloc(), that is used for single page allocations prior
to the VM starting up. If it is used after the VM startups up, it
replaces the zone's allocf pointer with either page_alloc() or
uma_small_alloc() where appropriate.
Pointy hat to: me
Tested by: phk/amd64, me/x86
Temporarily disable the UMA_MD_SMALL_ALLOC stuff since recent commits
break sparc64, amd64, ia64 and alpha. It appears only i386 and maybe
powerpc were not broken.
device to access 64-bit addresses from a 32-bit PCI bus. While the
RealTek manual says you can set this bit and the chip will perform
DAC only if you give it a DMA address with any of the upper 32
bits set, this appears not to be the case. If I turn on the DAC
bit, the chip sets the 'system error' bit in the status register
when I to do a DMA on my Athlon test box with 32-bit PCI bus (VIA
chipset) even though I only have 128MB of physical memory, and thus
can never give the chip a 64-bit address.
Obviously, I can't just set it and forget it, so until I figure
out the right rule for when it's safe/necessary to enable it, keep
it turned off.
not guaranteed that the RSE writes the NaT collection immediately,
sort of atomically, to the backing store when it writes the register
immediately prior to the NaT collection point. This means that we
cannot assume that the low 9 bits of the backingstore pointer do not
point to the NaT collection. This is rather a surprise and I don't
know at this time if it's a bug in the Merced or that it's actually
a valid condition of the architecture. A quick scan over the sources
does not indicate that we depend on the false assumption elsewhere,
but it's something to keep in mind.
The fix is to write the saved contents of the ar.rnat register to
the backingstore prior to entering the loop that copies the dirty
registers from the kernel stack to the user stack.
functions reference UMA internals from <vm/uma_int.h>, which makes
them highly unwanted in non-UMA specific files.
While here, prune the includes in pmap.c and use __FBSDID(). Move
the includes above the descriptive comment.
The copyright of uma_machdep.c is assigned to the project and can
be reassigned to the foundation if and when when such is preferrable.
Tested at 100Mbit only, using Asus P4P800 onboard 3C940.
The -stable version of this patch I have in use for ~2 weeks now, and works
just fine for me.
Based on: Nathan L. Binkert's patch for OpenBSD
Patch submitted by and thanks to: Jung-uk Kim <jkim@niksun.com>
MFC after: 2 weeks
Doing so creates a race where the buf is on neither list.
- Only vfree() in an error case in vclean() if VSHOULDFREE() thinks we
should.
- Convert the error case in vclean() to INVARIANTS from DIAGNOSTIC as this
really should not happen and is fast to check.
sufficient to guarantee that this race is not hit. The XLOCK will likely
have to be redesigned due to the way reference counting and mutexes work
in FreeBSD. We currently can not be guaranteed that xlock was not set
and cleared while we were blocked on the interlock while waiting to check
for XLOCK. This would lead us to reference a vnode which was not the
vnode we requested.
- Add a backtrace() call inside of INVARIANTS in the hopes of finding out if
this condition is ever hit. It should not, since we should be retaining
a reference to the vnode in these cases. The reference would be sufficient
to block recycling.
working set cache. This has several advantages. Firstly, we never touch
the per cpu queues now in the timeout handler. This removes one more
reason for having per cpu locks. Secondly, it reduces the size of the zone
by 8 bytes, bringing it under 200 bytes for a single proc x86 box. This
tidies up other logic as well.
- The 'destroy' flag no longer needs to be passed to zone_drain() since it
always frees everything in the zone's slabs.
- cache_drain() is now only called from zone_dtor() and so it destroys by
default. It also does not need the destroy parameter now.
broken consumers of the malloc interface who assume that the allocated
address will be an even multiple of the size.
- Remove disabled time delay code on uma_reclaim(). The comment there said
it all. It was not an effective strategy and it should not be left in
#if 0'd for all eternity.
restart instruction bits in the PSR. As such, we were returning
from interrupt to the instruction in the bundle that caused us
to enter the kernel, only now we're returning to a completely
different bundle.
While close here: add two KASSERTs to make sure that we restore
sync contexts only when entered the kernel through a syscall and
restore an async context only when entered the kernel through an
interrupt, trap or fault.
While not exactly here, but close enough: use suword64() when we
copy the dirty registers from the kernel stack to the user stack.
The code was intended to be be replaced shortly after being added,
but that was a couple of weeks ago. I might as well avoid that it
is a source for panics until it's replaced.
can get (or not) and what we do with them. This fixes the behaviour
for NaT consumption and speculation faults in that we now don't panic
for user faults.
Remove the dopanic label and move the code to a function. This makes
it easier in the simulator to set a breakpoint.
While here, remove the special handling of the old break-based syscall
path and move it to where we handle the break vector. While here,
reserve a new break immediate for KSE. We currently use the old break-
based syscall to deal with restoring async contexts. However, it has
the side-effect of also setting the signal mask and callong ast() on
the way out. The new break immediate simply restores the context and
returns without calling ast().
of "dumb" PCI-based serial/parallel boards get a hint how to enable
them.
I wasn't sure about the ia64, pc98, powerpc, and sparc64 archs whether
they'd support puc(4) or not.
extended irq lists. If the resource has a trailing byte but not the full
resource string, do not attempt to parse the resource string. This fixes
panics on transition to battery and shutdown for Larry. Patch has been
submitted to vendor and they will incorporate in next release.
Tested by: Larry Rosenman <ler@lerctr.org>
PR: kern/56254
page_alloc() function from the slab_zalloc() function. This allows us
to unconditionally call uz_allocf().
- In page_alloc() cleanup the boot_pages logic some. Previously memory from
this cache that was not used by the time the system started was left in
the cache and never used. Typically this wasn't more than a few pages,
but now we will use this cache so long as memory is available.
by accepting the user supplied flags directly. Previously this was not
done so that flags for the same field would not be defined in two
different files. Add comments in each header instructing future
developers on how now to shoot their feet.
- Fix a test for !OFFPAGE which should have been a test for HASH. This would
have caused a panic if we had ever destructed a malloc zone. This also
opens up the possibility that other zones could use the vsetobj() method
rather than a hash.
but for CPL != 0. For some reason yet unknown it is possible for the
CPL to be 2. This would previously be counted as kernel mode, which
resulted in nasty panics. By changing the test it is now treated as
user mode, which is more correct. We still need to figure out how it
is possible that the privilege level can be 2 (or 1 for that matter),
because it's not used by us. We only use 3 (user mode) and 0 (kernel
mode).
don't cache as many items.
- Introduce the bucket_alloc(), bucket_free() functions to wrap bucket
allocation. These functions select the appropriate bucket zone to
allocate from or free to.
- Rename ub_ptr to ub_cnt to reflect a change in its use. ub_cnt now reflects
the count of free items in the bucket. This gets rid of many unnatural
subtractions by 1 throughout the code.
- Add ub_entries which reflects the number of entries possibly held in a
bucket.
IF_HANDOFF() does it for us behind the scenes. Remove the extra call
to re_start() otherwise we try to transmit twice.
In re_encap(), fix the code that guards against consuming too many
descriptors in the TX ring so that it actually works. With the
new 8169S chip, I was able to hit a corner case that drained the
free descriptor count all the way to 0. This is not supposed to
be possible.
reserved bits in the port that must be zero are 24:30, not 20:30. Bits
16:23 are used to set the bus number. This meant that when we tested for
config mechanism #1, if the previous PCI configuration transaction sent
used a bus number greater than 15, one of the bits in 20:23 would be
non-zero and we would fail to use config mechanism #1 and thus fail to see
that PCI existed on the machine at all.
Obtained from: Shanley's PCI System Architecture book
Tested by: des
Proxied through: njl
user mode. This goes with rev.1.468 of machdep.c which changed the gates
for these traps to interrupt gates. Having the interrupts disabled for
these traps from user mode is just an unwanted side effect.
This fixes at least 1 case of "panic: absolutely cannot call
smp_ipi_shootdown with interrupts already disabled". Too much code was
run with interrupts disabled, and it sometimes hit a sanity check.
Fix verified by: deischen