Commit graph

12239 commits

Author SHA1 Message Date
Attilio Rao
ef607a6aa3 MFC 2011-05-12 14:01:40 +00:00
Stanislav Sedov
ff6f41a472 - Do no try to drop a NULL filedesc pointer. 2011-05-12 10:56:33 +00:00
Stanislav Sedov
0daf62d9f5 - Commit work from libprocstat project. These patches add support for runtime
file and processes information retrieval from the running kernel via sysctl
  in the form of new library, libprocstat.  The library also supports KVM backend
  for analyzing memory crash dumps.  Both procstat(1) and fstat(1) utilities have
  been modified to take advantage of the library (as the bonus point the fstat(1)
  utility no longer need superuser privileges to operate), and the procstat(1)
  utility is now able to display information from memory dumps as well.

  The newly introduced fuser(1) utility also uses this library and able to operate
  via sysctl and kvm backends.

  The library is by no means complete (e.g. KVM backend is missing vnode name
  resolution routines, and there're no manpages for the library itself) so I
  plan to improve it further.  I'm commiting it so it will get wider exposure
  and review.

  We won't be able to MFC this work as it relies on changes in HEAD, which
  was introduced some time ago, that break kernel ABI.  OTOH we may be able
  to merge the library with KVM backend if we really need it there.

Discussed with:	rwatson
2011-05-12 10:11:39 +00:00
Attilio Rao
b9f714be9f MFC 2011-05-07 23:34:14 +00:00
Jaakko Heinonen
852bee75b7 To avoid duplicated warning, move WITNESS_WARN() added in r221597 to the
branch which doesn't call malloc(9).

Suggested by:	kib
2011-05-07 17:59:07 +00:00
Jaakko Heinonen
816c203937 Add WITNESS_WARN() to getenv() to explicitly note that the function may
sleep. This helps to expose bugs when the requested environment variable
doesn't exist.
2011-05-07 11:10:58 +00:00
Attilio Rao
71a19bdc64 Commit the support for removing cpumask_t and replacing it directly with
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).

Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.

The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN

while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.

Some technical notes:
- This commit may be considered an ABI nop for all the architectures
  different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
  accessed avoiding migration, because the size of cpuset_t should be
  considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
  primirally done in order to leave some more space in userland to cope
  with KBI extensions). If you need to access kernel cpuset_t from the
  userland please refer to example in this patch on how to do that
  correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now

The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.

Tested by:	pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by:	jeff, jhb, sbruno
2011-05-05 14:39:14 +00:00
Attilio Rao
7505ef3a41 MFC 2011-05-04 15:45:23 +00:00
Attilio Rao
94ebcddde3 MFC 2011-05-03 18:57:46 +00:00
Andrey V. Elsukov
b50a7799b8 Add make_dev_alias_p() function. It is similar to make_dev_alias(),
but it may return an error like make_dev_p() does.

Reviewed by:	kib (previous version)
MFC after:	2 weeks
2011-05-03 18:54:18 +00:00
Edward Tomasz Napierala
a7ad07bff3 Change the way rctl interfaces with jails by introducing prison_racct
structure, which acts as a proxy between them.  This makes jail rules
persistent, i.e. they can be added before jail gets created, and they
don't disappear when the jail gets destroyed.
2011-05-03 07:32:58 +00:00
Attilio Rao
f0283a735c - Remove the following sysctl:
kern.sched.ipiwakeup.onecpu
  kern.sched.ipiwakeup.htt2

  Because they are absolutely obsolete. Probabilly the whole wakeup
  forward mechanism should be revisited for a better fitting in modern
  hw.
- As map2 variable is no longer used rename map3 to map2
- Fix a string by making more informative the msg and removing the
  arguments passing

Approved by:	julian
2011-04-30 23:28:07 +00:00
Attilio Rao
3121f5347e idle_cpus_mask is just used in the SMP case and within sched_4BSD.
Declare appropriately.
2011-04-30 22:30:18 +00:00
John Baldwin
85ee63c923 Add a new bus method, BUS_ADJUST_RESOURCE() that is intended to be a
wrapper around rman_adjust_resource().  Include a generic implementation,
bus_generic_adjust_resource() which passes the request up to the parent
bus.  There is currently no default implementation.  A
bus_adjust_resource() wrapper is provided for use in drivers.
2011-04-29 21:36:45 +00:00
John Baldwin
bb82622c3e Extend the rman(9) API to support altering an existing resource.
Specifically, these changes allow a resource to back a relocatable and
resizable resource such as the I/O window decoders in PCI-PCI bridges.
- rman_adjust_resource() can adjust the start and end address of an
  existing resource.  It only succeeds if the newly requested address
  space is already free.  It also supports shrinking a resource in
  which case the freed space will be marked unallocated in the rman.
- rman_first_free_region() and rman_last_free_region() return the
  start and end addresses for the first or last unallocated region in
  an rman, respectively.  This can be used to determine by how much
  the resource backing an rman must be adjusted to accomodate an
  allocation request that does not fit into the existing rman.

While here, document the rm_start and rm_end fields in struct rman,
rman_is_region_manager(), the bound argument to
rman_reserve_resource_bound(), and rman_init_from_resource().
2011-04-29 20:05:19 +00:00
John Baldwin
b67d11bbcc Change rman_manage_region() to actually honor the rm_start and rm_end
constraints on the rman and reject attempts to manage a region that is out
of range.
- Fix various places that set rm_end incorrectly (to ~0 or ~0u instead of
  ~0ul).
- To preserve existing behavior, change rman_init() to set rm_start and
  rm_end to allow managing the full range (0 to ~0ul) if they are not set by
  the caller when rman_init() is called.
2011-04-29 18:41:21 +00:00
Attilio Rao
2be767e069 Add the watchdogs patting during the (shutdown time) disk syncing and
disk dumping.
With the option SW_WATCHDOG on, these operations are doomed to let
watchdog fire, fi they take too long.

I implemented the stubs this way because I really want wdog_kern_*
KPI to not be dependant by SW_WATCHDOG being on (and really, the option
only enables watchdog activation in hardclock) and also avoid to
call them when not necessary (avoiding not-volountary watchdog
activations).

Sponsored by:	Sandvine Incorporated
Discussed with:	emaste, des
MFC after:	2 weeks
2011-04-28 16:02:05 +00:00
Ryan Stone
60dd73b78b If the 4BSD scheduler tries to schedule a thread that has been pinned or
bound to an AP before SMP has started, the system will panic when we try
to touch per-CPU state for that AP because that state has not been
initialized yet.  Fix this in the same way as ULE: place all threads in
the global run queue before SMP has started.

Reviewed by:	jhb
MFC after:	1 month
2011-04-26 20:34:30 +00:00
Konstantin Belousov
b2ad91f26b Implement the delayed task execution extension to the taskqueue
mechanism. The caller may specify a timeout in ticks after which the
task will be scheduled.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	jeff, jhb
MFC after:	1 month
2011-04-26 11:39:56 +00:00
Jeff Roberson
5bd186a65a - Catch up to falloc() changes.
- PHOLD() before using a task structure on the stack.
 - Fix a LOR between the sleepq lock and thread lock in _intr_drain().
2011-04-26 07:30:52 +00:00
Rick Macklem
c65c068a5f Fix a LOR in vfs_busy() where, after msleeping, it would lock
the mutexes in the wrong order for the case where the
MBF_MNTLSTLOCK is set. I believe this did have the
potential for deadlock. For example, if multiple nfsd threads
called vfs_busyfs(), which calls vfs_busy() with MBF_MNTLSTLOCK.
Thanks go to pho for catching this during his testing.

Tested by:	pho
Submitted by:	kib
MFC after:	2 weeks
2011-04-23 11:22:48 +00:00
Jaakko Heinonen
1b0fe69dc9 Utilize vfs_sanitizeopts() in vfs_mergeopts() to merge options. Because
vfs_sanitizeopts() can handle "ro" and "rw" options properly, there is
no more need to add "noro" in vfs_donmount() to cancel "ro".

This also fixes a problem of canceling options beginning with "no".
For example, "noatime" didn't cancel "nonoatime". Thus it was possible
that both "noatime" and "nonoatime" were active at the same time.

Reviewed by:	bde
2011-04-22 07:26:09 +00:00
Matthew D Fleming
1ce4508f6d Allow VOP_ALLOCATE to be iterative, and have kern_posix_fallocate(9)
drive looping and potentially yielding.

Requested by:	kib
2011-04-19 16:36:24 +00:00
Matthew D Fleming
5d253e418f Fix a copy/paste whitespace error. 2011-04-18 16:40:47 +00:00
Matthew D Fleming
7323776b01 Regen. 2011-04-18 16:32:47 +00:00
Matthew D Fleming
d91f88f7f3 Add the posix_fallocate(2) syscall. The default implementation in
vop_stdallocate() is filesystem agnostic and will run as slow as a
read/write loop in userspace; however, it serves to correctly
implement the functionality for filesystems that do not implement a
VOP_ALLOCATE.

Note that __FreeBSD_version was already bumped today to 900036 for any
ports which would like to use this function.

Also reserve space in the syscall table for posix_fadvise(2).

Reviewed by:	-arch (previous version)
2011-04-18 16:32:22 +00:00
Jilles Tjoelker
6100955206 ktrace: Log the code for all signals (PSIG events).
The code provides information on how the signal was generated.

Formerly, the code was only logged for traps, much like only signal handlers
for traps received a meaningful si_code before FreeBSD 7.0.

In rare cases, no information is available and 0 is still logged.

MFC after:	1 week
2011-04-17 14:38:11 +00:00
Dmitry Chagin
fa2835d296 Remove malloc(9) return value checks when M_WAITOK is used.
MFC after:	2 Week
2011-04-16 16:20:51 +00:00
Gleb Smirnoff
443301e296 Revert r194662, since it breaks ng_ksocket(4) and may break
other socket consumers with alternate sb_upcall.

PR:		kern/154676
Submitted by:	Arnaud Lacombe <lacombar gmail.com>
MFC after:	7 days
2011-04-14 14:54:22 +00:00
Sergey Kandaurov
ced9253e4e Remove stale M_ZOMBIE malloc type.
This type is unused since embedding p_ru into struct proc.

MFC after:	1 week
2011-04-14 14:25:47 +00:00
Gavin Atkinson
0f4d3c921d Add a new DDB command, "show rmans", which will show the address and brief
details of each rman header, but not the contents of all rman structures
in the system.  This is especially useful on platforms where some rmans
have many thousands of entries in rmans, making scrolling through the
output of "show all rman" impractical.  Individual rmans can then be viewed
including their contents with "show rman 0xaddr" as usual.

Reviewed by:	jhb
2011-04-13 19:10:56 +00:00
Sergey Kandaurov
6bed196c35 Staticize malloc types.
Approved by:	lstewart
MFC after:	1 week
2011-04-13 11:28:46 +00:00
Lawrence Stewart
891b8ed467 Use the full and proper company name for Swinburne University of Technology
throughout the source tree.

Requested by:	Grenville Armitage, Director of CAIA at Swinburne University of
			Technology
MFC after:	3 days
2011-04-12 08:13:18 +00:00
Edward Tomasz Napierala
415896e3b1 Rename a misnamed structure field (hr_loginclass), and reorder priv(9)
constants to match the order and naming of syscalls.  No functional changes.
2011-04-10 18:35:43 +00:00
Konstantin Belousov
2cfd5d3c91 Some callers of proc_reparent() already have the parent process locked.
Detect the situation and avoid process lock recursion.

Reported by:	Fabian Keil <freebsd-listen fabiankeil de>
2011-04-10 17:07:02 +00:00
Attilio Rao
1283e9cd60 Reintroduce the fix already discussed in r216805 (please check its history
for a detailed explanation of the problems).

The only difference with the previous fix is in Solution2:
CPUBLOCK is no longer set when exiting from callout_reset_*() functions,
which avoid the deadlock (leading to r217161).
There is no need to CPUBLOCK there because the running-and-migrating
assumption is strong enough to avoid problems there.
Furthermore add a better !SMP compliancy (leading to shrinked code and
structures) and facility macros/functions.

Tested by:	gianni, pho, dim
MFC after:	3 weeks
2011-04-08 18:48:57 +00:00
Edward Tomasz Napierala
722581d9e6 Add RACCT_NOFILE accounting.
Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-04-06 19:13:04 +00:00
Edward Tomasz Napierala
b1fb5f9c8d Style fix.
Submitted by:	jhb@
2011-04-06 19:08:50 +00:00
Edward Tomasz Napierala
3bcf74459f Add accounting for SysV-related resources.
Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-04-06 18:11:24 +00:00
John Baldwin
e806d352d2 Fix several places to ignore processes that are not yet fully constructed.
MFC after:	1 week
2011-04-06 17:47:22 +00:00
Edward Tomasz Napierala
8caddd81e2 Add ucred pointer to the SysV-related memory structures. This is required
for racct.

Note that after this commit, ipcs(1) needs to be rebuilt.  Otherwise, it will
fail with "ipcs: sysctlbyname: kern.ipc.msqids: Cannot allocate memory".

Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-04-06 16:59:54 +00:00
Edward Tomasz Napierala
1ba5ad4210 Add accounting for most of the memory-related resources.
Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-04-05 20:23:59 +00:00
Edward Tomasz Napierala
c98fe0a557 Add missing stubs. 2011-04-05 19:50:34 +00:00
Sergey Kandaurov
443db69597 Remove malloc type M_NETADDR unused since splitting into vfs_subr.c
and vfs_export.c.

MFC after:	1 week
2011-04-04 16:23:01 +00:00
Edward Tomasz Napierala
6fd8c2bd8a Add accounting for RACCT_NPTS. 2011-04-02 15:02:42 +00:00
Konstantin Belousov
1fe80828e7 After the r219999 is merged to stable/8, rename fallocf(9) to falloc(9)
and remove the falloc() version that lacks flag argument. This is done
to reduce the KPI bloat.

Requested by:	jhb
X-MFC-note:	do not
2011-04-01 13:28:34 +00:00
Konstantin Belousov
7332c129e0 Add support for executing the FreeBSD 1/i386 a.out binaries on amd64.
In particular:
- implement compat shims for old stat(2) variants and ogetdirentries(2);
- implement delivery of signals with ancient stack frame layout and
  corresponding sigreturn(2);
- implement old getpagesize(2);
- provide a user-mode trampoline and LDT call gate for lcall $7,$0;
- port a.out image activator and connect it to the build as a module
  on amd64.

The changes are hidden under COMPAT_43.

MFC after:   1 month
2011-04-01 11:16:29 +00:00
Edward Tomasz Napierala
58c77a9d53 Enable accounting for RACCT_NPROC and RACCT_NTHR.
Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-03-31 19:22:11 +00:00
Edward Tomasz Napierala
e4dcb7046a Notify racct when process credentials change.
Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-03-31 18:12:04 +00:00
Fabien Thomas
586cb6ec77 Clearing the flag when preempting will let the preempted thread run
too much time. This can finish in a scheduler deadlock with ping-pong
between two threads.

One sample of this is:
- device lapic (to have a preemption point on critical_exit())
- options DEVICE_POLLING with HZ>1499 (to have lapic freq = hardclock freq)
- running a cpu intensive task (that does not enter the kernel)
- only one CPU on SMP or no SMP.

As requested by jhb@ 4BSD have received the same type of fix instead of
propagating the flag to the new thread.

Reviewed by:	jhb, jeff
MFC after:	1 month
2011-03-31 13:59:47 +00:00
Edward Tomasz Napierala
66db16fc49 Regenerate. 2011-03-30 17:59:54 +00:00
Edward Tomasz Napierala
ec125fbbc5 Add rctl. It's used by racct to take user-configurable actions based
on the set of rules it maintains and the current resource usage.  It also
privides userland API to manage that ruleset.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-03-30 17:48:15 +00:00
Konstantin Belousov
8666550972 Provide compat32 shims for kldstat(2).
Requested and tested by:	jpaetzel
MFC after:	1 week
2011-03-30 14:46:12 +00:00
Edward Tomasz Napierala
d31b45e164 Remove pointless (always true) KASSERTs.
Submitted by:	pjd
2011-03-29 19:19:10 +00:00
Edward Tomasz Napierala
097055e26d Add racct. It's an API to keep per-process, per-jail, per-loginclass
and per-loginclass resource accounting information, to be used by the new
resource limits code.  It's connected to the build, but the code that
actually calls the new functions will come later.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-03-29 17:47:25 +00:00
Konstantin Belousov
cea8f30a54 Fix the check for vm_map_remove() error.
Pointed out by:	alc
MFC after:	2 weeks
2011-03-28 19:44:54 +00:00
Konstantin Belousov
cce6e354aa Trim white spaces, adjust style.
MFC after:	2 weeks
2011-03-28 13:28:23 +00:00
Konstantin Belousov
937060a843 Handle zero length in copyout_unmap().
Submitted by:	John Wehle <john feith com>
MFC after:	2 weeks
2011-03-28 13:21:26 +00:00
Konstantin Belousov
0f502d1c4e Promote ksyms_map() and ksyms_unmap() to general facility
copyout_map() and copyout_unmap() interfaces.

Submitted by:	John Wehle <john feith com>, nox
MFC after:	2 weeks
2011-03-28 12:48:33 +00:00
Jaakko Heinonen
9dc6abbd8a Fix some style issues in r219925.
Reported by:	bde
MFC after:	1 month
2011-03-26 17:17:24 +00:00
Konstantin Belousov
246d35ec91 Add O_CLOEXEC flag to open(2) and fhopen(2).
The new function fallocf(9), that is renamed falloc(9) with added
flag argument, is provided to facilitate the merge to stable branch.

Reviewed by:	jhb
MFC after:	1 week
2011-03-25 14:00:36 +00:00
John Baldwin
8e6fa660f2 Fix some locking nits with the p_state field of struct proc:
- Hold the proc lock while changing the state from PRS_NEW to PRS_NORMAL
  in fork to honor the locking requirements.  While here, expand the scope
  of the PROC_LOCK() on the new process (p2) to avoid some LORs.  Previously
  the code was locking the new child process (p2) after it had locked the
  parent process (p1).  However, when locking two processes, the safe order
  is to lock the child first, then the parent.
- Fix various places that were checking p_state against PRS_NEW without
  having the process locked to use PROC_LOCK().  Every place was already
  locking the process, just after the PRS_NEW check.
- Remove or reduce the use of PROC_SLOCK() for places that were checking
  p_state against PRS_NEW.  The PROC_LOCK() alone is sufficient for reading
  the current state.
- Reorder fill_kinfo_proc() slightly so it only acquires PROC_SLOCK() once.

MFC after:	1 week
2011-03-24 18:40:11 +00:00
Jaakko Heinonen
3fd8fe5b54 Recognize "ro", "rdonly", "norw", "rw" and "noro" as equal options in
vfs_equalopts(). This allows vfs_sanitizeopts() to filter redundant
occurrences of these options. It was possible that for example both "ro"
and "rw" options became active concurrently.

PR:		kern/133614
Discussed on:	freebsd-hackers
MFC after:	1 month
2011-03-23 17:56:38 +00:00
Alan Cox
e9a3f7852d Modestly increase the maximum allowed size of the kmem map on i386.
Also, express this new maximum as a fraction of the kernel's address
space size rather than a constant so that increasing KVA_PAGES will
automatically increase this maximum.  As a side-effect of this change,
kern.maxvnodes will automatically increase by a proportional amount.

While I'm here ensure that this change doesn't result in an unintended
increase in maxpipekva on i386.  Calculate maxpipekva based upon the
size of the kernel address space and the amount of physical memory
instead of the size of the kmem map.  The memory backing pipes is not
allocated from the kmem map.  It is allocated from its own submap of
the kernel map.  In short, it has no real connection to the kmem map.
(In fact, the commit messages for the maxpipekva auto-sizing talk
about using the kernel map size, cf. r117325 and r117391, even though
the implementation actually used the kmem map size.)  Although the
calculation is now done differently, the resulting value for
maxpipekva should remain almost the same on i386.  However, on amd64,
the value will be reduced by 2/3.  This is intentional.  The recent
change to VM_KMEM_SIZE_SCALE on amd64 for the benefit of ZFS also had
the unnecessary side-effect of increasing maxpipekva.  This change is
effectively restoring maxpipekva on amd64 to its prior value.

Eliminate init_param3() since it is no longer used.
2011-03-23 16:38:29 +00:00
John Baldwin
c3b127e022 Small style fix. 2011-03-23 13:44:32 +00:00
Edward Tomasz Napierala
999d680c92 Make UFS use PSARC/2010/029 NFSv4 ACL semantics by default, bringing
it in line with ZFSv28.

X-MFC-After:	ZFSv28.
2011-03-22 19:52:29 +00:00
Edward Tomasz Napierala
cdec385674 Move the code around so that libc behaviour does not depend on a variable
that was supposed to be kernel-only.  There should be no functional changes.
2011-03-22 17:44:07 +00:00
Jeff Roberson
e4cd31dd3c - Merge changes to the base system to support OFED. These include
a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND,
   and other miscellaneous small features.
2011-03-21 09:40:01 +00:00
Alan Cox
09a196a7de Update a comment. The sending process has not mapped the buffer pages
since before r127501.  Strictly speaking, the buffer pages are not
"wired".  They remain in the paging queues.  However, they are pinned in
memory using vm_page_hold().
2011-03-20 15:04:43 +00:00
Ivan Voras
630db7f99b The hardware has caught up; improvements are now observed even at 128,
but stay conservative and bump read_max to "only" 64 (it will probably be
a good idea to increase this to 128 after the next major release).
2011-03-16 16:22:59 +00:00
Andriy Gapon
56ede1074e add DTrace systrace support for linux32 and freebsd32 on amd64 syscalls
This commits makes necessary changes in syscall/sysent generation
infrastructure.

PR:		kern/152822
Submitted by:	Artem Belevich <fbsdlist@src.cx>
Reviewed by:	jhb (ealier version)
MFC after:	3 weeks
2011-03-12 08:51:43 +00:00
Dmitry Chagin
e5d81ef1b5 Extend struct sysvec with new method sv_schedtail, which is used for an
explicit process at fork trampoline path instead of eventhadler(schedtail)
invocation for each child process.

Remove eventhandler(schedtail) code and change linux ABI to use newly added
sysvec method.

While here replace explicit comparing of module sysentvec structure with the
newly created process sysentvec to detect the linux ABI.

Discussed with:	kib

MFC after:	2 Week
2011-03-08 19:01:45 +00:00
John Baldwin
e84c2db137 When constructing a new cpuset, apply the parent cpuset's mask to the new
set's mask rather than the root mask.  This was causing the root mask to
be modified incorrectly.

Reviewed by:	jeff
MFC after:	1 week
2011-03-08 14:18:21 +00:00
Konstantin Belousov
fd7032e1b3 Do not assert buffer lock in VFS_STRATEGY() when kernel already paniced.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2011-03-08 11:50:59 +00:00
Konstantin Belousov
0ad4dd9a00 The execution of the shebang script requires putting interpreter path,
possible option and script path in the place of argv[0] supplied to
execve(2).  It is possible and valid for the substitution to be shorter
then the argv[0].

Avoid signed underflow in this case.

Submitted by:	Devon H. O'Dell <devon.odell gmail com>
PR:	kern/155321
MFC after:	1 week
2011-03-06 22:59:30 +00:00
Edward Tomasz Napierala
8a6f498522 Temporarily revert r219272; it breaks acl_is_trivial_np(3). 2011-03-06 20:12:09 +00:00
Dmitry Chagin
de60a5f38c Style(9) fix.
Fix indentation in comment, double ';' in variable declaration.

MFC after:	1 Week
2011-03-05 20:54:17 +00:00
Dmitry Chagin
22ec040605 Partially reworked r219042.
The reason for this is a bug at ktrops() where process dereferenced
without having a lock. This might cause a panic if ktrace was runned
with -p flag and the specified process exited between the dropping
a lock and writing sv_flags.

Since it is impossible to acquire sx lock while holding mtx switch
to use asynchronous enqueuerequest() instead of writerequest().

Rename ktr_getrequest_ne() to more understandable name [1].

Requested by:	jhb [1]

MFC after:	1 Week
2011-03-05 20:36:42 +00:00
Edward Tomasz Napierala
7123f4cd6f Export login class information via kinfo and make it possible to view
it using "ps -o class".
2011-03-05 14:41:49 +00:00
Edward Tomasz Napierala
e776709347 Regenerate. 2011-03-05 12:46:24 +00:00
Edward Tomasz Napierala
2bfc50bc4f Add two new system calls, setloginclass(2) and getloginclass(2). This makes
it possible for the kernel to track login class the process is assigned to,
which is required for RCTL.  This change also make setusercontext(3) call
setloginclass(2) and makes it possible to retrieve current login class using
id(1).

Reviewed by:	kib (as part of a larger patch)
2011-03-05 12:40:35 +00:00
Edward Tomasz Napierala
18ac6e83dc Make UFS use PSARC/2010/029 NFSv4 ACL semantics by default, just like
ZFSv28 does.

MFC after:	2 months
2011-03-04 19:53:07 +00:00
Alexander Leidinger
d783bbd2d2 - Add a FEATURE for capsicum (security_capabilities).
- Rename mac FEATURE to security_mac.

Discussed with:	rwatson
2011-03-04 09:03:54 +00:00
Edward Tomasz Napierala
953bb3b992 Make "struct pts_softc" point to ucred instead of uidinfo. This is no-op,
required for resource containers.

Reviewed by:	kib (as part of a larger patch), ed
2011-03-03 17:33:22 +00:00
John Baldwin
88690d6a73 Similar to 189574, properly handle subclasses of bus drivers when deleting
a driver during kldunload.  Specifically, recursively walk the tree of
subclasses of a given driver attachment's bus device class detaching all
instances of that driver for each class and its subclasses.

Reported by:	bschmidt
Reviewed by:	imp
MFC after:	1 week
2011-03-01 14:43:37 +00:00
Robert Watson
fc94e4476b Continue introducing Capsicum capability mode support:
If a system call wasn't listed in capabilities.conf, return ECAPMODE at
syscall entry.

Reviewed by:	anderson
Discussed with:	benl, kris, pjd
Sponsored by:	Google, Inc.
Obtained from:	Capsicum Project
MFC after:	3 months
2011-03-01 13:32:07 +00:00
Robert Watson
ddfe0c2ba4 Regenerate system call files following addition of cap_enter(2),
cap_getmode(2), and capabilities.conf.

Reviewed by:	anderson
Discussed with:	benl, kris, pjd
Obtained from:	Capsicum Project
Sponsored by:	Google, Inc.
MFC after:	3 months
2011-03-01 13:30:23 +00:00
Robert Watson
08e6d9fad8 Continue to introduce Capsicum Capability Mode support:
Add a new system call flag, SYF_CAPENABLED, which indicates that a
particular system call is available in capability mode.

Add a new configuration file, kern/capabilities.conf (similar files
may be introduced for other ABIs in the future), which enumerates
system calls that are available in capability mode.  When a new
system call is added to syscalls.master, it will also need to be
added here (if needed).  Teach sysent parts to use this file to set
values for SYF_CAPENABLED for the native ABI.

Reviewed by:	anderson
Discussed with:	benl, kris, pjd
Obtained from:	Capsicum Project
MFC after:	3 months
2011-03-01 13:28:27 +00:00
Robert Watson
96fcc75fdf Add initial support for Capsicum's Capability Mode to the FreeBSD kernel,
compiled conditionally on options CAPABILITIES:

Add a new credential flag, CRED_FLAG_CAPMODE, which indicates that a
subject (typically a process) is in capability mode.

Add two new system calls, cap_enter(2) and cap_getmode(2), which allow
setting and querying (but never clearing) the flag.

Export the capability mode flag via process information sysctls.

Sponsored by:	Google, Inc.
Reviewed by:	anderson
Discussed with:	benl, kris, pjd
Obtained from:	Capsicum Project
MFC after:	3 months
2011-03-01 13:23:37 +00:00
Dmitry Chagin
7705d4b24a Introduce preliminary support of the show description of the ABI of
traced process by adding two new events which records value of process
sv_flags to the trace file at process creation/execing/exiting time.

MFC after:	1 Month.
2011-02-25 22:05:33 +00:00
Dmitry Chagin
b4c20e5e37 ktrace_resize_pool() locking slightly reworked:
1) do not take a lock around the single atomic operation.
2) do not lose the invariant of lock by dropping/acquiring
   ktrace_mtx around free() or malloc().

MFC after:	1 Month.
2011-02-25 22:03:28 +00:00
Alexander Leidinger
9a253c101e Make the description of the feature consistent with another similar
description for another feature.

Noticed by:	trasz
2011-02-25 12:46:43 +00:00
Alexander Leidinger
de5b19526b Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/
PMC/SYSV/...).

No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by:   Google Summer of Code 2010
Submitted by:   kibab
Reviewed by:    arch@ (parts by rwatson, trasz, jhb)
X-MFC after:    to be determined in last commit with code from this project
2011-02-25 10:11:01 +00:00
Sergey Kandaurov
c0bc8d1008 Clean up the now unused #include statement.
Approved by:	kib (mentor)
MFC after:	1 week
X-MFC with:	r218972
2011-02-23 18:22:40 +00:00
Konstantin Belousov
25a9cfc9e8 Move the max_threads_per_proc and max_threads_hits variables to the
file where they are used. Declare the kern.threads sysctl node at the
same location. Since no external use for the variables exists, make them
static.

Discussed with:	dchagin
MFC after:	1 week
2011-02-23 13:50:24 +00:00
John Baldwin
a328d5359c Revert previous change, the existing check was correct.
Pointy hat to:	jhb
2011-02-23 13:25:42 +00:00
John Baldwin
3379ac59af Expose the umtx_key structure and API to the rest of the kernel.
MFC after:	3 days
2011-02-23 13:19:14 +00:00
John Baldwin
329b4acb91 Fix off-by-one error in check against max_threads_per_proc.
Submitted by:	arundel
MFC after:	1 week
2011-02-23 12:56:25 +00:00
Rebecca Cran
6bccea7c2b Fix typos - remove duplicate "the".
PR:	bin/154928
Submitted by:	Eitan Adler <lists at eitanadler.com>
MFC after: 	3 days
2011-02-21 09:01:34 +00:00
Jaakko Heinonen
da2e368f70 Don't restore old mount options and flags if VFS_MOUNT(9) succeeds but
vfs_export() fails. Restoring old options and flags after successful
VFS_MOUNT(9) call may cause the file system internal state to become
inconsistent with mount options and flags. Specifically the FFS super
block fs_ronly field and the MNT_RDONLY flag may get out of sync.

PR:		kern/133614
Discussed on:	freebsd-hackers
2011-02-19 14:27:14 +00:00
Matthew D Fleming
3a5d36716f Modify kdb_trap() so that it re-calls the dbbe_trap function as long as
the debugger back-end has changed.  This means that switching from ddb
to gdb no longer requires a "step" which can be dangerous on an
already-crashed kernel.

Also add a capability to get from the gdb back-end back to ddb, by
typing ^C in the console window.

While here, simplify kdb_sysctl_available() by using
sbuf_new_for_sysctl(), and use strlcpy() instead of strncpy() since the
strlcpy semantic is desired.

MFC after:	1 month
2011-02-18 22:25:11 +00:00
Bjoern A. Zeeb
1fb51a12f2 Mfp4 CH=177274,177280,177284-177285,177297,177324-177325
VNET socket push back:
  try to minimize the number of places where we have to switch vnets
  and narrow down the time we stay switched.  Add assertions to the
  socket code to catch possibly unset vnets as seen in r204147.

  While this reduces the number of vnet recursion in some places like
  NFS, POSIX local sockets and some netgraph, .. recursions are
  impossible to fix.

  The current expectations are documented at the beginning of
  uipc_socket.c along with the other information there.

  Sponsored by: The FreeBSD Foundation
  Sponsored by: CK Software GmbH
  Reviewed by:  jhb
  Tested by:    zec

Tested by:	Mikolaj Golub (to.my.trociny gmail.com)
MFC after:	2 weeks
2011-02-16 21:29:13 +00:00
Bjoern A. Zeeb
bf9ce95bd2 Mfp4 CH=177256:
Catch a set vnet upon return to user space. This usually
  means return paths with CURVNET_RESTORE() missing.

  If VNET_DEBUG is turned on we can even tell the function
  that did the CURVNET_SET() which is really helpful; else
  we print "N/A".

  Sponsored by: The FreeBSD Foundation
  Sponsored by: CK Software GmbH
  Reviewed by:  jhb

MFC after:	11 days
2011-02-14 20:49:37 +00:00
Daniel Eischen
f7e6ce6d7a Allow the SO_SETFIB socket option to select the default (0)
routing table.

Reviewed by:	julian
2011-02-13 00:14:13 +00:00
Alan Cox
d7b20e4b45 Retire VFS_BIO_DEBUG. Convert those checks that were still valid into
KASSERT()s and eliminate the rest.

Replace excessive printf()s and a panic() in bufdone_finish() with a
KASSERT() in vm_page_io_finish().

Reviewed by:	kib
2011-02-12 01:00:00 +00:00
Juli Mallett
37142d9e87 With smp_topo_none, set cg_mask to all_cpus rather than setting the mp_ncpus
low bits.

Submitted by:	Bhanu Prakash
Reviewed by:	jeffr
2011-02-11 22:43:10 +00:00
Bjoern A. Zeeb
0028e52461 Mfp4 CH=177255:
Make VNET_ASSERT() available with either VNET_DEBUG or INVARIANTS.

  Change the syntax to match KASSERT() to allow more flexible panic
  messages rather than having a printf with hardcoded arguments
  before panic.

  Adjust the few assertions we have to the new format (and enhance
  the output).

  Sponsored by: The FreeBSD Foundation
  Sponsored by: CK Software GmbH
  Reviewed by:	jhb

MFC after:	2 weeks
2011-02-11 13:27:00 +00:00
Marcel Moolenaar
278e79707e Provide convenience function for obtaining MODINFO_ADDR and MODINFO_SIZE
attributes for preloaded modules/images. In particular, MODINFO_ADDR has
the added complexity of not always being relocated properly. Rather than
kluging this in the various components that are affected, we handle it
in a centralized place (preload_fetch_addr()). To that end, expose a new
variable, preload_addr_relocate, that MD initialization code can set and
that turns the address attribute into a valid kernel VA.

Architectures that need the relocation: arm & powerpc (at least).
Components that can utilize this: acpi(4), md(4), fb(4), pci(4), ZFS, geli.

Sponsored by: Juniper Networks
2011-02-09 19:08:21 +00:00
Matthew D Fleming
13434232a6 Remove the uio_yield prototype and symbol. This function has been
misnamed since it was introduced and should not be globally exposed
with this name.  The equivalent functionality is now available using
kern_yield(curthread->td_user_pri).  The function remains
undocumented.

Bump __FreeBSD_version.
2011-02-08 00:36:46 +00:00
Matthew D Fleming
e7ceb1e99b Based on discussions on the svn-src mailing list, rework r218195:
- entirely eliminate some calls to uio_yeild() as being unnecessary,
   such as in a sysctl handler.

 - move should_yield() and maybe_yield() to kern_synch.c and move the
   prototypes from sys/uio.h to sys/proc.h

 - add a slightly more generic kern_yield() that can replace the
   functionality of uio_yield().

 - replace source uses of uio_yield() with the functional equivalent,
   or in some cases do not change the thread priority when switching.

 - fix a logic inversion bug in vlrureclaim(), pointed out by bde@.

 - instead of using the per-cpu last switched ticks, use a per thread
   variable for should_yield().  With PREEMPTION, the only reasonable
   use of this is to determine if a lock has been held a long time and
   relinquish it.  Without PREEMPTION, this is essentially the same as
   the per-cpu variable.
2011-02-08 00:16:36 +00:00
Konstantin Belousov
6f9ec5aab0 Clear the padding when returning context to the usermode, for
MI ucontext_t and x86 MD parts.
Kernel allocates the structures on the stack, and not clearing
reserved fields and paddings causes leakage.

Noted and discussed with:	bde
MFC after:	2 weeks
2011-02-05 15:10:27 +00:00
John Baldwin
f7488600c0 Always assert that the turnstile chain lock is held in turnstile_wait()
and remove a duplicate hash lookup.

MFC after:	1 week
2011-02-04 14:16:41 +00:00
Alan Cox
8189ac85e9 Eliminate unnecessary page hold_count checks. These checks predate
r90944, which introduced a general mechanism for handling the freeing
of held pages.

Reviewed by:	kib@
2011-02-03 14:42:46 +00:00
Matthew D Fleming
08b163fa51 Put the general logic for being a CPU hog into a new function
should_yield().  Use this in various places.  Encapsulate the common
case of check-and-yield into a new function maybe_yield().

Change several checks for a magic number of iterations to use
should_yield() instead.

MFC after:	1 week
2011-02-02 16:35:10 +00:00
Konstantin Belousov
f7780c61e7 The unp_gc() function drops and reaquires lock between scan and
collect phases.  The unp_discard() function executes
unp_externalize_fp(), which might make the socket eligible for gc-ing,
and then, later, taskqueue will close the socket.  Since unp_gc()
dropped the list lock to do the malloc, close might happen after the
mark step but before the collection step, causing collection to not
find the socket and miss one array element.

I believe that the race was there before r216158, but the stated
revision made the window much wider by postponing the close to
taskqueue sometimes.

Only process as much array elements as we find the sockets during
second phase of gc [1].  Take linkage lock and recheck the eligibility
of the socket for gc, as well as call fhold() under the linkage lock.

Reported and tested by:	jmallett
Submitted by:   jmallett [1]
Reviewed by:	rwatson, jeff (possibly)
MFC after:	1 week
2011-02-01 13:33:49 +00:00
Konstantin Belousov
9ca9fc5380 If more than one thread allocated sf buffers for sendfile(2), and
each of the threads needs more while current pool of the buffers is
exhausted, then neither thread can make progress.

Switch to nowait allocations after we got first buffer already.

Reported by:	az
Reviewed by:	alc (previous version)
Tested by:	pho
MFC after:	1 week
2011-01-28 17:37:09 +00:00
Jilles Tjoelker
90750179ec Do not trip a KASSERT if /dev/null cannot be opened for a setuid program.
The fdcheckstd() function makes sure fds 0, 1 and 2 are open by opening
/dev/null. If this fails (e.g. missing devfs or wrong permissions),
fdcheckstd() will return failure and the process will exit as if it received
SIGABRT. The KASSERT is only to check that kern_open() returns the expected
fd, given that it succeeded.

Tripping the KASSERT is most likely if fd 0 is open but fd 1 or 2 are not.

MFC after:	2 weeks
2011-01-28 15:29:35 +00:00
Matthew D Fleming
00f0e671ff Explicitly wire the user buffer rather than doing it implicitly in
sbuf_new_for_sysctl(9).  This allows using an sbuf with a SYSCTL_OUT
drain for extremely large amounts of data where the caller knows that
appropriate references are held, and sleeping is not an issue.

Inspired by:	rwatson
2011-01-27 00:34:12 +00:00
Matthew D Fleming
73d6f8516d Remove the CTLFLAG_NOLOCK as it seems to be both unused and
unfunctional.  Wiring the user buffer has only been done explicitly
since r101422.

Mark the kern.disks sysctl as MPSAFE since it is and it seems to have
been mis-using the NOLOCK flag.

Partially break the KPI (but not the KBI) for the sysctl_req 'lock'
field since this member should be private and the "REQ_LOCKED" state
seems meaningless now.
2011-01-26 22:48:09 +00:00
Dmitry Chagin
a5c1afadeb Add macro to test the sv_flags of any process. Change some places to test
the flags instead of explicit comparing with address of known sysentvec
structures.

MFC after:	1 month
2011-01-26 20:03:58 +00:00
Konstantin Belousov
dbccdf7684 When vtruncbuf() iterates over the vnode buffer list, lock buffer object
before checking the validity of the next buffer pointer. Otherwise, the
buffer might be reclaimed after the check, causing iteration to run into
wrong buffer.

Reported and tested by:	pho
MFC after:	1 week
2011-01-25 14:04:02 +00:00
Konstantin Belousov
6fa39a7327 Allow debugger to specify that children of the traced process should be
automatically traced. Extend the ptrace(PL_LWPINFO) to report that child
just forked.

Reviewed by:	davidxu, jhb
MFC after:	2 weeks
2011-01-25 10:59:21 +00:00
Jaakko Heinonen
1a4fbae871 Replace spaces with tabs. 2011-01-24 17:08:26 +00:00
Sergey Kandaurov
4053b05b91 Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize.
Submitted by:	perryh pluto.rain.com (previous version)
Reviewed by:	jhb
Approved by:	kib (mentor)
Tested by:	universe
2011-01-21 10:26:26 +00:00
Matthew D Fleming
cbc134ad03 Introduce signed and unsigned version of CTLTYPE_QUAD, renaming
existing uses.  Rename sysctl_handle_quad() to sysctl_handle_64().
2011-01-19 23:00:25 +00:00
Matthew D Fleming
2fee06f087 Specify a CTLTYPE_FOO so that a future sysctl(8) change does not need
to rely on the format string.
2011-01-18 21:14:18 +00:00
John Baldwin
2dc29adb9f Rework realtime priority support:
- Move the realtime priority range up above kernel sleep priorities and
  just below interrupt thread priorities.
- Contract the interrupt and kernel sleep priority ranges a bit so that
  the timesharing priority band can be increased.  The new timeshare range
  is now slightly larger than the old realtime + timeshare ranges.
- Change the ULE scheduler to no longer use realtime priorities for
  interactive threads.  Instead, the larger timeshare range is now split
  into separate subranges for interactive and non-interactive ("batch")
  threads.  The end result is that interactive threads and non-interactive
  threads still use the same priority ranges as before, but realtime
  threads now have a separate, dedicated priority range.
- Do not modify the priority of non-timeshare threads in sched_sleep()
  or via cv_broadcastpri().  Realtime and idle priority threads will
  no longer have their priorities affected by sleeping in the kernel.

Reviewed by:	jeff
2011-01-14 17:06:54 +00:00
Matthew D Fleming
52c0b557cc One more sysctl(9) type-safety that I missed before. 2011-01-13 18:20:37 +00:00
Matthew D Fleming
240577c2a7 Fix up a few more sysctl(9) mis-typing found in various LINT builds. 2011-01-13 18:20:27 +00:00
John Baldwin
12d56c0f63 Introduce two new helper macros to define the priority ranges used for
interactive timeshare threads (PRI_*_INTERACTIVE) and non-interactive
timeshare threads (PRI_*_BATCH) and use these instead of PRI_*_REALTIME
and PRI_*_TIMESHARE.  No functional change.

Reviewed by:	jeff
2011-01-13 14:22:27 +00:00
Matthew D Fleming
fbbb13f962 sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.
Commit the kernel changes.
2011-01-12 19:54:19 +00:00
John Baldwin
d330520523 - Retire some unused ithread priorities: PI_TTYHIGH, PI_TAPE, and
PI_DISKLOW.  While here, rename PI_TTYLOW to PI_TTY.
- Add a macro PI_SWI() that takes a SWI_* constant as an argument and
  returns the suitable thread priority.
2011-01-11 22:15:30 +00:00
John Baldwin
c9a8cba456 Always use PRI_BASE() when checking the base type of a thread's priority
class.

MFC after:	2 weeks
2011-01-11 22:13:19 +00:00
John Baldwin
58ccf5b41c Remove unneeded includes of <sys/linker_set.h>. Other headers that use
it internally contain nested includes.

Reviewed by:	bde
2011-01-11 13:59:06 +00:00
Lawrence Stewart
5a29e4d24c Fix hhook_head_is_virtualised() so that "ret" can't be used uninitialised.
Sponsored by:	FreeBSD Foundation
Submitted by:	pjd
MFC after:	9 weeks
X-MFC with:	r216615
2011-01-11 01:11:07 +00:00
Lawrence Stewart
188d9a4947 Fix some minor style/readability nits in hhook.
Sponsored by:	FreeBSD Foundation
Submitted by:	pjd
MFC after:	9 weeks
X-MFC with:	r216615
2011-01-11 00:29:17 +00:00
John Baldwin
789200082c Fix two harmless off-by-one errors.
Reviewed by:	jeff
MFC after:	2 weeks
2011-01-10 20:48:10 +00:00
Bjoern A. Zeeb
8d12fab9ae Improve style and wording of comments and sysctl descriptions [1].
Move machdep.ct_debug to debug.clocktime as there was no reason to
actually put it under machdep in r216340.

Submitted by:	bde [1]
MFC after:	3 days
2011-01-09 14:34:56 +00:00
Nathan Whitehorn
083cfea1ee Make RB_CDROM work. This should probably check for a disc in cd1 and acd1
as well.
2011-01-08 19:50:13 +00:00
Attilio Rao
08e4ac8ad6 Revert r216805.
That revision is introducing a bug which is more visible than problems
it is trying to fix.

As long as my time is very limited in this period I am going to
commit back this patch just once it is fully fixed.

Reported by:	dim, Nicholas Esborn
2011-01-08 18:51:15 +00:00
Konstantin Belousov
26d8f3e11d Use the same expression to report stack protection mode for AT_STACKEXEC
as the expression used by exec_new_vmspace().
2011-01-08 18:41:19 +00:00
Konstantin Belousov
291c06a127 In elf image activator, read and apply the stack protection mode from
PT_GNU_STACK program header, if present and enabled. Two new sysctls
are provided, kern.elf32.nxstack and kern.elf64.nxstack, that allow to
enable PT_GNU_STACK for ABIs of specified bitsize, if ABI decided to
support shared page.

Inform rtld about access mode of the stack initial mapping by
AT_STACKPROT aux vector.

At the moment, the default is disabled, waiting for the usermode
support bits.
2011-01-08 16:30:59 +00:00
Konstantin Belousov
6297a3d843 Create shared (readonly) page. Each ABI may specify the use of page by
setting SV_SHP flag and providing pointer to the vm object and mapping
address. Provide simple allocator to carve space in the page, tailored
to put the code with alignment restrictions.

Enable shared page use for amd64, both native and 32bit FreeBSD
binaries.  Page is private mapped at the top of the user address
space, moving a start of the stack one page down. Move signal
trampoline code from the top of the stack to the shared page.

Reviewed by:	 alc
2011-01-08 16:13:44 +00:00
Konstantin Belousov
ed167eaa80 Collect code to translate between vm_prot_t and p_flags into helper
functions.

MFC after:	1 week
2011-01-08 16:02:14 +00:00
John Baldwin
fd05807822 - Properly initialize the base priority (td_base_pri) of thread0 to PVM
to match the desired priority in td_priority.  Otherwise the first time
  thread0 used a borrowed priority it would drop down to PUSER instead of
  PVM.
- Explicitly initialize the starting priority of new kprocs to PVM to
  avoid inheriting some random priority from thread0.

MFC after:	2 weeks
2011-01-06 22:26:00 +00:00
John Baldwin
22d19207e9 - Move sched_fork() later in fork() after the various sections of the new
thread and proc have been copied and zeroed from the old thread and
  proc.  Otherwise attempts to modify thread or process data in sched_fork()
  could be undone.
- Don't copy td_{base,}_user_pri from the old thread to the new thread in
  sched_fork_thread() in ULE.  This is already done courtesy the bcopy()
  of the thread copy region.
- Always initialize the real priority (td_priority) of new threads to the
  new thread's base priority (td_base_pri) to avoid bogusly inheriting a
  borrowed priority from the parent thread.

MFC after:	2 weeks
2011-01-06 22:24:00 +00:00
John Baldwin
177499ebcc Only change the priority of timeshare threads to PRI_MAX_TIMESHARE
when yield() is called.  Specifically, leave the priority of real time
and idle threads unchanged.

MFC after:	2 weeks
2011-01-06 22:19:15 +00:00
John Baldwin
a8f4344f08 - Restore dropping the priority of syncer down to PPAUSE when it is idle.
This was lost when it was converted to using a condition variable instead
  of lbolt.
- Drop the priority of flowtable down to PPAUSE when it is idle as well
  since it is a similar background task.

MFC after:	2 weeks
2011-01-06 22:17:07 +00:00
John Baldwin
6226ec3ef8 Retire PCONFIG and leave the priority of thread0 alone when waiting for
interrupt config hooks to execute.
2011-01-06 22:09:37 +00:00
Edward Tomasz Napierala
7b956487e9 Fix page fault that occurred when trying to initialize preloaded kernel module,
the dependency of which was preloaded, but failed to initialize.  Previously,
kernel dereferenced NULL pointer returned by modlist_lookup2(); now, when this
happens, we unload the dependent module.  Since the depended_files list is
sorted in dependency order, this properly propagates, unloading modules that
depend on failed ones.

From the user point of view, this prevents the kernel from panicing when
trying to boot kernel compiled without KDTRACE_HOOKS with dtraceall_load="YES"
in /boot/loader.conf.

Reviewed by:	kib
2011-01-05 09:58:41 +00:00