Commit graph

65 commits

Author SHA1 Message Date
Konstantin Belousov
36bcb07ab5 Switch the defaults to not split the RLIMIT_STACK-sized initial thread
stack into the stacks of the created threads.  Add knob
LIBPTHREAD_SPLITSTACK_MAIN to restore the older behaviour.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2014-09-24 12:39:12 +00:00
Konstantin Belousov
6c8ce3bfce Add a knob LIBPTHREAD_BIGSTACK_MAIN, which instructs libthr to leave
the whole RLIMIT_STACK-sized region of the kernel-allocated stack as
the stack of main thread.

By default, the main thread stack is clamped at 2MB (4MB on 64bit
ABIs) and the rest is used for other threads stack allocation.  Since
there is no programmatic way to adjust the size of the main thread
stack, pthread_attr_setstacksize() is too late, the knob allows user
to manage the main stack size both for single-threaded and
multi-threaded processes with the rlimit.

Reported by:	"Ivan A. Kosarev" <ivan@ivan-labs.com>
Tested by:	dim
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2014-08-13 05:53:41 +00:00
Jilles Tjoelker
b18943f3b4 libthr: Always use the threaded rtld lock implementation.
The threaded rtld lock implementation is faster even in the single-threaded
case because it postpones signal handlers via THR_CRITICAL_ENTER and
THR_CRITICAL_LEAVE instead of calling sigprocmask(2).

As a result, exception handling becomes faster in single-threaded
applications linked with libthr.

Reviewed by:	kib
2013-01-18 23:08:40 +00:00
David Xu
a7b84c6512 In suspend_common(), don't wait for a thread which is in creation, because
pthread_suspend_all_np() may have already suspended its parent thread.
Add locking code in pthread_suspend_all_np() to only allow one thread
to suspend other threads, this eliminates a deadlock where two or more
threads try to suspend each others.
2012-08-27 03:09:39 +00:00
David Xu
84ac0fb8ca MFp4:
Enqueue thread in LIFO, this can cause starvation, but it gives better
performance. Use _thr_queuefifo to control the frequency of FIFO vs LIFO,
you can use environment string LIBPTHREAD_QUEUE_FIFO to configure the
variable.
2012-05-03 09:17:31 +00:00
Alexander Kabaev
a805bbe21a Do not set thread name to less than informative 'initial thread'. 2011-06-19 13:35:36 +00:00
David Xu
d1078b0b03 MFp4:
- Add flags CVWAIT_ABSTIME and CVWAIT_CLOCKID for umtx kernel based
  condition variable, this should eliminate an extra system call to get
  current time.

- Add sub-function UMTX_OP_NWAKE_PRIVATE to wake up N channels in single
  system call. Create userland sleep queue for condition variable, in most
  cases, thread will wait in the queue, the pthread_cond_signal will defer
  thread wakeup until the mutex is unlocked, it tries to avoid an extra
  system call and a extra context switch in time window of pthread_cond_signal
  and pthread_mutex_unlock.

The changes are part of process-shared mutex project.
2010-12-22 05:01:52 +00:00
David Xu
bbb64c2143 In current code, statically initialized and destroyed object have
same null value, the code can not distinguish between them, to
fix the problem, now a destroyed object is assigned to a non-null
value, and it will be rejected by some pthread functions.
PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP is changed to number 1, so that
adaptive mutex can be statically initialized correctly.
2010-09-28 04:57:56 +00:00
David Xu
f4213b9006 To support stack unwinding for cancellation points, add -fexceptions flag
for them, two functions _pthread_cancel_enter and _pthread_cancel_leave
are added to let thread enter and leave a cancellation point, it also
makes it possible that other functions can be cancellation points in
libraries without having to be rewritten in libthr.
2010-09-25 01:57:47 +00:00
David Xu
3832fd24f1 add code to support stack unwinding when thread exits. note that only
defer-mode cancellation works, asynchrnous mode does not work because
it lacks of libuwind's support. stack unwinding is not enabled unless
LIBTHR_UNWIND_STACK is defined in Makefile.
2010-09-15 02:56:32 +00:00
David Xu
a9b764e218 Convert thread list lock from mutex to rwlock. 2010-09-13 07:03:01 +00:00
David Xu
ada33a6e36 Change atfork lock from mutex to rwlock, also make mutexes used by malloc()
module private type, when private type mutex is locked/unlocked, thread
critical region is entered or leaved. These changes makes fork()
async-signal safe which required by POSIX. Note that user's atfork handler
still needs to be async-signal safe, but it is not problem of libthr, it
is user's responsiblity.
2010-09-01 03:11:21 +00:00
David Xu
02c3c85869 Add signal handler wrapper, the reason to add it becauses there are
some cases we want to improve:
  1) if a thread signal got a signal while in cancellation point,
     it is possible the TDP_WAKEUP may be eaten by signal handler
     if the handler called some interruptibly system calls.
  2) In signal handler, we want to disable cancellation.
  3) When thread holding some low level locks, it is better to
     disable signal, those code need not to worry reentrancy,
     sigprocmask system call is avoided because it is a bit expensive.
The signal handler wrapper works in this way:
  1) libthr installs its signal handler if user code invokes sigaction
     to install its handler, the user handler is recorded in internal
     array.
  2) when a signal is delivered, libthr's signal handler is invoke,
     libthr checks if thread holds some low level lock or is in critical
     region, if it is true, the signal is buffered, and all signals are
     masked, once the thread leaves critical region, correct signal
     mask is restored and buffered signal is processed.
  3) before user signal handler is invoked, cancellation is temporarily
     disabled, after user signal handler is returned, cancellation state
     is restored, and pending cancellation is rescheduled.
2010-09-01 02:18:33 +00:00
David Xu
9b0f1823b5 Use umtx to implement process sharable semaphore, to make this work,
now type sema_t is a structure which can be put in a shared memory area,
and multiple processes can operate it concurrently.
User can either use mmap(MAP_SHARED) + sem_init(pshared=1) or use sem_open()
to initialize a shared semaphore.
Named semaphore uses file system and is located in /tmp directory, and its
file name is prefixed with 'SEMD', so now it is chroot or jail friendly.
In simplist cases, both for named and un-named semaphore, userland code
does not have to enter kernel to reduce/increase semaphore's count.
The semaphore is designed to be crash-safe, it means even if an application
is crashed in the middle of operating semaphore, the semaphore state is
still safely recovered by later use, there is no waiter counter maintained
by userland code.
The main semaphore code is in libc and libthr only has some necessary stubs,
this makes it possible that a non-threaded application can use semaphore
without linking to thread library.
Old semaphore implementation is kept libc to maintain binary compatibility.
The kernel ksem API is no longer used in the new implemenation.

Discussed on: threads@
2010-01-05 02:37:59 +00:00
David Xu
850f4d66cb - Reduce function call overhead for uncontended case.
- Remove unused flags MUTEX_FLAGS_* and their code.
- Check validity of the timeout parameter in mutex_self_lock().
2008-05-29 07:57:33 +00:00
David Xu
6d9517bc9f _vfork is not in libthr, remove the reference. 2008-04-16 03:19:11 +00:00
David Xu
54dff16b26 Use cpuset defined in pthread_attr for newly created thread, for now,
we set scheduling parameters and cpu binding fully in userland, and
because default scheduling policy is SCHED_RR (time-sharing), we set
default sched_inherit to PTHREAD_SCHED_INHERIT, this saves a system
call.
2008-03-05 07:01:20 +00:00
David Xu
a759db946a implement pthread_attr_getaffinity_np and pthread_attr_setaffinity_np. 2008-03-04 03:03:24 +00:00
David Xu
7416cdabcd Add my recent work of adaptive spin mutex code. Use two environments variable
to tune pthread mutex performance:
1. LIBPTHREAD_SPINLOOPS
	If a pthread mutex is being locked by another thread, this environment
	variable sets total number of spin loops before the current thread
	sleeps in kernel, this saves a syscall overhead if the mutex will be
	unlocked very soon (well written application code).
2. LIBPTHREAD_YIELDLOOPS
	If a pthread mutex is being locked by other threads, this environment
	variable sets total number of sched_yield() loops before the currrent
	thread sleeps in kernel. if a pthread mutex is locked, the current thread
	gives up cpu, but will not sleep in kernel, this means, current thread
	does not set contention bit in mutex, but let lock owner to run again
	if the owner is on kernel's run queue, and when lock owner unlocks the
	mutex, it does not need to enter kernel and do lots of work to resume
	mutex waiters, in some cases, this saves lots of syscall overheads for
	mutex owner.

In my practice, sometimes LIBPTHREAD_YIELDLOOPS can massively improve performance
than LIBPTHREAD_SPINLOOPS, this depends on application. These two environments
are global to all pthread mutex, there is no interface to set them for each
pthread mutex, the default values are zero, this means spinning is turned off
by default.
2007-10-30 05:57:37 +00:00
David Xu
00784f8b10 backout experimental adaptive spinning mutex for product use. 2007-05-09 08:39:33 +00:00
David Xu
74c751131b get LIBPTHREAD_ADAPTIVE_SPIN early, so it can be used for some global
mutexes.
2006-12-20 05:05:44 +00:00
David Xu
842a092b74 Check environment variable PTHREAD_ADAPTIVE_SPIN, if it is set, use
it as a default spin cycle count.
2006-12-20 04:43:34 +00:00
David Xu
d99f6dac14 - Remove variable _thr_scope_system, all threads are system scope.
- Rename _thr_smp_cpus to boolean variable _thr_is_smp.
- Define CPU_SPINWAIT macro for each arch, only X86 supports it.
2006-12-15 11:52:01 +00:00
David Xu
f08e1bf682 Eliminate atomic operations in thread cancellation functions, it should
reduce overheads of cancellation points.
2006-11-24 09:57:38 +00:00
David Xu
e6747c7ce1 use rtprio_thread system call to get or set thread priority. 2006-09-21 04:21:30 +00:00
David Xu
bddd24cd9c Replace internal usage of struct umtx with umutex which can supports
real-time if we want, no functionality is changed.
2006-09-06 04:04:10 +00:00
David Xu
8ab9d78b9d Use umutex APIs to implement pthread_mutex, member pp_mutexq is added
into pthread structure to keep track of locked PTHREAD_PRIO_PROTECT mutex,
no real mutex code is changed, the mutex locking and unlocking code should
has same performance as before.
2006-08-28 04:52:50 +00:00
David Xu
6b73f08519 Get number of CPUs and ignore spin count on single processor machine. 2006-08-08 04:42:41 +00:00
David Xu
05c3a5eab4 1. Don't override underscore version of aio_suspend(), system(),
wait(), waitpid() and usleep(), they are internal versions and
   should not be cancellation points.
2. Make wait3() as a cancellation point.
3. Move raise() and pause() into file thr_sig.c.
4. Add functions _sigsuspend, _sigwait, _sigtimedwait and _sigwaitinfo,
   remove SIGCANCEL bit in wait-set for those functions, the signal is
   used internally to implement thread cancellation.
2006-07-25 12:50:05 +00:00
David Xu
e2dc286c1c Caching scheduling policy and priority in userland, a critical but baddly
written application is frequently changing thread priority for SCHED_OTHER
policy.
2006-07-13 22:45:19 +00:00
David Xu
7b4f8f037f Use kernel facilities to support real-time scheduling. 2006-07-12 06:13:18 +00:00
David Xu
245116cafc - Use same priority range returned by kernel's sched_get_priority_min()
and sched_get_priority_max() syscalls.
- Remove unused fields from structure pthread_attr.
2006-04-27 08:18:23 +00:00
David Xu
37a6356bbe WARNS level 4 cleanup. 2006-04-04 02:57:49 +00:00
David Xu
9ad4b64459 Remove priority mutex code because it does not work correctly,
to make it work, turnstile like mechanism to support priority
propagating and other realtime scheduling options in kernel
should be available to userland mutex, for the moment, I just
want to make libthr be simple and efficient thread library.

Discussed with: deischen, julian
2006-03-27 23:50:21 +00:00
David Xu
6f79c82641 Set default contention scope to system. 2006-03-20 03:14:14 +00:00
Daniel Eischen
f4cd2a5b5c Add some more pthread stubs so that librt can use them.
The thread jump table has been resorted, so you need to
keep libc, libpthread, and libthr in sync.

Submitted by:	xu
2006-03-05 18:10:28 +00:00
David Xu
6f716c2f82 Rework last change of pthread_once, create a function _thr_once_init to
reinitialize its internal locks.
2006-02-15 23:05:03 +00:00
David Xu
c5d2fb8df7 After fork(), reinitialize internal locks for pthread_once(). 2006-02-15 13:41:02 +00:00
David Xu
8956297a57 Now, thread name is stored in kernel, userland no longer has to keep it. 2006-02-05 03:04:54 +00:00
David Xu
9572a73405 Use macro STATIC_LIB_REQUIRE to declare a symbol should be linked into
static binary.
2006-01-10 04:53:03 +00:00
David Xu
cf905a1575 1. Retire macro SCLASS, instead simply use language keyword and
put variables in thr_init.c.
2. Hide all global symbols which won't be exported.
2005-12-21 03:14:06 +00:00
David Xu
53bbdf8646 Add code to handle timer_delete(). The timer wrapper code is completely
rewritten, now timers created with same sigev_notify_attributes will
run in same thread, this allows user to organize which timers can
run in same thread to save some thread resource.
2005-11-01 06:53:22 +00:00
David Xu
7a4cd8d366 Conditionally report initial thread event. 2005-04-12 03:13:49 +00:00
David Xu
d245d9e13f Add debugger event reporting support, current only TD_CREATE and TD_DEATH
events are reported.
2005-04-12 03:00:28 +00:00
David Xu
bc1eb018c1 Remove unique id field which is no longer used by debugger. 2005-04-06 13:57:31 +00:00
David Xu
a091d823ad Import my recent 1:1 threading working. some features improved includes:
1. fast simple type mutex.
 2. __thread tls works.
 3. asynchronous cancellation works ( using signal ).
 4. thread synchronization is fully based on umtx, mainly, condition
    variable and other synchronization objects were rewritten by using
    umtx directly. those objects can be shared between processes via
    shared memory, it has to change ABI which does not happen yet.
 5. default stack size is increased to 1M on 32 bits platform, 2M for
    64 bits platform.
As the result, some mysql super-smack benchmarks show performance is
improved massivly.

Okayed by: jeff, mtm, rwatson, scottl
2005-04-02 01:20:00 +00:00
Joe Marcus Clarke
c5a6625e3e Increase the default stacksizes:
32-bit		64-bit
main thread	2 MB		4 MB
other threads	1 MB		2 MB

Approved by:	mtm
Adapted from:	libpthread
2005-03-06 07:56:18 +00:00
David Schultz
6004362e66 Don't include sys/user.h merely for its side-effect of recursively
including other headers.
2004-11-27 06:51:39 +00:00
Mike Makonnen
03d74100cf Implement pthread_atfork in libthr. This is mostly from deichen's
work in libpthread.

Submitted by: Dan Nelson <dnelson@allantgroup.com>
2004-06-27 10:01:35 +00:00
Mike Makonnen
356c2d4f58 In the case that the global thread list is being re-initialized after
a fork, make sure that the current thread isn't detached and freed. As
a consequence the thread should be inserted into the head of the
active list only once (in the beginning).
2004-06-27 09:53:06 +00:00