Commit graph

259 commits

Author SHA1 Message Date
Pedro F. Giffuni
83ef78be95 sys/i386: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 15:08:52 +00:00
Dmitry Chagin
c151945c86 Avoid using [LINUX_]SHAREDPAGE constant directly in the vdso code.
This is needed for https://reviews.freebsd.org/D11780.

Reported by:	kib@
2017-07-30 21:24:20 +00:00
Dmitry Chagin
a0c59c7afd Add support for musl consumers to the Linuxulator.
PR:		213809
Submitted by:	Yonas Yanfa
Reported by:	Yonas Yanfa
MFC after:	1 week
Relnotes:	yes
2017-07-03 10:24:49 +00:00
Konstantin Belousov
2d88da2f06 Move struct syscall_args syscall arguments parameters container into
struct thread.

For all architectures, the syscall trap handlers have to allocate the
structure on the stack.  The structure takes 88 bytes on 64bit arches
which is not negligible.  Also, it cannot be easily found by other
code, which e.g. caused duplication of some members of the structure
to struct thread already.  The change removes td_dbg_sc_code and
td_dbg_sc_nargs which were directly copied from syscall_args.

The structure is put into the copied on fork part of the struct thread
to make the syscall arguments information correct in the child after
fork.

This move will also allow several more uses shortly.

Reviewed by:	jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
X-Differential revision:	https://reviews.freebsd.org/D11080
2017-06-12 21:03:23 +00:00
Pedro F. Giffuni
b66bb393f2 Cleanup redundant parenthesis from existing howmany()/roundup() macro uses. 2016-04-22 16:57:42 +00:00
Pedro F. Giffuni
ea24b0561f X86: use our nitems() macro when it is avaliable through param.h.
No functional change, only trivial cases are done in this sweep,

Discussed in:	freebsd-current
2016-04-19 23:41:46 +00:00
Baptiste Daroussin
b6348be7b9 Add kern.features flags for linux and linux64 modules
kern.features.linux: 1 meaning linux 32 bits binaries are supported
kern.features.linux64: 1 meaning linux 64 bits binaries are supported

The goal here is to help 3rd party applications (including ports) to determine
if the host do support linux emulation

Reviewed by:	dchagin
MFC after:	1 week
Relnotes:	yes
Differential Revision:	D5830
2016-04-05 22:36:48 +00:00
John Baldwin
aa949be551 Convert ss_sp in stack_t and sigstack to void *.
POSIX requires these members to be of type void * rather than the
char * inherited from 4BSD.  NetBSD and OpenBSD both changed their
fields to void * back in 1998.  No new build failures were reported
via an exp-run.

PR:		206503 (exp-run)
Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D5092
2016-01-27 17:55:01 +00:00
Xin LI
669414e4fb Implement AT_SECURE properly.
AT_SECURE auxv entry has been added to the Linux 2.5 kernel to pass a
boolean flag indicating whether secure mode should be enabled. 1 means
that the program has changes its credentials during the execution.
Being exported AT_SECURE used by glibc issetugid() call.

Submitted by:	imp, dchagin
Security:	FreeBSD-SA-16:10.linux
Security:	CVE-2016-1883
2016-01-27 07:20:55 +00:00
Dmitry Chagin
038c720553 Implement vsyscall hack. Prior to 2.13 glibc uses vsyscall
instead of vdso. An upcoming linux_base-c6 needs it.

Differential Revision:  https://reviews.freebsd.org/D1090

Reviewed by:	kib, trasz
MFC after:	1 week
2016-01-09 20:18:53 +00:00
Konstantin Belousov
724f4b62b0 Remove sv_prepsyscall, sv_sigsize and sv_sigtbl members of the struct
sysent.

sv_prepsyscall is unused.

sv_sigsize and sv_sigtbl translate signal number from the FreeBSD
namespace into the ABI domain.  It is only utilized on i386 for iBCS2
binaries.  The issue with this approach is that signals for iBCS2 were
delivered with the FreeBSD signal frame layout, which does not follow
iBCS2.  The same note is true for any other potential user if
sv_sigtbl.  In other words, if ABI needs signal number translation, it
really needs custom sv_sendsig method instead.

Sponsored by:	The FreeBSD Foundation
2015-11-28 08:49:07 +00:00
Conrad Meyer
1b8d039386 Fix missing semi-colon from r289055.
Obtained from:	mjg
Sponsored by:	EMC / Isilon Storage Division
2015-10-08 23:27:45 +00:00
Mateusz Guzik
3e15a670d2 linux: fix handling of out-of-bounds syscall attempts
Due to an off by one the code would read an entry past the table, as
opposed to the last entry which contains the nosys handler.

Reported by:	Pawel Biernacki <pawel.biernacki gmail.com>
2015-10-08 21:08:35 +00:00
Dmitry Chagin
4ab7403bbd Rework signal code to allow using it by other modules, like linprocfs:
1. Linux sigset always 64 bit on all platforms. In order to move Linux
sigset code to the linux_common module define it as 64 bit int. Move
Linux sigset manipulation routines to the MI path.

2. Move Linux signal number definitions to the MI path. In general, they
are the same on all platforms except for a few signals.

3. Map Linux RT signals to the FreeBSD RT signals and hide signal conversion
tables to avoid conversion errors.

4. Emulate Linux SIGPWR signal via FreeBSD SIGRTMIN signal which is outside
of allowed on Linux signal numbers.

PR:		197216
2015-05-24 17:47:20 +00:00
Dmitry Chagin
fcdffc03f8 Call nosys in case when the incorrect syscall number is specified.
Reported by:	trinity
2015-05-24 17:38:02 +00:00
Dmitry Chagin
26c68e1fe5 Use the BSD_TO_LINUX_SIGNAL() wherever there is no need
to check the ABI as it is known.

Differential Revision:	https://reviews.freebsd.org/D1086
2015-05-24 16:30:23 +00:00
Dmitry Chagin
4048f59cd0 Add AT_RANDOM and AT_EXECFN auxiliary vector entries which are used by
glibc. At list since glibc version 2.16 using AT_RANDOM is mandatory.

Differential Revision:	https://reviews.freebsd.org/D1080
2015-05-24 16:24:24 +00:00
Dmitry Chagin
67d3974849 Introduce a new module linux_common.ko which is intended for the
following primary purposes:

1. Remove the dependency of linsysfs and linprocfs modules from linux.ko,
which will be architecture specific on amd64.

2. Incorporate into linux_common.ko general code for platforms on which
we'll support two Linuxulator modules (for both instruction set - 32 & 64 bit).

3. Move malloc(9) declaration to linux_common.ko, to enable getting memory
usage statistics properly.

Currently linux_common.ko incorporates a code from linux_mib.c and linux_util.c
and linprocfs, linsysfs and linux kernel modules depend on linux_common.ko.

Temporarily remove dtrace garbage from linux_mib.c and linux_util.c

Differential Revision:	https://reviews.freebsd.org/D1072
In collaboration with:	Vassilis Laganakos.

Reviewed by:	trasz
2015-05-24 15:51:18 +00:00
Dmitry Chagin
26cf41d6ca Remove stale comment about a signal trampoline which
is moved to the shared page at r219609.

Differential Revision:	https://reviews.freebsd.org/D1063
Reviewed by:	trasz
2015-05-24 15:32:52 +00:00
Dmitry Chagin
0020bdf13a Put linux_platform into the vdso to avoid copying it onto the stack at
every exec.

Differential Revision:	https://reviews.freebsd.org/D1062
Reviewed by:	trasz
2015-05-24 15:30:52 +00:00
Dmitry Chagin
bdc379344a Implement vdso - virtual dynamic shared object. Through vdso Linux
exposes functions from kernel with proper DWARF CFI information so that
it becomes easier to unwind through them.
Using vdso is a mandatory for a thread cancelation && cleanup
on a modern glibc.

Differential Revision:	https://reviews.freebsd.org/D1060
2015-05-24 15:28:17 +00:00
Dmitry Chagin
af682d487b Some style(9) && whitespaces fixes. No functional changes.
Differential Revision:	https://reviews.freebsd.org/D1041
Reviewed by:	emaste
2015-05-24 14:55:12 +00:00
Dmitry Chagin
81338031c4 Switch linuxulator to use the native 1:1 threads.
The reasons:
1. Get rid of the stubs/quirks with process dethreading,
   process reparent when the process group leader exits and close
   to this problems on wait(), waitpid(), etc.
2. Reuse our kernel code instead of writing excessive thread
   managment routines in Linuxulator.

Implementation details:

1. The thread is created via kern_thr_new() in the clone() call with
   the CLONE_THREAD parameter. Thus, everything else is a process.
2. The test that the process has a threads is done via P_HADTHREADS
   bit p_flag of struct proc.
3. Per thread emulator state data structure is now located in the
   struct thread and freed in the thread_dtor() hook.
   Mandatory holdig of the p_mtx required when referencing emuldata
   from the other threads.
4. PID mangling has changed. Now Linux pid is the native tid
   and Linux tgid is the native pid, with the exception of the first
   thread in the process where tid and pid are one and the same.

Ugliness:

   In case when the Linux thread is the initial thread in the thread
   group thread id is equal to the process id. Glibc depends on this
   magic (assert in pthread_getattr_np.c). So for system calls that
   take thread id as a parameter we should use the special method
   to reference struct thread.

Differential Revision:	https://reviews.freebsd.org/D1039
2015-05-24 14:53:16 +00:00
John Baldwin
716718932f MFamd64: Move extern declaration of _ucodesel and _udatasel to
<machine/md_var.h>
2014-11-02 21:40:32 +00:00
Dmitry Chagin
2dedc1281a Revert r266925 as it can lead to instant panic at fexecve():
To allow to run the interpreter itself add a new ELF branding type.

Pointed out by:	kib, mjg
2014-06-17 05:29:18 +00:00
Dmitry Chagin
5f56da1891 To allow to run the interpreter itself add a new ELF branding type.
Allow Linux ABI to run ELF interpreter.

MFC after:	3 days
2014-05-31 15:01:51 +00:00
Ed Maste
3d271aaab0 x86: Allow users to change PSL_RF via ptrace(PT_SETREGS...)
Debuggers may need to change PSL_RF.  Note that tf_eflags is already stored
in the signal context during signal handling and PSL_RF previously could be
modified via sigreturn, so this change should not provide any new ability
to userspace.

For background see the thread at:
http://lists.freebsd.org/pipermail/freebsd-i386/2007-September/005910.html

Reviewed by:	jhb, kib
Sponsored by:	DARPA, AFRL
2013-11-14 15:37:20 +00:00
John Baldwin
d825ce0a5d Reduce duplication between i386/linux/linux.h and amd64/linux32/linux.h
by moving bits that are MI out into headers in compat/linux.

Reviewed by:	Chagin Dmitry  dmitry | gmail
MFC after:	2 weeks
2013-01-29 18:41:30 +00:00
Kevin Lo
9823d52705 Revert previous commit...
Pointyhat to:	kevlo (myself)
2012-10-10 08:36:38 +00:00
Kevin Lo
a10cee30c9 Prefer NULL over 0 for pointers 2012-10-09 08:27:40 +00:00
Konstantin Belousov
aa10345311 Do not write to the user address directly, use suword().
Reported by:	Bengt Ahlgren <bengta sics se>
MFC after:	1 week
2012-02-25 01:33:39 +00:00
Ulrich Spörlein
9a14aa017b Convert files to UTF-8 2012-01-15 13:23:18 +00:00
Dmitry Chagin
acface683e Export the correct AT_PLATFORM value.
Since signal trampolines are copied to the shared page do not need to
leave place on the stack for it. Forgotten in the previous commit.

MFC after:	1 Week
2011-03-26 09:25:35 +00:00
Dmitry Chagin
8f1e49a638 Enable shared page use for amd64/linux32 and i386/linux binaries.
Move signal trampoline code from the top of the stack to the shared page.

MFC after:	2 Weeks
2011-03-13 14:58:02 +00:00
Dmitry Chagin
e5d81ef1b5 Extend struct sysvec with new method sv_schedtail, which is used for an
explicit process at fork trampoline path instead of eventhadler(schedtail)
invocation for each child process.

Remove eventhandler(schedtail) code and change linux ABI to use newly added
sysvec method.

While here replace explicit comparing of module sysentvec structure with the
newly created process sysentvec to detect the linux ABI.

Discussed with:	kib

MFC after:	2 Week
2011-03-08 19:01:45 +00:00
Dmitry Chagin
fde6316272 Sort include files in the alphabetical order. 2011-02-13 19:07:48 +00:00
Konstantin Belousov
78ae4338a2 Add macro DECLARE_MODULE_TIED to denote a module as requiring the
kernel of exactly the same __FreeBSD_version as the headers module was
compiled against.

Mark our in-tree ABI emulators with DECLARE_MODULE_TIED. The modules
use kernel interfaces that the Release Engineering Team feel are not
stable enough to guarantee they will not change during the life cycle
of a STABLE branch. In particular, the layout of struct sysentvec is
declared to be not part of the STABLE KBI.

Discussed with:	bz, rwatson
Approved by:	re (bz, kensmith)
MFC after:	2 weeks
2010-10-12 09:18:17 +00:00
Alan Cox
a14a949872 The interpreter name should no longer be treated as a buffer that can be
overwritten.  (This change should have been included in r210545.)

Submitted by:	kib
2010-07-28 04:47:40 +00:00
Konstantin Belousov
afe1a68827 Reorganize syscall entry and leave handling.
Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
  usermode into struct syscall_args. The structure is machine-depended
  (this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
  from the syscall. It is a generalization of
  cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
  return value.
sv_syscallnames - the table of syscall names.

Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().

The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.

Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().

Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively.  The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.

The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.

Reviewed by:	jhb, marcel, marius, nwhitehorn, stas
Tested by:	marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
	stas (mips)
MFC after:	1 month
2010-05-23 18:32:02 +00:00
Nathan Whitehorn
a107d8aac9 Change the arguments of exec_setregs() so that it receives a pointer
to the image_params struct instead of several members of that struct
individually. This makes it easier to expand its arguments in the future
without touching all platforms.

Reviewed by:	jhb
2010-03-25 14:24:00 +00:00
Konstantin Belousov
d6e029adbe In r197963, a race with thread being selected for signal delivery
while in kernel mode, and later changing signal mask to block the
signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race
exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls.

Use kern_sigprocmask() instead of direct manipulation of td_sigmask to
reschedule newly blocked signals, closing the race.

Reviewed by:	davidxu
Tested by:	pho
MFC after:	1 month
2009-10-27 10:47:58 +00:00
Bjoern A. Zeeb
89ffc202d6 Fix handling of .note.ABI-tag section for GNU systems [1].
Handle GNU/Linux according to LSB Core Specification 4.0,
Chapter 11. Object Format, 11.8. ABI note tag.

Also check the first word of desc, not only name, according to
glibc abi-tags specification to distinguish between Linux and
kFreeBSD.

Add explicit handling for Debian GNU/kFreeBSD, which runs
on our kernels as well [2].

In {amd64,i386}/trap.c, when checking osrel of the current process,
also check the ABI to not change the signal behaviour for Linux
binary processes, now that we save an osrel version for all three
from the lists above in struct proc [2].

These changes make it possible to run FreeBSD, Debian GNU/kFreeBSD
and Linux binaries on the same machine again for at least i386 and
amd64, and no longer break kFreeBSD which was detected as GNU(/Linux).

PR:		kern/135468
Submitted by:	dchagin [1] (initial patch)
Suggested by:	kib [2]
Tested by:	Petr Salinger (Petr.Salinger seznam.cz) for kFreeBSD
Reviewed by:	kib
MFC after:	3 days
2009-08-24 16:19:47 +00:00
Dmitry Chagin
8d30f381ef Do not export AT_CLKTCK when emulating Linux kernel prior
to 2.4.0, as it has appeared in the 2.4.0-rc7 first time.
Being exported, AT_CLKTCK is returned by sysconf(_SC_CLK_TCK),
glibc falls back to the hard-coded CLK_TCK value when aux entry
is not present.

Glibc versions prior to 2.2.1 always use hard-coded CLK_TCK value.

For older applications/libc's which depends on hard-coded CLK_TCK
value user should set compat.linux.osrelease less than 2.4.0.

Approved by:	kib (mentor)
2009-05-10 18:43:43 +00:00
Dmitry Chagin
1ca16454b3 Rework r189362, r191883.
The frequency of the statistics clock is given by stathz.
Use stathz if it is available, otherwise use hz.

Pointed out by:	bde

Approved by:	kib (mentor)
2009-05-10 18:16:07 +00:00
Jamie Gritton
7ae27ff49f Move the per-prison Linux MIB from a private one-off pointer to the new
OSD-based jail extensions.  This allows the Linux MIB to accessed via
jail_set and jail_get, and serves as a demonstration of adding jail support
to a module.

Reviewed by:	dchagin, kib
Approved by:	bz (mentor)
2009-05-07 18:36:47 +00:00
Dmitry Chagin
d789bfd562 Move extern variable definitions to the header file.
Approved by:	kib (mentor)
MFC after:	1 month
2009-05-02 10:06:49 +00:00
Dmitry Chagin
79262bf1f0 Reimplement futexes.
Old implemention used Giant to protect the kernel data structures,
but at the same time called malloc(M_WAITOK), that could cause the
calling thread to sleep and lost Giant protection. User-visible
result was the missed wakeup.

New implementation uses one sx lock per futex. The sx protects
the futex structures and allows to sleep while copyin or copyout
are performed.

Unlike linux, we return EINVAL when FUTEX_CMP_REQUEUE operation
is requested and either caller specified futexes are equial or
second futex already exists. This is acceptable since the situation
can only occur from the application error, and glibc falls back to
old FUTEX_WAKE operation when FUTEX_CMP_REQUEUE returns an error.

Approved by:	kib (mentor)
MFC after:	1 month
2009-05-01 15:36:02 +00:00
Dmitry Chagin
cd899aad76 Fix KBI breakage by r190520 which affects older linux.ko binaries:
1) Move the new field (brand_note) to the end of the Brandinfo structure.
2) Add a new flag BI_BRAND_NOTE that indicates that the brand_note pointer
   is valid.
3) Use the brand_note field if the flag BI_BRAND_NOTE is set and as old
   modules won't have the flag set, so the new field brand_note would be
   ignored.

Suggested by:	jhb
Reviewed by:	jhb
Approved by:	kib (mentor)
MFC after:	6 days
2009-04-05 09:27:19 +00:00
Dmitry Chagin
32c01de21c Implement new way of branding ELF binaries by looking to a
".note.ABI-tag" section.

The search order of a brand is changed, now first of all the
".note.ABI-tag" is looked through.

Move code which fetch osreldate for ELF binary to check_note() handler.

PR:		118473
Approved by:	kib (mentor)
2009-03-13 16:40:51 +00:00
John Baldwin
2ee8325f42 A better fix for handling different FPU initial control words for different
ABIs:
- Store the FPU initial control word in the pcb for each thread.
- When first using the FPU, load the initial control word after restoring
  the clean state if it is not the standard control word.
- Provide a correct control word for Linux/i386 binaries under
  FreeBSD/amd64.
- Adjust the control word returned for fpugetregs()/npxgetregs() when a
  thread hasn't used the FPU yet to reflect the real initial control
  word for the current ABI.
- The Linux/i386 ABI for FreeBSD/i386 now properly sets the right control
  word instead of trashing whatever the current state of the FPU is.

Reviewed by:	bde
2009-03-05 19:42:11 +00:00
Dmitry Chagin
4d7c2e8a48 Add AT_PLATFORM, AT_HWCAP and AT_CLKTCK auxiliary vector entries which
are used by glibc. This silents the message "2.4+ kernel w/o ELF notes?"
from some programs at start, among them are top and pkill.

Do the assignment of the vector entries in elf_linux_fixup()
as it is done in glibc.

Fix some minor style issues.

Submitted by:	Marcin Cieslak <saper at SYSTEM PL>
Approved by:	kib (mentor)
MFC after:	1 week
2009-03-04 12:14:33 +00:00
Warner Losh
0e7faf3934 Remove obsolete AT_DEBUG stuff. It never should have been committed
in the first place, let alone migrated to linux emulation.

Reviewed by:	peter, rdivacky
2008-12-17 06:11:42 +00:00
Konstantin Belousov
b4cf0e62f4 Add sv_flags field to struct sysentvec with intention to provide description
of the ABI of the currently executing image. Change some places to test
the flags instead of explicit comparing with address of known sysentvec
structures to determine ABI features.

Discussed with:	dchagin, imp, jhb, peter
2008-11-22 12:36:15 +00:00
Konstantin Belousov
aa8b201112 Correctly fill siginfo for the signals delivered by linux tkill/tgkill.
It is required for async cancellation to work.

Fix PROC_LOCK leak in linux_tgkill when signal delivery attempt is made
to not linux process.

Do not call em_find(p, ...) with p unlocked.

Move common code for linux_tkill() and linux_tgkill() into
linux_do_tkill().

Change linux siginfo_t definition to match actual linux one. Extend
uid fields to 4 bytes from 2. The extension does not change structure
layout and is binary compatible with previous definition, because i386
is little endian, and each uid field has 2 byte padding after it.

Reported by:	Nicolas Joly <njoly pasteur fr>
Submitted by:	dchangin
MFC after:	1 month
2008-10-19 10:02:26 +00:00
Konstantin Belousov
a8d403e102 Change the static struct sysentvec and struct Elf_Brandinfo initializers
to the C99 style. At least, it is easier to read sysent definitions
that way, and search for the actual instances of sigcode etc.

Explicitely initialize sysentvec.sv_maxssiz that was missed in most
sysvecs.

No objection from:	jhb
MFC after:	1 month
2008-09-24 10:14:37 +00:00
Konstantin Belousov
48b05c3f82 Implement the linux syscalls
openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat,
    renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat.

Submitted by:	rdivacky
Sponsored by:	Google Summer of Code 2007
Tested by:	pho
2008-04-08 09:45:49 +00:00
Konstantin Belousov
57b4252e45 Add the support for the AT_FDCWD and fd-relative name lookups to the
namei(9).

Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:01:21 +00:00
Konstantin Belousov
22eca0bf45 Since version 4.3, gcc changed its behaviour concerning the i386/amd64
ABI and the direction flag, that is it now assumes that the direction
flag is cleared at the entry of a function and it doesn't clear once
more if needed. This new behaviour conforms to the i386/amd64 ABI.

Modify the signal handler frame setup code to clear the DF {e,r}flags
bit on the amd64/i386 for the signal handlers.

jhb@ noted that it might break old apps if they assumed DF == 1 would be
preserved in the signal handlers, but that such apps should be rare and
that older versions of gcc would not generate such apps.

Submitted by:	Aurelien Jarno <aurelien aurel32 net>
PR:	121422
Reviewed by:	jhb
MFC after:	2 weeks
2008-03-13 10:54:38 +00:00
Jeff Roberson
6617724c5f Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to
FreeBSD, the M:N approach taken by the kse library was never developed
to its full potential.  Backwards compatibility will be provided via
libmap.conf for dynamically linked binaries and static binaries will
be broken.
2008-03-12 10:12:01 +00:00
Konstantin Belousov
96a2b63525 Fill in cr2 in the signal context from ksi->ksi_addr.
Together with the sys/i386/i386/trap.c rev. 1.306 it fixes the PR.

Submitted by:	rdivacky
Suggested by:	jhb
Sponsored by:	Google Summer of Code 2007
PR:		kern/77710
Approved by:	re (kensmith)
2007-09-20 13:46:26 +00:00
Jung-uk Kim
357afa7113 MFP4: Turn emul_lock into a mutex.
Submitted by:	rdivacky
2007-04-02 18:38:13 +00:00
Alexander Leidinger
bb59e63f8f Change futex lock from mutex to sx. Make futex_get atomic (protected by the
futex lock).

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Suggested by:	jhb
2006-09-09 16:25:25 +00:00
Alexander Leidinger
94cb2ecf79 Move some stuff into headers where they belong.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	jhb, ssouhlal
2006-08-17 21:06:48 +00:00
Alexander Leidinger
9b44bfc556 Add the linux 2.6.x stuff (not used by default!):
- TLS - complete
 - pid/tid mangling - complete
 - thread area - complete
 - futexes - complete with issues
 - clone() extension - complete with some possible minor issues
 - mq*/timer*/clock* stuff - complete but untested and the mq* stuff is
   disabled when not build as part of the kernel with native FreeBSD mq*
   support (module support for this will come later)

Tested with:
 - linux-firefox - works, tested
 - linux-opera - works, tested
 - linux-realplay - doesnt work, issue with futexes
 - linux-skype - doesnt work, issue with futexes
 - linux-rt2-demo - works, tested
 - linux-acroread - doesnt work, unknown reason (coredump) and sometimes
   issue with futexes
 - various unix utilities in linux-base-gentoo3 and linux-base-fc4:
   everything tried worked

On amd64 not everything is supported like on i386, the catchup is planned for
later when the remaining bugs in the new functions are fixed.

To test this new stuff, you have to run
	sysctl compat.linux.osrelease=2.6.16
to switch back use
	sysctl compat.linux.osrelease=2.4.2

Don't switch while running a linux program, strange things may or may not
happen.

Sponsored by:			Google SoC 2006
Submitted by:			rdivacky
Some suggestions/help by:	jhb, kib, manu@NetBSD.org, netchild
2006-08-15 12:54:30 +00:00
Alexander Leidinger
50e422f056 Add some more errno mappings (bsd -> linux) and a comment about the status..
Submitted by:	"Intron" <mag@intron.ac>
2006-08-10 22:05:25 +00:00
Doug Ambrisko
060e488247 Enhance the Linux emulation layer to make MegaRAID SAS managements tool happy.
Add back in a scheme to emulate old type major/minor numbers via hooks into
stat, linprocfs to return major/minors that Linux app's expect.  Currently
only /dev/null is always registered.  Drivers can register via the Linux
type shim similar to the ioctl shim but by using
linux_device_register_handler/linux_device_unregister_handler functions.
The structure is:

    struct linux_device_handler {
        char    *bsd_driver_name;
        char    *linux_driver_name;
        char    *bsd_device_name;
        char    *linux_device_name;
        int     linux_major;
        int     linux_minor;
        int     linux_char_device;
    };

Linprocfs uses this to display the major number of the driver.  The
soon to be available linsysfs will use it to fill in the driver name.
Linux_stat uses it to translate the major/minor into Linux type values.

Note major numbers are dynamically assigned via passing in a -1 for
the major number so we don't need to keep track of them.

This is somewhat needed due to us switching to our devfs.  MegaCli
will not run until I add in the linsysfs and mfi Linux compat changes.

Sponsored by:	IronPort Systems
2006-05-05 16:10:45 +00:00
Alexander Leidinger
1f7642e058 regen after COMPAT_43 removal 2006-03-18 18:24:38 +00:00
Maxim Sobolev
900b28f9f6 Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure
with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually
allow executing elf dynamic binaries (aka shared libraries). When it is
requested to execute ET_DYN elf image check if this flag is on after we
know the elf brand allowing execution if so.

PR:		kern/87615
Submitted by:	Marcin Koziej <creep@desk.pl>
2005-12-26 21:23:57 +00:00
John Baldwin
410d857972 Remove linux_mib_destroy() (which I actually added in between 5.0 and 5.1)
which existed to cleanup the linux_osname mutex.  Now that MTX_SYSINIT()
has grown a SYSUNINIT to destroy mutexes on unload, the extra destroy here
was redundant and resulted in panics in debug kernels.

MFC after:	1 week
Reported by:	Goran Gajic ggajic at afrodita dot rcub dot bg dot ac dot yu
2005-12-15 16:30:41 +00:00
John Baldwin
728ef95410 The signal code is now an int rather than a long, so update debug printfs. 2005-10-14 20:22:57 +00:00
David Xu
9104847f21 1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
   sendsig use discrete parameters, now they uses member fields of
   ksiginfo_t structure. For sendsig, this change allows us to pass
   POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
   generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
   blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
   thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
   be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
   an extra siginfo list holds additional siginfo_t data for signals.
   kernel code uses psignal() still behavior as before, it won't be failed
   even under memory pressure, only exception is when deleting a signal,
   we should call sigqueue_delete to remove signal from sigqueue but
   not SIGDELSET. Current there is no kernel code will deliver a signal
   with additional data, so kernel should be as stable as before,
   a ksiginfo can carry more information, for example, allow signal to
   be delivered but throw away siginfo data if memory is not enough.
   SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
   not be caught or masked.
   The sigqueue() syscall allows user code to queue a signal to target
   process, if resource is unavailable, EAGAIN will be returned as
   specification said.
   Just before thread exits, signal queue memory will be freed by
   sigqueue_flush.
   Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
2005-10-14 12:43:47 +00:00
John Baldwin
813a5e14ec Move MODULE_DEPEND() statements for SYSVIPC dependencies to linux_ipc.c
so that they aren't duplicated 3 times and are also in the same file as
the code that depends on the SYSVIPC modules.
2005-07-29 19:40:39 +00:00
John Baldwin
0311233ecc Use linux_emul_convpath() rather than linux_emul_find() as
linux_emul_find() is going away.
2005-02-07 18:37:51 +00:00
David Schultz
2a51b9b0aa When running Linux binaries, set up the initial FPU state as Linux
would.

PR:	28966
2005-02-06 17:29:20 +00:00
Maxim Sobolev
610ecfe035 o Split out kernel part of execve(2) syscall into two parts: one that
copies arguments into the kernel space and one that operates
  completely in the kernel space;

o use kernel-only version of execve(2) to kill another stackgap in
  linuxlator/i386.

Obtained from:  DragonFlyBSD (partially)
MFC after:      2 weeks
2005-01-29 23:12:00 +00:00
David Schultz
d3adf76902 Axe the semblance of support for PECOFF and Linux a.out core dumps. 2004-11-27 06:46:45 +00:00
David Schultz
0ef5c36ff1 Maintain the broken state of backwards compatibilty for a.out (and
PECOFF!) core dumps.  None of the old versions of gdb I tried were
able to read a.out core dumps before or after this change.

Reviewed by:	arch@
2004-11-20 02:32:04 +00:00
Poul-Henning Kamp
3e019deaed Do a pass over all modules in the kernel and make them return EOPNOTSUPP
for unknown events.

A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
2004-07-15 08:26:07 +00:00
Tim J. Robbins
f99619a0dc Change the types of vn_rdwr_inchunks()'s len and aresid arguments to
size_t and size_t *, respectively. Update callers for the new interface.
This is a better fix for overflows that occurred when dumping segments
larger than 2GB to core files.
2004-06-05 02:18:28 +00:00
David Xu
9b778a1611 Make sigaltstack as per-threaded, because per-process sigaltstack state
is useless for threaded programs, multiple threads can not share same
stack.
The alternative signal stack is private for thread, no lock is needed,
the orignal P_ALTSTACK is now moved into td_pflags and renamed to
TDP_ALTSTACK.
For single thread or Linux clone() based threaded program, there is no
semantic changed, because those programs only have one kernel thread
in every process.
2004-01-03 23:31:29 +00:00
David Xu
a30ec4b99c Make sigaltstack as per-threaded, because per-process sigaltstack state
is useless for threaded programs, multiple threads can not share same
stack.
The alternative signal stack is private for thread, no lock is needed,
the orignal P_ALTSTACK is now moved into td_pflags and renamed to
TDP_ALTSTACK.
For single thread or Linux clone() based threaded program, there is no
semantic changed, because those programs only have one kernel thread
in every process.

Reviewed by: deischen, dfr
2004-01-03 02:02:26 +00:00
Bruce Evans
ff22c670d9 Sorted includes. Removed duplicates exposed by this. 2003-12-29 06:51:10 +00:00
Peter Wemm
9b68618df0 Add an additional field to the elf brandinfo structure to support
quicker exec-time replacement of the elf interpreter on an emulation
environment where an entire /compat/* tree isn't really warranted.
2003-12-23 02:42:39 +00:00
Peter Wemm
c460ac3a00 Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit
systems where the data/stack/etc limits are too big for a 32 bit process.

Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.

Supply an ia32_fixlimits function.  Export the clip/default values to
sysctl under the compat.ia32 heirarchy.

Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max
value rather than the sysctl tweakable variable.  This allows mmap to
place mappings at sensible locations when limits have been reduced.

Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same
method as mmap(0, ...) now does.

Note that we cannot remove all references to the sysctl tweakable
maxdsiz etc variables because /etc/login.conf specifies a datasize
of 'unlimited'.  And that causes exec etc to fail since it can no
longer find space to mmap things.
2003-09-25 01:10:26 +00:00
David Xu
0e2a4d3aeb Rename P_THREADED to P_SA. P_SA means a process is using scheduler
activations.
2003-06-15 00:31:24 +00:00
David E. O'Brien
27e0099c4a Use __FBSDID(). 2003-06-02 16:56:40 +00:00
John Baldwin
90af4afacb - Merge struct procsig with struct sigacts.
- Move struct sigacts out of the u-area and malloc() it using the
  M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
  sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
  that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
  and thread_stopped() are now MP safe.

Reviewed by:	arch@
Approved by:	re (rwatson)
2003-05-13 20:36:02 +00:00
Matthew N. Dodd
598d45be84 Provide exec_linux_setregs() to override exec_setregs().
Linux initializes %gs to 0.  Mimic this behavior.

Submitted by:	 Christian Zander <zander@minion.de>
Reviewed by:	 jake
Approved by:	 re
2003-05-11 21:51:11 +00:00
John Baldwin
e5567180fb Don't drop the proc lock just to reacquire it after a few simple assignment
statements.  Just hold the lock the entire time.
2003-04-17 22:18:07 +00:00
Jeff Roberson
4093529dee - Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with
a follow on commit to kern_sig.c
 - signotify() now operates on a thread since unmasked pending signals are
   stored in the thread.
 - PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.
2003-03-31 22:49:17 +00:00
Jeff Roberson
1bf4700bff - Change trapsignal() to accept a thread and not a proc.
- Change all consumers to pass in a thread.

Right now this does not cause any functional changes but it will be important
later when signals can be delivered to specific threads.
2003-03-31 22:02:38 +00:00
John Baldwin
0f9d6538bb Add missing includes from previous commit.
Reported by:	des
2003-03-27 18:18:35 +00:00
John Baldwin
35eb8c5aa2 Add a cleanup function to destroy the osname_lock and call it on module
unload.

Submitted by:	gallatin
Reported by:	Martin Karlsson <mk-freebsd@bredband.net>
2003-03-26 18:29:44 +00:00
John Baldwin
43cf129c99 Sync up linux and svr compat elf fixup functions for exec(). These
functions are now all basically identical except that alpha linux uses
Elf64 arguments and svr4 and i386 linux use Elf32.  The fixups include
changing the first argument to be a register_t ** to match the prototype
for fixup functions, asserting that the process in the image_params struct
is always curproc and removing unnecessary locking to read credentials as a
result, and a few style fixes.
2003-03-21 19:49:34 +00:00
Dag-Erling Smørgrav
1d062e2be8 Clean up whitespace and remove register keyword. 2003-03-03 09:17:12 +00:00
Dag-Erling Smørgrav
4b7ef73d71 More caddr_t removal, in conjunction with copy{in,out}(9) this time.
Also clean up some egregious casts and incorrect use of sizeof.
2003-03-03 09:14:26 +00:00
Alexander Kabaev
ba873f4c18 Correctly map SIGSYS signal to/from Linux.
Submitted by:   "Georg-W. Koltermann" <g.w.k@web.de>
2003-02-24 16:16:45 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Marcel Moolenaar
99d45c5f9d bzero() the sigframe before we fill it. This was not done at all in
linux_rt_sendsig() and only done for the fpstate in linux_sendsig().
2002-11-02 07:41:04 +00:00