opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-04-23 07:07:24 -04:00

Author	SHA1	Message	Date
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
Konstantin Belousov	d075105da0	After applying LCONVPATH() to the path, do use the converted path instead of original user-mode string in the linux_stat() and linux_lstat() syscalls. Tested by: Peter Holm MFC after: 3 days	2008-01-05 12:36:35 +00:00
Konstantin Belousov	93eba2d50d	Plug the leaks in the present (hopefully, soon to be replaced) implementation of the linux_openat() for the quick MFC. Reported and tested by: Peter Holm MFC after: 3 days	2007-12-29 14:28:01 +00:00
Konstantin Belousov	15b78ac5d1	Apply the LCONVPATH() to the (old) linux_stat() and linux_lstat() syscalls. Without it, code has two problems: - behaviour of the old and new [l]stat are different with regard of the /compat/linux - directly accessing the userspace data from the kernel asks for the panics. Reported and tested by: Peter Holm Reviewed by: rdivacky MFC after: 3 days	2007-12-29 14:25:29 +00:00
Konstantin Belousov	d60f0a3d6a	Implement LINUX_SIOCGIFCOUNT and LINUX_SIOCGIFINDEX/LINUX_SIOGIFINDEX. LINUX_SIOCGIFCOUNT just returns 0 since it is not implemented in the Linux 2.6.16. LINUX_SIOCGIFINDEX/LINUX_SIOGIFINDEX are mapped to the FreeBSD native SIOCGIFINDEX. Tested by: Peter Kostouros <kpeter@melbpc.org.au> Reviewed by: brooks, rpaulo (on net@) Submitted by: rdivacky MFC after: 1 week	2007-11-07 16:42:52 +00:00
Robert Watson	30d239bc4c	Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-24 19:04:04 +00:00
David Malone	3ab8526963	The kernel version of Linux statfs64 is actually supposed to take 3 arguments, but we had forgotten the second argument. Also make the Linux statfs64 struct depend on the architecture because it has an extra 4 bytes padding on amd64 compared to i386. The three argument fix is from David Taylor, the struct statfs64 stuff is my fault. With this patch I can install i386 Linux matlab on an amd64 machine. Submitted by: David Taylor <davidt_at_yadt.co.uk> Approved by: re (kensmith)	2007-09-18 19:50:33 +00:00
Konstantin Belousov	b6e645c90f	Implement fake linux sched_getaffinity() syscall to enable java to work with Linux 2.6 emulation. This shall be reimplemented once FreeBSD gets native scheduler affinity syscalls. Submitted by: rdivacky Reviewed by: jkim Sponsored by: Google Summer of Code 2007 Approved by: re (kensmith)	2007-08-28 12:26:35 +00:00
Robert Watson	0bf686c125	Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)	2007-08-06 14:26:03 +00:00
Peter Wemm	79d5bdcca5	Don't add the 'pad' argument to the mmap/truncate/etc syscalls. Submitted by: kensmith Approved by: re (kensmith)	2007-07-04 23:06:43 +00:00
Robert Watson	32f9753cfb	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
Matt Jacob	2ba956ed13	Ensure that newpath is always initialized, even for the error case.	2007-06-10 04:37:22 +00:00
Attilio Rao	a1fe14bc33	rufetch and calcru sometimes should be called atomically together. This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way. Reviewed by: jeff Approved by: jeff (mentor)	2007-06-09 21:48:44 +00:00
Attilio Rao	2feb50bf7d	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Konstantin Belousov	1c182de9a9	Move futex support code from <arch>/support.s into linux compat directory. Implement all futex atomic operations in assembler to not depend on the fuword() that does not allow to distinguish between -1 and failure return. Correctly return 0 from atomic operations on success. In collaboration with: rdivacky Tested by: Scot Hetzel <swhetzel gmail com>, Milos Vyletel <mvyletel mzm cz> Sponsored by: Google SoC 2007	2007-05-23 08:33:06 +00:00
Jeff Roberson	222d01951f	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
Robert Watson	d72a615878	Some Linux applications (ping) pass a non-NULL msg_control argument to sendmsg() while using a 0-length msg_controllen. This isn't allowed in the FreeBSD system call ABI, so detect this case and set msg_control to NULL. This allows Linux ping to work. Submitted by: rdivacky	2007-04-14 10:35:09 +00:00
Scott Long	6eef46be3b	Whitespace fixes	2007-04-10 21:37:37 +00:00
Scott Long	1eba4c7948	Add the CAM 'SG' peripheral device. This device implements a subset of the Linux SCSI SG passthrough device API. The intention is to allow for both running of Linux apps that want to talk to /dev/sg* nodes, and to facilitate porting of apps from Linux to FreeBSD. As such, both native and linuxolator entry points and definitions are provided. Caveats: - This does not support the procfs and sysfs nodes that the Linux SG driver provides. Some Linux apps may rely on these for operation, others may only use them for informational purposes. - More ioctls need to be implemented. - Linux uses a naming scheme of "sg[a-z]" for devices, while FreeBSD uses a scheme of "sg[0-9]". Devfs aliasis (symlinks) are automatically created to link the two together. However, tools like camcontrol only see the native names. - Some operations were originally designed to return byte counts or other data directly as the syscall return value. The linuxolator doesn't appear to support this well, so this driver just punts for these cases. Now that the driver is in place, others are welcome to add missing functionality. Thanks to Roman Divacky for pushing this work along.	2007-04-07 19:40:58 +00:00
Robert Watson	5e3f7694b1	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
Jung-uk Kim	357afa7113	MFP4: Turn emul_lock into a mutex. Submitted by: rdivacky	2007-04-02 18:38:13 +00:00
Jung-uk Kim	a328699b34	MFP4: Linux futex support for amd64. Initial patch was submitted by kib and additional work was done by Divacky Roman. Tested by: emulation	2007-03-30 01:07:28 +00:00
Julian Elischer	6734f35eac	Implement the openat() linux syscall Submitted by: Roman Divacky (rdivacky@) MFC after: 2 weeks	2007-03-29 02:11:46 +00:00
Robert Watson	b77ad8fc3b	In translate_path_major_minor(), do not calculate otherwise unused 'fp' variable, avoiding an extra locking of the file descriptor array.	2007-03-06 07:39:12 +00:00
Jung-uk Kim	a4e3bad794	MFP4: 115220, 115222 - Fix style(9) and reduce diff between amd64 and i386. - Prefix Linuxulator macros with LINUX_ to prevent future collision.	2007-03-02 00:08:47 +00:00
Alexander Leidinger	8cf5ee2e2a	MFp4 (110541): Sync with rev 1.7 in NetBSD. Obtained from: NetBSD	2007-02-25 12:43:07 +00:00
Alexander Leidinger	f9dac96185	MFp4 (110523, parts which apply cleanly): semi-automatic style(9) The futex stuff already differs a lot (only a small part does not differ) from NetBSD, so we are already way off and can't apply changes from NetBSD automatically. As we need to merge everything by hand already, we can even make the files comply to our world order.	2007-02-25 12:40:35 +00:00
Alexander Leidinger	802e08a360	Partial MFp4 of 114977: Whitespace commit: Fix grammar, spelling and punctuation. Submitted by: "Scot Hetzel" <swhetzel@gmail.com>	2007-02-24 16:49:25 +00:00
Alexander Leidinger	1a26db0a3a	MFp4 (114193 (i386 part), 114194, 114195, 114200): - Dont "return" in linux_clone() after we forked the new process in a case of problems. - Move the copyout of p2->p_pid outside the emul_lock coverage in linux_clone(). - Cache the em->pdeath_signal in a local variable and move the copyout out of the emul_lock coverage. - Move the free() out of the emul_shared_lock coverage in a preparation to switch emul_lock to non-sleepable lock (mutex). Submitted by: rdivacky	2007-02-23 22:39:26 +00:00
Alexander Leidinger	e8b8b834b4	MFp4 (part of 114132): - Fix a LOR caused by holding emul_lock and proctree_lock at once. Submitted by: rdivacky	2007-02-23 22:29:24 +00:00
Konstantin Belousov	b4bb515484	Remove extern int hz; use proper include file instead.	2007-02-02 08:58:16 +00:00
Konstantin Belousov	d0b2365eec	Introduce some more SO_ option equivalents from Linux to FreeBSD. The msg variable in linux_recvmsg() was not initialized. Copy it from userspace. Submitted by: rdivacky	2007-02-01 13:36:19 +00:00
Konstantin Belousov	75ee4e5462	No need to lock emul_lock in exit_group() because em->shared cannot change (because its referenced by curthread). This fixes a LOR caused by acquiring emul_shared_lock while holding emul_lock. Fix typo in comment. Submitted by: rdivacky	2007-02-01 13:33:33 +00:00
Konstantin Belousov	25954d7430	No need to synchronize linux_schedtail with linux_proc_init. p->p_emuldata is properly initialized in the time when the child can run. Do not set p->p_emuldata to NULL when the process is exiting. It does not make any sense and only costs 2 mutex operations. Do not lock emul_data to unlock it on the very next line. Comment on possible race while there. Reparent all procs that are part of a threading group but not its leaders to init and SIGCHLD init to finish the zombies off. This fixes zombies left after opera's exit. [1] There is no need to lock p_em in the linux_proc_init CLONE_THREAD case because the process cannot change the address of the p_em->shared because its currently running this code path. Move assigning of em->shared outside emul_shared_lock. Noticed by: Scott Robbins <scottro@nyc.rr.com> [1] Submitted by: rdivacky	2007-02-01 13:29:27 +00:00
Alexander Leidinger	d071f5048c	MFp4 (113077, 113083, 113103, 113124, 113097): Dont expose em->shared to the outside world before its properly initialized. Might not affect anything but its at least a better coding style. Dont expose em via p->p_emuldata until its properly initialized. This also enables us to get rid of some locking and simplify the code because we are workin on a local copy. In linux_fork and linux_vfork create the process in stopped state to be sure that the new process runs with fully initialized emuldata structure [1]. Also fix the vfork (both in linux_clone and linux_vfork) race that could result in never woken up process [2]. Reported by: Scot Hetzel [1] Suggested by: jhb [2] Reviewed by: jhb (at least some important parts) Submitted by: rdivacky Tested by: Scot Hetzel (on amd64) Change 2 comments (in the new code) to comply to style(9). Suggested by: jhb	2007-01-20 14:58:59 +00:00
Konstantin Belousov	4349c6ba29	Add support for LINUX_O_DIRECT, LINUX_O_DIRECT and LINUX_O_NOFOLLOW flags to open() [1]. Improve locking for accessing session control structures [2]. Try to document (most likely harmless) races in the code [3]. Based on submission by: Intron (intron at intron ac) [1] Reviewed by: jhb [2] Discussed with: netchild, rwatson, jhb [3]	2007-01-18 09:32:08 +00:00
Alexander Leidinger	17011df1e1	MFp4 (112379): Implement SETALL/GETALL IPC primitives. This fixes some LTP testcases and LabView is able to proceed a little bit further. Submitted by: rdivacky	2007-01-14 16:34:43 +00:00
Alexander Leidinger	31becc7692	MFp4 (112705): Inherit setting of the default emulation version to the jails. Pointed out by: jhb Submitted by: rdivacky	2007-01-14 16:07:01 +00:00
Alexander Leidinger	a849401985	MFp4 (112646): Now (ok it's been a while...) that FreeBSD has RLIMIT_AS too, we can use it in the linuxolator instead of ignoring it. This fixes a LTP test. Submitted by: rdivacky	2007-01-07 19:30:19 +00:00
Alexander Leidinger	bb419e1b5b	MFp4 (112535): No need to lock prison in a case of linux_use26 because the int setting is atomic and process cannot leave jail. Submitted by: kib Reviewed by: jhb Requested by: rdivacky	2007-01-07 19:20:17 +00:00
Alexander Leidinger	0ed6f09c4e	MFp4 (112534): Dont lock em in a case of just using em->shared->group_pid because the group_pid never changes. Submitted by: rdivacky Reviewed by: kib Glanced at by: jhb	2007-01-07 19:14:06 +00:00
Alexander Leidinger	291081ce0a	MFp4 (112499): Protect em->shared with the lock in case of CLONE_THREAD. Submitted by: rdivacky	2007-01-07 19:09:20 +00:00
Alexander Leidinger	1c65504ca8	MFp4 (112498): Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion. Submitted by: rdivacky	2007-01-07 19:00:38 +00:00
Xin LI	59038483f5	Fix amd64 build. Submitted by: Divacky Roman <xdivac02 stud fit vutbr cz>	2007-01-01 14:47:45 +00:00
Alexander Leidinger	c9447c7551	MFp4 (111746, 108671, 108945, 112352): - add linux utimes syscall [1] - add linux rt_sigtimedwait syscall [2] Submitted by: "Scot Hetzel" <swhetzel@gmail.com> [1] Submitted by: Bruce Becker <hostmaster@whois.gts.net> [2] PR: 93199 [2]	2006-12-31 13:16:00 +00:00
Alexander Leidinger	a628609ee9	MFp4: - semi-automatic style fixes	2006-12-31 12:42:55 +00:00
Alexander Leidinger	9ce8f9bcdd	MFp4 (111746+): Redo the checking for 2.6 emulation. We now cache the value of use26 and replace calls to linux_get_osrelease() + parsing with a call to linux_use26(). Typical path is lockless now. Pointed out by: kib This allows to ship RELENG_7_0 with a default osrelease of 2.4.2 and the possibility to enable 2.6.x emulation without the possible performance impact of the previous version of the check. Submitted by: rdivacky	2006-12-31 12:39:10 +00:00
Alexander Leidinger	ef95cfeab9	MFp4: - semi-automatic style fixes - spelling fixes in comments - add some comments	2006-12-31 11:56:16 +00:00
Alexander Leidinger	de6bf3bfcd	MFP4 (110956): Add definition for LINUX_MSG_INFO. This fixes the tinderbox errors. Submitted by: rdivacky	2006-12-21 13:11:06 +00:00
Jung-uk Kim	77424f4177	MFP4: 109655 - Move linux_nanosleep() from src/sys/amd64/linux32/linux32_machdep.c to src/sys/compat/linux/linux_time.c. - Validate timespec ranges before use as Linux kernel does. - Fix l_timespec structure. - Clean up style(9) nits.	2006-12-20 20:17:35 +00:00
Jung-uk Kim	34ec45fe0d	MFP4: 110179 Add rudimentary IPC_INFO/MSG_INFO command support for linux_msgctl() to pacify Linux ipcs(1). While I am here, add more bound checks for linux_msgsnd() and linux_msgrcv().	2006-12-20 20:08:45 +00:00
Jung-uk Kim	f61480ecf5	MFP4: (part of) 110058 Use new kern_msgsnd()/kern_msgrcv() to fix linux32 emulation on amd64.	2006-12-20 19:30:52 +00:00
Jung-uk Kim	b34608fea5	MFP4: 109653 Linux mknod(2) can open any files, not just char/block or fifo files. This fixes Linux Test Project test cases mknod01, mknod07 and mknod09.	2006-12-04 22:46:09 +00:00
Jung-uk Kim	b256a1e10b	MFP4: 109652 Fixes for 'blocking in fifoor state' problem of LTP tests. linux_stat() functions were opening files with O_RDONLY to get major/minor pair for char/block special files. Unfortunately, when these functions are used against fifo, it is blocked forever because there is no writer. Instead, we only open char/block special files for major/minor conversion. We have to get rid of kern_open() entirely from translate_path_major_minor() but today is not the day. While I am here, add checks for errors before calling translate_path_major_minor().	2006-12-04 22:38:52 +00:00
Alexander Leidinger	f6018b1434	MFP4 (108673, 110519, 110874): - Currently LINUX_MAX_COMM_LEN is smaller than MAXCOMLEN, but in case this will change we have a buffer overflow. Apply some defensive programming to DTRT when this should happen. - Use copyinstr() instead of copyin where appropriate. * Fallback to copyin() in case of ENAMETOOLONG. [1] * Use the right source and destination (it was wrong before). - Use strlcpy instead of strcpy. - Properly lock the read case (PR_GET_NAME) like the write case. Reviewed by: rwatson (except [1]) Suggested by: rwatson [1]	2006-12-02 14:56:25 +00:00
Konstantin Belousov	bdaee9ef4e	Add missed ")". Fix the build. Pointy hat to: kib	2006-11-18 17:27:39 +00:00
Konstantin Belousov	cce1514679	Sync struct sysinfo with real one from linux. Submitted by: rdivacky	2006-11-18 14:37:54 +00:00
Konstantin Belousov	0c00520b93	Use standard debugging facilities in linux_getcwd(). Submitted by: rdivacky	2006-11-18 13:31:03 +00:00
Konstantin Belousov	d559d18183	Add debuging printfs to syscalls that do not contain it yet. In sethostname do not print the hostname because it would require to copyin the string. Sethostname is not very frequently used. Submitted by: rdivacky	2006-11-18 13:00:59 +00:00
Konstantin Belousov	f472c6e35a	Remove unecessary locking of process in linux_getpid. Suggested by: jhb Submitted by: rdivacky	2006-11-18 10:12:43 +00:00
Konstantin Belousov	292a85f4a8	Group pid and parent are shared in a case of CLONE_THREAD not CLONE_VM. This fix lets clone02 LTP test pass with 2.6 emulation. In reality 99% of the cases are that CLONE_VM and CLONE_THREAD are both set so it seemed to work. Submitted by: rdivacky	2006-11-15 11:04:37 +00:00
Konstantin Belousov	0132096dfd	In rev 1.188 of linux_misc.c the added check for valid options ommited __WCLONE. This fixes it thus fixing skype/teamspeak to not keep zombies after exit. Submitted by: rdivacky Reported by: Bakul Shah (bakul at bitblocks com)	2006-11-15 10:01:06 +00:00
Tom Rhodes	6aeb05d7be	Merge posix4/* into normal kernel hierarchy. Reviewed by: glanced at by jhb Approved by: silence on -arch@ and -standards@	2006-11-11 16:26:58 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
Alexander Leidinger	3680a41902	Backout the linux aio stuff. Several problems where identified and the dynamic nature (if no native aio code is available, the linux part returns ENOSYS because of missing requisites) should be solved differently than it is. All this will be done in P4. Not included in this commit is a backout of the changes to the native aio code (removing static in some places). Those changes (and some more) will also be needed when the reworked linux aio stuff will reenter the tree. Requested by: rwatson Discussed with: rwatson	2006-10-29 14:02:39 +00:00
Alexander Leidinger	c4ce314b40	Fix style(9). Noticed by: rwatson	2006-10-28 16:47:38 +00:00
Alexander Leidinger	955d762aca	MFP4: Implement prctl(). Submitted by: rdivacky Tested with: LTP	2006-10-28 10:59:59 +00:00
Robert Watson	aed5570872	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
Alexander Leidinger	6474221698	Fix compile (use the right variable name).	2006-10-15 14:34:03 +00:00
Alexander Leidinger	6a1162d4cd	MFP4 (with some minor changes): Implement the linux_io_* syscalls (AIO). They are only enabled if the native AIO code is available (either compiled in to the kernel or as a module) at the time the functions are used. If the AIO stuff is not available there will be a ENOSYS. From the submitter: ---snip--- DESIGN NOTES: 1. Linux permits a process to own multiple AIO queues (distinguished by "context"), but FreeBSD creates only one single AIO queue per process. My code maintains a request queue (STAILQ of queue(3)) per "context", and throws all AIO requests of all contexts owned by a process into the single FreeBSD per-process AIO queue. When the process calls io_destroy(2), io_getevents(2), io_submit(2) and io_cancel(2), my code can pick out requests owned by the specified context from the single FreeBSD per-process AIO queue according to the per-context request queues maintained by my code. 2. The request queue maintained by my code stores contrast information between Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks (struct aiocb). FreeBSD IO control block actually exists in userland memory space, required by FreeBSD native aio_XXXXXX(2). 3. It is quite troubling that the function io_getevents() of libaio-0.3.105 needs to use Linux-specific "struct aio_ring", which is a partial mirror of context in user space. I would rather take the address of context in kernel as the context ID, but the io_getevents() of libaio forces me to take the address of the "ring" in user space as the context ID. To my surprise, one comment line in the file "io_getevents.c" of libaio-0.3.105 reads: Ben will hate me for this REFERENCE: 1. Linux kernel source code: http://www.kernel.org/pub/linux/kernel/v2.6/ (include/linux/aio_abi.h, fs/aio.c) 2. Linux manual pages: http://www.kernel.org/pub/linux/docs/manpages/ (io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2)) 3. Linux Scalability Effort: http://lse.sourceforge.net/io/aio.html The design notes: http://lse.sourceforge.net/io/aionotes.txt 4. The package libaio, both source and binary: http://rpmfind.net/linux/rpm2html/search.php?query=libaio Simple transparent interface to Linux AIO system calls. 5. Libaio-oracle: http://oss.oracle.com/projects/libaio-oracle/ POSIX AIO implementation based on Linux AIO system calls (depending on libaio). ---snip--- Submitted by: Li, Xiao <intron@intron.ac>	2006-10-15 14:22:14 +00:00
Alexander Leidinger	687c23be1d	MFP4 (107868 - 107870): Use a macro to test for a valid signal instead of doing it my hand everywhere. Submitted by: rdivacky	2006-10-15 12:51:43 +00:00
John Baldwin	8528552b0d	Don't pass unused bufsz to kern_shmctl().	2006-10-10 22:46:50 +00:00
John Baldwin	f3ea244ea9	Only try to copyin a msqid for the IPC_SET command to msgctl(). Other commands (such as IPC_RMID) were bogusly failing with EFAULT. Tested by: jkim	2006-10-10 22:46:22 +00:00
John Baldwin	7f4c1dd0d6	Remove unnecessary casts before PTRIN().	2006-10-10 22:44:59 +00:00
Alexander Leidinger	28638377ad	- change if (cond) panic() to KASSERT. - Dont forget to free em in a case of error. Suggested by: ssouhlal Submitted by: rdivacky Tested with: LTP	2006-10-08 17:10:34 +00:00
Alexander Leidinger	7660ace19c	- Replace homegrown check for FIFO with S_ISFIFO. [1] - Check the status of the options before messing with it. Inspired by: NetBSD [1] Submitted by: rdivacky Tested with: LTP	2006-10-08 17:08:27 +00:00
Alexander Leidinger	d4b7423fa1	MFp4: - Linux returns ENOPROTOOPT in a case of not supported opt to setsockopt. - Return EISDIR in pread() when arg is a directory. - Return EINVAL instead of EFAULT when namelen is not correct in accept(). - Return EINVAL instead of EACCESS if invalid access mode is entered in access(). - Return EINVAL instead of EADDRNOTAVAIL in a case of bad salen param to bind(). Submitted by: rdivacky Tested with: LTP (vfork01 fails now, but it seems to be a race and not caused by those changes) MFC after: 1 week	2006-09-23 19:06:54 +00:00
Alexander Leidinger	18f81b3dfa	- don't reboot() when feed with wrong parameters (and enough permissions) [1] - add support to power off the system [2] - check the linux magic values [3] Submitted by: Marcin Cieslak <saper@SYSTEM.PL> [1,2] Modelled after: linux man page of the reboot() syscall [3] Found by: LTP testcase "reboot02" [1] Tested with: LTP testcase "reboot02" [1,3] MFC after: 1 week	2006-09-16 14:12:04 +00:00
Alexander Leidinger	db0d964062	The Linux unlink syscall uses a different errno value when trying to unlink a directory. PR: 102897 [1] Noticed by: Knut Anders Hatlen <kahatlen@gmail.com>, testrun with LTP [1] Submitted by: Marcin Cieslak <saper@SYSTEM.PL> Tested by: netchild (LTP test run)	2006-09-10 13:47:56 +00:00
Alexander Leidinger	8618fd85a3	- Extend the coverage of PROC_LOCK to cover wakeup(&p->p_emuldata); - Lock the emuldata in a case when we just created it. Sponsored by: Google SoC 2006 Submitted by: rdivacky Suggested by: jhb	2006-09-09 16:55:55 +00:00
Alexander Leidinger	bb59e63f8f	Change futex lock from mutex to sx. Make futex_get atomic (protected by the futex lock). Sponsored by: Google SoC 2006 Submitted by: rdivacky Suggested by: jhb	2006-09-09 16:25:25 +00:00
Alexander Leidinger	c19ddeda07	- don't wake every sleeper just the first one [1] - remove debuging printf [2] Submitted by: intron <mag@intron.ac> [1], rdivacky [2]	2006-09-09 13:04:28 +00:00
Suleiman Souhlal	c67e0cc9e7	FREE -> free Submitted by: rdivacky	2006-08-28 13:52:27 +00:00
Alexander Leidinger	835e506190	Add the linux statfs64 call. This allows Tivoli backup to proceed a little but further on -current (still not successful, but a step into the right direction). Sponsored by: Google SoC 2006 Submitted by: rdivacky Tested by: Paul Mather <paul@gromit.dlib.vt.edu>	2006-08-27 08:56:54 +00:00
Alexander Leidinger	84ed9f91d8	Correct the number of retries in a futex_wake() call. Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-08-26 10:36:16 +00:00
Robert Watson	3e8df637c0	Don't call suser_cred() directly from linux_sethostname(), as it just wraps userland_sysctl(), which performs necessary privilege checks as part of its normal operation. MFC after: 1 week	2006-08-25 11:02:42 +00:00
Alexander Leidinger	1a28c0df09	Sync the MI parts for amd64 with i386 and remove the corresponding special handling for amd64 in the common code. The MD parts for amd64 are still outstanding, but at least this fixes some panics on amd64. Sponsored by: Google SoC 2006 Submitted by: rdivacky Tested by: bsam	2006-08-20 13:50:27 +00:00
Alexander Leidinger	29ddc19bbf	Get rid of some nested includes. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb	2006-08-19 15:13:01 +00:00
Suleiman Souhlal	5342db0872	MALLOC -> malloc and FREE -> free Submitted by: rdivacky Pointed out by: jhb	2006-08-19 11:54:19 +00:00
Suleiman Souhlal	b273d5aa72	ifdef DEBUG a printf Submitted by: rdivacky	2006-08-19 11:07:22 +00:00
Alexander Leidinger	590e3a06e8	- disable some more code when osrelease=2.4.2 - protect td->td_proc->p_pid with the proc lock in linux_getpid in the amd64 (= non i386) case [1] Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: netchild [1]	2006-08-17 21:21:30 +00:00
Alexander Leidinger	94cb2ecf79	Move some stuff into headers where they belong. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb, ssouhlal	2006-08-17 21:06:48 +00:00
Alexander Leidinger	9b0bcbfbda	Fix the DEBUG build: - linux_emul.c [1] - linux_futex.c [2] Sponsored by: Google SoC 2006 [1] Submitted by: rdivacky [1] netchild [2]	2006-08-17 09:50:30 +00:00
Alexander Leidinger	0eef2f8a4e	Style fixes to comments. Sponsored by: Google SoC 2006 Submitted by: rdivacky Noticed by: jhb, ssouhlal	2006-08-16 18:54:51 +00:00
Alexander Leidinger	a43eeaabe4	Disable some parts of the code on amd64 for now to prevent a panic. A better fix will come later. Sponsored by: Google SoC 2006 Submitted by: rdivacky	2006-08-15 15:15:17 +00:00
Alexander Leidinger	9b44bfc556	Add the linux 2.6.x stuff (not used by default!): - TLS - complete - pid/tid mangling - complete - thread area - complete - futexes - complete with issues - clone() extension - complete with some possible minor issues - mq/timer/clock* stuff - complete but untested and the mq* stuff is disabled when not build as part of the kernel with native FreeBSD mq* support (module support for this will come later) Tested with: - linux-firefox - works, tested - linux-opera - works, tested - linux-realplay - doesnt work, issue with futexes - linux-skype - doesnt work, issue with futexes - linux-rt2-demo - works, tested - linux-acroread - doesnt work, unknown reason (coredump) and sometimes issue with futexes - various unix utilities in linux-base-gentoo3 and linux-base-fc4: everything tried worked On amd64 not everything is supported like on i386, the catchup is planned for later when the remaining bugs in the new functions are fixed. To test this new stuff, you have to run sysctl compat.linux.osrelease=2.6.16 to switch back use sysctl compat.linux.osrelease=2.4.2 Don't switch while running a linux program, strange things may or may not happen. Sponsored by: Google SoC 2006 Submitted by: rdivacky Some suggestions/help by: jhb, kib, manu@NetBSD.org, netchild	2006-08-15 12:54:30 +00:00
Alexander Leidinger	ad2056f2c4	Add some new files needed for linux 2.6.x compatibility. Please don't style(9) the NetBSD code, we want to stay in sync. Not imported on a vendor branch since we need local changes. Sponsored by: Google SoC 2006 Submitted by: rdivacky With help from: manu@NetBSD.org Obtained from: NetBSD (linux_{futex,time}.*)	2006-08-15 12:20:59 +00:00
John Baldwin	b4c63329d5	- Pass the MPSAFE flag to namei() in linux_uselib() and handle conditional Giant VFS locking in that function. - Remove bogus code to handle the case where namei() returns success but a NULL vnode pointer. - Note that this code duplicates exec_check_permissions() and annotate where it differs. - Hold the vnode lock longer to protect the write to set VV_TEXT in v_vflag. - Mark linux_uselib() MPSAFE. Reviewed by: rwatson	2006-07-21 20:22:13 +00:00
John Baldwin	b33887ea31	Don't free the sockaddr in kern_bind() and kern_connect() as not all callers pass a sockaddr allocated via malloc() from M_SONAME anymore. Instead, free it in the callers when necessary.	2006-07-19 18:28:52 +00:00
John Baldwin	be5747d5b5	- Add conditional VFS Giant locking to getdents_common() (linux ABIs), ibcs2_getdents(), ibcs2_read(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() similar to that in getdirentries(). - Mark ibcs2_getdents(), ibcs2_read(), linux_getdents(), linux_getdents64(), linux_readdir(), ogetdirentries(), svr4_sys_getdents(), and svr4_sys_getdents64() MPSAFE.	2006-07-11 20:52:08 +00:00
John Baldwin	c1cccebe8b	Add a kern_close() so that the ABIs can close a file descriptor w/o having to populate a close_args struct and change some of the places that do.	2006-07-08 20:03:39 +00:00
John Baldwin	b1ee5b654d	Rework kern_semctl a bit to always assume the UIO_SYSSPACE case. This mostly consists of pushing a few copyin's and copyout's up into __semctl() as all the other callers were already doing the UIO_SYSSPACE case. This also changes kern_semctl() to set the return value in a passed in pointer to a register_t rather than td->td_retval[0] directly so that callers can only set td->td_retval[0] if all the various copyout's succeed. As a result of these changes, kern_semctl() no longer does copyin/copyout (except for GETALL/SETALL) so simplify the locking to acquire the semakptr mutex before the MAC check and hold it all the way until the end of the big switch statement. The GETALL/SETALL cases have to temporarily drop it while they do copyin/malloc and copyout. Also, simplify the SETALL case to remove handling for a non-existent race condition.	2006-07-08 19:51:38 +00:00
John Baldwin	ad6d226d43	- Protect the list of linux ioctl handlers with an sx lock. - Hold Giant while calling linux ioctl handlers for now as they aren't all known to be MPSAFE yet. - Mark linux_ioctl() MPSAFE.	2006-07-06 21:42:36 +00:00
John Baldwin	4db580972e	Axe the stackgap macros as the Linux ABIs no longer use the stackgap.	2006-06-27 18:30:49 +00:00
John Baldwin	49d409a108	- Add a kern_semctl() helper function for __semctl(). It accepts a pointer to a copied-in copy of the 'union semun' and a uioseg to indicate which memory space the 'buf' pointer of the union points to. This is then used in linux_semctl() and svr4_sys_semctl() to eliminate use of the stackgap. - Mark linux_ipc() and svr4_sys_semsys() MPSAFE.	2006-06-27 18:28:50 +00:00
Alexander Leidinger	555f86b8b6	The linux times syscall can be called with a NULL pointer, so keep cool and don't panic. This fix is different from the patch submitted as it not only prevents a NULL-pointer dereference, but also skips some work in this case. Noticed by: Dmitry Ganenko <dima@apk-inform.com> Reviewed by: rdivacky (the original version as in emulation@) MFC after: 1 week Security: This is a RELENG_x_y candidate (local DoS). Go ahead by: secteam (cperciva)	2006-06-23 18:49:38 +00:00
Doug Ambrisko	edb75eca27	Fix file leaking in translate_path_major_minor.	2006-05-16 17:57:00 +00:00
Alexander Leidinger	01e0ffbae8	Now that we don't have a linuxolator on alpha anymore: - unifdef __alpha__ - revert rev. 1.66 of linux_socket.c	2006-05-10 20:38:16 +00:00
Alexander Leidinger	17138b619c	Implement rt_sigpending in the linuxolator. PR: 92671 Submitted by: Markus Niemist"o <markus.niemisto@gmx.net>	2006-05-10 18:17:29 +00:00
Doug Ambrisko	03487601c2	Fix the the duplicate cut-n-paste in linux_fstat64 pointed out by Alexander Leidinger. I forget to fix it in this version.	2006-05-05 16:17:59 +00:00
Doug Ambrisko	060e488247	Enhance the Linux emulation layer to make MegaRAID SAS managements tool happy. Add back in a scheme to emulate old type major/minor numbers via hooks into stat, linprocfs to return major/minors that Linux app's expect. Currently only /dev/null is always registered. Drivers can register via the Linux type shim similar to the ioctl shim but by using linux_device_register_handler/linux_device_unregister_handler functions. The structure is: struct linux_device_handler { char bsd_driver_name; char linux_driver_name; char bsd_device_name; char linux_device_name; int linux_major; int linux_minor; int linux_char_device; }; Linprocfs uses this to display the major number of the driver. The soon to be available linsysfs will use it to fill in the driver name. Linux_stat uses it to translate the major/minor into Linux type values. Note major numbers are dynamically assigned via passing in a -1 for the major number so we don't need to keep track of them. This is somewhat needed due to us switching to our devfs. MegaCli will not run until I add in the linsysfs and mfi Linux compat changes. Sponsored by: IronPort Systems	2006-05-05 16:10:45 +00:00
Robert Watson	f7f45ac8e2	Annotate uses of fgetsock() with indications that they should rely on their existing file descriptor references to sockets, rather than use fgetsock() to retrieve a direct socket reference. MFC after: 3 months	2006-04-01 15:25:01 +00:00
Tai-hwa Liang	d9d46ed258	Unbreaking build by removing a now unused variable.	2006-03-27 23:27:11 +00:00
John Baldwin	b77619bd7f	Use td_ucred rather than p_ucred to avoid panics and general unhappiness. Pointy hat to: netchild	2006-03-27 19:16:31 +00:00
Alexander Leidinger	1daa386fcf	Fix the LINT build on alpha: - rename some file local structure definitions, the names clash with autogenerated names - on !alpha add some compatibility defines for those renamed structures - make some functions globally visible on alpha	2006-03-21 21:56:04 +00:00
Alexander Leidinger	61da9d97fb	Fix tinderbox on alpha. Tested by: cross-compile	2006-03-20 19:46:56 +00:00
Ruslan Ermilov	aefce619cf	Unbreak COMPAT_LINUX32 option support on amd64. Broken by: netchild	2006-03-19 11:10:33 +00:00
Alexander Leidinger	d4a3f5ddb6	Fixup some problems in my previous commit (COMPAT_43). Pointyhat to: netchild	2006-03-18 20:47:36 +00:00
Alexander Leidinger	5c8919adf4	Get rid of the need of COMPAT_43 in the linuxolator. Submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz> Obtained from: DragonFly (some parts)	2006-03-18 18:20:17 +00:00
Jeff Roberson	c4be19469a	- Remove ifdef disabled code that doesn't have a chance of working anymore.	2006-02-06 10:10:42 +00:00
Jeff Roberson	d6791f7615	- vn_lock with LK_RETRY can not return an error. The code that handled this case was not necessary. Sponsored by: Isilon Systems, Inc.	2006-01-30 08:22:56 +00:00
Olivier Houchard	d425dbec89	Fix a typo : deivce => device Spotted by: rwatson	2006-01-26 21:48:50 +00:00
Olivier Houchard	e83d253beb	Linux compat bits needed to make linux programs use the new ptys : linux_ioctl.[ch] : Implement LINUX_TIOCGPTN, which returns the pty number linux_stats.c : - Return the magic number for devfs. - In various stats()-related functions, check that we're stating a file in /dev/pts, and if so, change the st_rdev field to match what linux expects to be there for a slave pty device. The glibc checks for this, and their openpty() fails if it is no correct.	2006-01-26 01:32:46 +00:00
Tom Rhodes	0e36e11d57	Cast tv_sec to intmax_t and print with %jd in some ifdef'ed code.	2005-12-28 07:08:54 +00:00
Gleb Smirnoff	3c6160327d	Add \n to log() message. Submitted by: Stanislaw Halik <weirdo tehran.lain.pl>	2005-12-27 00:17:11 +00:00
John Baldwin	410d857972	Remove linux_mib_destroy() (which I actually added in between 5.0 and 5.1) which existed to cleanup the linux_osname mutex. Now that MTX_SYSINIT() has grown a SYSUNINIT to destroy mutexes on unload, the extra destroy here was redundant and resulted in panics in debug kernels. MFC after: 1 week Reported by: Goran Gajic ggajic at afrodita dot rcub dot bg dot ac dot yu	2005-12-15 16:30:41 +00:00
Xin LI	1278dd6847	In Linux, kernel parameters passed to ioctl are by value, while in FreeBSD they are passed by reference. Handle the difference within the linux_ioctl_termio on the LINUX_TCFLSH path. Submitted by: Jaroslav Drzik <jaro_AT_coop-voz_dot_sk>	2005-12-13 15:32:52 +00:00
Gleb Smirnoff	7a14354549	Suppress logging about unimplemented syscalls to one time per process. This prevents hard flood of the system console. Reviewed by: bde	2005-12-08 13:33:57 +00:00
Ruslan Ermilov	f4e9888107	Fix -Wundef.	2005-12-04 02:12:43 +00:00
David Xu	9104847f21	1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most changes in MD code are trivial, before this change, trapsignal and sendsig use discrete parameters, now they uses member fields of ksiginfo_t structure. For sendsig, this change allows us to pass POSIX realtime signal value to user code. 2. Remove cpu_thread_siginfo, it is no longer needed because we now always generate ksiginfo_t data and feed it to libpthread. 3. Add p_sigqueue to proc structure to hold shared signals which were blocked by all threads in the proc. 4. Add td_sigqueue to thread structure to hold all signals delivered to thread. 5. i386 and amd64 now return POSIX standard si_code, other arches will be fixed. 6. In this sigqueue implementation, pending signal set is kept as before, an extra siginfo list holds additional siginfo_t data for signals. kernel code uses psignal() still behavior as before, it won't be failed even under memory pressure, only exception is when deleting a signal, we should call sigqueue_delete to remove signal from sigqueue but not SIGDELSET. Current there is no kernel code will deliver a signal with additional data, so kernel should be as stable as before, a ksiginfo can carry more information, for example, allow signal to be delivered but throw away siginfo data if memory is not enough. SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can not be caught or masked. The sigqueue() syscall allows user code to queue a signal to target process, if resource is unavailable, EAGAIN will be returned as specification said. Just before thread exits, signal queue memory will be freed by sigqueue_flush. Current, all signals are allowed to be queued, not only realtime signals. Earlier patch reviewed by: jhb, deischen Tested on: i386, amd64	2005-10-14 12:43:47 +00:00
Robert Watson	5f419982c2	Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57, osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60, svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81, svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55, svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10, ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58, unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133: Now that Giant is acquired in uprintf() and tprintf(), the caller no longer leads to acquire Giant unless it also holds another mutex that would generate a lock order reversal when calling into these functions. Specifically not backed out is the acquisition of Giant in nfs_socket.c and rpcclnt.c, where local mutexes are held and would otherwise violate the lock order with Giant. This aligns this code more with the eventual locking of ttys. Suggested by: bde	2005-09-28 07:03:03 +00:00
Robert Watson	84d2b7df26	Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(), as they both interact with the tty code (!MPSAFE) and may sleep if the tty buffer is full (per comment). Modify all consumers of uprintf() and tprintf() to hold Giant around calls into these functions. In most cases, this means adding an acquisition of Giant immediately around the function. In some cases (nfs_timer()), it means acquiring Giant higher up in the callout. With these changes, UFS no longer panics on SMP when either blocks are exhausted or inodes are exhausted under load due to races in the tty code when running without Giant. NB: Some reduction in calls to uprintf() in the svr4 code is probably desirable. NB: In the case of nfs_timer(), calling uprintf() while holding a mutex, or even in a callout at all, is a bad idea, and will generate warnings and potential upset. This needs to be fixed, but was a problem before this change. NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having non-MPSAFE tty code. MFC after: 1 week	2005-09-19 16:51:43 +00:00
Xin LI	e68796868a	Fix kernel build. Reported by: tinderbox	2005-08-28 13:11:08 +00:00
Craig Rodrigues	8739cd44d0	Rewrite linux_ifconf() to be more like ifconf() in net/if.c so that we do not call uiomove() while IFNET_RLOCK() is held. This eliminates the witness warning: Calling uiomove() with the following non-sleepable locks held: exclusive sleep mutex ifnet r = 0 (0xc096dd60) locked @ /usr/src/sys/modules/linux/../../compat/linux/linux_ioctl.c:2170 MFC after: 2 days	2005-08-27 14:44:10 +00:00
Robert Watson	13f4c340ae	Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to ifnet.if_drv_flags. Device drivers are now responsible for synchronizing access to these flags, as they are in if_drv_flags. This helps prevent races between the network stack and device driver in maintaining the interface flags field. Many __FreeBSD__ and __FreeBSD_version checks maintained and continued; some less so. Reviewed by: pjd, bz MFC after: 7 days	2005-08-09 10:20:02 +00:00
John Baldwin	813a5e14ec	Move MODULE_DEPEND() statements for SYSVIPC dependencies to linux_ipc.c so that they aren't duplicated 3 times and are also in the same file as the code that depends on the SYSVIPC modules.	2005-07-29 19:40:39 +00:00
John Baldwin	02295eedc7	Add Giant around linux_getcwd_common() in linux_getcwd(). Approved by: re (scottl)	2005-07-09 12:34:49 +00:00
John Baldwin	4641373fde	Add missing locking to linux_connect() so that it can be marked MP safe: - Conditionally grab Giant around the EISCONN hack at the end based on debug.mpsafenet. - Protect access to so_emuldata via SOCK_LOCK. Reviewed by: rwatson Approved by: re (scottl)	2005-07-09 12:26:22 +00:00
John Baldwin	8d948cd1ec	Fix the computation of uptime for linux_sysinfo(). Before it was returning the uptime in seconds mod 60 which wasn't very useful. Approved by: re (scottl)	2005-07-07 19:17:55 +00:00
Pawel Jakub Dawidek	06a137780b	Actually only protect mount-point if security.jail.enforce_statfs is set to 2. If we don't return statistics about requested file systems, system tools may not work correctly or at all. Approved by: re (scottl)	2005-06-23 22:13:29 +00:00
Pawel Jakub Dawidek	820a0de9a9	Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfs and extend its functionality: value policy 0 show all mount-points without any restrictions 1 show only mount-points below jail's chroot and show only part of the mount-point's path (if jail's chroot directory is /jails/foo and mount-point is /jails/foo/usr/home only /usr/home will be shown) 2 show only mount-point where jail's chroot directory is placed. Default value is 2. Discussed with: rwatson	2005-06-09 18:49:19 +00:00
Maxim Sobolev	bc165ab0fe	Properly convert FreeBSD priority values into Linux values in the getpriority(2) syscall. PR: kern/81951 Submitted by: Andriy Gapon <avg@icyb.net.ua>	2005-06-08 20:41:28 +00:00
Pawel Jakub Dawidek	d0cad55da8	Remove (now) unused argument 'td' from bsd_to_linux_statfs().	2005-05-27 19:25:39 +00:00
Pawel Jakub Dawidek	672d95c55d	The code is under '#ifdef not_that_way', but anyway: - Add missing prison_check_mount() check.	2005-05-22 22:30:31 +00:00
Pawel Jakub Dawidek	a0e96a49df	If we need to hide fsid, kern_statfs()/kern_fstatfs() will do it for us, so do not duplicate the code in cvtstatfs(). Note, that we now need to clear fsid in freebsd4_getfsstat(). This moves all security related checks from functions like cvtstatfs() and will allow to add more security related stuff (like statfs(2), etc. protection for jails) a bit easier.	2005-05-22 21:52:30 +00:00
Jeff Roberson	7625cbf3cc	- Pass the ISOPEN flag to namei so filesystems will know we're about to open them or otherwise access the data.	2005-04-27 09:05:19 +00:00
Jeff Roberson	4585e3ac5a	- Change all filesystems and vfs_cache to relock the dvp once the child is locked in the ISDOTDOT case. Se vfs_lookup.c r1.79 for details. Sponsored by: Isilon Systems, Inc.	2005-04-13 10:59:09 +00:00
Matthew N. Dodd	f9763094f1	Implement SOUND_MIXER_INFO ioctl in compat layer.	2005-04-13 04:33:06 +00:00
Matthew N. Dodd	73c730a694	Add support for O_NOFOLLOW and O_DIRECT to Linux fcntl() F_GETFL/F_SETFL.	2005-04-13 04:31:43 +00:00
John Baldwin	98df9218da	- Change the vm_mmap() function to accept an objtype_t parameter specifying the type of object represented by the handle argument. - Allow vm_mmap() to map device memory via cdev objects in addition to vnodes and anonymous memory. Note that mmaping a cdev directly does not currently perform any MAC checks like mapping a vnode does. - Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the cdev the ioctl is acting on rather than trying to find a suitable vnode to map from. Reviewed by: alc, arch@	2005-04-01 20:00:11 +00:00
Jeff Roberson	9f3d9acd26	- Initial cn_lkflags to LK_EXCLUSIVE. Sponsored by: Isilon Systems, Inc.	2005-03-29 10:16:12 +00:00
Brooks Davis	044ba81b85	Use the CTASSERT() macro instead of rolling my own, non-portable one using #error. Suggested by: jhb	2005-03-24 19:26:50 +00:00
Brooks Davis	fe753c29f7	Compile errors are way more useful then panics later. Replace a KASSERT of LINUX_IFNAMSIZ == IFNAMSIZ with a preprocessor check and #error message. This will prevent nasty suprises if users change IFNAMSIZ without updating the linux code appropriatly.	2005-03-24 17:51:15 +00:00
David Schultz	aa675b572f	Reject packets larger than IP_MAXPACKET in linux_sendto() for sockets with the IP_HDRINCL option set. Without this change, a Linux process with access to a raw socket could cause a kernel panic. Raw sockets must be created by root, and are generally not consigned to untrusted applications; hence, the security implications of this bug are minimal. I believe this only affects 6-CURRENT on or after 2005-01-30. Found by: Coverity Prevent analysis tool Security: Local DOS	2005-03-23 08:28:00 +00:00
Poul-Henning Kamp	bbbc2d967e	Neuter the duplicated disk-device magic code for now. Somebody with serious linux-clue is necessary to fix this properly.	2005-03-15 11:58:40 +00:00
Maxim Sobolev	8d6e40c3f1	Add kernel-only flag MSG_NOSIGNAL to be used in emulation layers to surpress SIGPIPE signal for the duration of the sento-family syscalls. Use it to replace previously added hack in Linux layer based on temporarily setting SO_NOSIGPIPE flag. Suggested by: alfred	2005-03-08 16:11:41 +00:00
Maxim Sobolev	2302f0fea8	Handle MSG_NOSIGNAL flag in linux_send() by setting SO_NOSIGPIPE on socket for the duration of the send() call. Such approach may be less than ideal in threading environment, when several threads share the same socket and it might happen that several of them are calling linux_send() at the same time with and without SO_NOSIGPIPE set. However, such race condition is very unlikely in practice, therefore this change provides practical improvement compared to the previous behaviour. PR: kern/76426 Submitted by: Steven Hartland <killing@multiplay.co.uk> MFC after: 3 days	2005-03-07 07:26:42 +00:00
Maxim Sobolev	e3478fe000	Handle unimplemented syscall by instantly returning ENOSYS instead of sending signal first and only then returning ENOSYS to match what real linux does. PR: kern/74302 Submitted by: Travis Poppe <tlp@LiquidX.org>	2005-03-07 00:18:06 +00:00
John Baldwin	501ce30561	Remove linux_emul_find() and the CHECKALT*() macros as they are no longer used.	2005-03-01 17:57:45 +00:00
Poul-Henning Kamp	1e247cc2ce	Neuter linux_ustat() until somebody finds time to try to fix it. The fundamental problem is that we get only the lower 8 bits of the minor device number so there is no guarantee that we can actually find the disk device in question at all. This was probably a bigger issue pre-GEOM where the upper bits signaled which slice were in use. The secondary problem is how we get from (partial) dev_t to vnode. The correct implementation will involve traversing the mount list looking for a perfect match or a possible match (for truncated minor).	2005-02-22 13:39:46 +00:00
Nate Lawson	1e8d246eee	Unbreak the kernel build. Pointy hat to: sobomax.	2005-02-13 19:50:57 +00:00
Maxim Sobolev	1a88a252fd	Backout previous change (disabling of security checks for signals delivered in emulation layers), since it appears to be too broad. Requested by: rwatson	2005-02-13 17:37:20 +00:00
Maxim Sobolev	d8ff44b79f	Split out kill(2) syscall service routine into user-level and kernel part, the former is callable from user space and the latter from the kernel one. Make kernel version take additional argument which tells if the respective call should check for additional restrictions for sending signals to suid/sugid applications or not. Make all emulation layers using non-checked version, since signal numbers in emulation layers can have different meaning that in native mode and such protection can cause misbehaviour. As a result remove LIBTHR from the signals allowed to be delivered to a suid/sugid application. Requested (sorta) by: rwatson MFC after: 2 weeks	2005-02-13 16:42:08 +00:00
Maxim Sobolev	282fae35d6	Semctl with IPC_STAT command should return zero in case of success. PR: 73778 Submitted by: Andriy Gapon <avg@icyb.net.ua> MFC after: 2 weeks	2005-02-11 13:46:55 +00:00
John Baldwin	f7a2587298	- Use kern_{l,f,}stat() and kern_{f,}statfs() functions rather than duplicating the contents of the same functions inline. - Consolidate common code to convert a BSD statfs struct to a Linux struct into a static worker function.	2005-02-07 18:47:28 +00:00
John Baldwin	25771ec2a4	Make linux_emul_convpath() a simple wrapper for kern_alternate_path().	2005-02-07 18:46:05 +00:00
John Baldwin	76951d21d1	- Tweak kern_msgctl() to return a copy of the requested message queue id structure in the struct pointed to by the 3rd argument for IPC_STAT and get rid of the 4th argument. The old way returned a pointer into the kernel array that the calling function would then access afterwards without holding the appropriate locks and doing non-lock-safe things like copyout() with the data anyways. This change removes that unsafeness and resulting race conditions as well as simplifying the interface. - Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(), fstatfs(), and fhstatfs(). Use these wrappers to cut out a lot of code duplication for freebsd4 and netbsd compatability system calls. - Add a new lookup function kern_alternate_path() that looks up a filename under an alternate prefix and determines which filename should be used. This is basically a more general version of linux_emul_convpath() that can be shared by all the ABIs thus allowing for further reduction of code duplication.	2005-02-07 18:44:55 +00:00
John Baldwin	12dd959a7d	Use kern_setitimer() to implement linux_alarm() instead of fondling the real interval timer directly.	2005-02-07 18:36:21 +00:00
Maxim Sobolev	4379219537	Boot away another stackgap (one of the lest ones in linuxlator/i386) by providing special version of CDIOCREADSUBCHANNEL ioctl(), which assumes that result has to be placed into kernel space not user space. In the long run more generic solution has to be designed WRT emulating various ioctl()s that operate on userspace buffers, but right now there is only one such ioctl() is emulated, so that it makes little sense. MFC after: 2 weeks	2005-01-30 08:12:37 +00:00
Maxim Sobolev	a6886ef173	Extend kern_sendit() to take another enum uio_seg argument, which specifies where the buffer to send lies and use it to eliminate yet another stackgap in linuxlator. MFC after: 2 weeks	2005-01-30 07:20:36 +00:00
Maxim Sobolev	f4b6eb045f	Split out kernel side of msgctl(2) into two parts: the first that pops data from the userland and pushes results back and the second which does actual processing. Use the latter to eliminate stackgap in the linux wrapper of that syscall. MFC after: 2 weeks	2005-01-26 00:46:36 +00:00
Maxim Sobolev	cfa0efe7ab	Split out kernel side of {get,set}itimer(2) into two parts: the first that pops data from the userland and pushes results back and the second which does actual processing. Use the latter to eliminate stackgap in the linux wrappers of those syscalls. MFC after: 2 weeks	2005-01-25 21:28:28 +00:00
David E. O'Brien	1997c537be	Match the LINUX32's style with existing style Submitted by: Jung-uk Kim <jkim@niksun.com> Use positive, not negative logic.	2005-01-14 04:44:56 +00:00
David E. O'Brien	9c0552ce3e	Fix Linux compat 'uname -m' on AMD64. Submitted by: Jung-uk Kim <jkim@niksun.com> (patch reworked by me)	2005-01-14 03:45:26 +00:00
Warner Losh	898b0535b7	Start each of the license/copyright comments with /*-	2005-01-05 22:34:37 +00:00
Poul-Henning Kamp	c9b621fb98	Do not blindly pass linux filesystem specific mount data across.	2004-12-03 18:14:22 +00:00
Poul-Henning Kamp	f8524838b9	Ignore MNT_NODEV option, it is implicit in choice of filesystem.	2004-11-26 07:39:20 +00:00
David Malone	08de85f54a	Rename thread args to be called "td" rather than "p" to be consistent with other bits of this file. There should be no functional change. Submitted by: Andrea Campi (many moons ago) MFC after: 2 month	2004-10-10 18:34:30 +00:00
John Baldwin	78c85e8dfc	Rework how we store process times in the kernel such that we always store the raw values including for child process statistics and only compute the system and user timevals on demand. - Fix the various kern_wait() syscall wrappers to only pass in a rusage pointer if they are going to use the result. - Add a kern_getrusage() function for the ABI syscalls to use so that they don't have to play stackgap games to call getrusage(). - Fix the svr4_sys_times() syscall to just call calcru() to calculate the times it needs rather than calling getrusage() twice with associated stackgap, etc. - Add a new rusage_ext structure to store raw time stats such as tick counts for user, system, and interrupt time as well as a bintime of the total runtime. A new p_rux field in struct proc replaces the same inline fields from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux field in struct proc contains the "raw" child time usage statistics. ruadd() has been changed to handle adding the associated rusage_ext structures as well as the values in rusage. Effectively, the values in rusage_ext replace the ru_utime and ru_stime values in struct rusage. These two fields in struct rusage are no longer used in the kernel. - calcru() has been split into a static worker function calcru1() that calculates appropriate timevals for user and system time as well as updating the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a copy of the process' p_rux structure to compute the timevals after updating the runtime appropriately if any of the threads in that process are currently executing. It also now only locks sched_lock internally while doing the rux_runtime fixup. calcru() now only requires the caller to hold the proc lock and calcru1() only requires the proc lock internally. calcru() also no longer allows callers to ask for an interrupt timeval since none of them actually did. - calcru() now correctly handles threads executing on other CPUs. - A new calccru() function computes the child system and user timevals by calling calcru1() on p_crux. Note that this means that any code that wants child times must now call this function rather than reading from p_cru directly. This function also requires the proc lock. - This finishes the locking for rusage and friends so some of the Giant locks in exit1() and kern_wait() are now gone. - The locking in ttyinfo() has been tweaked so that a shared lock of the proctree lock is used to protect the process group rather than the process group lock. By holding this lock until the end of the function we now ensure that the process/thread that we pick to dump info about will no longer vanish while we are trying to output its info to the console. Submitted by: bde (mostly) MFC after: 1 month	2004-10-05 18:51:11 +00:00
Poul-Henning Kamp	f69f5fbd42	Hold thread reference while frobbing cdevsw.	2004-09-24 06:37:00 +00:00
John Baldwin	2ca25ab53e	Fix the ABI wrappers to use kern_fcntl() rather than calling fcntl() directly. This removes a few more users of the stackgap and also marks the syscalls using these wrappers MP safe where appropriate. Tested on: i386 with linux acroread5 Compiled on: i386, alpha LINT	2004-08-24 20:21:21 +00:00
Dag-Erling Smørgrav	72261b9f61	Don't try to translate the control message unless we're certain it's valid; otherwise a caller could trick us into changing any 32-bit word in kernel memory to LINUX_SOL_SOCKET (0x00000001) if its previous value is SOL_SOCKET (0x0000ffff). MFC after: 3 days	2004-08-23 12:41:29 +00:00
David E. O'Brien	b61c60d401	Fix the 'DEBUG' argument code to unbreak the amd64 LINT build.	2004-08-16 12:15:07 +00:00
David E. O'Brien	4a16b489ca	Fix the 'DEBUG' argument code to unbreak the amd64 LINT build.	2004-08-16 11:12:57 +00:00
David E. O'Brien	3a2e3a4aa7	Fix the 'DEBUG' argument code to unbreak the LINT build.	2004-08-16 10:36:12 +00:00
Tim J. Robbins	4af2762336	Changes to MI Linux emulation code necessary to run 32-bit Linux binaries on AMD64, and the general case where the emulated platform has different size pointers than we use natively: - declare certain structure members as l_uintptr_t and use the new PTRIN and PTROUT macros to convert to and from native pointers. - declare some structures __packed on amd64 when the layout would differ from that used on i386. - include <machine/../linux32/linux.h> instead of <machine/../linux/linux.h> if compiling with COMPAT_LINUX32. This will need to be revisited before 32-bit and 64-bit Linux emulation support can coexist in the same kernel. - other small scattered changes. This should be a no-op on i386 and Alpha.	2004-08-16 07:28:16 +00:00
Tim J. Robbins	ae8e14a6ac	Replace linux_getitimer() and linux_setitimer() with implementations based on those in freebsd32_misc.c, removing the assumption that Linux uses the same layout for struct itimerval as we use natively.	2004-08-15 12:34:15 +00:00
Tim J. Robbins	d1d6dbf120	Avoid assuming that l_timeval is the same as the native struct timeval in linux_select().	2004-08-15 12:24:05 +00:00
Tim J. Robbins	6fa534bad8	Use sv_psstrings from the current process's sysentvec structure instead of PS_STRINGS. This is a no-op at present, but it will be needed when running 32-bit Linux binaries on amd64 to ensure PS_STRINGS is in addressable memory.	2004-08-15 11:52:45 +00:00
Poul-Henning Kamp	41befa53a4	Add XXX comment about findcdev() misuse.	2004-08-14 08:38:17 +00:00
Poul-Henning Kamp	ebb48ffd65	Use kernel_vmount() instead of vfs_nmount().	2004-07-27 21:38:42 +00:00
Colin Percival	56f21b9d74	Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb	2004-07-26 07:24:04 +00:00
David Malone	fb75797e40	I missed two pieces of the commit to this file. Robert has already added one, this adds the other.	2004-07-18 09:26:34 +00:00
Robert Watson	38da2381cd	Remove 'sg' argument to linux_sendto_hdrincl, which is what I think was intended. This fixes the build, but might require revision.	2004-07-18 04:09:40 +00:00
David Malone	e140eb430c	Add a kern_setsockopt and kern_getsockopt which can read the option values from either user land or from the kernel. Use them for [gs]etsockopt and to clean up some calls to [gs]etsockopt in the Linux emulation code that uses the stackgap.	2004-07-17 21:06:36 +00:00
Poul-Henning Kamp	552afd9c12	Clean up and wash struct iovec and struct uio handling. Add copyiniov() which copies a struct iovec array in from userland into a malloc'ed struct iovec. Caller frees. Change uiofromiov() to malloc the uio (caller frees) and name it copyinuio() which is more appropriate. Add cloneuio() which returns a malloc'ed copy. Caller frees. Use them throughout.	2004-07-10 15:42:16 +00:00
Poul-Henning Kamp	87d72a8f27	Use a couple of regular kernel entry points, rather than COMPAT_43 entry points.	2004-07-08 10:18:07 +00:00
Alexander Leidinger	a92c890fd1	Implement SNDCTL_DSP_SETDUPLEX. This may fix sound apps which want to use full duplex mode. Approved by: matk	2004-07-02 15:31:44 +00:00
Bruce Evans	d436410960	Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of depending on namespace pollution in <sys/vnode.h> for the definition of GIANT_REQUIRED. Sorted includes.	2004-06-23 06:35:43 +00:00
Robert Watson	537ca45a2e	Mark linux_emul_convpath() as GIANT_REQUIRED.	2004-06-22 04:22:34 +00:00
Bruce M Simpson	cc5f91ee35	Add stub for Linux SOUND_MIXER_READ_RECMASK, required by some Linux sound applications. PR: misc/27471 Submitted by: Gavin Atkinson (with cleanups)	2004-06-18 14:36:24 +00:00
Bruce M Simpson	bf4f8992cd	Add a stub for the Linux SOUND_MIXER_INFO ioctl (even though we don't actually implement it), as some applications, such as RealProducer, expect to be able to use it. PR: kern/65971 Submitted by: Matt Wright	2004-06-18 14:25:44 +00:00
Bruce M Simpson	3f77a2b479	Linux applications expect to be able to call SIOCGIFCONF with an NULL ifc.ifc_buf pointer, to determine the expected buffer size. The submitted fix only takes account of interfaces with an AF_INET address configured. This could no doubt be improved. PR: kern/45753 Submitted by: Jacques Garrigue (with cleanups)	2004-06-18 14:06:46 +00:00
Bruce M Simpson	36db02ff0b	Fix the VT_SETMODE/CDROMIOCTOCENTRY problem correctly. Reviewed by: tjr	2004-06-18 13:36:30 +00:00
Bruce M Simpson	e41fce295e	Fix two attempts to use an unchecked NULL pointer provided from the userland, for the CDIOREADTOCENTRY and VT_SETMODE cases respectively. Noticed by: tjr	2004-06-18 09:13:35 +00:00
Poul-Henning Kamp	f3732fd15b	Second half of the dev_t cleanup. The big lines are: NODEV -> NULL NOUDEV -> NODEV udev_t -> dev_t udev2dev() -> findcdev() Various minor adjustments including handling of userland access to kernel space struct cdev etc.	2004-06-17 17:16:53 +00:00
Poul-Henning Kamp	89c9c53da0	Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.	2004-06-16 09:47:26 +00:00
Poul-Henning Kamp	71e9d5f9c8	Add support for more linux ioctls. I've had this sitting in my tree for a long time and I can't seem to find who sent it to me in the first place, apologies to whoever is missing out on a Contributed by: line here. I belive it works as it should.	2004-06-14 07:26:23 +00:00
Poul-Henning Kamp	1930e303cf	Deorbit COMPAT_SUNOS. We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither a sparc32 port nor a SunOS4.x compatibility desire these days.	2004-06-11 11:16:26 +00:00
John Baldwin	b7e23e826c	- Replace wait1() with a kern_wait() function that accepts the pid, options, status pointer and rusage pointer as arguments. It is up to the caller to copyout the status and rusage to userland if needed. This lets us axe the 'compat' argument and hide all that functionality in owait(), by the way. This also cleans up some locking in kern_wait() since it no longer has to drop locks around copyout() since all the copyout()'s are deferred. - Convert owait(), wait4(), and the various ABI compat wait() syscalls to use kern_wait() rather than wait1() or wait4(). This removes a bit more stackgap usage. Tested on: i386 Compiled on: i386, alpha, amd64	2004-03-17 20:00:00 +00:00
Tim J. Robbins	7b0d017245	Use vfs_nmount() to mount linprocfs filesystems in linux_mount(); linprocfs doesn't support the old mount interface.	2004-03-16 09:05:56 +00:00
Tim J. Robbins	2ba9b76668	Correct size argument passed to copyinstr() in linux_mount(): mntfromname and mntonname are both MNAMELEN characters long, not MFSNAMELEN.	2004-03-16 08:37:19 +00:00
Poul-Henning Kamp	651b11eaf2	Remove unused second arg to vfinddev(). Don't call addaliasu() on VBLK nodes.	2004-03-11 16:33:11 +00:00
Poul-Henning Kamp	816d62bbb9	Device megapatch 5/6: Remove the unused second argument from udev2dev(). Convert all remaining users of makedev() to use udev2dev(). The semantic difference is that udev2dev() will only locate a pre-existing dev_t, it will not line makedev() create a new one. Apart from the tiny well controlled windown in D_PSEUDO drivers, there should no longer be any "anonymous" dev_t's in the system now, only dev_t's created with make_dev() and make_dev_alias()	2004-02-21 21:32:15 +00:00
Bruce M Simpson	a1166f2439	Add BSD compatibility tty ioctls LINUX_TIOCSBRK and LINUX_TIOCCBRK. This addition appears to allow VMware 3 Workstation to operate with nmdm(4) as a virtual COM device. Tested by: Guido van Rooij	2004-02-19 12:38:12 +00:00
John Baldwin	91d5354a2c	Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64	2004-02-04 21:52:57 +00:00
Don Lewis	ff5f695e78	VOP_GETATTR() wants the vnode passed to it to be locked. Instead of adding the code to lock and unlock the vnodes and taking care to avoid deadlock, simplify linux_emul_convpath() by comparing the vnode pointers directly instead of comparing their va_fsid and va_fileid attributes. This allows the removal of the calls to VOP_GETATTR().	2004-01-14 22:38:03 +00:00
Alan Cox	277b62040d	Lock the traversal of the vm object list. Use TAILQ_FOREACH consistently.	2004-01-02 19:29:31 +00:00
Bruce Evans	3db2a84395	Quick fix for LINT breakage caused by interface changes in accept(2), etc. The log message for rev.1.160 of kern/uipc_syscalls.c and associated changes only claimed to add restrict qualifiers (which have no effect in the kernel so they probably shouldn't be added), but the following interface changes were also made: - caddr_t to `void ' and `struct sockaddr_t ' - `int ' to `socklen_t '. These interface changes are not quite null, and this fix is quick (like the changes in uipc_syscalls 1.160) because it uses bogus casts instead of complete bounds-checked conversions. Things should be fixed better when the conversions can be done without using the stack gap. linux_check_hdrincl() already uses the stack gap and is fixed completely though the type mismatches in it were not fatal (there were only fatal type mismatches from unopaquing pointers to [o]sockaddr't's -- the difference between accept()'s args and oaccept()'s args is now non-opaque, but this is not reflected in their args structs).	2003-12-25 09:59:02 +00:00
Alexander Kabaev	501f5ff123	Do not call VOP_GETATTR in getdents function. It does not serve any purpose and the resulting vattr structure was ignored. In addition, the VOP_GETATTR call was made with no vnode lock held, resulting in vnode locking violation panic with debug kernels. Reported by: truckman Approved by: re@ (rwatson)	2003-11-19 04:12:32 +00:00
Robert Watson	0b92da272c	Add a MAC check for VOP_LOOKUP() in the Linux getwcd() implementation. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-11-17 18:57:20 +00:00
Maxim Sobolev	d09c47acd9	Pull latest changes from OpenBSD: - improve sysinfo(2) syscall; - add dummy fadvise64(2) syscall; - add dummy *xattr(2) family of syscalls; - add protos for the syscalls 222-225, 238-249 and 253-267; - add exit_group(2) syscall, which is currently just wired to exit(2). Obtained from: OpenBSD MFC after: 2 weeks	2003-11-16 15:07:10 +00:00
David Malone	5a8a13e0fc	Use kern_sendit rather than sendit for the Linux send* syscalls. This means we can avoid using the stack gap for most send* syscalls now (it is still used in the IP_HDRINCL case).	2003-11-09 17:04:04 +00:00
Eric Anholt	0b399cc8a6	Prevent leaking of fsid to non-root users in linux_statfs and linux_fstatfs. Matches native syscalls now. PR: kern/58793 Submitted by: David P. Reese Jr. <daver@gomerbud.com> MFC after: 1 week	2003-11-05 23:52:54 +00:00
Max Khon	2332251c6a	Back out the following revisions: 1.36 +73 -60 src/sys/compat/linux/linux_ipc.c 1.83 +102 -48 src/sys/kern/sysv_shm.c 1.8 +4 -0 src/sys/sys/syscallsubr.h That change was intended to support vmware3, but wantrem parameter is useless because vmware3 uses SYSV shared memory to talk with X server and X server is native application. The patch worked because check for wantrem was not valid (wantrem and SHMSEG_REMOVED was never checked for SHMSEG_ALLOCATED segments). Add kern.ipc.shm_allow_removed (integer, rw) sysctl (default 0) which when set to 1 allows to return removed segments in shm_find_segment_by_shmid() and shm_find_segment_by_shmidx(). MFC after: 1 week	2003-11-05 01:53:10 +00:00
Brooks Davis	9bf40ede4a	Replace the if_name and if_unit members of struct ifnet with new members if_xname, if_dname, and if_dunit. if_xname is the name of the interface and if_dname/unit are the driver name and instance. This change paves the way for interface renaming and enhanced pseudo device creation and configuration symantics. Approved By: re (in principle) Reviewed By: njl, imp Tested On: i386, amd64, sparc64 Obtained From: NetBSD (if_xname)	2003-10-31 18:32:15 +00:00
Tim J. Robbins	1d2d5501f9	Reject negative ngrp arguments in linux_setgroups() and linux_setgroups16(); stops users being able to cause setgroups to clobber the kernel stack by copying in data past the end of the linux_gidset array.	2003-10-21 11:00:33 +00:00
Sam Leffler	62a531a702	fix build: linux_to_bsd_msf_lba is no longer used because of previous commit	2003-10-20 17:56:10 +00:00
Søren Schmidt	a55140ce07	We dont support CDROMREADAUDIO anymore.	2003-10-20 09:51:00 +00:00
Mitsuru IWASAKI	84b11cd80a	Fix some problems in linux_sendmsg() and linux_recvmsg(). - Allocate storage for uap->msg always because it is copyin()'ed in native sendmsg(). - Convert sockopt level from Linux to FreeBSD after native recvmsg() calling. - Some cleanups. Tested with: Oracle 9i shared server connection mode. MFC after: 1 week	2003-10-11 15:08:32 +00:00
Bruce Evans	34eec0a169	Restored a non-egregious cast so that this file compiles on i386's with 64-bit longs again. This was fixed in rev.1.42 but the fix rotted non-fatally in rev.1.105 and fatally in rev.1.137. Many more non-egregrious casts are strictly required for conversions from semi-opaque types to pointers, but we avoid most of them by using types that are almost certain to be compatible with uintptr_t for representing pointers (e.g., vm_offset_t). Here we don't really want the u_longs, but we have them because a.out.h and its support code doesn't use typedefs (it uses unsigned in V7 and unsigned long in FreeBSD) and is too obsolete to fix now.	2003-09-07 13:03:13 +00:00
Dag-Erling Smørgrav	7576b4b4c0	Try to make 'uname -a' look more like it does on Linux: - cut the version string at the newline, suppressing information about who built the kernel and in what directory. Most of this information was already lost to truncation. - on i386, return the precise CPU class (if known) rather than just "i386". Linux software which uses this information to select which binary to run often does not know what to make of "i386".	2003-07-29 10:03:15 +00:00
Poul-Henning Kamp	a8d43c90af	Add a "int fd" argument to VOP_OPEN() which in the future will contain the filedescriptor number on opens from userland. The index is used rather than a "struct file " since it conveys a bit more information, which may be useful to in particular fdescfs and /dev/fd/ For now pass -1 all over the place.	2003-07-26 07:32:23 +00:00
Poul-Henning Kamp	567104a148	Add a new function swap_pager_status() which reports the total size of the paging space and how much of it is in use (in pages). Use this interface from the Linuxolator instead of groping around in the internals of the swap_pager.	2003-07-18 10:26:09 +00:00
Marcel Moolenaar	19acf030a2	Don't map LINUX_POSIX_VDISABLE to _POSIX_VDISABLE and vice versa for the VMIN and VTIME members of the c_cc array. These members are not special control characters. By not excluding these members we changed the noncanonical mode input processing when both members were 0 on entry (=LINUX_POSIX_VDISABLE) as we would remap them to 255 (=_POSIX_VDISABLE). See termios(4) case A for how that screws up your terminal I/O. PR: 23173 Originator: Bjarne Blichfeldt <bbl@dk.damgaard.com> Patch by: Boris Nikolaus <bn@dali.tellique.de> (original submission) Philipp Mergenthaler <philipp.mergenthaler@stud.uni-karlsruhe.de> Reminders by: Joseph Holland King <gte743n@cad.gatech.edu> MFC after: 5 days	2003-06-28 19:32:07 +00:00
Poul-Henning Kamp	3b6d965263	Add a f_vnode field to struct file. Several of the subtypes have an associated vnode which is used for stuff like the f*() functions. By giving the vnode a speparate field, a number of checks for the specific subtype can be replaced simply with a check for f_vnode != NULL, and we can later free f_data up to subtype specific use. At this point in time, f_data still points to the vnode, so any code I might have overlooked will still work.	2003-06-22 08:41:43 +00:00
David E. O'Brien	16dbc7f228	Use __FBSDID().	2003-06-10 21:29:12 +00:00
Martin Blapp	f130dcf22a	Change the semantics of sysv shm emulation to take a additional argument to the functions shm{at,ctl}1 and shm_find_segment_by_shmid{x}. The BSD semantics didn't allow the usage of shared segment after being marked for removal through IPC_RMID. The patch involves the following functions: - shmat - shmctl - shm_find_segment_by_shmid - shm_find_segment_by_shmidx - linux_shmat - linux_shmctl Submitted by: Orlando Bassotto <orlando.bassotto@ieo-research.it> Reviewed by: marcel	2003-05-05 09:22:58 +00:00
Martin Blapp	a966b13d67	Initialize tbuf in newstat_copyout() too. Reviewed by: phk	2003-04-29 17:03:22 +00:00
Alexander Kabaev	104a9b7e3e	Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h> Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>	2003-04-29 13:36:06 +00:00
Martin Blapp	616aa29a0e	Do the same thing for stat64_copyout() as we already do for newstat_copyout(). Lie about disk drives which are character devices in FreeBSD but block devices under Linux. PR: 37227 Submitted by: Vladimir B. Grebenschikov <vova@sw.ru> Reviewed by: phk MFC after: 2 weeks	2003-04-29 12:36:03 +00:00
John Baldwin	2f7ed219b2	Argh! We want to return the old signal set when the error return is zero (i.e. success), not non-zero (failure). Submitted by: tegge Pointy hat to: jhb	2003-04-28 19:43:11 +00:00
John Baldwin	19dde5cd3b	Use a switch to convert the Linux sigprocmask flags to the equivalent FreeBSD flags instead of just adding one to the Linux flags. This should be identical to the previous version except that I have at least one report of this patch fixing problems people were having with Linux apps after my last commit to this file. It is safer to use the switch then to make assumptions about the flag values anyways, esp. since we currently use MD defines for the values of the flags and this is MI code. Tested by: Michael Class <michael_class@gmx.net>	2003-04-25 19:26:18 +00:00
Eric Anholt	caa18809df	Add an ioctl handler for the DRM. This removes the need for the DRM_LINUX option, which has been a source of frustration for many users.	2003-04-24 23:36:35 +00:00
John Baldwin	c6004a6202	Fix a lock order reversal. Unlock the proc before calling fget(). Reported by: kris	2003-04-23 18:13:26 +00:00
John Baldwin	fe8cdcae87	- Replace inline implementations of sigprocmask() with calls to kern_sigprocmask() in the various binary compatibility emulators. - Replace calls to sigsuspend(), sigaltstack(), sigaction(), and sigprocmask() that used the stackgap with calls to the corresponding kern_sig*() functions instead without using the stackgap.	2003-04-22 18:23:49 +00:00
John Baldwin	9d8643eca6	Don't hold the proc lock while performing sigset conversions on local variables.	2003-04-17 22:07:56 +00:00
John Baldwin	8804bf6b03	Use local struct proc variables to reduce repeated td->td_proc dereferences and improve readability.	2003-04-17 22:02:47 +00:00
Poul-Henning Kamp	a300701213	Don't include <sys/disklabel.h>	2003-04-16 20:57:35 +00:00

... 3 4 5 6 7 ...

820 commits