opnsense-src

mirror of https://github.com/opnsense/src.git synced 2026-04-23 23:28:37 -04:00

Author	SHA1	Message	Date
Marcel Moolenaar	47f756866a	Use direct mapped KVA for the sf_buf allocator, as made possible by the previous commit. While here, fix a typo, reformat comments and fix a long line. Tested with: ftpd	2003-09-01 00:12:27 +00:00
Alan Cox	411d10a600	Migrate the sf_buf allocator that is used by sendfile(2) and zero-copy sockets into machine-dependent files. The rationale for this migration is illustrated by the modified amd64 allocator. It uses the amd64's direct map to avoid emphemeral mappings in the kernel's address space. On an SMP, the emphemeral mappings result in an IPI for TLB shootdown for each transmitted page. Yuck. Maintainers of other 64-bit platforms with direct maps should be able to use the amd64 allocator as a reference implementation.	2003-08-29 20:04:10 +00:00
Nate Lawson	5a4d072c93	Minor style cleanups.	2003-08-28 16:30:31 +00:00
Marcel Moolenaar	d0adfaea93	Change LOG2_PAGE_SIZE from 14 to 15 bits. This will cause the CTASSERT in vm_page.h to be reached and thus slightly increases the overall coverage of LINT on ia64.	2003-08-25 20:02:18 +00:00
Marcel Moolenaar	5b6a41bddf	Add the bits for a LINT kernel. It has been verified to compile. We may need to polish this.	2003-08-23 21:47:33 +00:00
Marcel Moolenaar	9539d5b4f6	Remove PAGE_SIZE_4K, PAGE_SIZE_8K and PAGE_SIZE_16K and replace them with LOG2_PAGE_SIZE. A single option is better to LINT than multiple mutual exclusive ones.	2003-08-23 03:39:55 +00:00
Marcel Moolenaar	ca668eda45	Remove unused inclusion of opt_acpi.h	2003-08-23 00:07:52 +00:00
John Baldwin	e7411b9d71	Regen.	2003-08-21 14:16:41 +00:00
John Baldwin	daf54a1e05	Swap sigaction/sigreturn since they are in the wrong order. Noticed indirectly by: peter	2003-08-21 14:16:00 +00:00
Marcel Moolenaar	4a98d8b095	Undo the mistake made in revision 1.77 of trap.c and which was the ultimate trigger for the follow-up fixes in revisions 1.78, 1.80, 1.81 and 1.82 of trap.c. I was simply too pre-occupied with the gateway page and how it blurs kernel space with user space and vice versa that I couldn't see that it was all a load of bollocks. It's not the IP address that matters, it's the privilege level that counts. We never run in user space with lifted permissions and we sure can not run in kernel space without it. Sure, the gateway page is the exception, but not if you look at the privilege level. It's user space if you run with user permissions and kernel space otherwise. So, we're back to looking at the privilege level like it should be. There's no other way. Pointy hat: marcel	2003-08-20 05:30:35 +00:00
Gordon Tetlow	df3d69c217	Fixup the ELF branding information to point to the new home of rtld.	2003-08-17 08:08:38 +00:00
Marcel Moolenaar	710338e94f	In vm_thread_swap{in\|out}(), remove the alpha specific conditional compilation and replace it with a call to cpu_thread_swap{in\|out}(). This allows us to add similar code on ia64 without cluttering the code even more.	2003-08-16 23:15:15 +00:00
Marcel Moolenaar	26502503e5	Further cleanup <machine/cpu.h> and <machine/md_var.h>: move the MI prototypes of cpu_halt(), cpu_reset() and swi_vm() from md_var.h to cpu.h. This affects db_command.c and kern_shutdown.c. ia64: move all MD prototypes from cpu.h to md_var.h. This affects madt.c, interrupt.c and mp_machdep.c. Remove is_physical_memory(). It's not used (vm_machdep.c). alpha: the MD prototypes have been left in cpu.h with a comment that they should be there. Moving them is left for later. It was expected that the impact would be significant enough to be done in a seperate commit. powerpc: MD prototypes left in cpu.h. Comment added. Suggested by: bde Tested with: make universe (pc98 incomplete)	2003-08-16 16:57:57 +00:00
Marcel Moolenaar	c6d402d3f2	Fix a range check bug. Don't left-shift the integer argument 'data'. Sign extension happens after the shift, not before so that boundary cases like 0x40000000 will not be caught properly. Instead, right shift ndirty. It is guaranteed to be a multiple of 8. While here, do some manual code motion and code commoning. Range check bug pointed out by: iedowse	2003-08-16 01:49:38 +00:00
Marcel Moolenaar	1fdb0ba9bb	Fix the generation of coredumps. We did not take the dirty registers that were on the kernel stack into account. For now we write them out to the register stack of the process before creating the dump. This however is not the final solution. The problem is that we may invalidate the coredump by overwriting vital information due to an invalid backing store pointer. Instead we need to write the dirty registers to an unused region of VM which will result in a seperate segment in the coredump. For now we can at least get to all the registers from a coredump.	2003-08-15 05:52:48 +00:00
Marcel Moolenaar	b00555136c	Add an instruction group break after the move to application register and the move to control register to avoid dependency violations when these functions are used. Note that explicit data and instruction serialization also need to be in a subsequent instruction group. This too requires that we have an igrp break here.	2003-08-15 05:46:33 +00:00
Marcel Moolenaar	60518ee41c	Introduce two machine specific ptrace(2) requests: PT_GETKSTACK and PT_SETKSTACK. These requests allow the tracing process to access the dirty registers of the traced process that are on the kernel stack. Note that there's currently no way to access the rnat register for those dirty registers that are not (yet) covered by a nat collection point. The interface for this is still being slept on. Also note that implied by these requests is the division of work: The tracing process has to keep track of where registers are spilled and is responsible to figure out where the NaT bit of the stacked registers are at any time during the execution of the traced process. The kernel provides the interfaces but will not abstract the fact that the register stack can be split. This model does not follow the approach taken in Linux where PT_PEEK and PT_POKE deals with this automagically.	2003-08-15 05:40:59 +00:00
Marcel Moolenaar	6e1f209af1	Don't use VM_MIN_KERNEL_ADDRESS to check if the faulting address is in user space or kernel space. VM_MIN_KERNEL_ADDRESS starts after the gateway page, which means that improper memory accesses to the gateway page while in user mode would panic the kernel. Use VM_MAX_ADDRESS instead. It ends before the gateway page. The difference between VM_MIN_KERNEL_ADDRESS and VM_MAX_ADDRESS is exactly the gateway page.	2003-08-13 03:20:10 +00:00
Marcel Moolenaar	dfcba5aae3	Put an instruction group break between the move to ar.rnat and the move to ar.rsc. The RSE must be in enforced lazy mode when writing to RSE modifyable registers. In this case we restore the RSE NaT collection register ar.rnat. I have seen 2 general exception faults on pluto1 now that indicate that the move to ar.rsc has already happened prior to the move to ar.rnat, meaning that the RSE is not in enforced lazy mode anymore. The ia64 dependency and instruction ordering rules seem to allow having both registers written to in the same instruction group, provided ar.rsc is written to later than ar.rnat (based on the ordering semantics). It appears that we may be pushing our luck. For now, put them in seperate cycles (by means of the instruction group break). If we ever get a general exception fault on the move to ar.rnat again, we have definite proof that something else is fishy.	2003-08-13 02:49:50 +00:00
Warner Losh	06b4bf3e55	Expand inline the relevant parts of src/COPYRIGHT for Matt Dillon's copyrighted files. Approved by: Matt Dillon	2003-08-12 23:24:05 +00:00
Marcel Moolenaar	75cf31a016	Extend identifycpu(): o Differentiate between CPU family and CPU model. There are multiple Itanium 2 models and it's nice to differentiate between them. o Seperately export the CPU family and CPU model with sysctl. o Merced is the only model in the Itanium family. o Add Madison to the Itanium 2 family. We already knew about McKinley. o Print the CPU family between parenthesis, like we do with the i386 CPU class. My prototype now identifies itself as: CPU: Merced (800.03-Mhz Itanium) pluto1 and pluto2 will eventually identify themselves as: CPU: McKinley (900.00-Mhz Itanium 2)	2003-08-12 08:10:16 +00:00
Marcel Moolenaar	e57196b3db	Cleanup prototypes in cpu.h, including fswintrberr and any references to it. Sort the remaining prototypes in cpu.h. No functional change.	2003-08-12 03:51:53 +00:00
Marcel Moolenaar	322d6e0236	Cleanup and style(9) fixes. No functional change.	2003-08-11 21:25:19 +00:00
Marcel Moolenaar	425963bb80	o move cpu_reset() from vm_machdep.c to machdep.c. o reorder cpu_boot(), cpu_halt() and identifycpu(). No functional change.	2003-08-10 21:33:07 +00:00
Marcel Moolenaar	29952636d3	Now that we can ignore up to 8KB of dirty registers, remove the RSE magic from exec_setregs(). In set_mcontext() we now also don't have to worry that we entered the kernel with more that 512 bytes of dirty registers on the kernel stack. Note that we cannot make any assumptions anymore WRT to NaT collection points in exec_setregs(), so we have to deal with them now.	2003-08-10 08:04:21 +00:00
Marcel Moolenaar	f8e1f6d036	MFi386 1.422 & 1.423: lock page queues in pmap_insert_entry().	2003-08-08 00:30:26 +00:00
John Baldwin	8b149b5131	Consistently use the BSD u_int and u_short instead of the SYSV uint and ushort. In most of these files, there was a mixture of both styles and this change just makes them self-consistent. Requested by: bde (kern_ktrace.c)	2003-08-07 15:04:27 +00:00
Marcel Moolenaar	1634f50b1b	Better define the flags in the mcontext_t and properly set the flags when we create contexts. The meaning of the flags are documented in <machine/ucontext.h>. I only list them here to help browsing the commit logs: _MC_FLAGS_ASYNC_CONTEXT _MC_FLAGS_HIGHFP_VALID _MC_FLAGS_KSE_SET_MBOX _MC_FLAGS_RETURN_VALID _MC_FLAGS_SCRATCH_VALID Yes, _MC_FLAGS_KSE_SET_MBOX is a hack and I'm proud of it :-)	2003-08-07 07:52:39 +00:00
Marcel Moolenaar	a50bc30203	o Fix cut-n-paste whitespace corruption in previous commit o For trap-based upcalls the argument (the kse_mailbox) to the UTS must be written onto the kernel stack, not the user stack. While here, deal with the fact that we may be at a NaT collection point.	2003-08-07 07:40:19 +00:00
Marcel Moolenaar	bee4e73025	In cpu_set_upcall_kse(), create the upcall according to the entry path into the kernel. Normally it's due to a syscall, but one can also be created as the result of a clock interrupt (for example). This now even more looks like exec_setregs(). While here, add an assert that we don't expect more than 8KB of dirty registers on the kernel stack.	2003-08-06 23:28:19 +00:00
Marcel Moolenaar	5f20d75a5f	o In revision 1.45 of exception.S we changed exception_restore to unconditionally restore ar.k7 (kernel memory stack) and ar.k6 (kernel register stack). I don't know what I was smoking then, but if you unconditionally restore ar.k6, you also want to compute its value unconditionally. By having the computation predicated and dependent on whether we return to user mode, we would end up writing junk (= invalid value for ar.bspstore) if we would return to kernel mode. But the whole point of the unconditional restoration was that there is a grey area where we still need to have ar.k6 restored. If we restore with a junk value, we would end up wedging the machine on the next interrupt. So, unconditionally calculate the value we unconditionally write to ar.k6. o The previous braino was found while making the following change: We used to clear the lower 9 bits of the value we write to ar.k6. The meaning being that we know that the kernel register stack is at least 512 byte aligned and simply clearing the lower 9 bits allows us to return to a context of which we don't have dirty registers on the kernel stack, even though the context that entered the kernel does have dirty registers on the kernel stack. By masking-off the lower bits, we correctly obtain the base of the register stack without having to worry that we didn't actually reached the base while unwinding it. The change is to mask off the lower 13 bits, knowing that the kernel register stack is always 8KB aligned. The advantage is that we don't have to worry anymore if there's more than 512 bytes of dirty registers on the kernel stack. A situation that frequently occurs. In exec_setregs() in machdep.c:1.147 or older, we had to deal with that situation by copying the active portion of the register stack down in multiples of 512 bytes. Now that we mask off the lower 13 bits we don't have to do that at all. Contemporary IPF processors have a register file that can hold up to 96 stacked registers (=784 bytes [incl. 2 NaT collections]). With no indication that register files grow beyond a couple of hundred registers, we should not have to worry about it anymore... and yes, 640KB is enough for everybody :-) This change helps setcontext(2) and cpu_set_upcall_kse() in that they can return to completely different contexts without having to mess with the kernel stack. Of course exec_setregs() doesn't need to do that anymore as well.	2003-08-06 21:32:38 +00:00
Marcel Moolenaar	7f36189f8a	o Put the syscall return registers in the context. Not only do we need this for swapcontext(), KSE upcalls initiated from ast() also need to save them so that we properly return the syscall results after having had a context switch. Note that we don't use r11 in the kernel. However, the runtime specification has defined r8-r11 as return registers, so we put r11 in the context as well. I think deischen@ was trying to tell me that we should save the return registers before. I just wasn't ready for it :-) o The EPC syscall code has 2 return registers and 2 frame markers to save. The first (rp/pfs) belongs to the syscall stub itself. The second (iip/cfm) belongs to the caller of the syscall stub. We want to put the second in the context (note that iip and cfm relate to interrupts. They are only being misused by the syscall code, but are not part of a regular context). This way, when the context is switched to again, we return to the caller of setcontext(2) as one would expect. o Deal with dirty registers on the kernel stack. The getcontext() syscall will flush the RSE, so we don't expect any dirty registers in that case. However, in thread_userret() we also need to save the context in certain cases. When that happens, we are sure that there are dirty registers on the kernel stack. This implementation simply copies the registers, one at a time, from the kernel stack to the user stack. NAT collections are not dealt with. Hence we don't preserve NaT bits. A better solution needs to be found at some later time. We also don't deal with this in all cases in set_mcontext. No temporay solution is implemented because it's not a showstopper. The problem is that we need to ignore the dirty registers and we automaticly do that for at most 62 registers. When there are more than 62 dirty registers we have a memory "leak". This commit is fundamental for KSE support.	2003-08-05 18:52:02 +00:00
Marcel Moolenaar	02cc6a6f35	Fix logic bug in the previous commit. Any region less than 5 is a user space region. Hence, we need to test if 5 is greater than the region; not greater equal. This bug caused us to call ast() while interrupting kernel mode.	2003-08-04 22:00:48 +00:00
John Baldwin	3bdbd658f1	- Since td_critnest is now initialized in MI code, it doesn't have to be set in cpu_critical_fork_exit() anymore. - As far as I can tell, cpu_thread_link() has never been used, not even when it was originally added, so remove it.	2003-08-04 20:32:45 +00:00
Marcel Moolenaar	46e31b2612	Cleanup the clock code. This includes: o Remove alpha specific timer code (mc146818A) and compiled-out calibration of said timer. o Remove i386 inherited timer code (i8253) and related acquire and release functions. o Move sysbeep() from clock.c to machdep.c and have it return ENODEV. Console beeps should be implemented using ACPI or if no such device is described, using the sound driver. o Move the sysctls related to adjkerntz, disable_rtc_set and wall_cmos_clock from machdep.c to clock.c, where the variables are. o Don't hardcode a hz value of 1024 in cpu_initclocks() and don't bother faking a stathz that's 1/8 of that. Keep it simple: hz defaults to HZ and stathz equals hz. This is also how it's done for sparc64. o Keep a per-CPU ITC counter (pc_clock) and adjustment (pc_clockadj) to calculate ITC skew and corrections. On average, we adjust the ITC match register once every ~1500 interrupts for a duration of 2 consequtive interruprs. This is to correct the non-deterministic behaviour of the ITC interrupt (there's a delay between the match and the raising of the interrupt). o Add 4 debugging sysctls to monitor clock behaviour. Those are debug.clock_adjust_edges, debug.clock_adjust_excess, debug.clock_adjust_lost and debug.clock_adjust_ticks. The first counts the individual adjustment cycles (when the skew first crosses the threshold), the second counts the number of times the adjustment was excessive (any non-zero value is to be considered a bug), the third counts lost clock interrupts and the last counts the number of interrupts for which we applied an adjustment (debug.clock_adjust_ticks / debug.clock_adjust_edges gives the avarage duration of an individual adjustment -- should be ~2). While here, remove some nearby (trivial) left-overs from alpha and other cleanups.	2003-08-04 05:13:18 +00:00
Marcel Moolenaar	5192a6fc07	Fix handling of external interrupts: we weren't calling ast() when interrupting user mode. The net effect of this bug is that a clock interrupt does not cause rescheduling and processes are not preempted. It only takes a "while (1);" to render the machine useless. This bug was introduced by the context changes and EPC syscall code. Handling of ASTs was moved to C for clarity and ease of maintenance, but was not added for the external interrupt case. This needs to be revisited. We now have calls to do_ast() in trap(), break_syscall() and ivt_External_Interrupt(). A single call in exception_restore covers these 3 places without duplication. This is where we handled ASTs prior to the overhaul, except that the meat has been moved to do_ast(), a C function. This was the goal to begin with. Pointy hat: marcel	2003-08-04 00:08:39 +00:00
David E. O'Brien	a98a5f06d3	Style sync.	2003-08-03 07:50:19 +00:00
Marcel Moolenaar	6a1909919b	Don't use uint64_t. Use unsigned long instead. One is supposed to use ucontext_t without having to include headers other than <ucontext.h>.	2003-08-02 01:12:31 +00:00
Marcel Moolenaar	b65384050e	Write the preserved registers to (and read them from) struct reg and struct fpreg.	2003-08-01 07:21:34 +00:00
Bosko Milekic	b053bc8407	Make sure that when the PV ENTRY zone is created in pmap, that it's created not only with UMA_ZONE_VM but also with UMA_ZONE_NOFREE. In the i386 case in particular, the pmap code would hook a special page allocation routine that allocated from kernel_map and not kmem_map, and so when/if the pageout daemon drained the zones, it could actually push out slabs from the PV ENTRY zone but call UMA's default page_free, which resulted in pages allocated from kernel_map being freed to kmem_map; bad. kmem_free() ignores the return value of the vm_map_delete and just returns. I'm not sure what the exact repercussions could be, but it doesn't look good. In the PAE case on i386, we also set-up a zone in pmap, so be conservative for now and make that zone also ZONE_NOFREE and ZONE_VM. Do this for the pmap zones for the other archs too, although in some cases it may not be entirely necessarily. We'd rather be safe than sorry at this point. Perhaps all UMA_ZONE_VM zones should by default be also UMA_ZONE_NOFREE? May fix some of silby's crashes on the PV ENTRY zone.	2003-07-31 03:39:51 +00:00
Peter Wemm	ad7a226f9d	Deal with 'options KSTACK_PAGES' being a global option.	2003-07-31 01:31:32 +00:00
Peter Wemm	aac6412bcd	Cosmetic: fix some disorder of #include "opt_...." files	2003-07-31 01:29:09 +00:00
Peter Wemm	edc367db34	Remove leftover relic of pmap_new_thread() etc.	2003-07-31 01:28:41 +00:00
Maxime Henrion	d5afecd068	- Introduce a new busdma flag BUS_DMA_ZERO to request for zero'ed memory in bus_dmamem_alloc(). This is possible now that contigmalloc() supports the M_ZERO flag. - Remove the locking of Giant around calls to contigmalloc() since contigmalloc() now grabs Giant itself.	2003-07-27 13:52:10 +00:00
Marcel Moolenaar	e2fe99a2e0	Remove prototype of ia64_pa_access(). The function has been moved to mem.c where it's been made static.	2003-07-26 10:13:30 +00:00
Marcel Moolenaar	4f373ec187	Avoid using __aligned(16). Instead define the jmp_buf in terms of long doubles. This gives us 16-byte alignment. Add a CTASSERT for the size of the jmp_buf to detect ABI breakages.	2003-07-26 08:03:43 +00:00
Marcel Moolenaar	dc539f3ee0	Unbreak ia64 builds now -Werror is enabled again. Avoid obsolete memory operand construct.	2003-07-26 07:23:25 +00:00
Marcel Moolenaar	938b878e45	Revert previous commit. We don't use setjmp()/longjmp() for context switching anymore, so there's no need to save and restore GP. This change breaks threaded applications linked against libc_r. Pull the tier 2 card again: relink. This will link against libthr instead.	2003-07-25 22:36:48 +00:00
Alan Cox	059358675e	MFi386 revision 1.416 Add vm object locking to pmap_prefault(). Note: powerpc and sparc64 do not implement this function.	2003-07-25 18:58:39 +00:00
Marcel Moolenaar	c1e97bb458	Remove __aligned(16) from the definition of struct _ia64_fpreg. It's a non-standard construct. Instead, redefine struct _ia64_fpreg as a union and put a long double in it. On ia64 and for LP64, this is defined by the ABI to have 16-byte alignment. For ILP32 a long double has 4-byte alignment, but we don't support ILP32. Note that the in-memory image of a long double does not match the in- memory image of spilled FP registers. This means that one cannot use the fpr_flt field to interpet the bits. For this reason we continue to use an aggregate type.	2003-07-25 08:02:24 +00:00
Marcel Moolenaar	cd7e5d6eb4	Remove INVARIANT* and WITNESS. This makes the simulator much more pleasant to use.	2003-07-25 07:52:20 +00:00
Marcel Moolenaar	076f523998	Move ia64_pa_access() from machdep.c to mem.c and declare it static. It's only used in mem.c and cannot accidentally be used elsewhere this way.	2003-07-25 05:37:13 +00:00
Marcel Moolenaar	c5262d75ed	Disable the single-step trap on a debug related trap, including of course the single-step trap itself.	2003-07-25 00:11:14 +00:00
Marcel Moolenaar	793e17ba11	We sloppily created an array for the high FP registers (f32-f127), but this just created a weird inconsistency when porting gdb(1). Instead, we name each high FP register seperately, like we do for all the other registers.	2003-07-23 03:08:34 +00:00
Marcel Moolenaar	e90153536e	Rename thread_siginfo to cpu_thread_siginfo.	2003-07-15 04:43:33 +00:00
Marcel Moolenaar	ed4ee6b2af	Enable the high FP registers when we call the FPSWA handler and disable them again afterwards. This fixes a disabled FP fault while in the FPSWA handler. While here, merge the FP fault and FP trap handling code to reduce code duplication. Where code was different, it was not sure it should be. Trigger case: ports/math/atlas	2003-07-13 04:08:16 +00:00
Marcel Moolenaar	480d3dd2ea	Add logic to trace across/over a trapframe. We have ABI markers in our unwind information for functions that are entry points into the kernel. When stepping to the next frame, the unwinder will let us know when sych a marker was encountered. We use this to stop the current unwind session, query the trapframe and restart a new unwind session based on the new trapframe. The implementation is a bit sloppy, but at this time there are bigger fish to fry.	2003-07-12 04:35:09 +00:00
Marcel Moolenaar	2c75b47793	Add a body directive before the first instruction in epc_syscall(). This results in a zero length prologue and a body that covers the whole function. This is more correct.	2003-07-11 08:52:48 +00:00
Marcel Moolenaar	67f79f5a15	Remove a gratuitous align directive after the endp directive for IVT entries.	2003-07-11 08:49:26 +00:00
Marcel Moolenaar	290245ea4c	Don't call malloc() and free() while in the debugger and unwinding to get a stacktrace. This does not work even with M_NOWAIT when we have WITNESS and is generally a bad idea (pointed out by bde@). We allocate an 8K heap for use by the unwinder when ddb is active. A stack trace roughly takes up half of that in any case, so we have some room for complex unwind situations. We don't want to waste too much space though. Due to the nature of unwinding, we don't worry too much about fragmentation or performance of unwinding while in the debugger. For now we have our own heap management, but we may be able to leverage from existing code at some later time. While here: o Make sure we actually free the unwind environment after unwinding. This fixes a memory leak. o Replace Doug's license with mine in unwind.c and unwind.h. Both files don't have much, if any, of Doug's code left since the EPC syscall overhaul and the import of the unwinder. o Remove dead code. o Replace M_NOWAIT with M_WAITOK for all remaining malloc() calls.	2003-07-05 23:21:58 +00:00
Alan Cox	1f78f902a8	Background: pmap_object_init_pt() premaps the pages of a object in order to avoid the overhead of later page faults. In general, it implements two cases: one for vnode-backed objects and one for device-backed objects. Only the device-backed case is really machine-dependent, belonging in the pmap. This commit moves the vnode-backed case into the (relatively) new function vm_map_pmap_enter(). On amd64 and i386, this commit only amounts to code rearrangement. On alpha and ia64, the new machine independent (MI) implementation of the vnode case is smaller and more efficient than their pmap-based implementations. (The MI implementation takes advantage of the fact that objects in -CURRENT are ordered collections of pages.) On sparc64, pmap_object_init_pt() hadn't (yet) been implemented.	2003-07-03 20:18:02 +00:00
Ruslan Ermilov	0be33d3321	The .s files were repo-copied to .S files. Approved by: marcel Repocopied by: joe	2003-07-02 12:57:07 +00:00
Marcel Moolenaar	c55b999c72	The use of SYSINIT requires the inclusion of <sys/kernel.h>	2003-07-02 01:22:29 +00:00
Maxime Henrion	75f9bf73ec	Make this even closer to other busdma backends.	2003-07-01 21:21:45 +00:00
Maxime Henrion	02681c8bc2	Sync bounce pages support with the alpha backend. More precisely: o use a mutex to protect the bounce pages structure. o use a SYSINIT function to initialize the bounce pages structures and thus avoid a race condition in alloc_bounce_pages(). o add support for the BUS_DMA_NOWAIT flag in bus_dmamap_load(). o remove obsolete splhigh()/splx() calls. o remove printf() about incorrect locking in busdma_swi() and sync busdma_swi() with the one of the alpha backend. o use __FBSDID.	2003-07-01 18:08:05 +00:00
Maxime Henrion	4813f72a9b	Honor the boundary of the busdma tag when allocating bounce pages. This was fixed in revision 1.5 of alpha/alpha/busdma_machdep.c and was never fixed in other busdma backends using bounce pages.	2003-07-01 16:54:54 +00:00
Scott Long	f6b1c44d1f	Mega busdma API commit. Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg. Lockfunc allows a driver to provide a function for managing its locking semantics while using busdma. At the moment, this is used for the asynchronous busdma_swi and callback mechanism. Two lockfunc implementations are provided: busdma_lock_mutex() performs standard mutex operations on the mutex that is specified from lockfuncarg. dftl_lock() is a panic implementation and is defaulted to when NULL, NULL are passed to bus_dma_tag_create(). The only time that NULL, NULL should ever be used is when the driver ensures that bus_dmamap_load() will not be deferred. Drivers that do not provide their own locking can pass busdma_lock_mutex,&Giant args in order to preserve the former behaviour. sparc64 and powerpc do not provide real busdma_swi functions, so this is largely a noop on those platforms. The busdma_swi on is64 is not properly locked yet, so warnings will be emitted on this platform when busdma callback deferrals happen. If anyone gets panics or warnings from dflt_lock() being called, please let me know right away. Reviewed by: tmm, gibbs	2003-07-01 15:52:06 +00:00
Alan Cox	dca96f1adc	- Export pmap_enter_quick() to the MI VM. This will permit the implementation of a largely MI pmap_object_init_pt() for vnode-backed objects. pmap_enter_quick() is implemented via pmap_enter() on sparc64 and powerpc. - Correct a mismatch between pmap_object_init_pt()'s prototype and its various implementations. (I plan to keep pmap_object_init_pt() as the MD hook for device-backed objects on i386 and amd64.) - Correct an error in ia64's pmap_enter_quick() and adjust its interface to match the other versions. Discussed with: marcel	2003-06-29 21:20:04 +00:00
Alan Cox	269acda954	- Remove the calls to pmap_install() from pmap_object_init_pt(); they are redundant. Discussed with: marcel - MFi386: Add vm object locking to pmap_object_init_pt().	2003-06-29 06:10:32 +00:00
Marcel Moolenaar	d9a4740f18	Implement cpu_set_upcall_kse(). Elementary testing shows that this function behaves correctly in principle, but is not expected to be 100% complete. In any case, with this commit we have KSE ported enough to start runtime testing with threaded applications and fix whatever bugs or omissions we encounter. Yay!	2003-06-28 09:22:25 +00:00
David Xu	b8f480ab94	Add a machine depended function thread_siginfo, SA signal code will use the function to construct a siginfo structure and use the result to export to userland. Reviewed by: julian	2003-06-28 06:34:08 +00:00
Scott Long	3eaffdf7e0	Do the first and mostly mechanical step of adding mutex support to the bus_dma async callback scheme. Note that sparc64 does not seem to do async callbacks. Note that ia64 callbacks might not be MPSAFE at the moment. Note that powerpc doesn't seem to do async callbacks due to the implementation being incomplete. Reviewed by: mostly silence on arch@	2003-06-27 08:31:48 +00:00
Marcel Moolenaar	e2905ce3a0	Add TLS related relocation.	2003-06-19 06:51:43 +00:00
Alan Cox	40ebf3e43a	Fix a performance bug in all of the various implementations of uma_small_alloc(): They always zeroed the page regardless of what the caller requested.	2003-06-18 02:57:38 +00:00
David Xu	0e2a4d3aeb	Rename P_THREADED to P_SA. P_SA means a process is using scheduler activations.	2003-06-15 00:31:24 +00:00
Alan Cox	49a2507bd1	Migrate the thread stack management functions from the machine-dependent to the machine-independent parts of the VM. At the same time, this introduces vm object locking for the non-i386 platforms. Two details: 1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES. The different machine-dependent implementations used various combinations of KSTACK_GUARD and KSTACK_GUARD_PAGES. To disable guard page, set KSTACK_GUARD_PAGES to 0. 2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new. In 5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed to vm_page_alloc() or vm_page_grab().	2003-06-14 23:23:55 +00:00
Alan Cox	89f4fca265	Move the _new_altkstack() and _dispose_altkstack() functions out of the various pmap implementations into the machine-independent vm. They were all identical.	2003-06-14 06:20:25 +00:00
Marcel Moolenaar	222a7e518c	Remove kernel event tracing. The overhead is significant when running under ski.	2003-06-14 00:01:24 +00:00
Marcel Moolenaar	58f2d986a6	Make sure pcpu->pc_pcb is pointing to a 16-byte aligned address. The PCB contains FP registers, whose alignment must be 16 bytes at least. Since the PCB pointed to by pc_pcb is immediately after the PCPU itself, round-up the size of thge PCPU to a multiple of 16 bytes. The PCPU is page aligned. This fixes a misalignment trap caused by stopping a CPU in a SMP kernel, such as been done when entering the debugger. Reported by: Alan Robinson <alan.robinson@fujitsu-siemens.com>	2003-06-12 00:15:18 +00:00
Peter Wemm	77e2a274d0	GC unused cpu_wait() function	2003-06-11 05:20:33 +00:00
Juli Mallett	d196a10856	Note that scbus is required for SCSI, not just "required" in general. Submitted by: Edward Kaplan (tmbg37 on IRC) Reviewed by: rwatson (in principle)	2003-06-08 02:03:02 +00:00
Marcel Moolenaar	6f2071769f	pmap_find_vhpt() has been observed to return a NULL pointer when the caller assumes this to not happen by means of performing an indirection without checking the return value. Add KASSERTs to force a kernel with INVARIANTS to panic. This is a short-term measure. The pmap code is scheduled to be overhauled.	2003-06-07 04:17:39 +00:00
Marcel Moolenaar	f09b81f8be	If we get a fault in the gateway page, which would happen if we try to deliver a signal and the RSE backing store has been exhausted or the backing store pointer has been clobbered, we need to make sure we call userret() and do_ast() when we exit from trap(). Not adjusting the local variable 'user' in this case will prevent the faulty process from being terminated and we end up in an infinite fault repetition. Faulty process provided by: bento	2003-06-07 04:10:07 +00:00
Marcel Moolenaar	0785ee125b	Use TRAPF_USERMODE() to replace an equivalent check in trap(). While here, amend the related comment.	2003-06-06 23:44:05 +00:00
Marcel Moolenaar	eaa7bda4a5	Have TRAPF_USERMODE() take into account that the gateway page is not always kernel space. It should be treated as user space when run with user privileges (which is the case for the signal trampolines). This fixes its only use in a KASSERT in subr_trap.c.	2003-06-06 23:27:18 +00:00
Marcel Moolenaar	3f52f44add	Fix the dreaded double counting that was present on alpha as well and got fixed two weeks after the ia64 version was copied from the alpha version (see rev 1.32 of sys/alpha/alpha/mem.c). As such, we were missing the same continue as on alpha. While here, add a default case for the device minor switch and do some general style(9) cleanups. WARNING: this file still has bugs. When reading from region 6 or region 7, we don't validate the physical address. One can trivially cause a machine check by trying to read from address 0xFFFFFFFFFFFFFFF0 or something that uses the unimplemented physical address bits. Reported by: Alan Robinson <alan.robinson@fujitsu-siemens.com>	2003-06-04 21:56:10 +00:00
Marcel Moolenaar	11e0f8e16d	Change the second (and last) argument of cpu_set_upcall(). Previously we were passing in a void* representing the PCB of the parent thread. Now we pass a pointer to the parent thread itself. The prime reason for this change is to allow cpu_set_upcall() to copy (parts of) the trapframe instead of having it done in MI code in each caller of cpu_set_upcall(). Copying the trapframe cannot always be done with a simply bcopy() or may not always be optimal that way. On ia64 specifically the trapframe contains information that is specific to an entry into the kernel and can only be used by the corresponding exit from the kernel. A trapframe copied verbatim from another frame is in most cases useless without some additional normalization. Note that this change removes the assignment to td->td_frame in some implementations of cpu_set_upcall(). The assignment is redundant. A previous call to cpu_thread_setup() already did the exact same assignment. An added benefit of removing the redundant assignment is that we can now change td_pcb without nasty side-effects. This change officially marks the ability on ia64 for 1:1 threading. Not tested on: amd64, powerpc Compile & boot tested on: alpha, sparc64 Functionally tested on: i386, ia64	2003-06-04 21:13:21 +00:00
Marcel Moolenaar	0fa2b83829	Improve set_mcontext: o Don't copy psr verbatim from the user supplied context. Only allow userland to change the processor settings that are part of the user mask.	2003-06-01 23:22:56 +00:00
Marcel Moolenaar	86f4f6f7b8	Improve on cpu_set_upcall: o Use pcb and tf for the new pcb and the new trapframe and use pcb0 for the old (current) pcb. The mix of pcb, pcb2 and tf was slightly confusing. o Don't define td->td_frame here. It has already been set previously by cpu_thread_setup. Add a KASSERT to make sure pcb and tf are both non-NULL. o Make sure the number of dirty registers is 0 for the new thread. There are no user registers on the backing store because we heven't enter userland yet.	2003-06-01 23:19:21 +00:00
Marcel Moolenaar	798c9e50b0	Implement cpu_thread_setup(). This is mostly the same as on i386, except for the fact that trapframes have a size recorded in it that we set here too. We need this for proper thread setup. Pointed out by: mtm	2003-06-01 08:29:43 +00:00
Marcel Moolenaar	5e7019bf32	Now that we have the signal trampolines in the gateway page and the gateway page is considered kernel space, we can panic when we should only SIGSEGV. Hence, add the additional constraint that for page faults we also require running with kernel privileges. The gateway page is the only kernel code running with user privileges, iso this is a correct way to exclude the gateway page from kernel land. We do not currently exclude the gateway page for other faults as it is not always the right way to do it. Further tuning will happen on a case by case bases.	2003-05-31 21:21:35 +00:00
Marcel Moolenaar	480a728dee	Implement cpu_set_upcall(). Required by libthr and used by thr_create(2). This implementation is so far only compile tested. But since this is also the last of the functions required to support libthr, we're now functionally complete (for some weird definition of functionally; and complete). Runtime testing can commence.	2003-05-31 21:14:25 +00:00
Marcel Moolenaar	c01da18e22	Implement set_mcontext() and get_mcontext(). Just as for sendsig() and sigreturn(), we cheat and assume the preserved registers are still on-chip and unmodified. This is actually the case, but more by accident than by design. We need to use unwinding eventually or explicitly compile the kernel in a way that the compiler steers clear from using the preserved registers completely.	2003-05-31 21:07:08 +00:00
Marcel Moolenaar	8ef6f226da	Make the regset pointers const pointers for the context restore functions. This works better with set_mcontext() and is more precise in general.	2003-05-31 21:02:18 +00:00
Marcel Moolenaar	3893a77138	Some ia32 related finetuning for the EPC syscall path: o The SDM states that flushing the RSE in the cycle prior to the call to ia32 code yields the best performance. We don't really care to much about performance here, but we do the same anyway. I'm being paranoia and conservative here. o Only initialize the ia32 state registers, not the registers used as scratch by the ia32 engine. This saves a couple of loads from the trapframe, but also helps debugging: we don't clobber useful debugging data (engineering hints :-) o Make sure all general registers constituting ia32 state have been initialized. If there's no useful to be loaded from the trapframe, clear the register. This avoids accidentally leaking NaT bits. o Make sure we set ar.k6 prior to clobbering ar.bspstore and also set ar.k7 prior to setting sp. This fixes a race seen for ia64 native code as well (and previously fixed too).	2003-05-31 20:57:26 +00:00
Marcel Moolenaar	62931aa266	Make sure we have all the dirty registers in user frames on the backing store before we discard them. It is possible that we enter the kernel (due to an execve in this case) with a lot of dirty user registers and that the RSE has only partially spilled them (to make room for new frames). We cannot move the backing store pointer down (to discard user registers) when not all of the user registers are on the backing store. So, we flush the register stack IFF this happens. Unconditionally doing the flush is too costly, because the condition in which we need to flush is very rare. This change appears to fix the SIGSEGV that sometimes happen for newly executed processes and so far also appears to fix the last of the corruption. It is possible, although not likely, that this change prevents some other bug from happening, even though it is itself not a fix. Hence the uncertainty. We'll know in a couple of months I guess :-)	2003-05-31 20:42:35 +00:00
Hiten Pandya	b77c32a07e	Rename BUS_DMAMEM_NOSYNC to BUS_DMA_COHERENT. The current name is confusing, because it indicates to the client that a bus_dmamap_sync() operation is not necessary when the flag is specified, which is wrong. The main purpose of this flag is to hint the underlying architecture that DMA memory should be mapped in a coherent way, but the architecture can ignore it. But if the architecture does supports coherent mapping of memory, then it makes bus_dmamap_sync() calls cheap. This flag is the same as the one in NetBSD's Bus DMA. Reviewed by: gibbs, scottl, des (implicitly) Approved by: re@ (jhb)	2003-05-30 20:40:33 +00:00
Marcel Moolenaar	3a8c4f9f9c	Move the sysctls of the misalignment handler to where they belong and use OID_AUTO instead of fixed IDs. Approved by: re@ (blanket)	2003-05-29 06:30:36 +00:00
Marcel Moolenaar	12cd60b726	Fix what I think is a cut-n-paste bug: use OID_AUTO for the print_usertrap sysctl instead of CPU_UNALIGNED_PRINT. The latter is used already. Approved by: re@ (blanket)	2003-05-29 05:09:15 +00:00
Marcel Moolenaar	81d77e2eed	A flushrs must be the first in an instruction group. Approved by: re@ (blanket)	2003-05-27 07:10:58 +00:00

1 2 3 4 5 ...

1088 commits