audit_bsm_fcntl.c contain utility routines to map local fcntl
commands into BSM constants. Adaptation to the FreeBSD kernel
environment will follow in a future commit.
Sponsored by: Apple, Inc.
Obtained from: TrustedBSD Project
MFC after: 2 weeks
use to identify if the socket is the same one that a cached request
came in on. It is set by nfsrvd_addsock() to a unique value generated
by incrementing an unsigned 64bit static variable for each assignment
and then the value of xp_sockref is tested to see if it is equal to
the value that was saved with the cached reply.
Submitted by: rmacklem
Reviewed by: dfr
Approved by: kib (mentor)
network load.
1. Leave the RX interrupt routine if there is no mbuf available.
2. Properly initialize and track tx_desc_used_count counter so as not to
leak mbuf while traversing used descriptors.
Obtained from: Semihalf
VFS_VGET() has returned in ufs_lookup(). If the '..' lookup started
immediately before the parent directory was removed, we might return
either cleared or unrelated inode otherwise.
Ufs_lookup() is split into new function ufs_lookup_() that either does
lookup, or verifies that directory entry exists and references supplied
inode number.
Reviewed by: tegge
Tested by: pho,
Andreas Tobler <andreast-list fgznet ch> (previous version)
MFC after: 1 month
naming of the partitions (GEOM_PART_EBR_COMPAT). When
compatibility is enabled, changes to the partitioning are
disallowed.
Remove the device name aliasing added previously to provide
backward compatibility, but which in practice doesn't give
us anything.
Enable compatibility on amd64 and i386.
In the good old days it was possible to have dev_t's that referred to
nonexistent devices. In these cases devtoname() automatically generated
names. This is no longer possible, so remove this dead code.
Discussed with: kib
Remove the `udev' variable, which has a different type than the original
function argument and si_drv0. The `udev' name is also misleading,
because it is not the number returned by dev2udev(). Rename this
argument to `unit'. It is the same number as returned by dev2unit().
We should still access si_drv0 using dev2unit(). Also change the
KASSERT() to really print the udev instead of the unit number. I suspect
it's still useful to print the unit number, especially for devices that
use clone lists, so keep the unit number in the panic string.
- Do not iterate int 15h, function e820h twice. Instead, we use STAILQ to
store each return buffer and copy all at once.
- Export optional extended attributes defined in ACPI 3.0 as separate
metadata. Currently, there are only two bits defined in the specification.
For example, if the descriptor has extended attributes and it is not
enabled, it has to be ignored by OS. We may implement it in the kernel
later if it is necessary and proven correct in reality.
- Check return buffer size strictly as suggested in ACPI 3.0.
Reviewed by: jhb
- add show as alias for get
- add weights to allow mpath to do more than equal cost
- add sticky / nostick to disable / re-enable per-connection load balancing
This adds a field to rt_metrics_lite so network bits of world will need to be re-built.
Reviewed by: jeli & qingli
- Remove redundant softc members for RIDs.
- Change some softc members to be unsigned where more appropriate.
- Add some missing const.
- Remove support for mmap(2)'ing VGA I/O as it was broken [1] and
not required by X.Org anyway.
- Fix some confusion between bus, physical and virtual addresses
which mostly consisted in using members of struct video_adapter
inappropriately but wasn't fatal except for the regular framebuffer
mmap(2)'ing.
- Remove redundant bzero(9)'ing of the softc.
- Don't map the framebuffer twice in case the firmware has already
mapped it as besides wasting resources this isn't possible with
all MMUs. This is a bit tricky as a) just because the firmware
provides a property with a virtual address doesn't mean it's
actually mapped (but typically is when the framebuffer is the
console) and b) the firmware doesn't necessarily map the it with
the same byteorder as we do. This make machfb(4) work on machines
with cheetah-class MMUs (including X.Org).
Reported by: Michael Plass [1]
MFC after: 3 days
addresses to BARs into new pci_read_bar() and pci_write_bar() routines.
pci_add_map(), pci_alloc_map(), and pci_delete_resource() now use these
routines to work with BARs.
- Just pass the device_t for the new PCI device to various routines instead
of passing the device, bus, slot, and function.
Reviewed by: imp
open partition. This fixes access to partitions whose starting offset
is >= 2 TB.
Submitted by: "James R. Van Artsdalen" james jrv.org
MFC after: 3 days
we recognize its a retransmit, ahead of the PR-SCTP
work. Without this fix, we end up NOT reducing flight
size and causing an miscalculation when PR-SCTP is active
and data is skipped.
Obtained from: Michael Tuexen.
identify a station operating in turbo-boost mode because it has a
pure ofdm rate set; add an explicit check for the channel type
instead of depending on IEEE80211_NODE_ERP being set
Remote host can advertise smaller MSS than that of sender so upper
stack might have adjusted the MSS which in turn generates IP
packets that are less size than that of interface MTU.
Reported by: Bjoern Koenig ( bkoenig <> alpha-tierchen dot de )
Tested by: Bjoern Koenig ( bkoenig <> alpha-tierchen dot de )
MFC after: 3 days
in sys/nfs/nfs_nfssvc.c by registering with it using the
nfsd_call_nfsserver function pointer. Also, add the build glue for
nfs_nfssvc.c optionally based on "nfsserver" and also as a loadable
module.
Submitted by: rmacklem
Reviewed by: kib
Approved by: kib (mentor)
and CARPSTATS_INC(), rather than directly manipulating the fields of
the structure. This will make it easier to change the implementation
of these statistics, such as using per-CPU versions of the data
structure.
MFC after: 3 days
and PIMSTAT_INC(), rather than directly manipulating the fields of
the structure. This will make it easier to change the
implementation of these statistics, such as using per-CPU versions
of the data structure.
MFC after: 3 days
and MRTSTAT_INC(), rather than directly manipulating the fields of
the structure. This will make it easier to change the
implementation of these statistics, such as using per-CPU versions
of the data structure.
MFC after: 3 days
IGMPSTAT_ADD() and IGMPSTAT_INC(), rather than directly
manipulating the fields of the structure. This will make it
easier to change the implementation of these statistics,
such as using per-CPU versions of the data structures.
MFC after: 3 days
macros: ICMPSTAT_ADD(), ICMPSTAT_INC(), ICMP6STAT_ADD(), and
ICMP6STAT_INC(), rather than directly manipulating the fields
of these structures across the kernel. This will make it
easier to change the implementation of these statistics,
such as using per-CPU versions of the data structures.
In on case, icmp6stat members are manipulated indirectly, by
icmp6_errcount(), and this will require further work to fix
for per-CPU stats.
MFC after: 3 days
Update stats in struct udpstat using two new macros, UDPSTAT_ADD()
and UDPSTAT_INC(), rather than directly manipulating the fields
across the kernel. This will make it easier to change the
implementation of these statistics, such as using per-CPU versions
of the data structures.
MFC after: 3 days
and UDPSTAT_INC(), rather than directly manipulating the fields
across the kernel. This will make it easier to change the
implementation of these statistics, such as using per-CPU versions
of the data structures.
MFC after: 3 days
CPUs known to use 128 byte cache lines and defaulting to 32, use the dcbz
instruction to measure it. Also make dcbz behave the way you would
expect on PPC 970.
IPSTAT_INC(), IPSTAT_SUB(), and IPSTAT_DEC(), rather than directly
manipulating the fields across the kernel. This will make it easier
to change the implementation of these statistics, such as using
per-CPU versions of the data structures.
MFC after: 3 days
TCPSTAT_INC(), rather than directly manipulating the fields across the
kernel. This will make it easier to change the implementation of
these statistics, such as using per-CPU versions of the data structures.
MFC after: 3 days
but do not actually invoke KDB. This includes recoverable machine checks
encountered in kernel mode.
This patch causes machines with Grackle host-PCI bridges to be able to
correctly enumerate them again.
MFC after: 3 days
not populated in parent directory if negative entry was being
created, yet entry itself was added to the nc_neg list. It was
possible for parent vnode to get discarded later, leaving negative
entry pointing to now unused memory block.
Reported by: dho
Revewed by: kib
-UdpAliasIn(): correctly check return code after modules ran.
-alias_nbt: in case of malformed packets (or some other unrecoverable
error), toss the packet.
Simplify in/out functions.
Remove a hack to generate more efficient code for port numbers below
0x100, which has been obsolete for at least ten years, because GCC has
an asm constraint to specify that.
Remove a hack to generate more efficient code for port numbers below
0x100, which has been obsolete for at least ten years, because GCC has
an asm constraint to specify that.
Submitted by: Christoph Mallon <christoph mallon gmx de>
a reservation, unless all of the reservation's pages were free, the
reservation was moved to the head of the partially-populated reservations
queue, where it would be the next reservation to be broken in case the
free page queues were emptied. Now, instead, I am moving it to the tail.
Very likely this reservation is in the process of being freed in its
entirety, so placing it at the tail of the queue makes it more likely that
the underlying physical memory will be returned to the free page queues as
one contiguous chunk. If a reservation must be broken, it will, instead,
be the longest unchanged reservation, which is arguably the reservation
that is least likely to ever achieve promotion or be freed in its entirety.
MFC after: 6 weeks
dependency tracking and ordering enforcement.
With this change, per-vnet initialization functions introduced with
r190787 are no longer directly called from traditional initialization
functions (which cc in most cases inlined to pre-r190787 code), but are
instead registered via the vnet framework first, and are invoked only
after all prerequisite modules have been initialized. In the long run,
this framework should allow us to both initialize and dismantle
multiple vnet instances in a correct order.
The problem this change aims to solve is how to replay the
initialization sequence of various network stack components, which
have been traditionally triggered via different mechanisms (SYSINIT,
protosw). Note that this initialization sequence was and still can be
subtly different depending on whether certain pieces of code have been
statically compiled into the kernel, loaded as modules by boot
loader, or kldloaded at run time.
The approach is simple - we record the initialization sequence
established by the traditional mechanisms whenever vnet_mod_register()
is called for a particular vnet module. The vnet_mod_register_multi()
variant allows a single initializer function to be registered multiple
times but with different arguments - currently this is only used in
kern/uipc_domain.c by net_add_domain() with different struct domain *
as arguments, which allows for protosw-registered initialization
routines to be invoked in a correct order by the new vnet
initialization framework.
For the purpose of identifying vnet modules, each vnet module has to
have a unique ID, which is statically assigned in sys/vimage.h.
Dynamic assignment of vnet module IDs is not supported yet.
A vnet module may specify a single prerequisite module at registration
time by filling in the vmi_dependson field of its vnet_modinfo struct
with the ID of the module it depends on. Unless specified otherwise,
all vnet modules depend on VNET_MOD_NET (container for ifnet list head,
rt_tables etc.), which thus has to and will always be initialized
first. The framework will panic if it detects any unresolved
dependencies before completing system initialization. Detection of
unresolved dependencies for vnet modules registered after boot
(kldloaded modules) is not provided.
Note that the fact that each module can specify only a single
prerequisite may become problematic in the long run. In particular,
INET6 depends on INET being already instantiated, due to TCP / UDP
structures residing in INET container. IPSEC also depends on INET,
which will in turn additionally complicate making INET6-only kernel
configs a reality.
The entire registration framework can be compiled out by turning on the
VIMAGE_GLOBALS kernel config option.
Reviewed by: bz
Approved by: julian (mentor)
1) Flag it and only access that command on the 3c1
2) The TX PLL appears to power down when not in use, so we have to power
it back up when we've been idle. Do this at the start of ifstart.
Otherwise we fall off the net.
have other media to test against, I've left that media reporting
unchanged.
o Enable the TX_PLL when we enable TX. This is harmless on most
cards, but required to get the 3c1 CF card working. Power savings
could be had by managing this better, but for now it gets my card
working.
when using the "self" keyword in tables or as ()-style host address and
fixes "ifconfig -g all" output.
PR: kern/130977, kern/131310
Submitted by: Mikolaj Golub
MFC after: 3 days
the removal of NQNFS, but was left in in case it was required for NFSv4.
Since our new NFSv4 client and server can't use it for their
requirements, GC the old mechanism, as well as other unused lease-
related code and interfaces.
Due to its impact on kernel programming and binary interfaces, this
change should not be MFC'd.
Proposed by: jeff
Reviewed by: jeff
Discussed with: rmacklem, zach loafman @ isilon
Check the condition and return ENOENT then.
In nfs_lookup(), respect ENOENT return from cache_lookup() when it is caused
by dvp reclaim.
Reported and tested by: pho
the mappings without any of read and execution rights, in particular,
the PROT_NONE entries. This makes mlockall(2) work for the process
address space that has such mappings.
Since protection mode of the entry may change between setting
MAP_ENTRY_IN_TRANSITION and final pass over the region that records
the wire status of the entries, allocate new map entry flag
MAP_ENTRY_WIRE_SKIPPED to mark the skipped PROT_NONE entries.
Reported and tested by: Hans Ottevanger <fbsdhackers beasties demon nl>
Reviewed by: alc
MFC after: 3 weeks
types of MAC overheads such as preambles, link level retransmissions
and more.
Note- this commit changes the userland/kernel ABI for pipes
(but not for ordinary firewall rules) so you need to rebuild
kernel and /sbin/ipfw to use dummynet features.
Please check the manpage for details on the new feature.
The MFC would be trivial but it breaks the ABI, so it will
be postponed until after 7.2 is released.
Interested users are welcome to apply the patch manually
to their RELENG_7 tree.
Work supported by the European Commission, Projects Onelab and
Onelab2 (contract 224263).
We typically wire translation to devices with TLB1 entries and
pmap_kextract() does not know about those and returns 0. This
causes false positives (read: all serial ports suddenly become
the console).
Do not use Giant for kbdmux(4) locking. This is wrong and apparently
causing more problems than it solves. This will re-open the issue
where interrupt handlers may race with kbdmux(4) in polling mode.
Typical symptoms include (but not limited to) duplicated and/or
missing characters when low level console functions (such as gets)
are used while interrupts are enabled (for example geli password
prompt, mountroot prompt etc.)
MFC after: 3 days
Most compilers nowadays (including GCC) are smart enough to know what's
going on and generate more efficient code anyway.
Submitted by: Christoph Mallon <christoph.mallon@gmx.de>
It turns out my handling of SIGTTOU and SIGTTIN didn't entirely comply
to the standards. It is true that in the SIGTTOU case we should not
return EIO when the signal is ignored/blocked, but in the SIGTTIN case
we must.
See also: POSIX issue 7 section 11.1.4
(framing, parity, etc), but does not indicate characters
being received. Since no chracters have been received,
ignore the line errors.
PR: 131006
MFC after: 3 days
the size and cost of name cache entries, but make adding debugging
and tracing easier.
Add SDT DTrace probes for various namecache events:
vfs:namecache:enter:done - new entry in the name cache, passed parent
directory vnode pointer, name added to the cache, and child vnode
pointer.
vfs:namecache:enter_negative:done - new negative entry in the name cache,
passed parent vnode pointer, name added to the cache.
vfs:namecache:fullpath:enter - call to vn_fullpath1() is made, passed
the vnode to resolve to a name.
vfs:namecache:fullpath:hit - vn_fullpath1() successfully resolved a
search for the parent of an object using the namecache, passed the
discovered parent directory vnode pointer, name, and child vnode
pointer.
vfs:namecache:fullpath:miss - vn_fullpath1() failed to resolve a search
for the parent of an object using the namecache, passed the child
vnode pointer.
vfs:namecache:fullpath:return - vn_fullpath1() has completed, passed the
error number, and if that is zero, the vnode to resolve, and the
returned path.
vfs:namecache:lookup:hit - postive name cache entry hit, passed the
parent directory vnode pointer, name, and child vnode pointer.
vfs:namecache:lookup:hit_negative - negative name cache entry hit,
passed the parent directory vnode pointer and name.
vfs:namecache:lookup:miss - name cache miss, passed the parent directory
pointer and the full remaining component name (not terminated after the
cache miss component).
vfs:namecache:purge:done - name cache purge for a vnode, passed the vnode
pointer to purge.
vfs:namecache:purge_negative:done - name cache purge of negative entries
for children of a vnode, passed the vnode pointer to purge.
vfs:namecache:purgevfs - name cache purge for a mountpoint, passed the
mount pointer. Separate probes will also be invoked for each cache
entry zapped.
vfs:namecache:zap:done - name cache entry zapped, passed the parent
directory vnode pointer, name, and child vnode pointer.
vfs:namecache:zap_negative:done - negative name cache entry zapped,
passed the parent directory vnode pointer and name.
For any probes involving an extant name cache entry (enter, hit, zapp),
we use the nul-terminated string for the name component. For misses,
the remainder of the path, including later components, is provided as
an argument instead since there is no handy nul-terminated version of
the string around. This is arguably a bug.
MFC after: 1 month
Sponsored by: Google, Inc.
Reviewed by: jhb, kan, kib (earlier version)
Because the "c" input constaint is used, the compiler will already place
the MSR_FSBASE/MSR_GSBASE constants in ecx. Using __asm("ecx") makes
LLVM crash. Even though this is also an LLVM bug, we'd better remove the
unnecessary GCCism as well.
Submitted by: Christoph Mallon <christoph.mallon@gmx.de>
sharing of the nfssvc() system call between nfsserver and the nfsv4
server. Building of nfs_nfssvc.c will be committed later, at the time
the .c files in sys/nfsserver are updated. To do so now would result in
nfssvc() multiply defined.
Submitted by: rmacklem
Reviewed by: dfr
Approved by: kib (mentor)
- First three fields of system UUID may be little-endian as described in
SMBIOS Specification v2.6. For now, we keep the network byte order for
backward compatibility (and consistency with popular dmidecode tool)
if SMBIOS table revision is less than 2.6. However, little-endian format
can be forced by defining BOOT_LITTLE_ENDIAN_UUID from make.conf(5) if it
is necessary.
- Replace overly ambitious optimizations with more readable code.
- Update comments to SMBIOS Specification v2.6 and clean up style(9) bugs.
instead of just register one for the first adapter. Without doing this
there would be some data loss upon shutdown because data could be ignored
when flushing to disk.
MFC after: 3 days
- override_kernel_driver() has been removed since this is an
in-tree version of driver.
- __DATE__ and __TIME__ removed from version string to make
binary update builders happy.
- Utilize pause(9) for __FreeBSDversion >= 700033 (redo 167086).
- Utilize kproc_suspend_check() for __FreeBSDversion >= 800002.
(redo 172836).
- Don't read past end of pVDevice (redo 143787).
- Make sure that controller and channel are initialized (redo 169823).
- Don't include cam/cam_xpt_periph.h (redo 158177).
MFC After: 3 days
and update comments about original patches doing this and it not
working. It works for both the DL10019 and DL10022 based cards that I
have. It really helps the DL10019 cards, since they were using 8k
instead of the normal 16k that regular NE-2000 cards help.
# Note to self: need to provide a common routine to setup memory
# parameters.
more adequate TCP performance with IPv6.
Changes for IPv4, r166403 and r172795, both ignored the
IPv6 counterpart and left it in the state of art of year 2000.
The same logic in syncache already shares code between v4 and v6 so
things do not need to be adapted there.
Reported by: Steinar Haug (sthaug nethelp.no)
Tested by: Steinar Haug (sthaug nethelp.no)
MFC after: 3 days
DP8390-based cards have no generic way of reporting status of the link
or setting the media type. Some specific versions of these cards do,
however, allow for this, and we already support some of them. Make
the 'ed' experience more uniform by providing "autoselect" as the
meida and status "active" always. This won't affect the chips that
provide more specific details.
from existing functions for initializing global state.
At this stage, the new per-vnet initializer functions are
directly called from the existing global initialization code,
which should in most cases result in compiler inlining those
new functions, hence yielding a near-zero functional change.
Modify the existing initializer functions which are invoked via
protosw, like ip_init() et. al., to allow them to be invoked
multiple times, i.e. per each vnet. Global state, if any,
is initialized only if such functions are called within the
context of vnet0, which will be determined via the
IS_DEFAULT_VNET(curvnet) check (currently always true).
While here, V_irtualize a few remaining global UMA zones
used by net/netinet/netipsec networking code. While it is
not yet clear to me or anybody else whether this is the right
thing to do, at this stage this makes the code more readable,
and makes it easier to track uncollected UMA-zone-backed
objects on vnet removal. In the long run, it's quite possible
that some form of shared use of UMA zone pools among multiple
vnets should be considered.
Bump __FreeBSD_version due to changes in layout of structs
vnet_ipfw, vnet_inet and vnet_net.
Approved by: julian (mentor)
an NFS node including the access and attribute caches. Previously the NFS
client only purged any name cache entries associated with the file.
PR: kern/123755
Submitted by: Jaakko Heinonen jh of saunalahti fi
Reported by: Timo Sirainen tss of iki fi
Reviewed by: rwatson, rmacklem
MFC after: 1 month
client from 30 seconds to 3 seconds. After the recent changes to add
caching of negative name cache lookups, a negative name cache hit will
persist until the client notices the parent directory has changed. The
higher the attribute cache timeout on directories, the longer that can take,
so lower the default timeout for directories to match that of regular files.
Suggested by: bde, mohans
MFC after: 1 month
It makes little sense to use 100 Hz polling in dcons. We cannot live
without polling, because that's just how dcons works. It polls the
buffer filled by the firewire hardware. 25 Hz is probably enough for
most use cases.
Discussed with: rwatson
Tested by: kan
vfs:namei:lookup:entry takes parent directory vnode pointer, path to
look up, and lookup flags.
vfs:namei:lookup:return takes an error value, and if successful, the
returned vnode pointer.
MFC after: 1 month
dcons_drv_init invocations. Testing return value for 0 does not work for
cases where dcons_drv_init was called already as part of low level
console initialization.
Refactor how we interface with the root HUB. This is achieved by making a
direct call from usb2_do_request to the host controller for root hub requests,
this call will perform the controller specific register read/writes and return
the error code.
This cuts out a lot of code in the host controller files and saves one thread
per USB bus.
Submitted by: Hans Petter Selasky
Not only did these two drivers depend on IFF_NEEDSGIANT, they were
broken 7 months ago during the MPSAFE TTY import. if_ppp(4) has been
replaced by ppp(8). There is no replacement for if_sl(4).
If we see regressions in for example the ports tree, we should just use
__FreeBSD_version 800045 to check whether if_ppp(4) and if_sl(4) are
present. Version 800045 is used to denote the import of MPSAFE TTY.
Discussed with: rwatson, but also rwatson's IFF_NEEDSGIANT emails on the
lists.
- add support for more complicated HID descriptors which can have multiple
definitions of the same field.
- remove old modulo patch in ums, which I think is due to bad HID parsing,
which should be fixed now.
Reported by: netchild
Submitted by: Hans Petter Selasky
at91_udp.c does not exist anymore, it is now replaced by at91dci in
src/sys/dev/usb/controller. Also remove the ohci_atmelarm.c because it is also
included in src/sys/conf/files
Submitted by: Sylvestre Gallon
Some cancelable flags are always true. Substitute these away. These cancelable
flags were mostly useful with the root HUB which is now handled differently.
Submitted by: Hans Petter Selasky
Refactor how we interface with the root HUB. This cuts around 1200 lines of
code totally and saves one thread per USB bus.
Submitted by: Hans Petter Selasky
- make usb2_power_mask_t 16-bit
- remove "usb2_config_sub" structure from "usb2_config". To compensate for this
"usb2_config" has a new field called "usb_mode" which select for which mode
the current xfer entry is active. Options are: a) Device mode only b) Host
mode only (default-by-zero) c) Both modes. This change was scripted using
the following sed script: "s/\.mh\././g".
- the standard packet size table in "usb_transfer.c" is now a function, hence
the code for the function uses less memory than the table itself.
Submitted by: Hans Petter Selasky
- bugfixes after the memory usage reduction patch
- Use "udev->pipes_max" instead of USB_EP_MAX
- Use correct "bmRequestType" for getting the config descriptor.
Submitted by: Hans Petter Selasky
- memory usage reduction by only allocating the required USB pipes and USB
interfaces.
- cleanup some USB parsing functions to be more flexible.
Submitted by: Hans Petter Selasky
1) Move the new field (brand_note) to the end of the Brandinfo structure.
2) Add a new flag BI_BRAND_NOTE that indicates that the brand_note pointer
is valid.
3) Use the brand_note field if the flag BI_BRAND_NOTE is set and as old
modules won't have the flag set, so the new field brand_note would be
ignored.
Suggested by: jhb
Reviewed by: jhb
Approved by: kib (mentor)
MFC after: 6 days
but I see no benefit from it today.
VM_PROT_READ_IS_EXEC was only intended for use on processors that do not
distinguish between read and execute permission. On an mmap(2) or
mprotect(2), it automatically added execute permission if the caller
specified permissions included read permission. The hope was that this
would reduce the number of vm map entries needed to implement an address
space because there would be fewer neighboring vm map entries that differed
only in the presence or absence of VM_PROT_EXECUTE. (See vm/vm_mmap.c
revision 1.56.)
Today, I don't see any real applications that benefit from
VM_PROT_READ_IS_EXEC. In any case, vm map entries are now organized
as a self-adjusting binary search tree instead of an ordered list. So,
the need for coalescing vm map entries is not as great as it once was.
being switched out may hold a reservation. The stwcx. will
clear the reservation. This is architecturally recommended.
The scenario this addresses is as follows:
1. Thread 1 performs a lwarx and as such holds a reservation.
2. Thread 1 gets switched out (before doing the matching
stwcx.) and thread 2 is switched in.
3. Thread 2 performs a stwcx. to the same reservation granule.
This will succeed because the processor has the reservation
even though thread 2 didn't do the lwarx.
Note that on some processors the address given the stwcx. is
not checked. On these processors the mere condition of having
a reservation would cause the stwcx. to succeed, irrespective
of whether the addresses are the same. The dummy stwcx. is
especially important for those processors.
- Fix style for A3N and for a comment
Submitted by: Akira Funahashi <funa@funa.org>
Tested by: Marcin Nowak <marcin.nowak@simplusnet.pl>,
Diego Sardina <diego.sardina@gmx.com>
PR: kern/128634
in the case where a single mbuf is allocated due to
m_getcl() returning NULL, we already call MH_ALIGN,
so do not increment m->m_data in this case.
Found during MLDv2 port.
The later may need blocks from the underlying device that belongs
to normal files, that should not be locked while snap lock is held.
Reported and tested by: pho
MFC after: 1 month
- PR-SCTP had major issues when skipping through a multi-part message.
o Did not look at socket buffer.
o Did not properly handle the reassmebly queue.
o The MARKED segments could interfere and un-skip a chunk causing
a problem with the proper FWD-TSN.
o No FR of FWD-TSN's was being done.
- NR-Sack code was basically disabled. It needed fixes that
never got into the real code.
- CMT code had issues when the two paths were NOT the same b/w. We
found a few small bugs, but also the critcal one here was not
dividing the rwnd amongst the paths.
Obtained from: Michael Tuexen and myself at the IETF hack-fest ;-)
on a generic dumper that creates an ELF core file and
uses PMAP functions to scan and iterate over memory
chunks, as well as handle memory mappings used during
dumping.
the PMAP layer can choose to return physical memory
chunks or virtual memory chunks. For minidumps, the
chunks should be virtual.
The default MMU I/F implementation for the scan_md()
method returns NULL. Thus, when a PMAP implementation
does not implement the required methods, an empty
core file is created. Here, empty means having an ELF
header only.
Obtained from: Juniper Networks
the ATA status register with a 4-byte read request. This updates it, and
subsequent 1-byte reads will return the correct result.
This commit adds a hack to do this, which is currently ifdef'd powerpc,
although Linux and Darwin do this unconditionally on all platforms.
provided, for example, on the PowerPC 970 (G5), as well as on related CPUs
like the POWER3 and POWER4.
This also adds support for various built-in hardware found on Apple G5
hardware (e.g. the IBM CPC925 northbridge).
Reviewed by: grehan
to be encapsulated before dispatching to the driver
o eliminate M_WDS now that we call ieee80211_encap directly and can supply
the wds vap to indicate a 4-address frame should be created
The DIROUT bit difference between the 19 and 22 is annoying. We can
set both bits on both parts without ill effect. Use this trick to
simplify the code.
The DELAYS in the MII bus bit-bang code for the DL100xx parts aren't
needed. Eliminate them.
packet data. However, the AX88190A moves this on-chip and reduces it
to the more traditional 16k from 16k-32k. The AX88790 follows the
'190A. Probe memory above 32k to see which flavor of the '190 we have
and use the extra memory if we have it.
Eliminate the kludgy read eeprom for the ID code. It really is just a
memory read at location 0x400, so just use that instead. Makes the
code easier to understand as well as eliminates some magic numbers.
ed cards. There's a number of minor nits in a lot of the PHYs on the
PC Cards that use the Axis AX88190 or DLink DL10019 and DL10022 chips.
Forcing the autonegotiation doesn't seem to cause problems on the
cards that have sane PHYs, but makes several cards I have work without
further workarounds.
I'm not 100% sure that kicking the PHY and resetting them is the right
thing to do on the media change callback. Other NICs seem to need
this and do similar things.
only for mic-type inputs. This gives better chances to use it.
Change default configuration for some AD1986A codec based ASUS boards,
use it also for ASUS P5PL2 board. This makes front mic preamplifier working.
Tested by: Vadim Frolov <frolov@frolov.ck.ua>
the arguments translations. Provide ABI-compatible definition of the
struct i386_ldt_args for freebsd32 compat layer.
In collaboration with: pho
Reviewed by: jhb
the kernel on amd64. Fill and read segment registers for mcontext and
signals. Handle traps caused by restoration of the
invalidated selectors.
Implement user-mode creation and manipulation of the process-specific
LDT descriptors for amd64, see sysarch(2).
Implement support for TSS i/o port access permission bitmap for amd64.
Context-switch LDT and TSS. Do not save and restore segment registers on
the context switch, that is handled by kernel enter/leave trampolines
now. Remove segment restore code from the signal trampolines for
freebsd/amd64, freebsd/ia32 and linux/i386 for the same reason.
Implement amd64-specific compat shims for sysarch.
Linuxolator (temporary ?) switched to use gsbase for thread_area pointer.
TODO:
Currently, gdb is not adapted to show segment registers from struct reg.
Also, no machine-depended ptrace command is added to set segment
registers for debugged process.
In collaboration with: pho
Discussed with: peter
Reviewed by: jhb
Linuxolator tested by: dchagin
Reorder amd64 gdt descriptors so that user-accessible selectors are the
same as on i386. At least Wine hard-codes this into the binary.
In collaboration with: pho
Reviewed by: jhb
Provides i386/freebsd API-compatible definitions for the argument
structures of the above sysarch commands. struct i386_ioperm_args
definition is ABI-compatible.
In collaboration with: pho
Reviewed by: jhb
To keep these structures ABI-compatible, half the size of r_trapno,
r_err, mc_trapno, mc_flags.
Add fsbase and gsbase to mcontext on both amd64 and i386.
Add flags to amd64 mcontext to indicate that it contains valid segments
or bases.
In collaboration with: pho
Discussed with: peter
Reviewed by: jhb
to the virtual one. I may had a reason at some point to use the later, but
can't remember which, and it can leads to issues.
Reported by: Guillaume Ballet <gballet gmail com>
as 'real memory' instead of Maxmem if the value is available.
Note amd64 displayed physmem as 'usable memory' since machdep.c r1.640
to unconfuse users. Now it is consistent across amd64 and i386 again.
While I am here, clean up smbios.c a bit and update copyright date.
Reviewed by: jhb
o Don't run through the register initialization in the read mac routine
for the AX88x90. It duplicates other stuff that we do.
o Eliminate the 10ms delay after we reset the AX88x90. We already wait for
the appropriate bits to indicate reset is done.
a device specific DMA tag. On amd64 it could exhaust all of bounce
pages when bus_dma_tag_create(9) is called at malo_pci_attach() then as
result in next turn it returns ENOMEM. This fix a attach fail on amd64.
Pointed by: yongari
Tested by: dchagin
MFC after: 3 days
It seems that RTL8168D and RTL8102EL requires additional settle
time to complete RL_PHYAR register write. Accessing RL_PHYAR
register right after the write causes errors for subsequent PHY
register accesses.
Tested by: george at luckytele dot com,
Steve Wills < STEVE at stevenwills dot com >
don't have one of the clock cycles (the turn cycle) that the AX88x90
chips have. Make this conditional. But this seems totally crazy and
can't possibly be right. Commit the fix for the moment until I can
explore this mystery more deeply.
On the plus side, the DL10022-based cards I have (D-Link DEF-670TXD
and SMC8040TX) work after this fix.
Add ch_suspend/ch_resume methods for PCI controllers and implement them
for AHCI. Refactor AHCI channel initialization according to it.
Fix Port Multipliers operation. It is far from perfect yet, but works now.
Tested with JMicron JMB363 AHCI + SiI 3726 PMP pair.
Previous version was also tested with SiI 4726 PMP.
Hardware sponsored by: Vitsch Electronics / VEHosting.nl
o call ieee80211_encap in ieee80211_start so frames passed down to drivers
are already encapsulated
o remove ieee80211_encap calls in drivers
o fixup wi so it recreates the 802.3 head it requires from the 802.11
header contents
o move fast-frame aggregation from ath to net80211 (conditional on
IEEE80211_SUPPORT_SUPERG):
- aggregation is now done in ieee80211_start; it is enabled when the
packets/sec exceeds ieee80211_ffppsmin (net.wlan.ffppsmin) and frames
are held on a staging queue according to ieee80211_ffagemax
(net.wlan.ffagemax) to wait for a frame to combine with
- drivers must call back to age/flush the staging queue (ath does this
on tx done, at swba, and on rx according to the state of the tx queues
and/or the contents of the staging queue)
- remove fast-frame-related data structures from ath
- add ieee80211_ff_node_init and ieee80211_ff_node_cleanup to handle
per-node fast-frames state (we reuse 11n tx ampdu state)
o change ieee80211_encap calling convention to include an explicit vap
so frames coming through a WDS vap are recognized w/o setting M_WDS
With these changes any device able to tx/rx 3Kbyte+ frames can use fast-frames.
Reviewed by: thompsa, rpaulo, avatar, imp, sephe
o Introduce new chip_type AX88790. There's a few places we need to know the
exact chip for workaronds.
o Explain the AX88190 workaround for the ISR bits being stuck, and don't
apply them to the AX88790. The datasheet says the bits are fixed, and
experience confirms.
o Fix mii bit-bang read code to read and discard the 'floating' bit.
o Remove empty ed_pccard_ax88x90_mii_reset routine
o Report error from mii_phy_probe
o Don't use ed_probe_Novel_generic for ax88x90 chips. It puts them into
an odd state sometimes. Instead, use a more stream-lined version that
avoids the trouble spots. This was copied and tweaked from the original.
o Move chip reset into its own routine.
o Minor code optimiation on getting MAC address
o Add code for coping with AX88790 cards that are in power down state and
need to be kicked before the PHY registers for the internal phy read right.
o Remove ugly cap of PHYs at 17.
o For AX88790, we need to set a special bit for accessig phy 16 (the internal
phy) and clear it for all others according to a chip erratum.
o streamline the bit-bang code for AX88x90: the delays aren't needed according
to the datasheet timing diagrams and also the Linux driver
o Fix minor bit definition for direction bit.
o Generally: Some comments reformatted
o Only try the toshiba probe on cards labelled as toshiba
# From another Akihabara card (this one from a few years ago from a
# friend in Japan). Fix the Corega FEther II PCC-TXD. This one is
# still on sale new, as of a few weeks ago. should fix all other AX88x90
# based cards, but I have some testing left to finish on my collection...
o PC98 uses 32-bit block numbers. Limit the scheme to 2^32-1
blocks when the media is larger. The 32-bit block numbers
are implicit (16-bit cylinder * 8-bit head * 8-bit sector).
o EBR uses 32-bit block numbers. Limit the scheme to 2^32-1
blocks when the media is larger.
o Calculate the number of entries based on the rounded media
size, rather than the raw media size.
in directory vnodes. Allow namecache dotdot entry to be created pointing
from child vnode to parent vnode if no existing links in opposite
direction exist. Use direct link from parent to child for dotdot lookups
otherwise.
This restores more efficient dotdot caching in NFS filesystems which
was lost when vnodes stoppped being type stable.
Reviewed by: kib
It's quite strange that nobody reported this issue before. It turns out
functions like ttyname(), ptsname() and fdevname() don't work in
compat32. This means it't not even possible to run applications like
script(1) inside a 32-bit FreeBSD jail.
Fix this by converting 32-bit fiodgname_arg structures to their 64-bit
equivalent.
Reported by: kris
Tested by: kris
o remove ic_myaddr from ieee80211com
o change ieee80211_ifattach to take the mac address of the physical device
and use that to setup the lladdr.
o replace all references to ic_myaddr in drivers by IF_LLADDR
o related cleanups (e.g. kill dead code)
PR: kern/133178
Reviewed by: thompsa, rpaulo
provider namespace. These are inserted dynamically into the
VOP_..._AP() functions created from vnode_if.src. Each VOP has
entry and return probes, as arg0 the primary vnode, arg1 the
vnode operation argument structure pointer, providing access to
IN and OUT arguments, and for return probes, arg2 the return
value.
MFC after: 1 month
Sponsored by: Google, Inc.
src/sys and not the entire src/ tree.
An earlier solution by peter had been comitted in r183528 and backed out
in r183566 due to problems with newvers.sh also called from other places
during world build. With the extra test this survived a make universe.
The work have been under testing and fixing since then, and it is mature enough
to be put into HEAD for further testing.
A lot have changed in this time, and here are the most important:
- Gvinum now uses one single workerthread instead of one thread for each
volume and each plex. The reason for this is that the previous scheme was
very complex, and was the cause of many of the bugs discovered in gvinum.
Instead, gvinum now uses one worker thread with an event queue, quite
similar to what used in gmirror.
- The rebuild/grow/initialize/parity check routines no longer runs in
separate threads, but are run as regular I/O requests with special flags.
This made it easier to support mounted growing and parity rebuild.
- Support for growing striped and raid5-plexes, meaning that one can extend the
volumes for these plex types in addition to the concat type. Also works while
the volume is mounted.
- Implementation of many of the missing commands from the old vinum:
attach/detach, start (was partially implemented), stop (was partially
implemented), concat, mirror, stripe, raid5 (shortcuts for creating volumes
with one plex of these organizations).
- The parity check and rebuild no longer goes between userland/kernel, meaning
that the gvinum command will not stay and wait forever for the rebuild to
finish. You can instead watch the status with the list command.
- Many problems with gvinum have been reported since 5.x, and some has been hard
to fix due to the complicated architecture. Hopefully, it should be more
stable and better handle edge cases that previously made gvinum crash.
- Failed drives no longer disappears entirely, but now leave behind a dummy
drive that makes sure the original state is not forgotten in case the system
is rebooted between drive failures/swaps.
- Update manpage to reflect new commands and extend it with some examples.
Sponsored by: Google Summer of Code 2007
Mentored by: le
Tested by: Rick C. Petty <rick-freebsd2008 -at- kiwi-computer.com>
It seems that some revision of controller hang while accessing
the VPD. Because VPD access routine are unused, nuke it.
o Let TWSI reload EEPROM if VPD capability is detected. Reloading
EEPROM will also set ethernet address so age(4) now reads AGE_PAR0
and AGE_PAR1 register to get ethernet address. This removes a lot
of hack and enhance readability a lot.
o Double PHY reset timeout as it takes more time to take PHY out of
power-saving state.
o Explicitly check power-saving state by checking undocumented PHY
registers. If link is not up, poke undocumented registers to take
PHY out of power-saving state. This is the same way what Linux
does. On resume, make sure to wake up PHY.
o Don't rely on auto-clearing feature of master reset bit, just wait
1ms and check idle status of MAC.
o Add PCI device revision information in bootverbose mode.
This should fix occasional controller hang in device attach phase.
Reported by: barbara < barbara.xxx1975 at libero DOT it >
Tested by: barbara < barbara.xxx1975 at libero DOT it >
up) rather than amount + 1 / 2, which is the same as amount, or 2x too
many words which leads to data corruption.
# This fixes the sbdrop panics I was seeing with the Toshiba LANCT00A.
Toshiba PCETC ISA card, and even has the same board type code in the
card ID (0x14). So, for this card, call ed_probe_WD80x3_generic after
setting things up apropriately. This makes the card attach and kinda
work (I'm seeing panics in sbdrop). Since history has shown that the
WD80x3 probe routine is dangerous, only do it for this card. Also,
disable the memory range check to make sure it is an valid ISA memory.
I think that it is bogus, but I'm not 100% sure, for these cards.
I removed probing for the WD80x3 in 2005 when I added support for the
AX88x90 and DL100xx cards since none of my cards had ever matched it
and PAO3 removed it and none of the cards in their database died.
It is possible there are other quirks about this card too, since no
other open source OS supports it, or even claims to support it. But
it was a fun half hour hack...
driver. Not sure who sold it/rebadged it.
Add stub entries for Mitsubishi B8895 and Toshiba LANCT00A to the
driver with a comment that they don't work /* NG */.[*] These are
DP83902A based cards, which should work, but don't seem to. Likely
they are from the days before the ne2000 roamed the earth and use a
non-standard hookup (see if_ed_isa or if_ed_cbus for some examples).
Unless I happen to stumble into the right one, these may never work,
but I'm tired of omitting them from commits.
[*] The Japanese adopted OK from English, but also use NG for its
opposite.
typedef l_long l_off_t;
Change l_mmap_argv's to l_ulong for pgoff. This restores prior behaviour
to consumers of l_off_t but allows mmap to mmap a 32bit position which a
Linux application requires to access SMBIOS data via /dev/mem.
Reviewed by: dchagin
Prompted by: rdivacky
rather than behind a seemingly accidental constant likely left over from one of
the related drivers which uses log levels rather than per-facility debugging
flags. This should get rid of contextless messages on the console for people
who have not set (or cleared the default) debugging flags.
o Don't create an APM scheme underneath another scheme when
the probe doesn't allow it.
o APM uses 32-bit block numbers. Limit the scheme to 2^32-1
blocks when the media is larger.
packet. Linux, OpenBSD and our iwn(4) all do this. It also results in
a huge performance improvement (and the rejection of a fair number of
apparently-bad packets on receive) on my hardware.
o) Like the wpi(4) driver in OpenBSD, and like our iwn(4), also drop runt
packets.
o) Don't bother doing IFQ_POLL and then IFQ_DRV_DEQUEUE, just do
IFQ_DRV_DEQUEUE outright. This is more similar to how OpenBSD and our
iwn(4) work.
Reviewed by: sam
into acpi_cpu_startup() which is where all the other code to update this
global variable lives. This fixes a bug where cpu_cx_count was not updated
correctly if acpi_cpu_generic_cx_probe() returned early.
PR: kern/108581
Debugged by: Bruce Cran
Reviewed by: avg, njl, sepotvin
MFC after: 3 days
via the Linux tool.
- Add Linux shim to ipmi(4)
- Create a partitions file to linprocfs to make Linux fdisk see
disks. This file is dynamic so we can see disks come and go.
- Convert msdosfs to vfat in mtab since Linux uses that for
msdosfs.
- In the Linux mount path convert vfat passed in to msdosfs
so Linux mount works on FreeBSD. Note that tasting works
so that if da0 is a msdos file system
/compat/linux/bin/mount /dev/da0 /mnt
works.
- fix a 64it bug for l_off_t.
Grabing sh, mount, fdisk, df from Linux, creating a symlink of mtab to
/compat/linux/etc/mtab and then some careful unpacking of the Linux bmc
update tool and hacking makes it work on newer Dell boxes. Note, probably
if you can't figure out how to do this, then you probably shouldn't be
doing it :-)
in AMD FPUs:
- Do not clear the affected state in the case that the FPU registers for
the thread that already owns the FPU are changed via fpu_setregs(). The
only local information the thread would see is its own state in that
case.
- Fix a type mismatch for the dummy variable used in a "fld". It accepts
a float, not a double.
Reviewed by: bde
Approved by: so (cperciva)
MFC after: 1 month
When a vt switch occurs the irq handler is uninstalled. Interrupts
and the state tracking of what was enabled/disabled wasn't working
properly. This should resolve the reports of "slow windows" after a
vt switch, among other things. The radeon 2d driver seems to work a
bit more correctly than the Intel driver. With the Intel driver,
vblank interrupts will be enabled at system startup and will only
be disabled after an additional modeset (vt switch, dpms, randr event).
With this patch, I am able to run glxgears synced to vblank and
vt switch while it is running without ill effects.
MFC after: 3 days
- Trace non-error loads into the access cache once, not zero times or
twice.
- Sometimes attr cache loads fail due to a race, in which case they are
aborted leading to an invalidation; in this case, trace only the flush,
not a load.
MFC after: 1 month
Sponsored by: Google, Inc.
the requested PCI bus falls outside of the bus range given in the ACPI
MCFG table. Several BIOSes seem to not include all of the PCI busses in
systems in their MCFG tables. It maybe that the BIOS is simply buggy and
does support all the busses, but it is more conservative to just fall back
to the old method unless it is certain that memory accesses will work.
events are:
nfsclient:accesscache:flush:done
nfsclient:accesscache:get:hit
nfsclient:accesscache:get:miss
nfsclient:accesscache:load:done
They pass the vnode, uid, and requested or loaded access mode (if any);
the load event may also report a load error if the RPC fails.
The attribute cache events are:
nfsclient:attrcache:flush:done
nfsclient:attrcache:get:hit
nfsclient:attrcache:get:miss
nfsclient:attrcache:load:done
They pass the vnode, optionally the vattr if one is present (hit or load),
and in the case of a load event, also a possible RPC error.
MFC after: 1 month
Sponsored by: Google, Inc.
- Call acpi_resync_clock() to reset system time before hardclock is ready
to tick. Note we assume the current timecounter hardware and RTC are
already available for read operation.
Tested by: mav
It was derived from i386 version long ago but never resync'ed again.
Originally, i386 version compared the current time from realtime clock
with time_second (which was just `time' in the old days). When this MI
version was written, it was wrongly compared against `base' AND never
used because of a bug (typo?) in the code. This check was killed
in i386 version when home-rolled calendaric calculation was removed.
Now, we just remove the code here as well to make the code simpler.
prevent individual transactions from crossing a 4GB address boundary. Due
to bus_size_t type limitations, the driver uses a 2GB boundary in PAE
kernels.
Reviewed by: scottl
MFC after: 1 week
quirk requiring it to be enabled even when using MSI. This makes
the latter work again after r189285.
- Remove a comment which no longer applies since r190194.
filtering handle this. Introduce a new function msk_rxfilter that
handles Rx filter configuration and multicast setup as well as
promiscuous mode. This simplifies code a lot.
Promiscuous mode always have preference to any other Rx
filtering so don't disable the mode when ALLMULTI is set.
to the devctl notification queue. Empty strings cause devctl read
call to return 0 and result in devd exiting prematurely.
The actual offender (ugen notes for root hubs) will be fixed
by separate commit.
Limit the size of malloced buffer when dumping environment
variables. [EN-09:01]
Approved by: so (cperciva)
Approved by: re (kensmith)
Security: FreeBSD-SA-09:06.ktimer
Errata: FreeBSD-EN-09:01.kenv
provider. The NFS client exposes 'start' and 'done' probes for NFSv2
and NFSv3 RPCs when using the new RPC implementation, passing in the
vnode, mbuf chain, credential, and NFSv2 or NFSv3 procedure number.
For 'done' probes, the error number is also available.
Probes are named in the following way:
...
nfsclient:nfs2:write:start
nfsclient:nfs2:write:done
...
nfsclient:nfs3:access:start
nfsclient:nfs3:access:done
...
Access to the unmarshalled arguments is not easily available at this
point in the stack, but the passed probe arguments are sufficient to
to a lot of interesting things in practice. Technically, these probes
may cover multiple RPC retransmits, and even transactions if the
transaction ID change as a result of authentication failure or a
jukebox error from the server, but usefully capture the intent of a
single NFS request, such as access, getattr, write, etc.
Typical use might involve profiling RPC latency by system call, number
of RPCs, how often a getattr leads to a call to access, when failed
access control checks occur, etc. More detailed RPC information might
best be provided by adding a krpc provider. It would also be useful
to add NFS client probes for events such as the access cache or
attribute cache satisfying requests without an RPC.
Sponsored by: Google, Inc.
MFC after: 1 month
Badly formed ELF note may cause the caclulated pointer to the next note
to point both after the note region, that was checked in the code, but
also to point before the region, that was not checked [1]. Remember the
first note location in note0 and leap out if the note is not between
note0 and note_end.
In the similar way, badly formed note may cause infinite loop by
pointing next note into the same or previous note. Guard against this by
limiting amount of loop iterations by arbitrary choosen big number.
For clarity, check the calculated note alignment in each iteration.
Reported by: Chris Palmer <chris noncombatant org> [1]
PR: kern/132886
Reviewed and tested by: dchagin
MFC after: 3 days
it is right for only a tiny fraction of these devices and this
wild-card entry is too broad.
# I run a kernel without this entry at all without ill effects...
stored in the pmap is from the direct map region. The two exceptions have
been the kernel pmap and the swapper's pmap. These pmaps have used a
kernel virtual address established by pmap_bootstrap() for their shared
pml4 page table page. However, there is no reason not to use the direct
map for these pmaps as well.
they were passed uninitialized to in6_pcblookup_hash. Instead, do as is done
for IPv4 and use the addresses within the sockaddr structure, which are
correctly populated.
This fixes tcpdrop(8) for IPv6 address pairs.
Reviewed by: bz
definitely doing an NFSv2 or NFSv3 RPC, rather than sometimes doing
so and sometimes not. This makes it easier to add a DTrace return
probe at a single point in the function.
MFC after: 1 week
The tick routine was not being restarted in the init_locked routine
which could resulted in loss of carrier when updating the MTU.
Submitted by: Navdeep Parhar at Chelsio
MFC after: 3 weeks
- If boot verbose, print asicrev, chiprev and bus type on attach.
- For PCI Express devices:
1) Adjust max read request size to 4Kbytes
2) Turn on FIFO_LONG_BURST in RDMA during bge_blockinit()
Though 1) does not seem to have much to do with the poor TX performance
observed on PCI Express bge(4), 2) does fix the problem. [1]
- Nuke the RX CPU self-diag, which prevents working cards from working
(Linux tg3 does not have this diag neither does OpenBSD's bge(4)).
The increasing of the firmware handshaking timeout to 20000 retries
done as part of the original commit isn't merged as way already have a
way higher BGE_TIMEOUT of 100000.
PR: 119361 [1]
Obtained from: tg3 via DragonflyBSD [1], DragonflyBSD
Workaround for buggy USB hardware not handling new SETUP packet before STATUS
stage is complete, this allows xfers to endpoint0 to return a short frame.
Submitted by: Hans Petter Selasky
Reported by: me
Workaround for buggy USB hardware not handling new SETUP packet before STATUS
stage is complete, this allows xfers to endpoint0 to return a short frame.
Submitted by: Hans Petter Selasky
Reported by: me
The number of entries in the cache defaults to 8 but is easily changed in
nfsclient/nfs.h. When the cache is filled, the oldest cache entry is
evicted when space is needed.
I mirrored the changes to the NFSv[23] client in the NFSv4 client to fix
compile breakage. However, the NFSv4 client doesn't actually use the
access cache currently.
Submitted by: rmacklem
- Move tunable defines into usb_core.h and dependancy towards usb_defs.h
- Leave hardcoded defines in "usb_defs.h".
- Allow overriding all tunable defines.
- Add more customisable typedefs.
- Correct maximum device number.
Submitted by: Hans Petter Selasky
It seems I didn't fix this issue before committing teken to the tree. My
initial idea was to somehow add an error mechanism to instruct the video
driver author to increase T_NUMCOL when using very big terminals. It
turns out we have platforms where we have gigantic consoles on systems
like the Apple PowerMac G5, which means we crash there right now.
Just ignore tabstops placed beyond column 160. Just force tabs to be
placed on each 8 columns.
Reported by: nwhitehorn
handle the ioctl. There are other paths that already call it, but this
allows for a non-interface socket (like AF_LOCAL which ifconfig now
uses) to use a broader class of interface ioctls.
Approved by: bz (mentor), rwatson
supported with these pseudo-PHYs. The MIIF_NOLOOP flag currently triggers
nothing but hopefully will be respected by mii_phy_setmedia() later on.
- Don't add IFM_NONE as isolation isn't supported by these pseudo-PHYs.
- Use mii_phy_add_media() instead of mii_add_media() so the latter can
be eventually retired.
supported burst sizes.
- Add support for 64-bit burst sizes (required for SBus GEM).
- Failing to register as interrupt controller during attach shouldn't
be fatal so just inform about this instead of panicing.
- Take advantage of KOBJMETHOD_END.
- Remove some redundant variables.
- Add missing const.
- Failing to register as interrupt controller during attach shouldn't
be fatal so just inform about this instead of panicing.
- Disable rerun of the streaming cache as workaround for a silicon bug
of certain Psycho versions.
- Remove the comment regarding lack of newbus'ified bus_dma(9) as being
able to associate a DMA tag with a device would allow to implement
CDMA flushing/syncing in bus_dmamap_sync(9) but that would totally
kill performance. Given that for devices not behind a PCI-PCI bridge
the host-to-PCI bridges also only do CDMA flushing/syncing based on
interrupts there's no additional disadvantage for polling(4) callbacks
in the case schizo(4) has to do the CDMA flushing/syncing but rather a
general problem.
- Don't panic if the power failure, power management or over-temperature
interrupts doesn't exist as these aren't mandatory and not available
with all controllers (not even Psychos). [1]
- Take advantage of KOBJMETHOD_END.
- Remove some redundant variables.
- Add missing const.
PR: 131371 [1]
- Hook up the streaming buffer (not used by iommu(4) by default, yet)
if available and usable. [1]
- Move the message regarding belated registration as interrupt control
under bootverbose as this isn't something the user should worry about.
Tested by: Michael Moll [1]
driver in Linux 2.6. uscanner was just a simple wrapper around a fifo and
contained no logic, the default interface is now libusb (supported by sane).
Reviewed by: HPS
"raw" names. While there, change the formatting of extended MSDOS partitions
so that the dot (".") is not used to separate two numbers (which kind of
looks like the whole is a decimal number). Use "+" instead, which also
hints that the second part of the name is the offset from the start of
the partition in the first part of the name. Also change the offset from
decimal to hexadecimal notation, simply for aesthetic reasons and future
compatibility.
GEOM_PART is the default in 8-CURRENT but not yet in 7-STABLE so this
changeset can be MFC-ed without causing major problems from the second
part.
Reviewed by: marcel
Approved by: gnn (mentor)
MFC after: 2 weeks
after vt switch or suspend. I can't really test this on Intel right now
but I think I've heard reports of it on radeon as well. I can't break
it on the radeon here.
MFC after: 3 days
This is purely a forwarding plane cleanup; no control plane
code is involved.
Summary:
* Split IPv4 and IPv6 MROUTING support. The static compile-time
kernel option remains the same, however, the modules may now
be built for IPv4 and IPv6 separately as ip_mroute_mod and
ip6_mroute_mod.
* Clean up the IPv4 multicast forwarding code to use BSD queue
and hash table constructs. Don't build our own timer abstractions
when ratecheck() and timevalclear() etc will do.
* Expose the multicast forwarding cache (MFC) and virtual interface
table (VIF) as sysctls, to reduce netstat's dependence on libkvm
for this information for running kernels.
* bandwidth meters however still require libkvm.
* Make the MFC hash table size a boot/load-time tunable ULONG,
net.inet.ip.mfchashsize (defaults to 256).
* Remove unused members from struct vif and struct mfc.
* Kill RSVP support, as no current RSVP implementation uses it.
These stubs could be moved to raw_ip.c.
* Don't share locks or initialization between IPv4 and IPv6.
* Don't use a static struct route_in6 in ip6_mroute.c.
The v6 code is still using a cached struct route_in6, this is
moved to mif6 for the time being.
* More cleanup remains to be merged from ip_mroute.c to ip6_mroute.c.
v4 path tested using ports/net/mcast-tools.
v6 changes are mostly mechanical locking and *have not* been tested.
As these changes partially break some kernel ABIs, they will not
be MFCed. There is a lot more work to be done here.
Reviewed by: Pavlin Radoslavov
o break out version-related code to simplify rev'ing the protocol
o add parameter validation macros so checks that appear multiple places
are consistent (and easy to change)
o add protocol version check when looking for a scan candidate
o improve scan debug output format
o rewrite beacon update handling to calculate a bitmask of changed values
and pass that down through the driver callback so drivers can optimize work
o do slot bounds check before use when parsing received beacons
directory for a znode. When the directory already exists, it returns a
referenced but unlocked vnode. When a directory does not yet exist, it
calls zfs_make_xattrdir() to create a new one. zfs_make_xattrdir() returns
the vnode both referenced and and locked and zfs_get_xattrdir() was leaking
this vnode lock to its callers. Fix this by dropping the vnode lock if
zfs_make_xattrdir() successfully creates a new extended attribute
directory.
Reviewed by: pjd
or URB_FUNCTION_CLASS_xxx with HAL preemption lock that means it's
non-sleepable during USB requests though usb2_do_request() requires a
sleep so it needs to send queries to the default pipe without those
interfaces to avoid sleep.
whatever the IRP flag is because some drivers (eg. RTL8187L NDIS driver)
call IoCompleteRequest() without setting flags. It will prevent waiting
a event forever at attach.
B_DELWRI cleanup and vnode disassociation should happen just before to
assign the buffer to a queue.
Reported by: miwi, Volker <volker at vwsoft dot com>,
Ben Kaduk <minimarmot at gmail dot com>,
Christopher Mallon <christoph dot mallon at gmx dot de>
Tested by: lulf, miwi
any IPv4 multicast operations which reference it.
There is a potential race because ifma_protospec is set to NULL
when we discover the underlying ifnet has gone away. This write
is not covered by the IF_ADDR_LOCK, and it's difficult to widen
its scope without making it a recursive lock. It isn't clear why
this manifests more quickly with 802.11 interfaces, but does not
seem to manifest at all with wired interfaces.
With this change, the 802.11 related panics reported by sam@
and cokane@ should go away. It is not the right fix, that requires
more thought before 8.0.
Idea from: sam
Tested by: cokane
Obtained from: Hideotshi Shimokawa
This update is based on comments from Hidetoshi.
Changeset 183550 removed the call to crom_load() in fw_busreset(). Restore
that call such that the Configuration ROM is valid.
Stash and update fwdev settings in fw_explore_node() so that negotiation
works again.
to the full path of the image that is being executed.
Increase AT_COUNT.
Remove no longer true comment about types used in Linux ELF binaries,
listed types contain FreeBSD-specific entries.
Reviewed by: kan
This fixes osrel fetching from the FreeBSD branding note for the 64bit
platforms.
Reported by: swell.k gmail com
Reviewed by: dchagin
Tested by: dchagin, swell.k gmail com
no-op's that I inadvertently added. Even if locking is needed in general
for the ioctl's, setting a single long will not need it due to the operation
being atomic.
Reported by: rwatson
Fix regression issue in the USB file system interface.
- Use cdev_privdata pointer as indicator of correct file handle.
- Remove redundant FIFO opened flags.
Don't send ZLP at close for ulpt and uscanner devices as this causes some
models to stop working. This reverts back to the USB1 behaviour.
Submitted by: Hans Petter Selasky
This code is heavily inspired by Takanori Watanabe's experimental SMP patch
for i386 and large portion was shamelessly cut and pasted from Peter Wemm's
AP boot code.
1. fix nointr check in intsmb_start, matters only if ENABLE_ALART is
defined (by default, it is not);
2. drop unnecessary inspection/reporting of power-management io registers
base address;
3. in verbose mode report errors from SMBus host controller and their
mapping to smbus(4) errors;
Approved by: jhb (mentor)
implement CD input in hardware, while unconditional showing it confuse users.
Also it was made in the way that sometimes improper with present driver.
Add patch for ALC268 based Acer TM5320 to make headphones jack sensing work.
Default configuration defines two separate playback associations, which
current driver unable to trace properly due to order they are defined and
limited codec uniformity.
Submitted by: G. Mirov <g.mirov AT gmail.com>
the "nbufkv" sleep.
First, ffs background cg group block write requests a new buffer for
the shadow copy. When ffs_bufwrite() is called from the bufdaemon due
to buffers shortage, requesting the buffer deadlock bufdaemon.
Introduce a new flag for getnewbuf(), GB_NOWAIT_BD, to request getblk
to not block while allocating the buffer, and return failure
instead. Add a flag argument to the geteblk to allow to pass the flags
to getblk(). Do not repeat the getnewbuf() call from geteblk if buffer
allocation failed and either GB_NOWAIT_BD is specified, or geteblk()
is called from bufdaemon (or its helper, see below). In
ffs_bufwrite(), fall back to synchronous cg block write if shadow
block allocation failed.
Since r107847, buffer write assumes that vnode owning the buffer is
locked. The second problem is that buffer cache may accumulate many
buffers belonging to limited number of vnodes. With such workload,
quite often threads that own the mentioned vnodes locks are trying to
read another block from the vnodes, and, due to buffer cache
exhaustion, are asking bufdaemon for help. Bufdaemon is unable to make
any substantial progress because the vnodes are locked.
Allow the threads owning vnode locks to help the bufdaemon by doing
the flush pass over the buffer cache before getnewbuf() is going to
uninterruptible sleep. Move the flushing code from buf_daemon() to new
helper function buf_do_flush(), that is called from getnewbuf(). The
number of buffers flushed by single call to buf_do_flush() from
getnewbuf() is limited by new sysctl vfs.flushbufqtarget. Prevent
recursive calls to buf_do_flush() by marking the bufdaemon and threads
that temporarily help bufdaemon by TDP_BUFNEED flag.
In collaboration with: pho
Reviewed by: tegge (previous version)
Tested by: glebius, yandex ...
MFC after: 3 weeks
LO_CSUM_FEATURES - a bitmask of supported transmit offload features, which
will be stored in if_hwassist if IFCAP_TXCSUM is enabled, and be cleared
from mbuf packet header csum flags on transmit. (1)
LO_CSUM_SET - a bitmask of supported receive offload features, which will
be set on the mbuf packet header csum flags on transmit if IFCAP_RXCSUM
is enabled.
While here, fix SCTP offload for loopback: offer generation on the
transmit side, don't just skip validation on the receive side.
Obtained from: DragonflyBSD (1)
MFC after: 1 week
have its MTU set higher than 1500 (ETHERMTU). Its new limit is now
65535 as enforced by ifhwioctl() in if.c
This allows a tap(4) device to be added to a bridge, which requires all
interface members to have the same MTU, with an interface configured for
jumbo frames. QEMU may now connect to a network via tap(4) without
requiring the real interface to have its MTU set to 1500 or lower.
Reviewed by: rpaulo, bms
MFC after: 1 week
avoidance:
- Enable setting the RXCSUM and TXCSUM flags for loopback interfaces;
set both by default.
- When RXCSUM is set, flag packets sent over the loopback interface as
having checked and valid IP, UDP, TCP checksums so that higher
protocol layers won't check them.
- Always clear CSUM_{IP,UDP_TCP} checksum required flags on transmit,
as they will have gotten there as a result of TXCSUM being set.
This is done only for packets explicitly sent over the loopback, not
simulated loopback via if_simloop() due to !SIMPLEX interfaces, etc.
Note that enabling TXCSUM but not RXCSUM will lead to unhappiness, as
checksums won't be generated but will be validated.
Kris reports that this leads to significant performance improvements
in loopback benchmarking with TCP and UDP for throughput:
RXCSUM RXCSUM+TXCSUM
TCP 15% 37%
UDP 10% 74%
Update man page.
Reviewed by: sam
Tested by: kris
MFC after: 1 week
in FreeBSD 5.x to allow network device drivers to run with Giant
despite the network stack being Giant-free. This significantly
simplifies calls into ioctl() on network interfaces, especially
in the multicast code, as well as eliminates deferred invocation
of interface if_start routines.
Disable the build on device drivers still depending on
IFF_NEEDSGIANT as they no longer compile. They will be removed
in a few weeks if they haven't been made MPSAFE in that time.
Disabled drivers:
if_ar
if_axe
if_aue
if_cdce
if_cue
if_kue
if_ray
if_rue
if_rum
if_sr
if_udav
if_ural
if_zyd
Drivers that were already disabled because of tty changes:
if_ppp
if_sl
Discussed on: arch@
certain flags that should have been in inp_flags ended up in inp_vflag,
meaning that they were inconsistently locked, and in one case,
interpreted. Move the following flags from inp_vflag to gaps in the
inp_flags space (and clean up the inp_flags constants to make gaps
more obvious to future takers):
INP_TIMEWAIT
INP_SOCKREF
INP_ONESBCAST
INP_DROPPED
Some aspects of this change have no effect on kernel ABI at all, as these
are UDP/TCP/IP-internal uses; however, netstat and sockstat detect
INP_TIMEWAIT when listing TCP sockets, so any MFC will need to take this
into account.
MFC after: 1 week (or after dependencies are MFC'd)
Reviewed by: bz
guarantee that all cpus have acknowledged the cleared enable int by
scheduling the resetting thread on each cpu in succession. Since all
lock profiling happens within a critical section this guarantees that
all cpus have left lock profiling before we clear the datastructures.
- Assert that the per-thread queue of locks lock profiling is aware of
is clear on thread exit. There were several cases where this was not
true that slows lock profiling and leaks information.
- Remove all objects from all lists before clearing any per-cpu
information in reset. Lock profiling objects can migrate between
per-cpu caches and previously these migrated objects could be zero'd
before they'd been removed
Discussed with: attilio
Sponsored by: Nokia
C-NET(PC) has a cfe at location 1 that has both an odd irq mask (it
matches pc98 machines, so maybe it was a flag for pc98 operation) as
well as a memory map. Since this driver doesn't know how to cope, we
start with cfe2, which is purely an I/O space mapped and that seems to
make it work. I say 'seems' here, because the card I have doesn't
seem to have the right dongle for full testing...
when someone is passing new rules, not when he only want to read them.
Because of this bug, even if the given rules were incorrect, they
ended up in rule_string.
- Add missing protection for rule_string when coping it.
Reviewed by: rwatson
MFC after: 1 week
pthread_sigmask() to signal.h. In principle, this shouldn't break anything,
since they're already in signal.h on other systems, and the FreeBSD
manpage says that both pthread.h and signal.h need to be included to
get these functions.
Add a hack to declare pthread_t in the P1003.1-2008 namespace
in signal.h.
in the POSIX namespace, and hiding eaccess() and setproctitle().
Also move mknodat() from unistd.h to sys/stat.h where it belongs.
The *at() syscalls are only in CURRENT, so this shouldn't cause
problems.
improve performance:
- Eliminate custom reference count and condition variable to monitor
threads entering the framework, as this had both significant overhead
and behaved badly in the face of contention.
- Replace reference count with two locks: an rwlock and an sx lock,
which will be read-acquired by threads entering the framework
depending on whether a give policy entry point is permitted to sleep
or not.
- Replace previous mutex locking of the reference count for exclusive
access with write acquiring of both the policy list sx and rw locks,
which occurs only when policies are attached or detached.
- Do a lockless read of the dynamic policy list head before acquiring
any locks in order to reduce overhead when no dynamic policies are
loaded; this a race we can afford to lose.
- For every policy entry point invocation, decide whether sleeping is
permitted, and if not, use a _NOSLEEP() variant of the composition
macros, which will use the rwlock instead of the sxlock. In some
cases, we decide which to use based on allocation flags passed to the
MAC Framework entry point.
As with the move to rwlocks/rmlocks in pfil, this may trigger witness
warnings, but these should (generally) be false positives as all
acquisition of the locks is for read with two very narrow exceptions
for policy load/unload, and those code blocks should never acquire
other locks.
Sponsored by: Google, Inc.
Obtained from: TrustedBSD Project
Discussed with: csjp (idea, not specific patch)
(1) Fix pcib_read/write_config prototypes.
(2) When contrainting a resource request for a 'subtractive' bridge,
it is important to select a range outside the base/limit
registers, since those are the only values known to not
possibly work. On my HP laptop, the base bridge excludes I/O
ports 0xa000-0xafff, however that was the range we were passing
up the tree. Instead, when a range spans the "hole" we now
arbitrarily pick the range just above the hole to allocate from.
All of my rl and xl cards, at a minimum, started working again on this
laptop with those fixes.
- When sending large PR-SCTP messages over a
lossy link we would incorrectly calculate the fwd-tsn
- When receiving large multipart pr-sctp packets we would
incorrectly send back a SACK that would renege improperly
on already received packets thus causing unneeded retransmissions.
is calculated as 0 which causes errors elsewhere.
Submitted by: KOIE Hidetaka <koie@suri.co.jp>
- When sched_affinity() is called with a thread that is not curthread we
need to handle the ON_RUNQ() case by adding the thread to the correct
run queue.
Submitted by: Justin Teller <justin.teller@gmail.com>
MFC after: 1 Week
".note.ABI-tag" section.
The search order of a brand is changed, now first of all the
".note.ABI-tag" is looked through.
Move code which fetch osreldate for ELF binary to check_note() handler.
PR: 118473
Approved by: kib (mentor)
booting because the CD driver did not use bounce buffers to ensure
request buffers sent to the BIOS were always in the first 1MB. Copy over
the bounce buffer logic from the BIOS disk driver (minus the 64k boundary
code for floppies) to fix this.
Reported by: kensmith
look like a temperature.
This driver will most likely be renamed to something more meaningful in
the near future.
Submitted by: nork
MFC after: 2 weeks
o add 9280 attach that sets up ini, cal, etc.
o new rf backend for 9280 and later parts
o split ini setup and spur mitigation support out to methods
and provide 9280-specific support
o minor fixups to shared code to handle 9280-specific work
Obtained from: Atheros (ini values and some code)
Provide a custom lock around initializing and tearing down EA area,
to prevent both memory leaks and double-free of it. Count the number
of EA area accessors.
Lock protocol requires either holding exclusive vnode lock to modify
i_ea_area, or shared vnode lock and owning IN_EA_LOCKED flag in i_flag.
Noted by: YAMAMOTO, Taku <taku tackymt homeip net>
Tested by: pho (previous version)
MFC after: 2 weeks
both disks, or if we should suppress the slave drive. Default to
suppressing the slave, in the case that this REQIURED tuple turns out
to not actually be present...
Based on the HAL preemption lock there is a problem on SMP machines
and causes a panic.
o When a device detached the current tactic to detach NDIS USB driver is
to call SURPRISE_REMOVED event. So it don't need to call
ndis_halt_nic() again. This fixes some page faults when some drivers
work abnormal.
o it assumes now that URB_FUNCTION_BULK_OR_INTERRUPT_TRANSFER is in
DISPATCH_LEVEL (non-sleepable) and as further work
URB_FUNCTION_VENDOR_XXX and URB_FUNCTION_CLASS_XXX should be.
Reviewed by: Hans Petter Selasky <hselasky_at_freebsd.org>
Tested by: Paul B. Mahol <onemda_at_gmail.com>
More HID parsing fixes for usb mice.
- be less strict on the last HID item usage.
- preserve item size and count accross items
- improve default HID usage selection.
Tested by: ache
Submitted by: Hans Petter Selasky
o Header file cleanup.
o bus_dma(9) conversion.
- Removed all consumers of vtophys(9) and converted to use
bus_dma(9).
- Typhoon2 functional specification says the controller supports
64bit DMA addressing. However all Typhoon controllers are known
to lack of DAC support so 64bit DMA support was disabled.
- The hardware can't handle more 16 fragmented Tx DMA segments so
teach txp(4) to collapse these segments to be less than 16.
- Added Rx buffer alignment requirements(4 bytes alignment) and
implemented fixup code to align receive frame. Previously
txp(4) always copied Rx frame to align it on 2 byte boundary
but its copy overhead is much higher than unaligned access on
i386/amd64. Alignment fixup code is now applied only for
strict-alignment architectures. With this change i386 and
amd64 will get instant Rx performance boost. Typhoon2 datasheet
mentions a command that pads arbitrary bytes in Rx buffer but
that command does not work.
- Nuked pointer trick in descriptor ring. This does not work on
sparc64 and replaced it with bcopy. Alternatively txp(4) can
embed a 32 bits index value into the descriptor and compute
real buffer address but it may make code complicated.
- Added endianness support code in various Tx/Rx/command/response
descriptor access. With this change txp(4) should work on all
architectures.
o Added comments for known firmware bugs(Tx checksum offloading,
TSO, VLAN stripping and Rx buffer padding control).
o Prefer faster memory space register access to I/O space access.
Added fall-back mechanism to use alternative I/O space access.
The hardware supports both memory and I/O mapped access. Users
can still force to use old I/O space access by setting
hw.txp.prefer_iomap tunable to 1 in /boot/loader.conf.
o Added experimental suspend/resume methods.
o Nuke error prone Rx buffer handling code and implemented local
buffer management with TAILQ. Be definition the controller can't
pass the last received frame to host if no Rx free buffers are
available to use as head and tail pointer of Rx descriptor ring
can't have the same value. In that case the Rx buffer pointer in
Rx buffer ring still holds a valid buffer and txp_rxbuf_reclaim()
can't fill Rx buffers as the first buffer is still valid. Instead
of relying on the value of Rx buffer ring, introduce local buffer
management code to handle empty buffer situation. This should fix
a long standing bug which completely hangs the controller under
high network load. I could easily trigger the issue by sending 64
bytes UDP frames with netperf. I have no idea how this bugs was
not fixed for a long time.
o Converted ithread interrupt handler to filter based one.
o Rearranged txp_detach routine such that it's now used for general
clean-up routine.
o Show sleep image version on device attach time. This will help
to know what action should be taken depending on sleep image
version. The version information in datasheet was wrong for newer
NV images so I followed Linux which seems to correctly extract
version numbers from response descriptors.
o Firmware image is no longer downloaded in device attach time. Now
it is reloaded whenever if_init is invoked. This is to ensure
correct operation of hardware when something goes wrong.
Previously the controller always run without regard to running
state of firmware. This change will add additional controller
initialization time but it give more robust operation as txp(4)
always start off from a known state. The controller is put into
sleep state until administrator explicitly up the interface.
o As firmware is loaded in if_init handler, it's now possible to
implement real watchdog timeout handler. When watchdog timer is
expired, full-reset the controller and initialize the hardware
again as most other drivers do. While I'm here use our own timer
for watchdog instead of using if_watchdog/if_timer interface.
o Instead of masking specific interrupts with TXP_IMR register,
program TXP_IER register with the interrupts to be raised and
use TXP_IMR to toggle interrupt generation.
o Implemented txp_wait() to wait a specific state of a controller.
o Separate boot related code from txp_download_fw() and name it
txp_boot() to handle boot process.
o Added bus_barrier(9) to host to ARM communication.
o Added endianness to all typhoon command processing. The ARM93C
always expects little-endian format of command/data.
o Removed __STRICT_ALIGNMENT which is not valid on FreeBSD.
__NO_STRICT_ALIGNMENT is provided for that purpose on FreeBSD.
Previously __STRICT_ALIGNMENT was unconditionally defined for
all architectures.
o Rewrote SIOCSIFCAP ioctl handler such that each capability can be
controlled by ifconfig(8). Note, disabling VLAN hardware tagging
has no effect due to the bug of firmware.
o Don't send TXP_CMD_CLEAR_STATISTICS to clear MAC statistics in
txp_tick(). The command is not atomic. Instead, just read the
statistics and reflect saved statistics to the statistics.
dev.txp.%d.stats sysctl node provides detailed MAC statistics.
This also reduces a lot of waste of CPU cycles as processing a
command ring takes a very long time on ARM93C. Note, Rx
multicast and broadcast statistics does not seem to right. It
might be another bug of firmware.
o Implemented link state change handling in txp_tick(). Now sending
packets is allowed only after establishing a valid link. Also
invoke link state change notification whenever its state is
changed so pseudo drivers like lagg(4) that relies on link state
can work with failover or link aggregation without hacks.
if_baudrate is updated to resolved speed so SNMP agents can get
correct bandwidth parameters.
o Overhauled Tx routine such that it now honors number of allowable
DMA segments and checks for 4 free descriptors before trying to
send a frame. A frame may require 4 descriptors(1 frame
descriptor, 1 or more frame descriptors, 1 TSO option descriptor,
one free descriptor to prevent descriptor wrap-around) at least
so it's necessary to check available free descriptors prior to
setting up DMA operation.
o Added a sysctl variable dev.txp.%d.process_limit to control
how many received frames should be served in Rx handler. Valid
ranges are 16 to 128(default 64) in unit of frames.
o Added ALTQ(4) support.
o Added missing IFCAP_VLAN_HWCSUM as txp(4) can offload checksum
calculation as well as VLAN tag insertion/stripping.
o Fixed media header length for VLAN.
o Don't set if_mtu in device attach, it's already set in
ether_ifattach().
o Enabled MWI.
o Fixed module unload panic when bpf listeners are active.
o Rearranged ethernet address programming logic such that it works
on strict-alignment architectures.
o Removed unused member variables in softc.
o Added support for WOL.
o Removed now unused TXP_PCI_LOMEM/TXP_PCI_LOIO.
o Added wakeup command TXP_BOOTCMD_WAKEUP definition.
o Added a new firmware version query command, TXP_CMD_READ_VERSION.
o Removed volatile keyword in softc as bus_dmamap_sync(9) should
take care of this.
o Removed embedded union trick of a structure used to to access
a pointer on LP64 systems.
o Added a few TSO related definitions for struct txp_tcpseg_desc.
However TSO is not used at all due to the limitation of hardware.
o Redefined PKT_MAX_PKTLEN to theoretical maximum size of a frame.
o Switched from bus_space_{read|write}_4 to bus_{read|write}_4.
o Added a new macro TXP_DESC_INC to compute next descriptor index.
Tested by: don.nasco <> gmail dot com
poll(), only copy out the revents field, not the whole pollfd
structure. Otherwise, if the events field is updated
concurrently by another thread, that update may be lost.
This issue apparently causes problems for the JDK on FreeBSD,
which expects the Linux behavior of not updating all fields
(somewhat oddly, Solaris does not implement the required
behavior, but presumably our adaptation of the JDK is based
on the Linux port?).
MFC after: 2 weeks
PR: kern/130924
Submitted by: Kurt Miller <kurt @ intricatesoftware.com>
Discussed with: kib
internal sysctl_sysctl_name() handler to map the MIB array to a string
name and logs this name in the trace log. This can be useful to see
exactly which sysctls a thread is invoking.
MFC after: 1 month
filesystem supports additional operations using shared vnode locks.
Currently this is used to enable shared locks for open() and close() of
read-only file descriptors.
- When an ISOPEN namei() request is performed with LOCKSHARED, use a
shared vnode lock for the leaf vnode only if the mount point has the
extended shared flag set.
- Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but
not O_CREAT.
- Use a shared vnode lock around VOP_CLOSE() if the file was opened with
O_RDONLY and the mountpoint has the extended shared flag set.
- Adjust md(4) to upgrade the vnode lock on the vnode it gets back from
vn_open() since it now may only have a shared vnode lock.
- Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since
FIFO's require exclusive vnode locks for their open() and close()
routines. (My recent MPSAFE patches for UDF and cd9660 already included
this change.)
- Enable extended shared operations on UFS, cd9660, and UDF.
Submitted by: ups
Reviewed by: pjd (ZFS bits)
MFC after: 1 month
legal in the spec. Add newline to the verbose messages we print when
debugging when this happens. The Hitachi HT-4840-11 is the only card
to hit these in years, and it works well enough if we're liberal about
what we accept.
the unmanaged flag set in the PVO attributes. Without doing this,
pmap_remove() could try to remove fictitious pages (like those created
by mmap of physical memory) from the wrong UMA zone, causing a panic.
Reported by: Justin Hibbits
MFC after: 1 week
to resurrect (maybe honor foot shooting bit in kern.geom_debugflags)
o fix match macro so we now recognize we want to merge FIS dir with RedBoot
config parameters even if we don't actually do it
or not the inpcb is currenty on various hash lookup lists, rather
than using (lport != 0) to detect this. This means that the full
4-tuple of a connection can be retained after close, which should
lead to more sensible netstat output in the window between TCP
close and socket close.
MFC after: 2 weeks
this fixes geom_redboot on boards that have multiple parts/regions as it
uses the value to locate the FIS directory which is in the last erase
region of flash
Firmware upgraded to 7.1.0 (from 5.0.0).
T3C EEPROM and SRAM added; Code to update eeprom/sram fixed.
fl_empty and rx_fifo_ovfl counters can be observed via sysctl.
Two new cxgbtool commands to get uP logic analyzer info and uP IOQs
Synced up with Chelsio's "common code" (as of 03/03/09)
Submitted by: Navdeep Parhar at Chelsio
Reviewed by: gnn
MFC after: 2 weeks
operate on the resource (we have no local resources to manage); this
fixes drivers that alloc/release resources in their probe method and
then do it again in attach
o while here add some prints to catch failures and massage style a bit
hit in the name cache. cache_lookup() doesn't actually return ENOENT
for such requests to force the filesystem to do an explicit lookup, so
this was effectively dead code.
- Grab the nfsnode mutex while writing to n_dmtime. We don't grab the lock
when comparing the time against the cached directory mod time (just as
we don't when comparing ctime's for +ve name cache hits) since the
attribute caching is already racy for NFS clients as it is.
Discussed with: bde
in6p_ip6_nxt
in6p_vflag
in6p_flags
in6p_socket
in6p_lport
in6p_fport
in6p_ppcb
Remove unused v6 macro aliases for inpcb flags:
IN6P_HIGHPORT
IN6P_LOWPORT
IN6P_ANONPORT
IN6P_RECVIF
IN6P_MTUDISC
IN6P_FAITH
IN6P_CONTROLOPTS
References to in6p_lport and in6_fport in sockstat are also replaced with
normal inp_lport and inp_fport references.
MFC after: 3 days
Reviewed by: bz
o encode need for A4 bus space tag hackery according to the memory
address; checking for "uart" breaks down with the GPS chip support
which is also a uart but does not require the same hackery
o encode the correct memory window instead of carving up all of i/o
space, potentially with a larger window than a device should have;
this likely should be handled in the drivers by using a proper bus
alloc call but since some drivers depend on the bus support to figure
this out we cannot simply mod them
o add optional GPS and RS485 support (conditionally as the support
isn't ready yet)
needed.
- Move the release of the sysctl sx lock after the vsunlock() in
userland_sysctl() to restore the original memlock behavior of
minimizing the amount of memory wired to handle sysctl requests.
MFC after: 1 week
implementation instead. The bypass does not assume that returned vnode
is only held.
Reported by: Paul B. Mahol <onemda gmail com>, pluknet <pluknet gmail com>
Reviewed by: jhb
Tested by: pho, pluknet <pluknet gmail com>
consumers which fork after the shared pages have been setup. pflogd(8)
is an example. The problem is understood and there is a fix coming in
shortly.
Folks who want to continue using it can do so by setting
net.bpf.zerocopy_enable
to 1.
Discussed with: rwatson
the PORTEN and MEMEN bits in the command register than to zero the
bars.
Use pci_write_ivar directly instead of a one-line wrapper that adds no
value.
Track verbosity changes in pci.
Remove a stray blank line.
After I imported libteken into the source tree, I noticed syscons didn't
store the cursor position inside the terminal emulator, but inside the
virtual terminal stat. This is not very useful, because when you
implement more complex forms of line wrapping, you need to keep track of
more state than just the cursor position.
Because the kernel messages didn't share the same terminal emulator as
ttyv0, this caused a lot of strange things, like kernel messages being
misplaced and a missing notification to resize the terminal emulator for
kernel messages never to be resized when using vidcontrol.
This patch just removes kernel_console_ts and adds a special parameter
to te_puts to determine whether messages should be printed using regular
colors or the ones for kernel messages.
Reported by: ache
Tested by: nyan, garga (older version)
table pages. Now, all accesses to user-space page table pages are
performed through the direct map. (The recursive mapping is only used
to access kernel-space page table pages.)
Eliminate the TLB invalidation on the recursive mapping when a user-space
page table page is removed from the page table and when a user-space
superpage is demoted.
We should just leave the underlying TTY objects alone when scrolling
around in KDB. It should be handled by Syscons exclusively.
Reported by: pluknet gmail com
address space sizes to be longs instead of ints. Specifically, the follow
values are now longs: runningbufspace, bufspace, maxbufspace,
bufmallocspace, maxbufmallocspace, lobufspace, hibufspace, lorunningspace,
hirunningspace, maxswzone, maxbcache, and maxpipekva. Previously, a
relatively small number (~ 44000) of buffers set in kern.nbuf would result
in integer overflows resulting either in hangs or bogus values of
hidirtybuffers and lodirtybuffers. Now one has to overflow a long to see
such problems. There was a check for a nbuf setting that would cause
overflows in the auto-tuning of nbuf. I've changed it to always check and
cap nbuf but warn if a user-supplied tunable would cause overflow.
Note that this changes the ABI of several sysctls that are used by things
like top(1), etc., so any MFC would probably require a some gross shims
to allow for that.
MFC after: 1 month
debug.hashstat.rawnchash sysctl in particular as taking 7 milliseconds on
a 3GHz Intel Xeon (4x2) running 7.1. It accounted for almost a quarter of
the total runtime of 'sysctl -a'. It also performs lots of copyout's while
holding the namecache lock (this does not attempt to fix that).
MFC after: 2 weeks
IPv4 stack.
Diffs are minimized against p4.
PCS has been used for some protocol verification, more widespread
testing of recorded sources in Group-and-Source queries is needed.
sizeof(struct igmpstat) has changed.
__FreeBSD_version is bumped to 800070.
in make.conf or src.conf.
- When GPT is enabled (which it is by default), use memory above 1 MB and
leave the memory from the end of the bss to the end of the 640k window
purely for the stack. The loader has grown and now it is much more
common for the heap and stack to grow into each other when both are
located in the 640k window.
PR: kern/129526
MFC after: 1 week
ports tree so that programs use libusb from the base by default. Thanks to
Stanislav Sedov for sorting out the ports build.
Bump __FreeBSD_version to 800069
Help and testing by: stas
was introduced. If you have a bus, say cardbus, that is derived from
a base-bus (say PCI), then ordinarily all PCI drivers would attach to
cardbus devices. However, there had been one exception: kldload
wouldn't work.
The problem is in devclass_add_driver. In this routine, all we did
was call to the pci device's BUS_DRIVER_ADDED routine. However, since
cardbus bus instances had a different devclass, none of them were
called.
The solution is to call all subclass devclasses, recursively down the
tree, of the class that was loaded. Since we don't have a 'children
class' pointer, we search the whole list of devclasses for a class
whose parent matches. Since just done a kldload time, this isn't as
bad as it sounds. In addition, we short-circuit the whole process by
marking those classes with subclasses with a flag. We'll likely have
to reevaluate this method the number of devclasses with subclasses
gets large.
This means we can remove the "cardbus" lines from all the PCI drivers
since we have no cardbus specific attach device attachments in the
tree.
# Also: minor tweak to an error message
unlikely but not impossible given modern thread counts) wrap-around,
and the compiler was padding it out to an int (at least) anyway.
MFC after: 3 days (but confirm ABI impact)
system call entry path and i386 IP checksum generation: we now assume
all code is MPSAFE unless explicitly marked otherwise. Remove XXX
Giant comments along similar lines: the code by the comments either
doesn't need or doesn't want Giant (especially the NMI handler).
MFC after: 3 days
not there is an audit record hung off of td_ar on the current thread.
Test this flag instead of td_ar when auditing syscall arguments or
checking for an audit record to commit on syscall return. Under
these circumstances, td_pflags is much more likely to be in the cache
(especially if there is no auditing of the current system call), so
this should help reduce cache misses in the system call return path.
MFC after: 1 week
Reported by: kris
Obtained from: TrustedBSD Project
get default next page configuration. While I'm here explicitly set
IP1000PHY_ANAR_CSMA bit. This bit is read-only and always set
by hardware so setting it has no effect but it would clear the
intention. With this change controllers that couldn't establish
1000baseT link should work.
PR: kern/130846
mapping. The tunable is OFF for all controllers except RTL8169SC
family. RTL8169SC seems to require more magic to use memory
register mapping. r187483 added a fix for RTL8169SCe controller but
it does not looke like fix other variants of RTL8169SC.
Tested by: Gavin Stone-Tolcher g.stone-tolcher <> its dot uq dot edu dot au
o correct dBm<->mW conversion logic
o set net80211 TXPMGT capability only if driver reports it is capable
PR: kern/132342
Submitted by: "Paul B. Mahol" <onemda@gmail.com>
Fix bugs and improve HID parsing.
- fix possible memory leak found
- fix possible NULL pointer access
- fix possible invalid memory read
- parsing improvements
- reset item data position when a new report ID is detected.
Submitted by: Hans Petter Selasky
query functions in the kernel, as these effectively serialize
parallel calls to the gettimeofday(2) system call, as well as
other kernel services that use timestamps.
Use the NetBSD version of the fix (kern_tc.c:1.32 by ad@) as
they have picked up our timecounter code and also ran into the
same problem.
Reported by: kris
Obtained from: NetBSD
MFC after: 3 days
locks: a global list/counter/generation counter protected by a new
mutex unp_list_lock, and a global linkage rwlock, unp_global_rwlock,
which protects the connections between UNIX domain sockets.
This eliminates conditional lock acquisition that was previously a
property of the global lock being held over sonewconn() leading to a
call to uipc_attach(), which also required the global lock, but
couldn't rely on it as other paths existed to uipc_attach() that
didn't hold it: now uipc_attach() uses only the list lock, which
follows the linkage lock in the lock order. It may also reduce
contention on the global lock for some workloads.
Add global UNIX domain socket locks to hard-coded witness lock
order.
MFC after: 1 week
Discussed with: kris
directory of a vnode to find a dirent with a matching file number. The
name from that dirent is then used to provide the component name.
Note: if the initial vnode argument is not a directory itself, then
the default VOP_VPTOCNP(9) implementation still returns ENOENT.
Reviewed by: kib
Approved by: kib
Tested by: pho
extended attribute get/set; in the case of get an uninitialized user
buffer was passed before the EA was retrieved, making it of relatively
little use; the latter was simply unused by any policies.
Obtained from: TrustedBSD Project
Sponsored by: Google, Inc.
naming by renaming certain "proc" entry points to "cred" entry points,
reflecting their manipulation of credentials. For some entry points,
the process was passed into the framework but not into policies; in
these cases, stop passing in the process since we don't need it.
mac_proc_check_setaudit -> mac_cred_check_setaudit
mac_proc_check_setaudit_addr -> mac_cred_check_setaudit_addr
mac_proc_check_setauid -> mac_cred_check_setauid
mac_proc_check_setegid -> mac_cred_check_setegid
mac_proc_check_seteuid -> mac_cred_check_seteuid
mac_proc_check_setgid -> mac_cred_check_setgid
mac_proc_check_setgroups -> mac_cred_ceck_setgroups
mac_proc_check_setregid -> mac_cred_check_setregid
mac_proc_check_setresgid -> mac_cred_check_setresgid
mac_proc_check_setresuid -> mac_cred_check_setresuid
mac_proc_check_setreuid -> mac_cred_check_setreuid
mac_proc_check_setuid -> mac_cred_check_setuid
Obtained from: TrustedBSD Project
Sponsored by: Google, Inc.
privilege grants so that dtrace can be more easily used to monitor
the security decisions being generated by the MAC Framework following
policy invocation.
Successful access control checks will be reported by:
mac_framework:kernel:<entrypoint>:mac_check_ok
Failed access control checks will be reported by:
mac_framework:kernel:<entrypoint>:mac_check_err
Successful privilege grants will be reported by:
mac_framework:kernel:priv_grant:mac_grant_ok
Failed privilege grants will be reported by:
mac_framework:kernel:priv_grant:mac_grant_err
In all cases, the return value (always 0 for _ok, otherwise an errno
for _err) will be reported via arg0 on the probe, and subsequent
arguments will hold entrypoint-specific data, in a style similar to
privilege tracing.
Obtained from: TrustedBSD Project
Sponsored by: Google, Inc.
are not currently owned by userspace before clearing or rotating them.
Otherwise we may not play by the rules of the shared memory protocol,
potentially corrupting packet data or causing userspace applications
that are playing by the rules to spin due to being notified that a
buffer is complete but the shared memory header not reflecting that.
This behavior was seen with pflogd by a number of reporters; note that
this fix is not sufficient to get pflogd properly working with
zero-copy BPF, due to pflogd opening the BPF device before forking,
leading to the shared memory buffer not being propery inherited in the
privilege-separated child. We're still deciding how to fix that
problem.
This change exposes buffer-model specific strategy information in
reset_d(), which will be fixed at a later date once we've decided how
best to improve the BPF buffer abstraction.
Reviewed by: csjp
Reported by: keramida
the disklabel in the 2nd sector for boot code. Even with both UFS1
and UFS2 supported, there's enough bytes left that we don't have to
nibble from the disklabel.
Thus, the entire 2nd sector is now reserved for the disklabel, which
makes the bootcode compatible again with disklabels that have more
than 8 partitions -- such as those created and supported by gpart.
i386: 135 bytes available
amd64: 151 bytes available
Ok'd by: jhb
Tested on an HD3850 (RV670) on loan from Warren Block.
Currently, you need one of the following for this to be useful:
x11-drivers/xf86-video-radeonhd-devel (not tested)
xf86-video-ati from git (EXA works, xv is too fast)
xf86-video-radeonhd from git (EXA works, xv works)
There is no 3d support available from dri just yet.
MFC after: 2 weeks
o add Transaction Translator support (still missing ISOC xfers)
o add EHCI_SCFLG_BIGEMMIO flag to force big-endian byte-select to be
set in USBMODE
o split reset work into new public routine ehci_reset so bus shim drivers
can force big-endian byte-select before ehci_init
o enable TT and big-endian MMIO
o force a reset before ehci_init to get byte-select setup
Also go back to using USB_EHCI_BIG_ENDIAN_DESC at compile time to enable the
byteswapping and reduce diffs to the original commits.
This fixes the new USB stack on the Cambria board.
o implement URB_FUNCTION_ABORT_PIPE handling.
o remove unused code related with canceling the timer list for USB
drivers.
o whitespace cleanup and style(9)
Obtained from: hps's original patch
o improves understandability by replacing numerous relative address
calculations with fixed addresses; everything should now match up
more easily with the vm layout shown at the top of the file
o move the expansion bus chip select regions to be contiguous with
the expansion bus configuration area; this is not exploited right
now but allows map consolidation in the future
o leave a gap between the expansion bus regions and the pci config
space in case we want to map more exp bus cs regions
Reviewed by: imp, thompsa
poll_no_poll().
Return a poll_no_poll() result from devfs_poll_f() when
filedescriptor does not reference the live cdev, instead of ENXIO.
Noted and tested by: hps
MFC after: 1 week
1) WP should never be marked unless flight size is 0
2) When recovering from wp if the peer ack's it we don't mark for retran
3) When recovering, we must assure a timer is still running.
ABIs:
- Store the FPU initial control word in the pcb for each thread.
- When first using the FPU, load the initial control word after restoring
the clean state if it is not the standard control word.
- Provide a correct control word for Linux/i386 binaries under
FreeBSD/amd64.
- Adjust the control word returned for fpugetregs()/npxgetregs() when a
thread hasn't used the FPU yet to reflect the real initial control
word for the current ABI.
- The Linux/i386 ABI for FreeBSD/i386 now properly sets the right control
word instead of trashing whatever the current state of the FPU is.
Reviewed by: bde
- Enable keyboard autodetection by default for ISA syscons attachments.
- If there are no syscons hints at all, assume there is a single sc0 device
anyway. The console probe will still fail unless a VGA adapter is found.
MFC after: 2 weeks
- Remove the control word parameter to npxinit(). It was always set
to __INITIAL_NPXCW__.
- Remove npx_cleanstate_ready as the cleanstate is always initalized
when it is used.
- Improve the handling of the case when the FPU isn't present. Now
the npx0 device no longer succeeds in its probe so all of npx_attach()
is skipped. Also, we allow this case with SMP (though that shouldn't
actually occur as all i386 systems that support SMP have FPUs) now.
SMP was only an issue back when we had an FPU emulator which was not
per-CPU.
- MFamd64: Clear some of the state in npx_cleanstate rather than leaving
it as garbage.
- MFamd64: When a user thread first uses the FPU, use npx_cleanstate for
the initial FPU state.
Reviewed by: bde
- fpudna() always returned 1 since amd64 CPUs always have FPUs. Change
the function to return void and adjust the calling code in trap() to
assume the return 1 case is the only case.
- Remove fpu_cleanstate_ready as it is always true when it is tested.
Also, only initialize fpu_cleanstate when fpuinit() is called on the BSP.
Reviewed by: bde
entry is a specific entry to override the generic NetMos entry so that
puc(4) will leave this device alone and let uart(4) claim it.
Submitted by: Navdeep Parhar nparhar @ gmail
Reviewed by: marcel
MFC after: 1 week
bogus entries have a starting IRQ that is invalid (> 255, so won't fit
into a PCI intline config register). It had the side effect of breaking
MSI by "claiming" several IRQs in the MSI range. Fix this by ignoring such
I/O APICs.
MFC after: 2 weeks
when determining the size of a BAR by writing all 1's to the BAR and
reading back the result, always operate on the full 64-bit size.
Reviewed by: imp
MFC after: 1 month
flag when calling bus_alloc_resource() to allocate resources from a parent
PCI bridge. For PCI-PCI bridges this asks the bridge to satisfy the
request using the prefetchable memory range rather than the normal
memory range.
Reviewed by: imp
Reported by: scottl
MFC after: 1 week
Do not overload the local variable size in kern_shmat() due to vm_size_t
change.
Fix style bug by adding explicit comparision with 0.
Discussed with: bde
MFC after: 1 week
BAR could be allocated twice by different children of a vgapci0 device.
To fix this, change the vgapci0 device to track references on its associated
resources so that they are only allocated once from the parent PCI bus and
released when no children are using them. Previously this leaked a small
amount of KVA on at least some architectures.
into the advance_peer_ack point so we would incorrectly
send a wrong value in the FWD-TSN
- PR-SCTP bug, where an PR packet is used for a window
probe which could incorrectly get the packet moved
back into the send_queue, which will cause major issues and
should not happen.
- Fix a trace to use the proper macro.
We now explicitly enable INTx during bus_setup_intr() if it is needed.
Several of the ata drivers were managing this bit internally. This is
better handled in pci and it should work for all drivers now.
We also mask INTx during bus_teardown_intr() by setting this bit.
Reviewed by: jhb
MFC after: 3 days
We fail mapping for any udf_bmap_internal error and there can be
different reasons for it, so no need to (over-)emphasize files with
data in fentry.
Submitted by: bde
Approved by: jhb
are used by glibc. This silents the message "2.4+ kernel w/o ELF notes?"
from some programs at start, among them are top and pkill.
Do the assignment of the vector entries in elf_linux_fixup()
as it is done in glibc.
Fix some minor style issues.
Submitted by: Marcin Cieslak <saper at SYSTEM PL>
Approved by: kib (mentor)
MFC after: 1 week
and do not attempt to perform a group lookup.
This is a socket layer lock, and the bottom half of IP
really has no business taking it.
Use the value of the in_mcast_loop sysctl to determine
if we should loop back by default, in the absence of
any multicast socket options. Because the check on
group membership is now deferred to the input path,
an m_copym() is now required.
This should increase multicast send performance where the
source has not requested loopback, although this has not been
benchmarked or measured.
It is also a necessary change for IN_MULTI_LOCK to become
non-recursive, which is required in order to implement IGMPv3
in a thread-safe way.
IPv4 multicast sends are looped back to senders by default
on a stack-wide basis, rather than relying on the socket option.
Note that the sysctl only applies to newly created multicast sockets.
- Added missing firmware for 5709 A1 controllers.
- Changed some debug statistic variable names to be more consistent.
Submitted by: davidch
MFC after: Two weeks
while developing and compiling with kernel options that change the
size of at least one structure. The current kernel build framework
does not allow us to pass -Dxxx to module builds so we would possibly
need a kernel option to disable the checks and that might not work
for people just building modules alone.
For now they helped to identify possibly API problems and bring
those back into minds of developers seeking for better solutions.
Problems reported by: kib, warner
Reviewed by: warner
Clang disallows structs with variable length arrays to be nested inside
other structs, because this is in violation with ISO C99. Even though we
can keep bugging the LLVM folks about this issue, we'd better just fix
our code to not do this. This code seems to be the only code in the
entire source tree that does this.
I haven't tested this patch by using the kernel modules in question, but
Diane Bruce and I have compared disassembled versions of these kernel
modules. We would have expected them to be exactly the same, but due to
randomness in the register allocator and reordering of instructions,
there were some minor differences.
Approved by: julian
wrapper macros that allow trace points and arguments to be declared
using a single macro rather than several. This means a lot less
repetition and vertical space for each trace point.
Use these macros when defining privilege and MAC Framework trace points.
Reviewed by: jb
MFC after: 1 week
A while back, Warner changed the PCI bus code to reserve resources when
enumerating devices and simply give devices the previously allocated
resources when they call bus_alloc_resource(). This ensures that address
ranges being decoded by a BAR are always allocated in the nexus0 device
(or whatever device the PCI bus gets its address space from) even if a
device driver is not attached to the device. This patch extends this
behavior further:
- To let the PCI bus distinguish between a resource being allocated by
a device driver vs. merely being allocated by the bus, use
rman_set_device() to assign the device to the bus when it is owned
by the bus and to the child device when it is allocated by the child
device's driver. We can now prevent a device driver from allocating
the same device twice. Doing so could result in odd things like
allocating duplicate virtual memory to map the resource on some
archs and leaking the original mapping.
- When a PCI device driver releases a resource, don't pass the request
all the way up the tree and release it in the nexus (or similar device)
since the BAR is still active and decoding. Otherwise, another device
could later allocate the same range even though it is still in use.
Instead, deactivate the resource and assign it back to the PCI bus
using rman_set_device().
- pci_delete_resource() will actually completely free a BAR including
attemping to disable it.
- Disable BAR decoding via the command register when sizing a BAR in
pci_alloc_map() which is used to allocate resources for a BAR when
the BIOS/firmware did not assign a usable resource range during boot.
This mirrors an earlier fix to pci_add_map() which is used when to
size BARs during boot.
- Move the activation of I/O decoding in the PCI command register into
pci_activate_resource() instead of doing it in pci_alloc_resource().
Previously we could actually enable decoding before a BAR was
initialized via pci_alloc_map().
Glanced at by: bsdimp
RH0 was deprecated by RFC 5095.
While most of the code had been disabled by #if 0 already, leave a
bit of infrastructure for possible RH2 code and a log message under
BURN_BRIDGES in case a user still tries to send RH0 packets.
Reviewed by: gnn (a bit back, earlier version)
Previosly readdir missed some directory entries because there was
no space for them in current uio but directory stream offset
was advanced nevertheless.
jhb has discoved the issue and provided a test-case.
Reviewed by: bde
Approved by: jhb (mentor)
vfsopt and the vfs_buildopts function public, and add some new fields
to struct vfsopt (pos and seen), and new functions vfs_getopt_pos and
vfs_opterror.
Further extend the interface to allow reading options from the kernel
in addition to sending them to the kernel, with vfs_setopt and related
functions.
While this allows the "name=value" option interface to be used for more
than just FS mounts (planned use is for jails), it retains the current
"vfsopt" name and <sys/mount.h> requirement.
Approved by: bz (mentor)
as a handler.
The variable was exported only for debugging, but there is little reason
to do it now that the timekeeping is supported by various other variables.
For the time being just comment out the sysctl, but I think this
should go away.
of sysctl_variables.
I would also remove it from the VNET record but I am unsure if
there is any ABI issue -- so for the time being just mark it as
unused in ip_fw.h, and then we will collect the garbage at some
appropriate time in the future.
MFC after: 3 days
operation is known and to retry or fail accordingly to that
outcome. This fixes the problem with namespace traversing
programs failing with random ENOENT errors if someone just
happened to try to unmount that same filesystem at the same
time.
Reported by: dhw
Reviewed by: kib, attilio
Sponsored by: Juniper Networks, Inc.
This addresses interrupt storms that were noticed after enabling MSI
in drm. I think this is due to a loose interpretation of the PCI 2.3
spec, which states that a function using MSI is prohibitted from using
INTx. It appears that some vendors interpretted that to mean that they
should handle it in hardware, while others felt it was the drivers
responsibility.
This fix will also likely resolve interrupt storm related issues with
devices other than drm.
Reviewed by: jhb@
MFC after: 3 days
memory from int to size_t. Implement a workaround for current ABI not
allowing to properly save size for and report more then 2Gb sized segment
of shared memory.
This makes it possible to use > 2 Gb shared memory segments on 64bit
architectures. Please note the new BUGS section in shmctl(2) and
UPDATING note for limitations of this temporal solution.
Reviewed by: csjp
Tested by: Nikolay Dzham <i levsha org ua>
MFC after: 2 weeks
name i2c-address instead of reg. Change the OFW I2C probe to check both
locations for the address.
Submitted by: Marco Trillo
Reported by: Justin Hibbits
contrib/openbsm (svn merge) and src/sys/{bsm,security/audit} (manual
merge).
OpenBSM history for imported revision below for reference.
MFC after: 1 month
Sponsored by: Apple, Inc.
Obtained from: TrustedBSD Project
OpenBSM 1.1 beta 1
- The filesz parameter in audit_control(5) now accepts suffixes: 'B' for
Bytes, 'K' for Kilobytes, 'M' for Megabytes, and 'G' for Gigabytes.
For legacy support no suffix defaults to bytes.
- Audit trail log expiration support added. It is configured in
audit_control(5) with the expire-after parameter. If there is no
expire-after parameter in audit_control(5), the default, then the audit
trail files are not expired and removed. See audit_control(5) for
more information.
- Change defaults in audit_control: warn at 5% rather than 20% free for audit
partitions, rotate automatically at 2mb, and set the default policy to
cnt,argv rather than cnt so that execve(2) arguments are captured if
AUE_EXECVE events are audited. These may provide more usable defaults for
many users.
- Use au_domain_to_bsm(3) and au_socket_type_to_bsm(3) to convert
au_to_socket_ex(3) arguments to BSM format.
- Fix error encoding AUT_IPC_PERM tokens.
changes since the last imported OpenBSM release:
OpenBSM 1.1 beta 1
- The filesz parameter in audit_control(5) now accepts suffixes: 'B' for
Bytes, 'K' for Kilobytes, 'M' for Megabytes, and 'G' for Gigabytes.
For legacy support no suffix defaults to bytes.
- Audit trail log expiration support added. It is configured in
audit_control(5) with the expire-after parameter. If there is no
expire-after parameter in audit_control(5), the default, then the audit
trail files are not expired and removed. See audit_control(5) for
more information.
- Change defaults in audit_control: warn at 5% rather than 20% free for audit
partitions, rotate automatically at 2mb, and set the default policy to
cnt,argv rather than cnt so that execve(2) arguments are captured if
AUE_EXECVE events are audited. These may provide more usable defaults for
many users.
- Use au_domain_to_bsm(3) and au_socket_type_to_bsm(3) to convert
au_to_socket_ex(3) arguments to BSM format.
- Fix error encoding AUT_IPC_PERM tokens.
Obtained from: TrustedBSD Project
Sponsored by: Apple Inc.
ready status. Most of controllers managed to issue coommand and set BUSY
bit almost simultaneously, before we will read it, but at least JMicron JMB363
don't. Ignore timeout errors to keep old behavior when error there was
impossible.
For me this fixes timeout errors on the first command after channel attach
or reinit. Boot in my case is not affected, as there is much time passing
between reset and next command giving reset time to complete.
Unlike GCC, LLVM defines __STDC_VERSION__ to 199901L by default. This
means `restrict' keywords in files end up being given to lint, which
results in errors during compilation of usr.bin/xlint.
Other keywords are also expanded to nothing when using lint, so do the
same with restrict.
done in other places. Until we have no support for command queueing we have
no any benefit from FBS, while enabling it only here somehow leads to
"port not ready" errors on Intel 63XXESB2 controller.
Tested by: Larry Rosenman <ler AT lerctr.org>
pointers together, move padding to the bottom of the structure, and add
two new integer spares due to attrition over time. Remove unused spare
"flags" field, we can use one of the spare ints if we need it later.
This change requires a rebuild of device driver modules that depend on
the layout of ifnet for binary compatibility reasons.
Discussed with: kmacy
which are not in a module of their own like gif.
Single kernel compiles and universe will fail if the size of the struct
changes. Th expected values are given in sys/vimage.h.
See the comments where how to handle this.
Requested by: peter
architecture to implement size-guards on the vimage vnet_* structures.
As CTASSERT_EQUAL() needs special compile time options we back it
by CTASSERT() in the default case. Unfortunately CTASSERT() triggers
first, thus add an option to allow compilation with CTASSERT_EQUAL() only.
See the comments how to get new values if you trigger the assert
and what to do in that case.
Reviewed by: rwatson, zec (earlier versions)
It's better to just use internal language constructs, because it is
likely the compiler has a better opinion on whether to perform inlining,
which is very likely to happen to struct winsize.
Submitted by: Christoph Mallon <christoph mallon gmx de>
It takes a positive integer constant (the expected value) and
another positive integer, usually compile-time evaluated,
e.g. CTASSERT_EQUAL(FOO_EXPECTED_SIZE, sizeof (struct foo));
While the classic CTASSERT() gives:
error: size of array '__assert60' is negative
this gives you:
In function '__ctassert_equal_at_line_60':
warning: '__expected_42_but_got[464ul]' is used uninitialized in this function
and you can directly see the difference in the expected and the
real value.
CTASSERT_EQUAL() needs special compile time options to trigger
thus keep it locally to this header. If it proves to be of general
interest it can be moved to systm.h.
Submitted by: jmallett
Reviewed by: sam, warner, rwatson, jmallett (earlier versions)
* Add RB_FOREACH_FROM() which continues traversal *at*
the y-node provided. There is no pre-increment.
* Nuke RB_FOREACH_SAFE as it was buggy; it would omit the final node.
* Replace RB_FOREACH_SAFE() with a working implementation
derived from RB_FOREACH_FROM().
The key observation is that we now only check the loop-control
variable, but still cache the next member pointer.
* Add RB_FOREACH_REVERSE_FROM() which continues backwards
traversal *at* the y-node provided. There is no pre-increment.
Typically this is used to back out of allocations made
whilst walking an RB-tree.
* Add RB_FOREACH_REVERSE_SAFE() which performs insertion and
deletion safe backwards traversal.
and partially r188903. Revert breaks new drives detection on reinit to the
state as it was before me, but fixes series of new bugs reported by some
people.
Unconditional queueing of ata_completed() calls can lead to deadlock if
due to timeout ata_reinit() was called at the same thread by previous
ata_completed(). Calling of ata_identify() on ata_reinit() in current
implementation opens numerous races and deadlocks.
Problems I was touching here are still exist and should be addresed, but
probably in different way.
When copying big structures, LLVM generates calls to memmove(), because
it may not be able to figure out whether structures overlap. This caused
linker errors to occur. memmove() is now implemented using bcopy().
Ideally it would be the other way around, but that can be solved in the
future. On ARM we don't do add anything, because it already has
memmove().
Discussed on: arch@
Reviewed by: rdivacky
drivers' probe routines. It allows not to sleep and so not drop Giant inside
ata_identify() critical section and so avoid crash if it reentered on
request timeout. Reentering of probe call checked inside of it.
Give device own knowledge about it's type (ata/atapi/atapicam). It is not
a good idea to ask channel status for device type inside ata_getparam().
Add softc memory deallocation on device destruction.
On FreeBSD, this is the default behaviour. According to the spec, we may
give this flag a value of zero, but I'd rather not do this. If we define
it to a non-zero value, we can always change default behaviour without
changing the ABI. This is very unlikely to happen, though.
wcscasecmp(), and wcsncasecmp().
- Make some previously non-standard extensions visible
if POSIX_VISIBLE >= 200809.
- Use restrict qualifiers in stpcpy().
- Declare off_t and size_t in stdio.h.
- Bump __FreeBSD_version in case the new symbols (particularly
getline()) cause issues with ports.
Reviewed by: standards@
at irq install/uninstall time, but when we vt switch, we uninstall the
irq handler. When the irq handler is reinstalled, the modeset ioctl
happens first. The modeset ioctl is supposed to tell us that we can
disable vblank interrupts if there are no active consumers. This will
fail after a vt switch until another modeset ioctl is called via dpms
or xrandr. Leading to cases where either interrupts are on and can't
be disabled, or worse, no interrupts at all.
MFC after: 2 weeks
usb stack rather than with the rest of the processor support code.
Not sure that's a good idea, as we were moving away from it, but this
fixes the build in the mean time so we can have that discussion.
- Fix the copy, we can't do a blind copy but must transfer
the data from the old to the new.
- Fix the ACK processing so we properly stop retransmitting
the thing.
- Fix it so if we get a retran we will properly reply with
the saved response without doing anything.
MFC after: 1 month
the devfs clone handler to open the (invisible) devices on the fly.
The /dev entries are layed out as follows,
/dev/usbctl = master device
/dev/usb/0.1.0.5 = usb device, (<bus>.<dev>.<iface>.<endpoint>)
/dev/ugen0.1 -> usb/0.1.0.0 = ugen link to ctrl endpoint
This also removes the custom permissions model from USB. Bump
__FreeBSD_version to 800066.
Submitted by: rink (earlier version)
net/route.h.
Remove the hidden include of opt_route.h and net/route.h from net/vnet.h.
We need to make sure that both opt_route.h and net/route.h are included
before net/vnet.h because of the way MRT figures out the number of FIBs
from the kernel option. If we do not, we end up with the default number
of 1 when including net/vnet.h and array sizes are wrong.
This does not change the list of files which depend on opt_route.h
but we can identify them now more easily.
printf() and vprintf() are exactly the same, except the way arguments
are passed. Just like we see in other pieces of code (i.e. libc's
printf()), implement printf() using vprintf().
Submitted by: Christoph Mallon <christoph mallon gmx de>
As mentioned by bz and bde, the change I made wasn't the proper way to
fix. Inspired by bde's patch, perform some small cleanups to uprintf().
Reviewed by: bz
Previously, DBCR0 flags were set "globally", but this leads to problems
because Book-E fine grained debug settings work only in conjuction with the
debug master enable bit in MSR: in scenarios when the DBCR0 was set with
intention to debug one process, but another one with MSR[DE] set got
scheduled, the latter would immediately cause debug exceptions to occur upon
execution of its own code instructions (and not the one intended for
debugging).
To avoid such problems and properly handle debugging context, DBCR0 state
should be managed individually per process.
Submitted by: Grzegorz Bernacki gjb ! semihalf dot com
Reviewed by: marcel
The function pow() in libmp(3) clashes with pow(3) in libm. We could
rename this single function, but we can just take the same approach as
the Solaris folks did, which is to prefix all function names with mp_.
libmp(3) isn't really popular nowadays. I suspect not a single
application in ports depends on it. There's still a chance, so I've
increased the SHLIB_MAJOR and __FreeBSD_version.
Reviewed by: deischen, rdivacky
kernel dumping case.
ata_completed() may initiate ata_reinit() on error, that may lead to drives
attach or detach. Attach and detach are sending requests to drives and sleep
waiting for results. But ata_finish() can be called directly from
interrupt handler where sleeping is prohibited, so we must break this chain
somewhere. This place seems to fit best.
go before the standard -I$S to have the search happen before /sys/.
Make this conditional on 'makeoptions WITH_LEGACY=1' in the kernel config.
Prodded by: sam
Disable MSI for nVidia MCP51 controller. Enabling MSI there leads to
unexpected errors and timeouts, that should not happen even if interrupts
are not working completely.
Currently bread()-ing through device vnode with
(1) VMIO enabled,
(2) bo_bsize != DEV_BSIZE
(3) more than 1 block
results in data being incorrectly cached.
So instead a more common approach of using a vnode belonging to fs is now
employed.
Also, prevent attempt to bread more than MAXBSIZE bytes because of
adjustments made to account for offset that doesn't start on block
boundary.
Add expanded comments to explain the calculations.
Also drop unused inline function while here.
PR: kern/120967
PR: kern/129084
Reviewed by: scottl, kib
Approved by: jhb (mentor)
Inside do_execve(), we have a pointer `ndp', which always points to
`&nd'. I can imagine a primitive (non-optimizing) compiler to really
reserve space for such a pointer, so just remove the variable and use
`&nd' directly.
kern_time.c:
- Unused variable `p'.
kern_thr.c:
- Variable `error' is always caught immediately, so no reason to
initialize it. There is no way that error != 0 at the end of
create_thread().
kern_sig.c:
- Unused variable `code'.
kern_synch.c:
- `rval' is always assigned in all different cases.
kern_rwlock.c:
- `v' is always overwritten with RW_UNLOCKED further on.
kern_malloc.c:
- `size' is always initialized with the proper value before being used.
kern_exit.c:
- `error' is always caught and returned immediately. abort2() never
returns a non-zero value.
kern_exec.c:
- `len' is always assigned inside the if-statement right below it.
tty_info.c:
- `td' is always overwritten by FOREACH_THREAD_IN_PROC().
Found by: LLVM's scan-build
... as opposed to files with data in extents.
Some UDF authoring tools produce this type of file for sufficiently small
data files.
Approved by: jhb (mentor)
Use bit-shift instead of division/multiplication.
Act on error as soon as it is detected.
Report attempt to read data embedded in file entry via regular way.
While there, fix lblktosize macro and make use of it.
No functionality should change as a result.
Approved by: jhb (mentor)
`p' is already initialized with `td->td_proc'. Because td is always
curthread, it is safe to initialize it without any locks.
Found by: LLVM's scan-build
priv:kernel:priv_check:priv_ok fires for granted privileges
priv:kernel:priv_check:priv_errr fires for denied privileges
The first argument is the requested privilege number. The naming
convention is a little different from the OpenSolaris equivilent
because we can't have '-' in probefunc names, and our privilege
namespace is different.
MFC after: 1 week
fault. In r188331 this update was relocated because of synchronization
changes to a place where it would occur on both hard and soft faults. This
change again restricts the update to hard faults.
required to make 3CR990 familiy controllers run on NV flash
firmware version 03.001.008.
The latest firmware added HMAC digest information so teach txp(4)
to pass them to sleep image before downloading is started.
While I'm here restore previous IMR/IER register if firmware
downloading have failed.
PR: kern/89876, kern/132047
The old BTX passed the general purpose registers from the 32-bit client to
the routines called via virtual 86 mode. The new BTX did the same thing.
However, it turns out that some instructions behave differently in virtual 86
mode and real mode (even though this is under-documented). For example, the
LEAVE instruction will cause an exception in real mode if any of the upper
16-bits of %ebp are non-zero after it executes. In virtual 8086 mode the
upper 16-bits are simply ignored. This could cause faults in hardware
interrupt handlers that inherited an %ebp larger than 0xffff from the 32-bit
client (loader, boot2, etc.) while running in real mode.
To fix, when executing hardware interrupt handlers provide an explicit clean
state where all the general purpose and segment registers are zero upon
entry to the interrupt handler. While here, I attempted to simplify the
control flow in the 'intusr' code that sets up the various stack frames
and exits protected mode to invoke the requested routine via real mode.
A huge thanks to Tor Egge (tegge@) for debugging this issue.
Submitted by: tegge
Reviewed by: tegge
Tested by: bz
MFC after: 1 week
function, done in r188334. Instead, collect the entries that shall be
freed, in the deferred_freelist member of the map. Automatically purge
the deferred freelist when map is unlocked.
Tested by: pho
Reviewed by: alc
an NFS file. Now the priming is conditional on a new
vfs.nfs.prime_access_cache sysctl. For now I've left the default setting
to disabling the priming.
Requested by: scottl
checks for the tcpcb, previously used to detect complete disconnection,
with INP_DROPPED checks. Correct that, preventing shutdown() from
improperly generating a TCP segment with destination IP and port of
0.0.0.0:0.
PR: kern/132050
Reported by: david gueluy <david.gueluy at netasq.com>
MFC after: 3 weeks
- We don't need to exit the Giant mutex when sleeping. This is done
automatically. Replace Giant by NULL mutex for all control requests in the
enumeration path.
- Optimise away duplicate alternate interface selection requests in USB Host
mode.
Submitted by: Hans Petter Selasky
Improvements to "usb2_transfer_setup()" and "usb2_transfer_unsetup()". Set
"ppxfer[n]" when the transfer setup is complete to prevent races. Remove
redundant NULL-checks from "usb2_transfer_unsetup()".
Submitted by: Hans Petter Selasky
- The software computed HID size is not always correct, because the algoritm
does not handle unsorted HID descriptors.
- Change the way we obtain the report ID.
- Use the X/Y/Z+button locations instead for report ID source for ums.
- Add more range checks.
- Remove Microsoft Mouse quirks. If the positions are moduloed the report
length multiplied by 8, the values seem correct.
- Some minor style changes.
Submitted by: Hans Petter Selasky
o add ah_configPCIE and ah_disablePCIE for drivers to configure PCIE
power save operation (modeled after ath9k, may need changes)
o add private state flag to indicate if device is PCIE (replaces private
hack in 5212 code)
o add serdes programming ini bits for 5416 and later parts and setup
for each part (5416 and 9160 logic hand-crafted from existing routines);
5212 remains open-coded but is now hooked in via ah_configPCIE
o add PCIE workaround gunk
o add ar5416AttachPCIE for iodomatic code used by 5416 and later parts