* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (30 commits)
MAINTAINERS: Add tomoyo-dev-en ML.
SELinux: define permissions for DCB netlink messages
encrypted-keys: style and other cleanup
encrypted-keys: verify datablob size before converting to binary
trusted-keys: kzalloc and other cleanup
trusted-keys: additional TSS return code and other error handling
syslog: check cap_syslog when dmesg_restrict
Smack: Transmute labels on specified directories
selinux: cache sidtab_context_to_sid results
SELinux: do not compute transition labels on mountpoint labeled filesystems
This patch adds a new security attribute to Smack called SMACK64EXEC. It defines label that is used while task is running.
SELinux: merge policydb_index_classes and policydb_index_others
selinux: convert part of the sym_val_to_name array to use flex_array
selinux: convert type_val_to_struct to flex_array
flex_array: fix flex_array_put_ptr macro to be valid C
SELinux: do not set automatic i_ino in selinuxfs
selinux: rework security_netlbl_secattr_to_sid
SELinux: standardize return code handling in selinuxfs.c
SELinux: standardize return code handling in selinuxfs.c
SELinux: standardize return code handling in policydb.c
...
Using "iptables -L" with a lot of rules have a too big BH latency.
Jesper mentioned ~6 ms and worried of frame drops.
Switch to a per_cpu seqlock scheme, so that taking a snapshot of
counters doesnt need to block BH (for this cpu, but also other cpus).
This adds two increments on seqlock sequence per ipt_do_table() call,
its a reasonable cost for allowing "iptables -L" not block BH
processing.
Reported-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When I enable EXT2_XATTR_DEBUG in fs/ext2/xattr.c I get a build error stating
the following:
CC fs/ext2/xattr.o
fs/ext2/xattr.c: In function 'ext2_xattr_cache_insert':
fs/ext2/xattr.c:841: error: dereferencing pointer to incomplete type
fs/ext2/xattr.c:846: error: dereferencing pointer to incomplete type
make[2]: *** [fs/ext2/xattr.o] Error 1
make[1]: *** [fs/ext2] Error 2
make: *** [fs] Error 2
These lines reference ext2_xattr_cache->c_entry_count which is defined
in struct mb_cache. struct mb_cache is currently only defined in fs/mbcache.c.
Moving struct mb_cache definition to include/linux/mbcache.h to resolve the
issue.
Signed-off-by: Josh Hunt <johunt@akamai.com>
Signed-off-by: Jan Kara <jack@suse.cz>
The addition of 64k block capability in the rec_len_from_disk
and rec_len_to_disk functions added a bit of math overhead which
slows down file create workloads needlessly when the architecture
cannot even support 64k blocks, thanks to page size limits.
Similar changes already exist in the ext4 codebase.
The directory entry checking can also be optimized a bit
by sprinkling in some unlikely() conditions to move the
error handling out of line.
bonnie++ sequential file creates on a 512MB ramdisk speeds up
from about 77,000/s to about 82,000/s, about a 6% improvement.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Use %pV in __quota_error so a single printk can not be
interleaved with other logging messages.
Add __attribute__((format (printf, 3, 4))) so format
and arguments can be verified by compiler.
Make sure printk formats and arguments match.
Block # needed a pointer dereference.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Walk through allocation groups and trim all free extents. It can be
invoked through FITRIM ioctl on the file system. The main idea is to
provide a way to trim the whole file system if needed, since some SSD's
may suffer from performance loss after the whole device was filled (it
does not mean that fs is full!).
It search for free extents in allocation groups specified by Byte range
start -> start+len. When the free extent is within this range, blocks are
marked as used and then trimmed. Afterwards these blocks are marked as
free in per-group bitmap.
[JK: Fixed up error handling and trimming of a single group]
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Replace the jbd2_inode structure (which is 48 bytes) with a pointer
and only allocate the jbd2_inode when it is needed --- that is, when
the file system has a journal present and the inode has been opened
for writing. This allows us to further slim down the ext4_inode_info
structure.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add new mappings for assist, VAIO, zoom and eject buttons present on
refurbished P, Z and EC models.
Reported-by: Gyorgy Jeney <nog.lkml@gmail.com>
Reported-by: Stephan Mueller <smueller@chronox.de>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging: (44 commits)
hwmon: Support for Dallas Semiconductor DS620
hwmon: driver for Sensirion SHT21 humidity and temperature sensor
hwmon: Add humidity attribute to sysfs ABI
hwmon: sysfs ABI updates
hwmon: (via-cputemp) sync hotplug handling with coretemp/pkgtemp
hwmon: (lm95241) Rewrite to avoid using macros
hwmon: (applesmc) Fix checkpatch errors and fix value range checks
hwmon: (applesmc) Update copyright information
hwmon: (applesmc) Silence driver
hwmon: (applesmc) Simplify feature sysfs handling
hwmon: (applesmc) Dynamic creation of fan files
hwmon: (applesmc) Extract all features generically
hwmon: (applesmc) Handle new temperature format
hwmon: (applesmc) Dynamic creation of temperature files
hwmon: (applesmc) Introduce a register lookup table
hwmon: (applesmc) Use pr_fmt and pr_<level>
hwmon: (applesmc) Relax the severity of device init failure
hwmon: (applesmc) Add MacBookAir3,1(3,2) support
hwmon: (w83627hf) Use pr_fmt and pr_<level>
hwmon: (w83627ehf) Use pr_fmt and pr_<level>
...
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: (29 commits)
of/flattree: forward declare struct device_node in of_fdt.h
ipmi: explicitly include of_address.h and of_irq.h
sparc: explicitly cast negative phandle checks to s32
powerpc/405: Fix missing #{address,size}-cells in i2c node
powerpc/5200: dts: refactor dts files
powerpc/5200: dts: Change combatible strings on localbus
powerpc/5200: dts: remove unused properties
powerpc/5200: dts: rename nodes to prepare for refactoring dts files
of/flattree: Update dtc to current mainline.
of/device: Don't register disabled devices
powerpc/dts: fix syntax bugs in bluestone.dts
of: Fixes for OF probing on little endian systems
of: make drivers depend on CONFIG_OF instead of CONFIG_PPC_OF
of/flattree: Add of_flat_dt_match() helper function
of_serial: explicitly include of_irq.h
of/flattree: Refactor unflatten_device_tree and add fdt_unflatten_tree
of/flattree: Reorder unflatten_dt_node
of/flattree: Refactor unflatten_dt_node
of/flattree: Add non-boottime device tree functions
of/flattree: Add Kconfig for EARLY_FLATTREE
...
Fix up trivial conflict in arch/sparc/prom/tree_32.c as per Grant.
Remove kobject.h from files which don't need it, notably,
sched.h and fs.h.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove path.h from sched.h and other files.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: Fix a crash during slabinfo -v
tracing/slab: Move kmalloc tracepoint out of inline code
slub: Fix slub_lock down/up imbalance
slub: Fix build breakage in Documentation/vm
slub tracing: move trace calls out of always inlined functions to reduce kernel code size
slub: move slabinfo.c to tools/slub/slabinfo.c
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
mkuboot.sh: Fail if mkimage is missing
gen_init_cpio: checkpatch fixes
gen_init_cpio: Avoid race between call to stat() and call to open()
modpost: Fix address calculation in reloc_location()
Make fixdep error handling more explicit
checksyscalls: Fix stand-alone usage
modpost: Put .zdebug* section on white list
kbuild: fix interaction of CONFIG_IKCONFIG and KCONFIG_CONFIG
kbuild: export linux/{a.out,kvm,kvm_para}.h on headers_install_all
kbuild: introduce HDR_ARCH_LIST for headers_install_all
headers_install: check exit status of unifdef
gen_init_cpio: remove leading `/' from file names
scripts/genksyms: fix header usage
fixdep: use hash table instead of a single array
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (34 commits)
HID: roccat: Update sysfs attribute doc
HID: roccat: don't use #pragma pack
HID: roccat: Add support for Roccat Kone[+] v2
HID: roccat: reduce number of functions in kone and pyra drivers
HID: roccat: declare meaning of pack pragma usage in driver headers
HID: roccat: use class for char device for sysfs attribute creation
sysfs: Introducing binary attributes for struct class
HID: hidraw: add compatibility ioctl() for 32-bit applications.
HID: hid-picolcd: Fix memory leak in picolcd_debug_out_report()
HID: picolcd: fix misuse of logical operation in place of bitop
HID: usbhid: base runtime PM on modern API
HID: replace offsets values with their corresponding BTN_* defines
HID: hid-mosart: support suspend/resume
HID: hid-mosart: ignore buttons report
HID: hid-picolcd: don't use flush_scheduled_work()
HID: simplify an index check in hid_lookup_collection
HID: Hoist assigns from ifs
HID: Remove superfluous __inline__
HID: Use vzalloc for vmalloc/memset(,0...)
HID: Add and use hid_<level>: dev_<level> equivalents
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
spi / PM: Support dev_pm_ops
PM: Prototype the pm_generic_ operations
PM / Runtime: Generic resume shouldn't set RPM_ACTIVE unconditionally
PM: Use dev_name() in core device suspend and resume routines
PM: Permit registration of parentless devices during system suspend
PM: Replace the device power.status field with a bit field
PM: Remove redundant checks from core device resume routines
PM: Use a different list of devices for each stage of device suspend
PM: Avoid compiler warning in pm_noirq_op()
PM: Use pm_wakeup_pending() in __device_suspend()
PM / Wakeup: Replace pm_check_wakeup_events() with pm_wakeup_pending()
PM: Prevent dpm_prepare() from returning errors unnecessarily
PM: Fix references to basic-pm-debugging.txt in drivers-testing.txt
PM / Runtime: Add synchronous runtime interface for interrupt handlers (v3)
PM / Hibernate: When failed, in_suspend should be reset
PM / Hibernate: hibernation_ops->leave should be checked too
Freezer: Fix a race during freezing of TASK_STOPPED tasks
PM: Use proper ccflag flag in kernel/power/Makefile
PM / Runtime: Fix comments to match runtime callback code
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: use split transaction timeout only for split transactions
firewire: ohci: consolidate context status flags
firewire: ohci: cache the context run bit
firewire: ohci: flush AT contexts after bus reset - addendum
firewire: ohci: flush AT contexts after bus reset for OHCI 1.2
firewire: net: set carrier state at ifup
firewire: net: add carrier detection
firewire: net: ratelimit error messages
firewire: ohci: restart iso DMA contexts on resume from low power mode
firewire: ohci: restore GUID on resume.
firewire: ohci: use common buffer for self IDs and AR descriptors
firewire: ohci: optimize iso context checks in the interrupt handler
firewire: make PHY packet header format consistent
firewire: ohci: properly clear posted write errors
firewire: ohci: flush MMIO writes in the interrupt handler
firewire: ohci: fix AT context initialization error handling
firewire: ohci: Asynchronous Reception rewrite
firewire: core: Update WARN uses
firewire: nosy: char device is not seekable
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: fix ioctl ABI
fuse: allow batching of FORGET requests
fuse: separate queue for FORGET requests
fuse: ioctl cleanup
Fix up trivial conflict in fs/fuse/inode.c due to RCU lookup having done
the RCU-freeing of the inode in fuse_destroy_inode().
Fix kernel-doc notation warnings in pipe_fs_i.h:
Warning(include/linux/pipe_fs_i.h:58): No description found for parameter 'buffers'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix new kernel-doc notation warning in hrtimer.h:
Warning(include/linux/hrtimer.h:150): Excess struct/union/enum/typedef member 'first' description in 'hrtimer_clock_base'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix new kernel-doc notation warning in dcache.h:
Warning(include/linux/dcache.h:316): Excess function parameter 'Returns' description in '__d_rcu_to_refcount'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that there is a single function that can compute the device
features relevant to a packet, we don't want to run it for each
offload. This converts netif_needs_gso() to take the features
of the device, rather than computing them itself.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
netif_get_vlan_features() is currently only used by netif_needs_gso(),
so it only concerns itself with GSO features. However, several other
places also should take into account the contents of the packet when
deciding whether to offload to hardware. This generalizes the function
to return features about all of the various forms of offloading. Since
offloads tend to be linked together, this avoids duplicating the logic
in each location (i.e. the scatter/gather code also needs the checksum
logic).
Suggested-by: Michał Mirosław <mirqus@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add mac field into fec_platform_data and consolidate function
fec_get_mac to get mac address in following order.
1) module parameter via kernel command line fec.macaddr=0x00,0x04,...
2) from flash in case of CONFIG_M5272 or fec_platform_data mac
field for others, which typically have mac stored in fuse
3) fec mac address registers set by bootloader
Signed-off-by: Shawn Guo <shawn.guo@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
security/smack/smack_lsm.c
Verified and added fix by Stephen Rothwell <sfr@canb.auug.org.au>
Ok'd by Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
Driver for Dallas Semiconductor DS620 temperature sensor and thermostat
Signed-off-by: Roland Stigge <stigge@antcom.de>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
This patch implements SDIO IRQ support for mfds which
announce the TMIO_MMC_SDIO_IRQ flag for tmio_mmc.
If MMC_CAP_SDIO_IRQ is also set SDIO IRQ signalling is activated.
Tested with a b43-based wireless SDIO card and sh_mobile_sdhi.
Signed-off-by: Arnd Hannemann <arnd@arndnet.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
For example, with SDIO WLAN cards, some transfers happen with buffers at
odd addresses, whereas the SH-Mobile DMA engine requires even addresses
for SDHI. This patch extends the tmio driver with a bounce buffer, that
is used for single entry scatter-gather lists both for sending and
receiving. If we ever encounter unaligned transfers with multi-element
sg lists, this patch will have to be extended. For now it just falls
back to PIO in this and other unsupported cases.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
This adds the mmc host driver for the Synopsys DesignWare mmc
host controller, found in a number of embedded SoC designs.
Signed-off-by: Will Newton <will.newton@imgtec.com>
Reviewed-by: Matt Fleming <matt@console-pimps.org>
Reviewed-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Some controllers misparse segment length 0 as being 0, not 65536. Add
a quirk to deal with it.
Signed-off-by: Olof Johansson <olof@lixom.net>
Reviewed-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Some old MMC devices fail with the 4/8 bits the driver tries to use
exclusively. This patch adds a test for the given bus setup and falls
back to the lower bit mode (until 1-bit mode) when the test fails.
[Major rework and refactoring by tiwai]
[Quirk addition and many fixes by prakity]
Signed-off-by: Aries Lee <arieslee@jmicron.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Philip Rakity <prakity@marvell.com>
Tested-by: Philip Rakity <prakity@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch forward declares struct device_node to fix a compile error
when of_fdt.h is included, but of.h is not. Alternately, including
linux/of.h could have been added to of_fdt.h, but that pulls in a lot
of unnecessary declarations when only working with the flattened
form.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Upon system resume, SDIO core must reinitialize cards that were
powered off during suspend.
If the card had its power kept during suspend (and thus it is
'powered-resumed'), SDIO core performs only a limited reinitializing,
mainly needed to make sure that the card wasn't removed/replaced.
If a __nonremovable__ card is powered-resumed, we can safely skip the
reinitializing phase.
Note: 9b966aa (mmc: sdio: fully reconfigure oldcard on resume) removed
the bus width reconfiguration since mmc_sdio_init_card already does it.
It is brought back now in case mmc_sdio_init_card is skipped.
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
JMicron 388 SD/MMC combo controller supports the 1.8V low-voltage for
SD, but MMC doesn't work with the low-voltage, resulting in an error
at probing.
This patch adds the support for multiple voltage mask per device type,
so that SD works with 1.8V while MMC forces 3.3V. Here new ocr_avail_*
fields for each device are introduced, so that the actual OCR mask is
switched dynamically.
Also, the restriction of low-voltage in core/sd.c is removed when the
bit is allowed explicitly via ocr_avail_sd mask.
This patch was rewritten from scratch based on Aries' original code.
Signed-off-by: Aries Lee <arieslee@jmicron.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch modifies the MMC core code to optionally call the set_ios()
operation on the driver with the clock frequency set to 0 (gate) after
a grace period of at least 8 MCLK cycles, then restore it (ungate)
before any new request. This gives the driver the option to shut down
the MCI clock to the MMC/SD card when the clock frequency is 0, i.e.
the core has stated that the MCI clock does not need to be generated.
It is inspired by existing clock gating code found in the OMAP and
Atmel drivers and brings this up to the host abstraction. Gating is
performed before and after any MMC request.
This patchset implements this for the MMCI/PL180 MMC/SD host controller,
but it should be simple to switch OMAP/Atmel over to using this instead.
mmc_set_{gated,ungated}() add variable protection to the state holders
for the clock gating code. This is particularly important when ordinary
.set_ios() calls would race with the .set_ios() call resulting from a
delayed gate operation.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Reviewed-by: Chris Ball <cjb@laptop.org>
Tested-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch disables the broken ADMA on selected O2Micro devices.
Signed-off-by: Jennifer Li <Jennifer.li@o2micro.com>
Reviewed-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
On older gcc (3.3) dynamic debug fails to compile:
include/net/inet_connection_sock.h: In function `inet_csk_reset_xmit_timer':
include/net/inet_connection_sock.h:236: error: duplicate label declaration `do_printk'
include/net/inet_connection_sock.h:219: error: this is a previous declaration
include/net/inet_connection_sock.h:236: error: duplicate label declaration `out'
include/net/inet_connection_sock.h:219: error: this is a previous declaration
include/net/inet_connection_sock.h:236: error: duplicate label `do_printk'
include/net/inet_connection_sock.h:236: error: duplicate label `out'
Fix, by reverting the usage of JUMP_LABEL() in dynamic debug for now.
Cc: <stable@kernel.org>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* 'next-spi' of git://git.secretlab.ca/git/linux-2.6: (77 commits)
spi/omap: Fix DMA API usage in OMAP MCSPI driver
spi/imx: correct the test on platform_get_irq() return value
spi/topcliff: Typo fix threhold to threshold
spi/dw_spi Typo change diable to disable.
spi/fsl_espi: change the read behaviour of the SPIRF
spi/mpc52xx-psc-spi: move probe/remove to proper sections
spi/dw_spi: add DMA support
spi/dw_spi: change to EXPORT_SYMBOL_GPL for exported APIs
spi/dw_spi: Fix too short timeout in spi polling loop
spi/pl022: convert running variable
spi/pl022: convert busy flag to a bool
spi/pl022: pass the returned sglen to the DMA engine
spi/pl022: map the buffers on the DMA engine
spi/topcliff_pch: Fix data transfer issue
spi/imx: remove autodetection
spi/pxa2xx: pass of_node to spi device and set a parent device
spi/pxa2xx: Modify RX-Tresh instead of busy-loop for the remaining RX bytes.
spi/pxa2xx: Add chipselect support for Sodaville
spi/pxa2xx: Consider CE4100's FIFO depth
spi/pxa2xx: Add CE4100 support
...
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (30 commits)
gameport: use this_cpu_read instead of lookup
x86: udelay: Use this_cpu_read to avoid address calculation
x86: Use this_cpu_inc_return for nmi counter
x86: Replace uses of current_cpu_data with this_cpu ops
x86: Use this_cpu_ops to optimize code
vmstat: User per cpu atomics to avoid interrupt disable / enable
irq_work: Use per cpu atomics instead of regular atomics
cpuops: Use cmpxchg for xchg to avoid lock semantics
x86: this_cpu_cmpxchg and this_cpu_xchg operations
percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support
percpu,x86: relocate this_cpu_add_return() and friends
connector: Use this_cpu operations
xen: Use this_cpu_inc_return
taskstats: Use this_cpu_ops
random: Use this_cpu_inc_return
fs: Use this_cpu_inc_return in buffer.c
highmem: Use this_cpu_xx_return() operations
vmstat: Use this_cpu_inc_return for vm statistics
x86: Support for this_cpu_add, sub, dec, inc_return
percpu: Generic support for this_cpu_add, sub, dec, inc_return
...
Fixed up conflicts: in arch/x86/kernel/{apic/nmi.c, apic/x2apic_uv_x.c, process.c}
as per Tejun.
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits)
usb: don't use flush_scheduled_work()
speedtch: don't abuse struct delayed_work
media/video: don't use flush_scheduled_work()
media/video: explicitly flush request_module work
ioc4: use static work_struct for ioc4_load_modules()
init: don't call flush_scheduled_work() from do_initcalls()
s390: don't use flush_scheduled_work()
rtc: don't use flush_scheduled_work()
mmc: update workqueue usages
mfd: update workqueue usages
dvb: don't use flush_scheduled_work()
leds-wm8350: don't use flush_scheduled_work()
mISDN: don't use flush_scheduled_work()
macintosh/ams: don't use flush_scheduled_work()
vmwgfx: don't use flush_scheduled_work()
tpm: don't use flush_scheduled_work()
sonypi: don't use flush_scheduled_work()
hvsi: don't use flush_scheduled_work()
xen: don't use flush_scheduled_work()
gdrom: don't use flush_scheduled_work()
...
Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c
as per Tejun.
Added dev_bin_attrs to struct class similar to existing dev_attrs.
Signed-off-by: Stefan Achatz <erazor_de@users.sourceforge.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Constify function scope static struct sched_param usage
sched: Fix strncmp operation
sched: Move sched_autogroup_exit() to free_signal_struct()
sched: Fix struct autogroup memory leak
sched: Mark autogroup_init() __init
sched: Consolidate the name of root_task_group and init_task_group
* 'rmobile-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (67 commits)
ARM: mach-shmobile: update for SMP changes.
ARM: mach-shmobile: update for GIC changes.
ARM: mach-shmobile: Fix up clkdev fallout for SH73A0.
dma: shdma: don't register the global die notifier multiple times
ARM: mach-shmobile: Rely on run-time IRQ handlers
ARM: mach-shmobile: Run-time IRQ handler for GIC
ARM: mach-shmobile: Run-time IRQ handler for INTCA
ARM: mach-shmobile: Enable CONFIG_MULTI_IRQ_HANDLER
ARM: mach-shmobile: Use shared GIC entry macros
ARM: mach-shmobile: mackerel: Add zboot support
ARM: mach-shmobile: mackerel: Add HDMI sound support
ARM: mach-shmobile: mackerel: add HDMI video support
ARM: mach-shmobile: ap4evb: fixup clk_put timing of fsib_clk
ARM: mach-shmobile: sh73a0: fix div4 table
ARM: mach-shmobile: ap4/mackerel: modify wrong comment out of USB
ARM: mach-shmobile: Mackerel VGA camera support
mmc: sh_mmcif: make DMA support by the driver unconditional
ARM: mach-shmobile: Add eMMC support through MMCIF on AG5EVM
ARM: mach-shmobile: Use pullups for AG5EVM KEYSC pins
ARM: mach-shmobile: sh73a0 GPIO pullup improvement
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (58 commits)
Input: wacom_w8001 - support pen or touch only devices
Input: wacom_w8001 - use __set_bit to set keybits
Input: bu21013_ts - fix misuse of logical operation in place of bitop
Input: i8042 - add Acer Aspire 5100 to the Dritek list
Input: wacom - add support for digitizer in Lenovo W700
Input: psmouse - disable the synaptics extension on OLPC machines
Input: psmouse - fix up Synaptics comment
Input: synaptics - ignore bogus mt packet
Input: synaptics - add multi-finger and semi-mt support
Input: synaptics - report clickpad property
input: mt: Document interface updates
Input: fix double equality sign in uevent
Input: introduce device properties
hid: egalax: Add support for Wetab (726b)
Input: include MT library as source for kerneldoc
MAINTAINERS: Update input-mt entry
hid: egalax: Add support for Samsung NB30 netbook
hid: egalax: Document the new devices in Kconfig
hid: egalax: Add support for Wetab
hid: egalax: Convert to MT slots
...
Fixed up trivial conflict in drivers/input/keyboard/Kconfig
* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (36 commits)
serial: apbuart: Fixup apbuart_console_init()
TTY: Add tty ioctl to figure device node of the system console.
tty: add 'active' sysfs attribute to tty0 and console device
drivers: serial: apbuart: Handle OF failures gracefully
Serial: Avoid unbalanced IRQ wake disable during resume
tty: fix typos/errors in tty_driver.h comments
pch_uart : fix warnings for 64bit compile
8250: fix uninitialized FIFOs
ip2: fix compiler warning on ip2main_pci_tbl
specialix: fix compiler warning on specialix_pci_tbl
rocket: fix compiler warning on rocket_pci_ids
8250: add a UPIO_DWAPB32 for 32 bit accesses
8250: use container_of() instead of casting
serial: omap-serial: Add support for kernel debugger
serial: fix pch_uart kconfig & build
drivers: char: hvc: add arm JTAG DCC console support
RS485 documentation: add 16C950 UART description
serial: ifx6x60: fix memory leak
serial: ifx6x60: free IRQ on error
Serial: EG20T: add PCH_UART driver
...
Fixed up conflicts in drivers/serial/apbuart.c with evil merge that
makes the code look fairly sane (unlike either side).
* 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (144 commits)
USB: add support for Dream Cheeky DL100B Webmail Notifier (1d34:0004)
USB: serial: ftdi_sio: add support for TIOCSERGETLSR
USB: ehci-mxc: Setup portsc register prior to accessing OTG viewport
USB: atmel_usba_udc: fix freeing irq in usba_udc_remove()
usb: ehci-omap: fix tll channel enable mask
usb: ohci-omap3: fix trivial typo
USB: gadget: ci13xxx: don't assume that PAGE_SIZE is 4096
USB: gadget: ci13xxx: fix complete() callback for no_interrupt rq's
USB: gadget: update ci13xxx to work with g_ether
USB: gadgets: ci13xxx: fix probing of compiled-in gadget drivers
Revert "USB: musb: pm: don't rely fully on clock support"
Revert "USB: musb: blackfin: pm: make it work"
USB: uas: Use GFP_NOIO instead of GFP_KERNEL in I/O submission path
USB: uas: Ensure we only bind to a UAS interface
USB: uas: Rename sense pipe and sense urb to status pipe and status urb
USB: uas: Use kzalloc instead of kmalloc
USB: uas: Fix up the Sense IU
usb: musb: core: kill unneeded #include's
DA8xx: assign name to MUSB IRQ resource
usb: gadget: g_ncm added
...
Manually fix up trivial conflicts in USB Kconfig changes in:
arch/arm/mach-omap2/Kconfig
arch/sh/Kconfig
drivers/usb/Kconfig
drivers/usb/host/ehci-hcd.c
and annoying chip clock data conflicts in:
arch/arm/mach-omap2/clock3xxx_data.c
arch/arm/mach-omap2/clock44xx_data.c
root_task_group is the leftover of USER_SCHED, now it's always
same to init_task_group.
But as Mike suggested, root_task_group is maybe the suitable name
to keep for a tree.
So in this patch:
init_task_group --> root_task_group
init_task_group_load --> root_task_group_load
INIT_TASK_GROUP_LOAD --> ROOT_TASK_GROUP_LOAD
Suggested-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110107071736.GA32635@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the PCI northbridge device id for AMD CPU
families 12h and 14h. Both families have implemented the same
PCI northbridge device.
There are some future use cases that use this PCI device and
we would like to clarify its naming.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: xen-devel@lists.xensource.com <xen-devel@lists.xensource.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <JBeulich@novell.com>
LKML-Reference: <20110106165107.GL4739@erda.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The problem that this patch aims to fix is vfsmount refcounting scalability.
We need to take a reference on the vfsmount for every successful path lookup,
which often go to the same mount point.
The fundamental difficulty is that a "simple" reference count can never be made
scalable, because any time a reference is dropped, we must check whether that
was the last reference. To do that requires communication with all other CPUs
that may have taken a reference count.
We can make refcounts more scalable in a couple of ways, involving keeping
distributed counters, and checking for the global-zero condition less
frequently.
- check the global sum once every interval (this will delay zero detection
for some interval, so it's probably a showstopper for vfsmounts).
- keep a local count and only taking the global sum when local reaches 0 (this
is difficult for vfsmounts, because we can't hold preempt off for the life of
a reference, so a counter would need to be per-thread or tied strongly to a
particular CPU which requires more locking).
- keep a local difference of increments and decrements, which allows us to sum
the total difference and hence find the refcount when summing all CPUs. Then,
keep a single integer "long" refcount for slow and long lasting references,
and only take the global sum of local counters when the long refcount is 0.
This last scheme is what I implemented here. Attached mounts and process root
and working directory references are "long" references, and everything else is
a short reference.
This allows scalable vfsmount references during path walking over mounted
subtrees and unattached (lazy umounted) mounts with processes still running
in them.
This results in one fewer atomic op in the fastpath: mntget is now just a
per-CPU inc, rather than an atomic inc; and mntput just requires a spinlock
and non-atomic decrement in the common case. However code is otherwise bigger
and heavier, so single threaded performance is basically a wash.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
The standard memcmp function on a Westmere system shows up hot in
profiles in the `git diff` workload (both parallel and single threaded),
and it is likely due to the costs associated with trapping into
microcode, and little opportunity to improve memory access (dentry
name is not likely to take up more than a cacheline).
So replace it with an open-coded byte comparison. This increases code
size by 8 bytes in the critical __d_lookup_rcu function, but the
speedup is huge, averaging 10 runs of each:
git diff st user sys elapsed CPU
before 1.15 2.57 3.82 97.1
after 1.14 2.35 3.61 96.8
git diff mt user sys elapsed CPU
before 1.27 3.85 1.46 349
after 1.26 3.54 1.43 333
Elapsed time for single threaded git diff at 95.0% confidence:
-0.21 +/- 0.01
-5.45% +/- 0.24%
It's -0.66% +/- 0.06% elapsed time on my Opteron, so rep cmp costs on the
fam10h seem to be relatively smaller, but there is still a win.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Regardless of how much we possibly try to scale dcache, there is likely
always going to be some fundamental contention when adding or removing children
under the same parent. Pseudo filesystems do not seem need to have connected
dentries because by definition they are disconnected.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
dcache_inode_lock can be replaced with per-inode locking. Use existing
inode->i_lock for this. This is slightly non-trivial because we sometimes
need to find the inode from the dentry, which requires d_inode to be
stabilised (either with refcount or d_lock).
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Introduce a type of hlist that can support the use of the lowest bit in the
hlist_head. This will be subsequently used to implement per-bucket bit spinlock
for inode and dentry hashes, and may be useful in other cases such as network
hashes.
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
This simple implementation just checks for no ACLs on the inode, and
if so, then the rcu-walk may proceed, otherwise fail it.
This could easily be extended to put acls under RCU and check them
under seqlock, if need be. But this implementation is enough to show
the rcu-walk aware permissions code for path lookups is working, and
will handle cases where there are no ACLs or ACLs in just the final
element.
This patch implicity converts tmpfs to rcu-aware permission check.
Subsequent patches onvert ext*, xfs, and, btrfs. Each of these uses
acl/permission code in a different way, so convert them all to provide
templates and proof of concept.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Require filesystems be aware of .d_revalidate being called in rcu-walk
mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning
-ECHILD from all implementations.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Put dentry and inode fields into top of data structure. This allows RCU path
traversal to perform an RCU dentry lookup in a path walk by touching only the
first 56 bytes of the dentry.
We also fit in 8 bytes of inline name in the first 64 bytes, so for short
names, only 64 bytes needs to be touched to perform the lookup. We should
get rid of the hash->prev pointer from the first 64 bytes, and fit 16 bytes
of name in there, which will take care of 81% rather than 32% of the kernel
tree.
inode is also rearranged so that RCU lookup will only touch a single cacheline
in the inode, plus one in the i_ops structure.
This is important for directory component lookups in RCU path walking. In the
kernel source, directory names average is around 6 chars, so this works.
When we reach the last element of the lookup, we need to lock it and take its
refcount which requires another cacheline access.
Align dentry and inode operations structs, so members will be at predictable
offsets and we can group common operations into head of structure.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Reduce some branches and memory accesses in dcache lookup by adding dentry
flags to indicate common d_ops are set, rather than having to check them.
This saves a pointer memory access (dentry->d_op) in common path lookup
situations, and saves another pointer load and branch in cases where we
have d_op but not the particular operation.
Patched with:
git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Rather than keep a d_mounted count in the dentry, set a dentry flag instead.
The flag can be cleared by checking the hash table to see if there are any
mounts left, which is not time critical because it is performed at detach time.
The mounted state of a dentry is only used to speculatively take a look in the
mount hash table if it is set -- before following the mount, vfsmount lock is
taken and mount re-checked without races.
This saves 4 bytes on 32-bit, nothing on 64-bit but it does provide a hole I
might use later (and some configs have larger than 32-bit spinlocks which might
make use of the hole).
Autofs4 conversion and changelog by Ian Kent <raven@themaw.net>:
In autofs4, when expring direct (or offset) mounts we need to ensure that we
block user path walks into the autofs mount, which is covered by another mount.
To do this we clear the mounted status so that follows stop before walking into
the mount and are essentially blocked until the expire is completed. The
automount daemon still finds the correct dentry for the umount due to the
follow mount logic in fs/autofs4/root.c:autofs4_follow_link(), which is set as
an inode operation for direct and offset mounts only and is called following
the lookup that stopped at the covered mount.
At the end of the expire the covering mount probably has gone away so the
mounted status need not be restored. But we need to check this and only restore
the mounted status if the expire failed.
XXX: autofs may not work right if we have other mounts go over the top of it?
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Use a seqlock in the fs_struct to enable us to take an atomic copy of the
complete cwd and root paths. Use this in the RCU lookup path to avoid a
thread-shared spinlock in RCU lookup operations.
Multi-threaded apps may now perform path lookups with scalability matching
multi-process apps. Operations such as stat(2) become very scalable for
multi-threaded workload.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Perform common cases of path lookups without any stores or locking in the
ancestor dentry elements. This is called rcu-walk, as opposed to the current
algorithm which is a refcount based walk, or ref-walk.
This results in far fewer atomic operations on every path element,
significantly improving path lookup performance. It also avoids cacheline
bouncing on common dentries, significantly improving scalability.
The overall design is like this:
* LOOKUP_RCU is set in nd->flags, which distinguishes rcu-walk from ref-walk.
* Take the RCU lock for the entire path walk, starting with the acquiring
of the starting path (eg. root/cwd/fd-path). So now dentry refcounts are
not required for dentry persistence.
* synchronize_rcu is called when unregistering a filesystem, so we can
access d_ops and i_ops during rcu-walk.
* Similarly take the vfsmount lock for the entire path walk. So now mnt
refcounts are not required for persistence. Also we are free to perform mount
lookups, and to assume dentry mount points and mount roots are stable up and
down the path.
* Have a per-dentry seqlock to protect the dentry name, parent, and inode,
so we can load this tuple atomically, and also check whether any of its
members have changed.
* Dentry lookups (based on parent, candidate string tuple) recheck the parent
sequence after the child is found in case anything changed in the parent
during the path walk.
* inode is also RCU protected so we can load d_inode and use the inode for
limited things.
* i_mode, i_uid, i_gid can be tested for exec permissions during path walk.
* i_op can be loaded.
When we reach the destination dentry, we lock it, recheck lookup sequence,
and increment its refcount and mountpoint refcount. RCU and vfsmount locks
are dropped. This is termed "dropping rcu-walk". If the dentry refcount does
not match, we can not drop rcu-walk gracefully at the current point in the
lokup, so instead return -ECHILD (for want of a better errno). This signals the
path walking code to re-do the entire lookup with a ref-walk.
Aside from the final dentry, there are other situations that may be encounted
where we cannot continue rcu-walk. In that case, we drop rcu-walk (ie. take
a reference on the last good dentry) and continue with a ref-walk. Again, if
we can drop rcu-walk gracefully, we return -ECHILD and do the whole lookup
using ref-walk. But it is very important that we can continue with ref-walk
for most cases, particularly to avoid the overhead of double lookups, and to
gain the scalability advantages on common path elements (like cwd and root).
The cases where rcu-walk cannot continue are:
* NULL dentry (ie. any uncached path element)
* parent with d_inode->i_op->permission or ACLs
* dentries with d_revalidate
* Following links
In future patches, permission checks and d_revalidate become rcu-walk aware. It
may be possible eventually to make following links rcu-walk aware.
Uncached path elements will always require dropping to ref-walk mode, at the
very least because i_mutex needs to be grabbed, and objects allocated.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Add branch annotations for seqlock read fastpath, and introduce
__read_seqcount_begin and __read_seqcount_end functions, that can avoid the
smp_rmb() if used carefully. These will be used by store-free path walking
algorithm performance is critical and seqlocks are in use.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Pseudo filesystems that don't put inode on RCU list or reachable by
rcu-walk dentries do not need to RCU free their inodes.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
RCU free the struct inode. This will allow:
- Subsequent store-free path walking patch. The inode must be consulted for
permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
to take i_lock no longer need to take sb_inode_list_lock to walk the list in
the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
page lock to follow page->mapping.
The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.
In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.
The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
dget_locked was a shortcut to avoid the lazy lru manipulation when we already
held dcache_lock (lru manipulation was relatively cheap at that point).
However, how that the lru lock is an innermost one, we never hold it at any
caller, so the lock cost can now be avoided. We already have well working lazy
dcache LRU, so it should be fine to defer LRU manipulations to scan time.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
The remaining usages for dcache_lock is to allow atomic, multi-step read-side
operations over the directory tree by excluding modifications to the tree.
Also, to walk in the leaf->root direction in the tree where we don't have
a natural d_lock ordering.
This could be accomplished by taking every d_lock, but this would mean a
huge number of locks and actually gets very tricky.
Solve this instead by using the rename seqlock for multi-step read-side
operations, retry in case of a rename so we don't walk up the wrong parent.
Concurrent dentry insertions are not serialised against. Concurrent deletes
are tricky when walking up the directory: our parent might have been deleted
when dropping locks so also need to check and retry for that.
We can also use the rename lock in cases where livelock is a worry (and it
is introduced in subsequent patch).
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Add a new lock, dcache_inode_lock, to protect the inode's i_dentry list
from concurrent modification. d_alias is also protected by d_lock.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Protect d_subdirs and d_child with d_lock, except in filesystems that aren't
using dcache_lock for these anyway (eg. using i_mutex).
Note: if we change the locking rule in future so that ->d_child protection is
provided only with ->d_parent->d_lock, it may allow us to reduce some locking.
But it would be an exception to an otherwise regular locking scheme, so we'd
have to see some good results. Probably not worthwhile.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Make d_count non-atomic and protect it with d_lock. This allows us to ensure a
0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when
we start protecting many other dentry members with d_lock.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Add a new lock, dcache_hash_lock, to protect the dcache hash table from
concurrent modification. d_hash is also protected by d_lock.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Remove dcache_lock locking from hostfs filesystem, and move it into dcache
helpers. All that is required is a coherent path name. Protection from
concurrent modification of the namespace after path name generation is not
provided in current code, because dcache_lock is dropped before the path is
used.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Change d_hash so it may be called from lock-free RCU lookups. See similar
patch for d_compare for details.
For in-tree filesystems, this is just a mechanical change.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Change d_compare so it may be called from lock-free RCU lookups. This
does put significant restrictions on what may be done from the callback,
however there don't seem to have been any problems with in-tree fses.
If some strange use case pops up that _really_ cannot cope with the
rcu-walk rules, we can just add new rcu-unaware callbacks, which would
cause name lookup to drop out of rcu-walk mode.
For in-tree filesystems, this is just a mechanical change.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
smpfs and ncpfs want to update a live dentry name in-place. Rather than
have them open code the locking, provide a documented dcache API.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Change d_delete from a dentry deletion notification to a dentry caching
advise, more like ->drop_inode. Require it to be constant and idempotent,
and not take d_lock. This is how all existing filesystems use the callback
anyway.
This makes fine grained dentry locking of dput and dentry lru scanning
much simpler.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Remove redundant (and incorrect, since dcache RCU lookup) dentry locking
documentation and point to the canonical document.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
* 'omap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: (243 commits)
omap2: Make OMAP2PLUS select OMAP_DM_TIMER
OMAP4: hwmod data: Fix alignment and end of line in structurefields
OMAP4: hwmod data: Move the DMA structures
OMAP4: hwmod data: Move the smartreflex structures
OMAP4: hwmod data: Fix missing SIDLE_SMART_WKUP in smartreflexsysc
arm: omap: tusb6010: add name for MUSB IRQ
arm: omap: craneboard: Add USB EHCI support
omap2+: Initialize serial port for dynamic remuxing for n8x0
omap2+: Add struct omap_board_data and use it for platform level serial init
omap2+: Allow hwmod state changes to mux pads based on the state changes
omap2+: Add support for hwmod specific muxing of devices
omap2+: Add omap_mux_get_by_name
OMAP2: PM: fix compile error when !CONFIG_SUSPEND
MAINTAINERS: OMAP: hwmod: update hwmod code, data maintainership
OMAP4: Smartreflex framework extensions
OMAP4: hwmod: Add inital data for smartreflex modules.
OMAP4: PM: Program correct init voltages for scalable VDDs
OMAP4: Adding voltage driver support
OMAP4: Register voltage PMIC parameters with the voltage layer
OMAP3: PM: Program correct init voltages for VDD1 and VDD2
...
Fix up trivial conflict in arch/arm/plat-omap/Kconfig
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (416 commits)
ARM: DMA: add support for DMA debugging
ARM: PL011: add DMA burst threshold support for ST variants
ARM: PL011: Add support for transmit DMA
ARM: PL011: Ensure IRQs are disabled in UART interrupt handler
ARM: PL011: Separate hardware FIFO size from TTY FIFO size
ARM: PL011: Allow better handling of vendor data
ARM: PL011: Ensure error flags are clear at startup
ARM: PL011: include revision number in boot-time port printk
ARM: vexpress: add sched_clock() for Versatile Express
ARM i.MX53: Make MX53 EVK bootable
ARM i.MX53: Some bug fix about MX53 MSL code
ARM: 6607/1: sa1100: Update platform device registration
ARM: 6606/1: sa1100: Fix platform device registration
ARM i.MX51: rename IPU irqs
ARM i.MX51: Add ipu clock support
ARM: imx/mx27_3ds: Add PMIC support
ARM: DMA: Replace page_to_dma()/dma_to_page() with pfn_to_dma()/dma_to_pfn()
mx51: fix usb clock support
MX51: Add support for usb host 2
arch/arm/plat-mxc/ehci.c: fix errors/typos
...
In order to enable migration support, we will want to move some of the
structures that are subject to migration into the struct nfs_server.
In particular, if we are to move the state_owner and state_owner_id to
being a per-filesystem structure, then we should label the resulting
open/lock owners with a per-filesytem label to ensure global uniqueness.
This patch does so by adding the super block s_dev to the open/lock owner
name.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1436 commits)
cassini: Use local-mac-address prom property for Cassini MAC address
net: remove the duplicate #ifdef __KERNEL__
net: bridge: check the length of skb after nf_bridge_maybe_copy_header()
netconsole: clarify stopping message
netconsole: don't announce stopping if nothing happened
cnic: Fix the type field in SPQ messages
netfilter: fix export secctx error handling
netfilter: fix the race when initializing nf_ct_expect_hash_rnd
ipv4: IP defragmentation must be ECN aware
net: r6040: Return proper error for r6040_init_one
dcb: use after free in dcb_flushapp()
dcb: unlock on error in dcbnl_ieee_get()
net: ixp4xx_eth: Return proper error for eth_init_one
include/linux/if_ether.h: Add #define ETH_P_LINK_CTL for HPNA and wlan local tunnel
net: add POLLPRI to sock_def_readable()
af_unix: Avoid socket->sk NULL OOPS in stream connect security hooks.
net_sched: pfifo_head_drop problem
mac80211: remove stray extern
mac80211: implement off-channel TX using hw r-o-c offload
mac80211: implement hardware offload for remain-on-channel
...
Delegations are per-inode, not per-nfs_client. When a server file
system is migrated, delegations on the client must be moved from the
source to the destination nfs_server. Make it easier to manage a
mount point's delegation list across a migration event by moving the
list to the nfs_server struct.
Clean up: I added documenting comments to public functions I changed
in this patch. For consistency I added comments to all the other
public functions in fs/nfs/delegation.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv4 migration needs to reassociate state owners from the source to
the destination nfs_server data structures. To make that easier, move
the cl_state_owners field to the nfs_server struct. cl_openowner_id
and cl_lockowner_id accompany this move, as they are used in
conjunction with cl_state_owners.
The cl_lock field in the parent nfs_client continues to protect all
three of these fields.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
A layout can request return-on-close. How this interacts with the
forgetful model of never sending LAYOUTRETURNS is a bit ambiguous.
We forget any layouts marked roc, and wait for them to be completely
forgotten before continuing with the close. In addition, to compensate
for races with any inflight LAYOUTGETs, and the fact that we do not get
any layout stateid back from the server, we set the barrier to the worst
case scenario of current_seqid + number of outstanding LAYOUTGETS.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>