linux-2.6

dect

Archived

Author	SHA1	Message	Date
Len Brown	3b70b2e5fc	x86 idle floppy: deprecate disable_hlt() Plan to remove floppy_disable_hlt in 2012, an ancient workaround with comments that it should be removed. This allows us to remove clutter and a run-time branch from the idle code. WARN_ONCE() on invocation until it is removed. cc: x86@kernel.org cc: stable@kernel.org # .39.x Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 03:39:15 -04:00
Len Brown	06ae40ce07	x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it In the long run, we don't want default_idle() or (pm_idle)() to be exported outside of process.c. Start by not exporting them to modules, unless the APM build demands it. cc: x86@kernel.org cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 03:39:14 -04:00
Len Brown	02c68a0201	x86 idle: clarify AMD erratum 400 workaround The workaround for AMD erratum 400 uses the term "c1e" falsely suggesting: 1. Intel C1E is somehow involved 2. All AMD processors with C1E are involved Use the string "amd_c1e" instead of simply "c1e" to clarify that this workaround is specific to AMD's version of C1E. Use the string "e400" to clarify that the workaround is specific to AMD processors with Erratum 400. This patch is text-substitution only, with no functional change. cc: x86@kernel.org Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 03:38:57 -04:00
Zhang Rui	08b53f0e6b	ACPI EC: remove redundant code ec->handle is set in ec_parse_device(), so don't bother to set it again. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 02:59:50 -04:00
Lin Ming	28c2103dad	ACPI: Add D3 cold state _SxW returns an Integer containing the lowest D-state supported in state Sx. If OSPM has not indicated that it supports _PR3, then the value “3” corresponds to D3. If it has indicated _PR3 support, the value “3” represents D3hot and the value “4” represents D3cold. Linux does set _OSC._PR3, so we should fix it to expect that _SxW can return 4. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 02:21:08 -04:00
Lin Ming	932df74143	ACPI: processor: fix processor_physically_present in UP kernel Usually, there are multiple processors defined in ACPI table, for example Scope (_PR) { Processor (CPU0, 0x00, 0x00000410, 0x06) {} Processor (CPU1, 0x01, 0x00000410, 0x06) {} Processor (CPU2, 0x02, 0x00000410, 0x06) {} Processor (CPU3, 0x03, 0x00000410, 0x06) {} } processor_physically_present(...) will be called to check whether those processors are physically present. Currently we have below codes in processor_physically_present, cpuid = acpi_get_cpuid(...); if ((cpuid == -1) && (num_possible_cpus() > 1)) return false; return true; In UP kernel, acpi_get_cpuid(...) always return -1 and num_possible_cpus() always return 1, so processor_physically_present(...) always returns true for all passed in processor handles. This is wrong for UP processor or SMP processor running UP kernel. This patch removes the !SMP version of acpi_get_cpuid(), so both UP and SMP kernel use the same acpi_get_cpuid function. And for UP kernel, only processor 0 is valid. https://bugzilla.kernel.org/show_bug.cgi?id=16548 https://bugzilla.kernel.org/show_bug.cgi?id=16357 Tested-by: Anton Kochkov <anton.kochkov@gmail.com> Tested-by: Ambroz Bizjak <ambrop7@gmail.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 02:17:56 -04:00
Linus Torvalds	139f37f5e1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin: Blackfin: debug-mmrs: include RSI_PID[4567] MMRs Blackfin: bf51x: fix up RSI_PID# MMR defines Blackfin: bf52x/bf54x: fix up usb MMR defines Blackfin: debug-mmrs: fix typos with gptimers/mdma/ppi Blackfin: gptimers: add structure for hardware register layout Blackfin: wire up new sendmmsg syscall Blackfin: mach/bfin_serial_5xx.h: punt now-unused header Blackfin: bfin_serial.h: turn default port wrappers into stubs	2011-05-28 23:12:28 -07:00
Randy Dunlap	5be7ef0024	scsi: fix scsi_proc new kernel-doc warning Fix kernel-doc warnings in scsi_proc.c: Warning(drivers/scsi/scsi_proc.c:390): No description found for parameter 'dev' Warning(drivers/scsi/scsi_proc.c:390): No description found for parameter 'data' Warning(drivers/scsi/scsi_proc.c:390): Excess function parameter 's' description in 'always_match' Warning(drivers/scsi/scsi_proc.c:390): Excess function parameter 'p' description in 'always_match' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-28 23:12:11 -07:00
Thomas Renninger	526b4af47f	ACPI: Split out custom_method functionality into an own driver With /sys/kernel/debug/acpi/custom_method root can write to arbitrary memory and increase his priveleges, even if these are restricted. -> Make this an own debug .config option and warn about the security issue in the config description. -> Still keep acpi/debugfs.c which now only creates an empty /sys/kernel/debug/acpi directory. There might be other users of it later. Signed-off-by: Thomas Renninger <trenn@suse.de> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: rui.zhang@intel.com Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 01:50:40 -04:00
Thomas Renninger	aecad432fd	ACPI: Cleanup custom_method debug stuff - Move param aml_debug_output to other params into sysfs.c - Split acpi_debugfs_init to prepare custom_method to be an own .config option and driver. Signed-off-by: Thomas Renninger <trenn@suse.de> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: rui.zhang@intel.com Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 01:50:04 -04:00
Zhang Rui	534bc4e3d2	ACPI EC: enable MSI workaround for Quanta laptops Enable MSI workaround for Quanta laptops. https://bugzilla.kernel.org/show_bug.cgi?id=20242 Tested-by: Jan-Matthias Braun <jan_braun@gmx.net> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 01:35:46 -04:00
Tim Chen	333c5ae994	idle governor: Avoid lock acquisition to read pm_qos before entering idle Thanks to the reviews and comments by Rafael, James, Mark and Andi. Here's version 2 of the patch incorporating your comments and also some update to my previous patch comments. I noticed that before entering idle state, the menu idle governor will look up the current pm_qos target value according to the list of qos requests received. This look up currently needs the acquisition of a lock to access the list of qos requests to find the qos target value, slowing down the entrance into idle state due to contention by multiple cpus to access this list. The contention is severe when there are a lot of cpus waking and going into idle. For example, for a simple workload that has 32 pair of processes ping ponging messages to each other, where 64 cpu cores are active in test system, I see the following profile with 37.82% of cpu cycles spent in contention of pm_qos_lock: - 37.82% swapper [kernel.kallsyms] [k] _raw_spin_lock_irqsave - _raw_spin_lock_irqsave - 95.65% pm_qos_request menu_select cpuidle_idle_call - cpu_idle 99.98% start_secondary A better approach will be to cache the updated pm_qos target value so reading it does not require lock acquisition as in the patch below. With this patch the contention for pm_qos_lock is removed and I saw a 2.2X increase in throughput for my message passing workload. cc: stable@kernel.org Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: James Bottomley <James.Bottomley@suse.de> Acked-by: mark gross <markgross@thegnar.org> Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 00:50:59 -04:00
Tero Kristo	7467571f44	cpuidle: menu: fixed wrapping timers at 4.294 seconds Cpuidle menu governor is using u32 as a temporary datatype for storing nanosecond values which wrap around at 4.294 seconds. This causes errors in predicted sleep times resulting in higher than should be C state selection and increased power consumption. This also breaks cpuidle state residency statistics. cc: stable@kernel.org # .32.x through .39.x Signed-off-by: Tero Kristo <tero.kristo@nokia.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 00:35:47 -04:00
Hugh Dickins	eee0f252c6	mm: fix page_lock_anon_vma leaving mutex locked On one machine I've been getting hangs, a page fault's anon_vma_prepare() waiting in anon_vma_lock(), other processes waiting for that page's lock. This is a replay of last year's `f18194275c` "mm: fix hang on anon_vma->root->lock". The new page_lock_anon_vma() places too much faith in its refcount: when it has acquired the mutex_trylock(), it's possible that a racing task in anon_vma_alloc() has just reallocated the struct anon_vma, set refcount to 1, and is about to reset its anon_vma->root. Fix this by saving anon_vma->root, and relying on the usual page_mapped() check instead of a refcount check: if page is still mapped, the anon_vma is still ours; if page is not still mapped, we're no longer interested. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-28 16:55:32 -07:00
Hugh Dickins	5dbe0af47f	mm: fix kernel BUG at mm/rmap.c:1017! I've hit the "address >= vma->vm_end" check in do_page_add_anon_rmap() just once. The stack showed khugepaged allocation trying to compact pages: the call to page_add_anon_rmap() coming from remove_migration_pte(). That path holds anon_vma lock, but does not hold mmap_sem: it can therefore race with a split_vma(), and in commit `5f70b962cc` "mmap: avoid unnecessary anon_vma lock" we just took away the anon_vma lock protection when adjusting vma->vm_end. I don't think that particular BUG_ON ever caught anything interesting, so better replace it by a comment, than reinstate the anon_vma locking. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-28 16:09:26 -07:00
Hugh Dickins	826267cf1e	tmpfs: fix race between truncate and writepage While running fsx on tmpfs with a memhog then swapoff, swapoff was hanging (interruptibly), repeatedly failing to locate the owner of a 0xff entry in the swap_map. Although shmem_writepage() does abandon when it sees incoming page index is beyond eof, there was still a window in which shmem_truncate_range() could come in between writepage's dropping lock and updating swap_map, find the half-completed swap_map entry, and in trying to free it, leave it in a state that swap_shmem_alloc() could not correct. Arguably a bug in __swap_duplicate()'s and swap_entry_free()'s handling of the different cases, but easiest to fix by moving swap_shmem_alloc() under cover of the lock. More interesting than the bug: it's been there since 2.6.33, why could I not see it with earlier kernels? The mmotm of two weeks ago seems to have some magic for generating races, this is just one of three I found. With yesterday's git I first saw this in mainline, bisected in search of that magic, but the easy reproducibility evaporated. Oh well, fix the bug. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-28 16:09:26 -07:00
Mike Frysinger	c320afe965	Blackfin: debug-mmrs: include RSI_PID[4567] MMRs The documentation is a little iffy as to whether these are actual MMRs, but reading them on the hardware works, and the previous version of this logic (the SDH) had PID[4567]. So add it for RSI too. Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2011-05-28 17:02:56 -04:00
Mike Frysinger	fcb243918f	Blackfin: bf51x: fix up RSI_PID# MMR defines Looks like the copying of MMR defines from the SDH block missed updating the addresses of the RSI_PID# registers. So tweak them to reflect the actual hardware. Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2011-05-28 17:02:56 -04:00
Mike Frysinger	61aa818f7b	Blackfin: bf52x/bf54x: fix up usb MMR defines The bf52x/bf54x have the incorrect addresses for USB_EP_NI7_RXINTERVAL and USB_EP_NI7_TXCOUNT, so adjust those. Further, the bf54x header puts the USB defines in the wrong place, so shuffle them back to the right grouping. Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2011-05-28 17:02:56 -04:00
Mike Frysinger	d09fb60203	Blackfin: debug-mmrs: fix typos with gptimers/mdma/ppi This code was mostly developed against a BF54x, so some BF537-specific issues were missed. The PPI block starts at PPI_CONTROL, not PPI_STATUS (which is the reverse of the EPPI block). The MDMA block starts at MDMA_NEXT_DESC_PTR, not MDMA_CONFIG. Seems the sim does not catch misreads here so that'll need to get fixed. The gptimer block is mostly 32bit regs, not 16bit. Use the gptimer struct to figure that out rather than hardcoding it locally. Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2011-05-28 17:02:56 -04:00
Mike Frysinger	a4ffd95692	Blackfin: gptimers: add structure for hardware register layout Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2011-05-28 17:02:55 -04:00
Mike Frysinger	427472c967	Blackfin: wire up new sendmmsg syscall Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2011-05-28 17:02:55 -04:00
Mike Frysinger	63917efc4f	Blackfin: mach/bfin_serial_5xx.h: punt now-unused header Now that the serial code has been unified in bfin_serial.h, and the Blackfin UART driver pushed its resources to the boards files, we don't need these headers anymore. Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2011-05-28 17:01:55 -04:00
Mike Frysinger	091c75985e	Blackfin: bfin_serial.h: turn default port wrappers into stubs Any consumer that needs to access the MMRs has to provide these helpers, so make the default into useless stubs. Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2011-05-28 17:01:55 -04:00
Linus Torvalds	36947a7682	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (36 commits) Cache xattr security drop check for write v2 fs: block_page_mkwrite should wait for writeback to finish mm: Wait for writeback when grabbing pages to begin a write configfs: remove unnecessary dentry_unhash on rmdir, dir rename fat: remove unnecessary dentry_unhash on rmdir, dir rename hpfs: remove unnecessary dentry_unhash on rmdir, dir rename minix: remove unnecessary dentry_unhash on rmdir, dir rename fuse: remove unnecessary dentry_unhash on rmdir, dir rename coda: remove unnecessary dentry_unhash on rmdir, dir rename afs: remove unnecessary dentry_unhash on rmdir, dir rename affs: remove unnecessary dentry_unhash on rmdir, dir rename 9p: remove unnecessary dentry_unhash on rmdir, dir rename ncpfs: fix rename over directory with dangling references ncpfs: document dentry_unhash usage ecryptfs: remove unnecessary dentry_unhash on rmdir, dir rename hostfs: remove unnecessary dentry_unhash on rmdir, dir rename hfsplus: remove unnecessary dentry_unhash on rmdir, dir rename hfs: remove unnecessary dentry_unhash on rmdir, dir rename omfs: remove unnecessary dentry_unhash on rmdir, dir rneame udf: remove unnecessary dentry_unhash from rmdir, dir rename ...	2011-05-28 13:03:41 -07:00
Linus Torvalds	a947e23a8e	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, asm: Clean up desc.h a bit x86, amd: Do not enable ARAT feature on AMD processors below family 0x12 x86: Move do_page_fault()'s error path under unlikely() x86, efi: Retain boot service code until after switching to virtual mode x86: Remove unnecessary check in detect_ht() x86: Reorder mm_context_t to remove x86_64 alignment padding and thus shrink mm_struct x86, UV: Clean up uv_tlb.c x86, UV: Add support for SGI UV2 hub chip x86, cpufeature: Update CPU feature RDRND to RDRAND	2011-05-28 12:57:01 -07:00
Linus Torvalds	08a8b79600	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: cpuset: Fix cpuset_cpus_allowed_fallback(), don't update tsk->rt.nr_cpus_allowed sched: Fix ->min_vruntime calculation in dequeue_entity() sched: Fix ttwu() for __ARCH_WANT_INTERRUPTS_ON_CTXSW sched: More sched_domain iterations fixes	2011-05-28 12:56:46 -07:00
Linus Torvalds	1ba4b8cb94	Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: Start RCU kthreads in TASK_INTERRUPTIBLE state rcu: Remove waitqueue usage for cpu, node, and boost kthreads rcu: Avoid acquiring rcu_node locks in timer functions atomic: Add atomic_or() Documentation: Add statistics about nested locks rcu: Decrease memory-barrier usage based on semi-formal proof rcu: Make rcu_enter_nohz() pay attention to nesting rcu: Don't do reschedule unless in irq rcu: Remove old memory barriers from rcu_process_callbacks() rcu: Add memory barriers rcu: Fix unpaired rcu_irq_enter() from locking selftests	2011-05-28 12:56:32 -07:00
Linus Torvalds	c4a227d89f	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) perf: Fix SIGIO handling perf top: Don't stop if no kernel symtab is found perf top: Handle kptr_restrict perf top: Remove unused macro perf events: initialize fd array to -1 instead of 0 perf tools: Make sure kptr_restrict warnings fit 80 col terms perf tools: Fix build on older systems perf symbols: Handle /proc/sys/kernel/kptr_restrict perf: Remove duplicate headers ftrace: Add internal recursive checks tracing: Update btrfs's tracepoints to use u64 interface tracing: Add __print_symbolic_u64 to avoid warnings on 32bit machine ftrace: Set ops->flag to enabled even on static function tracing tracing: Have event with function tracer check error return ftrace: Have ftrace_startup() return failure code jump_label: Check entries limit in __jump_label_update ftrace/recordmcount: Avoid STT_FUNC symbols as base on ARM scripts/tags.sh: Add magic for trace-events for etags too scripts/tags.sh: Fix ctags for DEFINE_EVENT() x86/ftrace: Fix compiler warning in ftrace.c ...	2011-05-28 12:55:55 -07:00
Linus Torvalds	87367a0b71	Merge branch 'for-usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci * 'for-usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci: Intel xhci: Limit number of active endpoints to 64. Intel xhci: Ignore spurious successful event. Intel xhci: Support EHCI/xHCI port switching. Intel xhci: Add PCI id for Panther Point xHCI host. xhci: STFU: Be quieter during URB submission and completion. xhci: STFU: Don't print event ring dequeue pointer. xhci: STFU: Remove function tracing. xhci: Don't submit commands when the host is dead. xhci: Clear stopped_td when Stop Endpoint command completes.	2011-05-28 12:36:15 -07:00
Linus Torvalds	4cb865deec	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (33 commits) x86: poll waiting for I/OAT DMA channel status maintainers: add dma engine tree details dmaengine: add TODO items for future work on dma drivers dmaengine: Add API documentation for slave dma usage dmaengine/dw_dmac: Update maintainer-ship dmaengine: move link order dmaengine/dw_dmac: implement pause and resume in dwc_control dmaengine/dw_dmac: Replace spin_lock* with irqsave variants and enable submission from callback dmaengine/dw_dmac: Divide one sg to many desc, if sg len is greater than DWC_MAX_COUNT dmaengine/dw_dmac: set residue as total len in dwc_tx_status if status is !DMA_SUCCESS dmaengine/dw_dmac: don't call callback routine in case dmaengine_terminate_all() is called dmaengine: at_hdmac: pause: no need to wait for FIFO empty pch_dma: modify pci device table definition pch_dma: Support new device ML7223 IOH pch_dma: Support I2S for ML7213 IOH pch_dma: Fix DMA setting issue pch_dma: modify for checkpatch pch_dma: fix dma direction issue for ML7213 IOH video-in dmaengine: at_hdmac: use descriptor chaining help function dmaengine: at_hdmac: implement pause and resume in atc_control ... Fix up trivial conflict in drivers/dma/dw_dmac.c	2011-05-28 12:35:15 -07:00
Linus Torvalds	55f08e1baa	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: mfd: Fix build breakage caused by tps65910 gpio directory move mfd: Use mfd cell platform_data for db8500-prcmu cells platform bits	2011-05-28 12:33:51 -07:00
Linus Torvalds	d02bf062fb	Merge branch 'spi/next' of git://git.secretlab.ca/git/linux-2.6 * 'spi/next' of git://git.secretlab.ca/git/linux-2.6: spi/spi_bfin_sport: new driver for a SPI bus via the Blackfin SPORT peripheral spi/tle620x: add missing device_remove_file()	2011-05-28 10:57:39 -07:00
Linus Torvalds	04830fccdc	Merge branch 'gpio/next' of git://git.secretlab.ca/git/linux-2.6 * 'gpio/next' of git://git.secretlab.ca/git/linux-2.6: gpio/pch_gpio: Support new device ML7223 gpio: make gpio_{request,free}_array gpio array parameter const GPIO: OMAP: move to drivers/gpio GPIO: OMAP: move register offset defines into <plat/gpio.h> gpio: Convert gpio_is_valid to return bool gpio: Move the s5pc100 GPIO to drivers/gpio gpio: Move the s5pv210 GPIO to drivers/gpio gpio: Move the exynos4 GPIO to drivers/gpio gpio: Move to Samsung common GPIO library to drivers/gpio gpio/nomadik: add function to read GPIO pull down status gpio/nomadik: show all pins in debug gpio: move Nomadik GPIO driver to drivers/gpio gpio: move U300 GPIO driver to drivers/gpio langwell_gpio: add runtime pm support gpio/pca953x: Add support for pca9574 and pca9575 devices gpio/cs5535: Show explicit dependency between gpio_cs5535 and mfd_cs5535	2011-05-28 10:56:34 -07:00
Linus Torvalds	571503e100	Merge branch 'setns' * setns: ns: Wire up the setns system call Done as a merge to make it easier to fix up conflicts in arm due to addition of sendmmsg system call	2011-05-28 10:51:01 -07:00
Eric W. Biederman	7b21fddd08	ns: Wire up the setns system call 32bit and 64bit on x86 are tested and working. The rest I have looked at closely and I can't find any problems. setns is an easy system call to wire up. It just takes two ints so I don't expect any weird architecture porting problems. While doing this I have noticed that we have some architectures that are very slow to get new system calls. cris seems to be the slowest where the last system calls wired up were preadv and pwritev. avr32 is weird in that recvmmsg was wired up but never declared in unistd.h. frv is behind with perf_event_open being the last syscall wired up. On h8300 the last system call wired up was epoll_wait. On m32r the last system call wired up was fallocate. mn10300 has recvmmsg as the last system call wired up. The rest seem to at least have syncfs wired up which was new in the 2.6.39. v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com> v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com> v4: Moved wiring up of the system call to another patch v5: ported to v2.6.39-rc6 v6: rebased onto parisc-next and net-next to avoid syscall conflicts. v7: ported to Linus's latest post 2.6.39 tree. > arch/blackfin/include/asm/unistd.h \| 3 ++- > arch/blackfin/mach-common/entry.S \| 1 + Acked-by: Mike Frysinger <vapier@gentoo.org> Oh - ia64 wiring looks good. Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-28 10:48:39 -07:00
Andi Kleen	69b4573296	Cache xattr security drop check for write v2 Some recent benchmarking on btrfs showed that a major scaling bottleneck on large systems on btrfs is currently the xattr lookup on every write. Why xattr lookup on every write I hear you ask? write wants to drop suid and security related xattrs that could set o capabilities for executables. To do that it currently looks up security.capability on EVERY write (even for non executables) to decide whether to drop it or not. In btrfs this causes an additional tree walk, hitting some per file system locks and quite bad scalability. In a simple read workload on a 8S system I saw over 90% CPU time in spinlocks related to that. Chris Mason tells me this is also a problem in ext4, where it hits the global mbcache lock. This patch adds a simple per inode to avoid this problem. We only do the lookup once per file and then if there is no xattr cache the decision. All xattr changes clear the flag. I also used the same flag to avoid the suid check, although that one is pretty cheap. A file system can also set this flag when it creates the inode, if it has a cheap way to do so. This is done for some common file systems in followon patches. With this patch a major part of the lock contention disappears for btrfs. Some testing on smaller systems didn't show significant performance changes, but at least it helps the larger systems and is generally more efficient. v2: Rename is_sgid. add file system helper. Cc: chris.mason@oracle.com Cc: josef@redhat.com Cc: viro@zeniv.linux.org.uk Cc: agruen@linbit.com Cc: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-05-28 12:02:09 -04:00
Paul E. McKenney	cc3ce5176d	rcu: Start RCU kthreads in TASK_INTERRUPTIBLE state Upon creation, kthreads are in TASK_UNINTERRUPTIBLE state, which can result in softlockup warnings. Because some of RCU's kthreads can legitimately be idle indefinitely, start them in TASK_INTERRUPTIBLE state in order to avoid those warnings. Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-28 17:41:56 +02:00
Peter Zijlstra	08bca60a69	rcu: Remove waitqueue usage for cpu, node, and boost kthreads It is not necessary to use waitqueues for the RCU kthreads because we always know exactly which thread is to be awakened. In addition, wake_up() only issues an actual wakeup when there is a thread waiting on the queue, which was why there was an extra explicit wake_up_process() to get the RCU kthreads started. Eliminating the waitqueues (and wake_up()) in favor of wake_up_process() eliminates the need for the initial wake_up_process() and also shrinks the data structure size a bit. The wakeup logic is placed in a new rcu_wait() macro. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-28 17:41:52 +02:00
Paul E. McKenney	8826f3b039	rcu: Avoid acquiring rcu_node locks in timer functions This commit switches manipulations of the rcu_node ->wakemask field to atomic operations, which allows rcu_cpu_kthread_timer() to avoid acquiring the rcu_node lock. This should avoid the following lockdep splat reported by Valdis Kletnieks: [ 12.872150] usb 1-4: new high speed USB device number 3 using ehci_hcd [ 12.986667] usb 1-4: New USB device found, idVendor=413c, idProduct=2513 [ 12.986679] usb 1-4: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 12.987691] hub 1-4:1.0: USB hub found [ 12.987877] hub 1-4:1.0: 3 ports detected [ 12.996372] input: PS/2 Generic Mouse as /devices/platform/i8042/serio1/input/input10 [ 13.071471] udevadm used greatest stack depth: 3984 bytes left [ 13.172129] [ 13.172130] ======================================================= [ 13.172425] [ INFO: possible circular locking dependency detected ] [ 13.172650] 2.6.39-rc6-mmotm0506 #1 [ 13.172773] ------------------------------------------------------- [ 13.172997] blkid/267 is trying to acquire lock: [ 13.173009] (&p->pi_lock){-.-.-.}, at: [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [ 13.173009] but task is already holding lock: [ 13.173009] (rcu_node_level_0){..-...}, at: [<ffffffff810901cc>] rcu_cpu_kthread_timer+0x27/0x58 [ 13.173009] [ 13.173009] which lock already depends on the new lock. [ 13.173009] [ 13.173009] [ 13.173009] the existing dependency chain (in reverse order) is: [ 13.173009] [ 13.173009] -> #2 (rcu_node_level_0){..-...}: [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815697f1>] _raw_spin_lock+0x36/0x45 [ 13.173009] [<ffffffff81090794>] rcu_read_unlock_special+0x8c/0x1d5 [ 13.173009] [<ffffffff8109092c>] __rcu_read_unlock+0x4f/0xd7 [ 13.173009] [<ffffffff81027bd3>] rcu_read_unlock+0x21/0x23 [ 13.173009] [<ffffffff8102cc34>] cpuacct_charge+0x6c/0x75 [ 13.173009] [<ffffffff81030cc6>] update_curr+0x101/0x12e [ 13.173009] [<ffffffff810311d0>] check_preempt_wakeup+0xf7/0x23b [ 13.173009] [<ffffffff8102acb3>] check_preempt_curr+0x2b/0x68 [ 13.173009] [<ffffffff81031d40>] ttwu_do_wakeup+0x76/0x128 [ 13.173009] [<ffffffff81031e49>] ttwu_do_activate.constprop.63+0x57/0x5c [ 13.173009] [<ffffffff81031e96>] scheduler_ipi+0x48/0x5d [ 13.173009] [<ffffffff810177d5>] smp_reschedule_interrupt+0x16/0x18 [ 13.173009] [<ffffffff815710f3>] reschedule_interrupt+0x13/0x20 [ 13.173009] [<ffffffff810b66d1>] rcu_read_unlock+0x21/0x23 [ 13.173009] [<ffffffff810b739c>] find_get_page+0xa9/0xb9 [ 13.173009] [<ffffffff810b8b48>] filemap_fault+0x6a/0x34d [ 13.173009] [<ffffffff810d1a25>] __do_fault+0x54/0x3e6 [ 13.173009] [<ffffffff810d447a>] handle_pte_fault+0x12c/0x1ed [ 13.173009] [<ffffffff810d48f7>] handle_mm_fault+0x1cd/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 13.173009] [ 13.173009] -> #1 (&rq->lock){-.-.-.}: [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815697f1>] _raw_spin_lock+0x36/0x45 [ 13.173009] [<ffffffff81027e19>] __task_rq_lock+0x8b/0xd3 [ 13.173009] [<ffffffff81032f7f>] wake_up_new_task+0x41/0x108 [ 13.173009] [<ffffffff810376c3>] do_fork+0x265/0x33f [ 13.173009] [<ffffffff81007d02>] kernel_thread+0x6b/0x6d [ 13.173009] [<ffffffff8153a9dd>] rest_init+0x21/0xd2 [ 13.173009] [<ffffffff81b1db4f>] start_kernel+0x3bb/0x3c6 [ 13.173009] [<ffffffff81b1d29f>] x86_64_start_reservations+0xaf/0xb3 [ 13.173009] [<ffffffff81b1d393>] x86_64_start_kernel+0xf0/0xf7 [ 13.173009] [ 13.173009] -> #0 (&p->pi_lock){-.-.-.}: [ 13.173009] [<ffffffff81067788>] check_prev_add+0x68/0x20e [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815698ea>] _raw_spin_lock_irqsave+0x44/0x57 [ 13.173009] [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81032f3c>] wake_up_process+0x10/0x12 [ 13.173009] [<ffffffff810901e9>] rcu_cpu_kthread_timer+0x44/0x58 [ 13.173009] [<ffffffff81045286>] call_timer_fn+0xac/0x1e9 [ 13.173009] [<ffffffff8104556d>] run_timer_softirq+0x1aa/0x1f2 [ 13.173009] [<ffffffff8103e487>] __do_softirq+0x109/0x26a [ 13.173009] [<ffffffff8157144c>] call_softirq+0x1c/0x30 [ 13.173009] [<ffffffff81003207>] do_softirq+0x44/0xf1 [ 13.173009] [<ffffffff8103e8b9>] irq_exit+0x58/0xc8 [ 13.173009] [<ffffffff81017f5a>] smp_apic_timer_interrupt+0x79/0x87 [ 13.173009] [<ffffffff81570fd3>] apic_timer_interrupt+0x13/0x20 [ 13.173009] [<ffffffff810bd51a>] get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff810bdf03>] __alloc_pages_nodemask+0x178/0x243 [ 13.173009] [<ffffffff8101fe2f>] pte_alloc_one+0x1e/0x3a [ 13.173009] [<ffffffff810d27fe>] __pte_alloc+0x22/0x14b [ 13.173009] [<ffffffff810d48a8>] handle_mm_fault+0x17e/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 13.173009] [ 13.173009] other info that might help us debug this: [ 13.173009] [ 13.173009] Chain exists of: [ 13.173009] &p->pi_lock --> &rq->lock --> rcu_node_level_0 [ 13.173009] [ 13.173009] Possible unsafe locking scenario: [ 13.173009] [ 13.173009] CPU0 CPU1 [ 13.173009] ---- ---- [ 13.173009] lock(rcu_node_level_0); [ 13.173009] lock(&rq->lock); [ 13.173009] lock(rcu_node_level_0); [ 13.173009] lock(&p->pi_lock); [ 13.173009] [ 13.173009] * DEADLOCK * [ 13.173009] [ 13.173009] 3 locks held by blkid/267: [ 13.173009] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8156cdb4>] do_page_fault+0x1f3/0x5de [ 13.173009] #1: (&yield_timer){+.-...}, at: [<ffffffff810451da>] call_timer_fn+0x0/0x1e9 [ 13.173009] #2: (rcu_node_level_0){..-...}, at: [<ffffffff810901cc>] rcu_cpu_kthread_timer+0x27/0x58 [ 13.173009] [ 13.173009] stack backtrace: [ 13.173009] Pid: 267, comm: blkid Not tainted 2.6.39-rc6-mmotm0506 #1 [ 13.173009] Call Trace: [ 13.173009] <IRQ> [<ffffffff8154a529>] print_circular_bug+0xc8/0xd9 [ 13.173009] [<ffffffff81067788>] check_prev_add+0x68/0x20e [ 13.173009] [<ffffffff8100c861>] ? save_stack_trace+0x28/0x46 [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff815698ea>] _raw_spin_lock_irqsave+0x44/0x57 [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff81032f3c>] wake_up_process+0x10/0x12 [ 13.173009] [<ffffffff810901e9>] rcu_cpu_kthread_timer+0x44/0x58 [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff81045286>] call_timer_fn+0xac/0x1e9 [ 13.173009] [<ffffffff810451da>] ? del_timer+0x75/0x75 [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff8104556d>] run_timer_softirq+0x1aa/0x1f2 [ 13.173009] [<ffffffff8103e487>] __do_softirq+0x109/0x26a [ 13.173009] [<ffffffff8106365f>] ? tick_dev_program_event+0x37/0xf6 [ 13.173009] [<ffffffff810a0e4a>] ? time_hardirqs_off+0x1b/0x2f [ 13.173009] [<ffffffff8157144c>] call_softirq+0x1c/0x30 [ 13.173009] [<ffffffff81003207>] do_softirq+0x44/0xf1 [ 13.173009] [<ffffffff8103e8b9>] irq_exit+0x58/0xc8 [ 13.173009] [<ffffffff81017f5a>] smp_apic_timer_interrupt+0x79/0x87 [ 13.173009] [<ffffffff81570fd3>] apic_timer_interrupt+0x13/0x20 [ 13.173009] <EOI> [<ffffffff810bd384>] ? get_page_from_freelist+0x114/0x310 [ 13.173009] [<ffffffff810bd51a>] ? get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff812220e7>] ? clear_page_c+0x7/0x10 [ 13.173009] [<ffffffff810bd1ef>] ? prep_new_page+0x14c/0x1cd [ 13.173009] [<ffffffff810bd51a>] get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff810bdf03>] __alloc_pages_nodemask+0x178/0x243 [ 13.173009] [<ffffffff810d46b9>] ? __pmd_alloc+0x87/0x99 [ 13.173009] [<ffffffff8101fe2f>] pte_alloc_one+0x1e/0x3a [ 13.173009] [<ffffffff810d46b9>] ? __pmd_alloc+0x87/0x99 [ 13.173009] [<ffffffff810d27fe>] __pte_alloc+0x22/0x14b [ 13.173009] [<ffffffff810d48a8>] handle_mm_fault+0x17e/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff810d915f>] ? sys_brk+0x32/0x10c [ 13.173009] [<ffffffff810a0e4a>] ? time_hardirqs_off+0x1b/0x2f [ 13.173009] [<ffffffff81065c4f>] ? trace_hardirqs_off_caller+0x3f/0x9c [ 13.173009] [<ffffffff812235dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 14.010075] usb 5-1: new full speed USB device number 2 using uhci_hcd Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-28 17:41:49 +02:00
Paul E. McKenney	55c2945aa9	atomic: Add atomic_or() An atomic_or() function is needed by TREE_RCU to avoid deadlock, so add a generic version. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-28 17:41:46 +02:00
Ingo Molnar	29f742f88a	Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/urgent	2011-05-28 17:41:05 +02:00
Peter Zijlstra	f506b3dc0e	perf: Fix SIGIO handling Vince noticed that unless we mmap() a buffer, SIGIO gets lost. So explicitly push the wakeup (including signals) when requested. Reported-by: Vince Weaver <vweaver1@eecs.utk.edu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/n/tip-2euus3f3x3dyvdk52cjxw8zu@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-28 17:04:59 +02:00
Juri Lelli	f62508f68d	Documentation: Add statistics about nested locks Explain what the trailing "/1" on some lock class names of lock_stat output means. Reviewed-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4DD4F6C1.5090701@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-28 17:03:29 +02:00
KOSAKI Motohiro	1e1b6c511d	cpuset: Fix cpuset_cpus_allowed_fallback(), don't update tsk->rt.nr_cpus_allowed The rule is, we have to update tsk->rt.nr_cpus_allowed if we change tsk->cpus_allowed. Otherwise RT scheduler may confuse. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4DD4B3FA.5060901@jp.fujitsu.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-28 17:02:57 +02:00
Peter Zijlstra	1e87623178	sched: Fix ->min_vruntime calculation in dequeue_entity() Dima Zavin <dima@android.com> reported: "After pulling the thread off the run-queue during a cgroup change, the cfs_rq.min_vruntime gets recalculated. The dequeued thread's vruntime then gets normalized to this new value. This can then lead to the thread getting an unfair boost in the new group if the vruntime of the next task in the old run-queue was way further ahead." Reported-by: Dima Zavin <dima@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Recalls-having-tested-once-upon-a-time-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1305674470-23727-1-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-28 17:02:56 +02:00
Peter Zijlstra	d6aa8f85f1	sched: Fix ttwu() for __ARCH_WANT_INTERRUPTS_ON_CTXSW Marc reported that `e4a52bcb9` (sched: Remove rq->lock from the first half of ttwu()) broke his ARM-SMP machine. Now ARM is one of the few __ARCH_WANT_INTERRUPTS_ON_CTXSW users, so that exception in the ttwu() code was suspect. Yong found that the interrupt could hit after context_switch() changes current but before it clears p->on_cpu, if that interrupt were to attempt a wake-up of p we would indeed find ourselves spinning in IRQ context. Fix this by reverting to the old behaviour for this situation and perform a full remote wake-up. Cc: Frank Rowand <frank.rowand@am.sony.com> Cc: Yong Zhang <yong.zhang0@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Reported-by: Marc Zyngier <Marc.Zyngier@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-28 17:02:55 +02:00
Xiaotian Feng	cd4ae6adf8	sched: More sched_domain iterations fixes sched_domain iterations needs to be protected by rcu_read_lock() now, this patch adds another two places which needs the rcu lock, which is spotted by following suspicious rcu_dereference_check() usage warnings. kernel/sched_rt.c:1244 invoked rcu_dereference_check() without protection! kernel/sched_stats.h:41 invoked rcu_dereference_check() without protection! Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1303469634-11678-1-git-send-email-dfeng@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-28 17:02:54 +02:00
Namhyung Kim	5988ce2396	nbd: adjust 'max_part' according to part_shift The 'max_part' parameter determines how many partitions are supported on each nbd device. However the actual number can be changed to the power of 2 minus 1 form during the module initialization as alloc_disk() is called with (1 << part_shift) for some reason. So adjust 'max_part' also at least for consistency with loop and brd. It is exported via sysfs already, and a user should check this value after module loading if [s]he wants to use that number correctly (i.e. fdisk or something). Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Laurent Vivier <Laurent.Vivier@bull.net> Cc: Paul Clements <Paul.Clements@steeleye.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-28 14:44:46 +02:00
Namhyung Kim	3b2710824e	nbd: limit module parameters to a sane value The 'max_part' parameter controls the number of maximum partition a nbd device can have. However if a user specifies very large value it would exceed the limitation of device minor number and can cause a kernel oops (or, at least, produce invalid device nodes in some cases). In addition, specifying large 'nbds_max' value causes same problem for the same reason. On my desktop, following command results to the kernel bug: $ sudo modprobe nbd max_part=100000 kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/block/nbd4/range CPU 1 Modules linked in: nbd(+) bridge stp llc kvm_intel kvm asus_atk0110 sg sr_mod cdrom Pid: 2522, comm: modprobe Tainted: G W 2.6.39-leonard+ #159 System manufacturer System Product Name/P5G41TD-M PRO RIP: 0010:[<ffffffff8115aa08>] [<ffffffff8115aa08>] internal_create_group+0x2f/0x166 RSP: 0018:ffff8801009f1de8 EFLAGS: 00010246 RAX: 00000000ffffffef RBX: ffff880103920478 RCX: 00000000000a7bd3 RDX: ffffffff81a2dbe0 RSI: 0000000000000000 RDI: ffff880103920478 RBP: ffff8801009f1e38 R08: ffff880103920468 R09: ffff880103920478 R10: ffff8801009f1de8 R11: ffff88011eccbb68 R12: ffffffff81a2dbe0 R13: ffff880103920468 R14: 0000000000000000 R15: ffff880103920400 FS: 00007f3c49de9700(0000) GS:ffff88011f800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f3b7fe7c000 CR3: 00000000cd58d000 CR4: 00000000000406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 2522, threadinfo ffff8801009f0000, task ffff8801009a93a0) Stack: ffff8801009f1e58 ffffffff812e8f6e ffff8801009f1e58 ffffffff812e7a80 ffff880000000010 ffff880103920400 ffff8801002fd0c0 ffff880103920468 0000000000000011 ffff880103920400 ffff8801009f1e48 ffffffff8115ab6a Call Trace: [<ffffffff812e8f6e>] ? device_add+0x4f1/0x5e4 [<ffffffff812e7a80>] ? dev_set_name+0x41/0x43 [<ffffffff8115ab6a>] sysfs_create_group+0x13/0x15 [<ffffffff810b857e>] blk_trace_init_sysfs+0x14/0x16 [<ffffffff811ee58b>] blk_register_queue+0x4c/0xfd [<ffffffff811f3bdf>] add_disk+0xe4/0x29c [<ffffffffa007e2ab>] nbd_init+0x2ab/0x30d [nbd] [<ffffffffa007e000>] ? 0xffffffffa007dfff [<ffffffff8100020f>] do_one_initcall+0x7f/0x13e [<ffffffff8107ab0a>] sys_init_module+0xa1/0x1e3 [<ffffffff814f3542>] system_call_fastpath+0x16/0x1b Code: 41 57 41 56 41 55 41 54 53 48 83 ec 28 0f 1f 44 00 00 48 89 fb 41 89 f6 49 89 d4 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe b9 ea ff ff ff 48 83 7f 30 00 0f 84 09 01 00 00 49 RIP [<ffffffff8115aa08>] internal_create_group+0x2f/0x166 RSP <ffff8801009f1de8> ---[ end trace 753285ffbf72c57c ]--- Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Laurent Vivier <Laurent.Vivier@bull.net> Cc: Paul Clements <Paul.Clements@steeleye.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-28 14:44:46 +02:00

... 3 4 5 6 7 ...

253081 Commits All Branches Search

253081 Commits

All Branches