linux-2.6

dect

Archived

Author	SHA1	Message	Date
Linus Torvalds	cc21fe518a	Merge branch 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Hyper-V: Integrate the clocksource with Hyper-V detection code Fix up conflicts in drivers/staging/hv/Makefile manually (some of the hv code has moved out of staging to drivers/hv/)	2011-10-28 05:08:40 -07:00
Linus Torvalds	e34eb39c1c	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, amd: Include linux/elf.h since we use stuff from asm/elf.h x86: cache_info: Update calculation of AMD L3 cache indices x86: cache_info: Kill the atomic allocation in amd_init_l3_cache() x86: cache_info: Kill the moronic shadow struct x86: cache_info: Remove bogus free of amd_l3_cache data x86, amd: Include elf.h explicitly, prepare the code for the module.h split x86-32, amd: Move va_align definition to unbreak 32-bit build x86, amd: Move BSP code to cpu_dev helper x86: Add a BSP cpu_dev helper x86, amd: Avoid cache aliasing penalties on AMD family 15h	2011-10-28 05:03:12 -07:00
Linus Torvalds	dfa4a423cf	Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86-64, unistd: Remove bogus __IGNORE_getcpu x86, mm, trivial: Remove unnecessary get_order() in free_thread_info() x86, cleanup: Remove unneeded version.h include from arch/x86/	2011-10-26 17:43:08 +02:00
Linus Torvalds	7b86572a7a	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86-64: Fix CFI data for interrupt frames x86-64: Don't apply destructive erratum workaround on unaffected CPUs	2011-10-26 17:42:03 +02:00
Linus Torvalds	0791e98dd1	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq: Standardize on CONFIG_SPARSE_IRQ=y x86, ioapic: Clean up ioapic/apic_id usage x86, ioapic: Factor out print_IO_APIC() to only print one io apic x86, ioapic: Print out irte with right ioapic index x86, ioapic: Split up setup_ioapic_entry() x86, ioapic: Pass struct irq_attr * to setup_ioapic_irq() apic, i386/bigsmp: Fix false warnings regarding logical APIC ID mismatches	2011-10-26 17:30:33 +02:00
Linus Torvalds	7115e3fcf4	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (121 commits) perf symbols: Increase symbol KSYM_NAME_LEN size perf hists browser: Refuse 'a' hotkey on non symbolic views perf ui browser: Use libslang to read keys perf tools: Fix tracing info recording perf hists browser: Elide DSO column when it is set to just one DSO, ditto for threads perf hists: Don't consider filtered entries when calculating column widths perf hists: Don't decay total_period for filtered entries perf hists browser: Honour symbol_conf.show_{nr_samples,total_period} perf hists browser: Do not exit on tab key with single event perf annotate browser: Don't change selection line when returning from callq perf tools: handle endianness of feature bitmap perf tools: Add prelink suggestion to dso update message perf script: Fix unknown feature comment perf hists browser: Apply the dso and thread filters when merging new batches perf hists: Move the dso and thread filters from hist_browser perf ui browser: Honour the xterm colors perf top tui: Give color hints just on the percentage, like on --stdio perf ui browser: Make the colors configurable and change the defaults perf tui: Remove unneeded call to newtCls on startup perf hists: Don't format the percentage on hist_entry__snprintf ... Fix up conflicts in arch/x86/kernel/kprobes.c manually. Ingo's tree did the insane "add volatile to const array", which just doesn't make sense ("volatile const"?). But we could remove the const and make the array volatile to make doubly sure that gcc doesn't optimize it away.. Also fix up kernel/trace/ring_buffer.c non-data-conflicts manually: the reader_lock has been turned into a raw lock by the core locking merge, and there was a new user of it introduced in this perf core merge. Make sure that new use also uses the raw accessor functions.	2011-10-26 17:03:38 +02:00
Linus Torvalds	3cfef95246	Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) rtmutex: Add missing rcu_read_unlock() in debug_rt_mutex_print_deadlock() lockdep: Comment all warnings lib: atomic64: Change the type of local lock to raw_spinlock_t locking, lib/atomic64: Annotate atomic64_lock::lock as raw locking, x86, iommu: Annotate qi->q_lock as raw locking, x86, iommu: Annotate irq_2_ir_lock as raw locking, x86, iommu: Annotate iommu->register_lock as raw locking, dma, ipu: Annotate bank_lock as raw locking, ARM: Annotate low level hw locks as raw locking, drivers/dca: Annotate dca_lock as raw locking, powerpc: Annotate uic->lock as raw locking, x86: mce: Annotate cmci_discover_lock as raw locking, ACPI: Annotate c3_lock as raw locking, oprofile: Annotate oprofilefs lock as raw locking, video: Annotate vga console lock as raw locking, latencytop: Annotate latency_lock as raw locking, timer_stats: Annotate table_lock as raw locking, rwsem: Annotate inner lock as raw locking, semaphores: Annotate inner lock as raw locking, sched: Annotate thread_group_cputimer as raw ... Fix up conflicts in kernel/posix-cpu-timers.c manually: making cputimer->cputime a raw lock conflicted with the ABBA fix in commit `bcd5cff721` ("cputimer: Cure lock inversion").	2011-10-26 16:17:32 +02:00
Linus Torvalds	982653009b	Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, ioapic: Consolidate the explicit EOI code x86, ioapic: Restore the mask bit correctly in eoi_ioapic_irq() x86, kdump, ioapic: Reset remote-IRR in clear_IO_APIC iommu: Rename the DMAR and INTR_REMAP config options x86, ioapic: Define irq_remap_modify_chip_defaults() x86, msi, intr-remap: Use the ioapic set affinity routine iommu: Cleanup ifdefs in detect_intel_iommu() iommu: No need to set dmar_disabled in check_zero_address() iommu: Move IOMMU specific code to intel-iommu.c intr_remap: Call dmar_dev_scope_init() explicitly x86, x2apic: Enable the bios request for x2apic optout	2011-10-26 16:11:53 +02:00
Jeremy Fitzhardinge	e71a5be15e	x86/jump_label: add arch_jump_label_transform_static() This allows jump-label entries to be cheaply updated on code which is not yet live. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Jason Baron <jbaron@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2011-10-25 11:54:42 -07:00
Jeremy Fitzhardinge	b7e3155818	x86/jump_label: drop arch_jump_label_text_poke_early() It is no longer used. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Jason Baron <jbaron@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2011-10-25 11:54:21 -07:00
Linus Torvalds	59e5253417	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (59 commits) MAINTAINERS: linux-m32r is moderated for non-subscribers linux@lists.openrisc.net is moderated for non-subscribers Drop default from "DM365 codec select" choice parisc: Kconfig: cleanup Kernel page size default Kconfig: remove redundant CONFIG_ prefix on two symbols cris: remove arch/cris/arch-v32/lib/nand_init.S microblaze: add missing CONFIG_ prefixes h8300: drop puzzling Kconfig dependencies MAINTAINERS: microblaze-uclinux@itee.uq.edu.au is moderated for non-subscribers tty: drop superfluous dependency in Kconfig ARM: mxc: fix Kconfig typo 'i.MX51' Fix file references in Kconfig files aic7xxx: fix Kconfig references to READMEs Fix file references in drivers/ide/ thinkpad_acpi: Fix printk typo 'bluestooth' bcmring: drop commented out line in Kconfig btmrvl_sdio: fix typo 'btmrvl_sdio_sd6888' doc: raw1394: Trivial typo fix CIFS: Don't free volume_info->UNC until we are entirely done with it. treewide: Correct spelling of successfully in comments ...	2011-10-25 12:11:02 +02:00
Josh Stone	315eb8a2a1	x86: Fix compilation bug in kprobes' twobyte_is_boostable When compiling an i386_defconfig kernel with gcc-4.6.1-9.fc15.i686, I noticed a warning about the asm operand for test_bit in kprobes' can_boost. I discovered that this caused only the first long of twobyte_is_boostable[] to be output. Jakub filed and fixed gcc PR50571 to correct the warning and this output issue. But to solve it for less current gcc, we can make kprobes' twobyte_is_boostable[] non-const, and it won't be optimized out. Before: CC arch/x86/kernel/kprobes.o In file included from include/linux/bitops.h:22:0, from include/linux/kernel.h:17, from [...]/arch/x86/include/asm/percpu.h:44, from [...]/arch/x86/include/asm/current.h:5, from [...]/arch/x86/include/asm/processor.h:15, from [...]/arch/x86/include/asm/atomic.h:6, from include/linux/atomic.h:4, from include/linux/mutex.h:18, from include/linux/notifier.h:13, from include/linux/kprobes.h:34, from arch/x86/kernel/kprobes.c:43: [...]/arch/x86/include/asm/bitops.h: In function ‘can_boost.part.1’: [...]/arch/x86/include/asm/bitops.h:319:2: warning: use of memory input without lvalue in asm operand 1 is deprecated [enabled by default] $ objdump -rd arch/x86/kernel/kprobes.o \| grep -A1 -w bt 551: 0f a3 05 00 00 00 00 bt %eax,0x0 554: R_386_32 .rodata.cst4 $ objdump -s -j .rodata.cst4 -j .data arch/x86/kernel/kprobes.o arch/x86/kernel/kprobes.o: file format elf32-i386 Contents of section .data: 0000 48000000 00000000 00000000 00000000 H............... Contents of section .rodata.cst4: 0000 4c030000 L... Only a single long of twobyte_is_boostable[] is in the object file. After, without the const on twobyte_is_boostable: $ objdump -rd arch/x86/kernel/kprobes.o \| grep -A1 -w bt 551: 0f a3 05 20 00 00 00 bt %eax,0x20 554: R_386_32 .data $ objdump -s -j .rodata.cst4 -j .data arch/x86/kernel/kprobes.o arch/x86/kernel/kprobes.o: file format elf32-i386 Contents of section .data: 0000 48000000 00000000 00000000 00000000 H............... 0010 00000000 00000000 00000000 00000000 ................ 0020 4c030000 0f000200 ffff0000 ffcff0c0 L............... 0030 0000ffff 3bbbfff8 03ff2ebb 26bb2e77 ....;.......&..w Now all 32 bytes are output into .data instead. Signed-off-by: Josh Stone <jistone@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-10-25 09:00:14 +02:00
Borislav Petkov	bcb80e5387	x86, microcode, AMD: Add microcode revision to /proc/cpuinfo Enable microcode revision output for AMD after `506ed6b53e` ("x86, intel: Output microcode revision in /proc/cpuinfo") did it for Intel. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-10-19 16:07:30 +02:00
Borislav Petkov	881e23e567	x86, microcode: Correct microcode revision format `506ed6b53e` ("x86, intel: Output microcode revision in /proc/cpuinfo") added microcode revision format to /proc/cpuinfo and the MCE handler in decimal format but both AMD and Intel patch levels are handled as hex numbers. Fix it. Acked-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-10-19 15:47:48 +02:00
Josh Stone	db45bd90be	x86, perf, kprobes: Make kprobes's twobyte_is_boostable volatile When compiling an i386_defconfig kernel with gcc-4.6.1-9.fc15.i686, I noticed a warning about the asm operand for test_bit in kprobes' can_boost. I discovered that this caused only the first long of twobyte_is_boostable[] to be output. Jakub filed and fixed gcc PR50571 to correct the warning and this output issue. But to solve it for less current gcc, we can make kprobes' twobyte_is_boostable[] volatile, and it won't be optimized out. Before: CC arch/x86/kernel/kprobes.o In file included from include/linux/bitops.h:22:0, from include/linux/kernel.h:17, from [...]/arch/x86/include/asm/percpu.h:44, from [...]/arch/x86/include/asm/current.h:5, from [...]/arch/x86/include/asm/processor.h:15, from [...]/arch/x86/include/asm/atomic.h:6, from include/linux/atomic.h:4, from include/linux/mutex.h:18, from include/linux/notifier.h:13, from include/linux/kprobes.h:34, from arch/x86/kernel/kprobes.c:43: [...]/arch/x86/include/asm/bitops.h: In function ‘can_boost.part.1’: [...]/arch/x86/include/asm/bitops.h:319:2: warning: use of memory input without lvalue in asm operand 1 is deprecated [enabled by default] $ objdump -rd arch/x86/kernel/kprobes.o \| grep -A1 -w bt 551: 0f a3 05 00 00 00 00 bt %eax,0x0 554: R_386_32 .rodata.cst4 $ objdump -s -j .rodata.cst4 -j .data arch/x86/kernel/kprobes.o arch/x86/kernel/kprobes.o: file format elf32-i386 Contents of section .data: 0000 48000000 00000000 00000000 00000000 H............... Contents of section .rodata.cst4: 0000 4c030000 L... Only a single long of twobyte_is_boostable[] is in the object file. After, with volatile: $ objdump -rd arch/x86/kernel/kprobes.o \| grep -A1 -w bt 551: 0f a3 05 20 00 00 00 bt %eax,0x20 554: R_386_32 .data $ objdump -s -j .rodata.cst4 -j .data arch/x86/kernel/kprobes.o arch/x86/kernel/kprobes.o: file format elf32-i386 Contents of section .data: 0000 48000000 00000000 00000000 00000000 H............... 0010 00000000 00000000 00000000 00000000 ................ 0020 4c030000 0f000200 ffff0000 ffcff0c0 L............... 0030 0000ffff 3bbbfff8 03ff2ebb 26bb2e77 ....;.......&..w Now all 32 bytes are output into .data instead. Signed-off-by: Josh Stone <jistone@redhat.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Jakub Jelinek <jakub@redhat.com> Link: http://lkml.kernel.org/r/1318899645-4068-1-git-send-email-jistone@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-18 08:43:08 +02:00
Andi Kleen	30963c0ac7	x86, intel: Use c->microcode for Atom errata check Now that the cpu update level is available the Atom PSE errata check can use it directly without reading the MSR again. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1318466795-7393-2-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-14 13:16:38 +02:00
Andi Kleen	506ed6b53e	x86, intel: Output microcode revision in /proc/cpuinfo I got a request to make it easier to determine the microcode update level on Intel CPUs. This patch adds a new "microcode" field to /proc/cpuinfo. The microcode level is also outputed on fatal machine checks together with the other CPUID model information. I removed the respective code from the microcode update driver, it just reads the field from cpu_data. Also when the microcode is updated it fills in the new values too. I had to add a memory barrier to native_cpuid to prevent it being optimized away when the result is not used. This turns out to clean up further code which already got this information manually. This is done in followon patches. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1318466795-7393-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-14 13:16:35 +02:00
Srivatsa S. Bhat	70989449da	x86, microcode: Don't request microcode from userspace unnecessarily Requesting the microcode from userspace every time when onlining CPUs (during a CPU hotplug operation) is unnecessary. Thus, ensure that once the kernel gets the microcode after booting, it is not freed nor invalidated when a CPU goes offline, so that it can be reused when that CPU comes back online, without requesting userspace for it again. As a result, the CPU hotplug operations become faster as well. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Link: http://lkml.kernel.org/r/4E91F908.5010006@linux.vnet.ibm.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-10-13 16:20:35 +02:00
Yinghai Lu	141d55e6cc	x86/irq: Standardize on CONFIG_SPARSE_IRQ=y Sparseirq got introduced in v2.6.28 and Thomas did a huge cleanup around v2.6.38 that eliminated basically all disadvantages of it. So we can remove non-sparseirq support now and simplify our IRQ degrees of freedom a bit. Suggested-and-acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/4E95E21D.6090200@oracle.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-13 12:12:12 +02:00
Ingo Molnar	910e94dd0c	Merge branch 'tip/perf/core' of git://github.com/rostedt/linux into perf/core	2011-10-12 17:14:47 +02:00
Yinghai Lu	6f50d45fae	x86, ioapic: Clean up ioapic/apic_id usage While looking at the code, apic_id sometime is referred to index of ioapic, but sometime is used for phys apic id. and some even use apic for real apic id. It is very confusing. So try to limit apic_id or ioapic_id to be real apic id for ioapic, and use ioapic_idx for ioapic index in the array. -v2: Suggested by Ingo, use ioapic_idx consistently, instead of ioapic Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/4E9542DC.3090509@oracle.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-12 09:55:28 +02:00
Yinghai Lu	cda417dd87	x86, ioapic: Factor out print_IO_APIC() to only print one io apic It is getting too big after the interrupt remaping entries debug print out was added. Original print_IO_APIC() becomes print_IO_APICs(). New print_IO_APIC() will only print one ioapic's registers As a side-effect this clean-up also made checkpatch.pl happier. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/4E9542D3.5000008@oracle.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-12 09:55:25 +02:00
Yinghai Lu	3a61d7feca	x86, ioapic: Print out irte with right ioapic index While checking irte dump in dmesg, the print out is confusing ioapic index with real io apic id: IOAPIC[0]: Set routing entry (1-1 -> 0x31 -> IRQ 1 Mode:0 Active:0 Dest:1) IOAPIC[1]: Set IRTE entry (P:1 FPD:0 Dst_Mode:1 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:1 Avail:0 Vector:31 Dest:00000001 SID:00FF SQ:0 SVT:1) IOAPIC[0]: Set routing entry (1-2 -> 0x30 -> IRQ 0 Mode:0 Active:0 Dest:1) IOAPIC[1]: Set IRTE entry (P:1 FPD:0 Dst_Mode:1 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:1 Avail:0 Vector:30 Dest:00000001 SID:00FF SQ:0 SVT:1) The system's first ioapic id is 1. This commit: \| commit `3040db92ee` \| Author: Naga Chumbalkar <nagananda.chumbalkar@hp.com> \| Date: Tue Jul 12 21:17:41 2011 +0000 \| \| x86, ioapic: Print IRTE when IR is enabled Confused apic_id with the ioapic ID - fix it. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/4E9542C8.8040209@oracle.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-12 09:55:21 +02:00
Yinghai Lu	c5b4712c3f	x86, ioapic: Split up setup_ioapic_entry() Ingo pointed out that setup_ioapic_entry() is way too big now. Split the intr-remap code out into setup_ir_ioapic_entry(). Also pass struct io_apic_irq_attr * instead of 5 parameters in those two functions. At last in setup_ir_ioapic_entry() we don't need to panic. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/4E9542BB.4070807@oracle.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-12 09:55:18 +02:00
Yinghai Lu	e4aff81182	x86, ioapic: Pass struct irq_attr * to setup_ioapic_irq() Do not expand that struct, and just pass pointer to reduce the number of parameters in related functions. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/4E9542B1.7050800@oracle.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-12 09:55:15 +02:00
Adrian Bunk	2b666859ec	x86: Default to vsyscall=native for now This UML breakage: linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 Is caused by commit `3ae36655` ("x86-64: Rework vsyscall emulation and add vsyscall= parameter") - the vsyscall emulation code is not fully cooked yet as UML relies on some rather fragile SIGSEGV semantics. Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to vsyscall=native for now, this patch implements that. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Andrew Lutomirski <luto@mit.edu> Cc: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/20111005214047.GE14406@localhost.pp.htv.fi Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-11 08:23:34 +02:00
Ingo Molnar	d48b0e1737	x86, nmi, drivers: Fix nmi splitup build bug nmi.c needs an #include <linux/mca.h>: arch/x86/kernel/nmi.c: In function ‘unknown_nmi_error’: arch/x86/kernel/nmi.c:286:6: error: ‘MCA_bus’ undeclared (first use in this function) arch/x86/kernel/nmi.c:286:6: note: each undeclared identifier is reported only once for each function it appears in Another one is the hpwdt driver: drivers/watchdog/hpwdt.c:507:9: error: ‘NMI_DONE’ undeclared (first use in this function) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-10 06:57:21 +02:00
Robert Richter	b716916679	perf, x86: Implement IBS initialization This patch implements IBS feature detection and initialzation. The code is shared between perf and oprofile. If IBS is available on the system for perf, a pmu is setup. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1316597423-25723-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-10 06:57:16 +02:00
Robert Richter	ee5789dbcc	perf, x86: Share IBS macros between perf and oprofile Moving IBS macros from oprofile to <asm/perf_event.h> to make it available to perf. No additional changes. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1316597423-25723-2-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-10 06:57:11 +02:00
Don Zickus	efc3aac5f3	x86, nmi: Track NMI usage stats Now that the NMI handler are broken into lists, increment the appropriate stats for each list. This allows us to see what is going on when they get printed out in the next patch. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1317409584-23662-6-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-10 06:57:06 +02:00
Don Zickus	b227e23399	x86, nmi: Add in logic to handle multiple events and unknown NMIs Previous patches allow the NMI subsystem to process multipe NMI events in one NMI. As previously discussed this can cause issues when an event triggered another NMI but is processed in the current NMI. This causes the next NMI to go unprocessed and become an 'unknown' NMI. To handle this, we first have to flag whether or not the NMI handler handled more than one event or not. If it did, then there exists a chance that the next NMI might be already processed. Once the NMI is flagged as a candidate to be swallowed, we next look for a back-to-back NMI condition. This is determined by looking at the %rip from pt_regs. If it is the same as the previous NMI, it is assumed the cpu did not have a chance to jump back into a non-NMI context and execute code and instead handled another NMI. If both of those conditions are true then we will swallow any unknown NMI. There still exists a chance that we accidentally swallow a real unknown NMI, but for now things seem better. An optimization has also been added to the nmi notifier rountine. Because x86 can latch up to one NMI while currently processing an NMI, we don't have to worry about executing _all_ the handlers in a standalone NMI. The idea is if multiple NMIs come in, the second NMI will represent them. For those back-to-back NMI cases, we have the potentail to drop NMIs. Therefore only execute all the handlers in the second half of a detected back-to-back NMI. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1317409584-23662-5-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-10 06:57:01 +02:00
Don Zickus	9c48f1c629	x86, nmi: Wire up NMI handlers to new routines Just convert all the files that have an nmi handler to the new routines. Most of it is straight forward conversion. A couple of places needed some tweaking like kgdb which separates the debug notifier from the nmi handler and mce removes a call to notify_die. [Thanks to Ying for finding out the history behind that mce call https://lkml.org/lkml/2010/5/27/114 And Boris responding that he would like to remove that call because of it https://lkml.org/lkml/2011/9/21/163] The things that get converted are the registeration/unregistration routines and the nmi handler itself has its args changed along with code removal to check which list it is on (most are on one NMI list except for kgdb which has both an NMI routine and an NMI Unknown routine). Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Corey Minyard <minyard@acm.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Corey Minyard <minyard@acm.org> Cc: Jack Steiner <steiner@sgi.com> Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-10 06:56:57 +02:00
Don Zickus	c9126b2ee8	x86, nmi: Create new NMI handler routines The NMI handlers used to rely on the notifier infrastructure. This worked great until we wanted to support handling multiple events better. One of the key ideas to the nmi handling is to process _all_ the handlers for each NMI. The reason behind this switch is because NMIs are edge triggered. If enough NMIs are triggered, then they could be lost because the cpu can only latch at most one NMI (besides the one currently being processed). In order to deal with this we have decided to process all the NMI handlers for each NMI. This allows the handlers to determine if they recieved an event or not (the ones that can not determine this will be left to fend for themselves on the unknown NMI list). As a result of this change it is now possible to have an extra NMI that was destined to be received for an already processed event. Because the event was processed in the previous NMI, this NMI gets dropped and becomes an 'unknown' NMI. This of course will cause printks that scare people. However, we prefer to have extra NMIs as opposed to losing NMIs and as such are have developed a basic mechanism to catch most of them. That will be a later patch. To accomplish this idea, I unhooked the nmi handlers from the notifier routines and created a new mechanism loosely based on doIRQ. The reason for this is the notifier routines have a couple of shortcomings. One we could't guarantee all future NMI handlers used NOTIFY_OK instead of NOTIFY_STOP. Second, we couldn't keep track of the number of events being handled in each routine (most only handle one, perf can handle more than one). Third, I wanted to eventually display which nmi handlers are registered in the system in /proc/interrupts to help see who is generating NMIs. The patch below just implements the new infrastructure but doesn't wire it up yet (that is the next patch). Its design is based on doIRQ structs and the atomic notifier routines. So the rcu stuff in the patch isn't entirely untested (as the notifier routines have soaked it) but it should be double checked in case I copied the code wrong. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1317409584-23662-3-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-10 06:56:52 +02:00
Don Zickus	1d48922c14	x86, nmi: Split out nmi from traps.c The nmi stuff is changing a lot and adding more functionality. Split it out from the traps.c file so it doesn't continue to pollute that file. This makes it easier to find and expand all the future nmi related work. No real functional changes here. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1317409584-23662-2-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-10 06:56:47 +02:00
Gleb Natapov	144d31e6f1	perf, intel: Use GO/HO bits in perf-ctr Intel does not have guest/host-only bit in perf counters like AMD does. To support GO/HO bits KVM needs to switch EVENTSELn values (or PERF_GLOBAL_CTRL if available) at a guest entry. If a counter is configured to count only in a guest mode it stays disabled in a host, but VMX is configured to switch it to enabled value during guest entry. This patch adds GO/HO tracking to Intel perf code and provides interface for KVM to get a list of MSRs that need to be switched on a guest entry. Only cpus with architectural PMU (v1 or later) are supported with this patch. To my knowledge there is not p6 models with VMX but without architectural PMU and p4 with VMX are rare and the interface is general enough to support them if need arise. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1317816084-18026-7-git-send-email-gleb@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-10 06:56:42 +02:00
Joerg Roedel	011af85784	perf, amd: Use GO/HO bits in perf-ctr The AMD perf-counters support counting in guest or host-mode only. Make use of that feature when user-space specified guest/host-mode only counting. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1317816084-18026-3-git-send-email-gleb@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-06 13:00:31 +02:00
Ingo Molnar	7b4f86ac05	Merge branch 'ras' of git://amd64.org/linux/bp into perf/core	2011-10-06 12:54:36 +02:00
Ingo Molnar	9d01402023	Merge commit 'v3.1-rc9' into perf/core Merge reason: pick up latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-06 12:49:21 +02:00
Jan Beulich	eab9e6137f	x86-64: Fix CFI data for interrupt frames The patch titled "x86: Don't use frame pointer to save old stack on irq entry" did not properly adjust CFI directives, so this patch is a follow-up to that one. With the old stack pointer no longer stored in a callee-saved register (plus some offset), we now have to use a CFA expression to describe the memory location where it is being found. This requires the use of .cfi_escape (allowing arbitrary byte streams to be emitted into .eh_frame), as there is no .cfi_def_cfa_expression (which also cannot reasonably be expected, as it would require a full expression parser). Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/4E8360200200007800058467@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-28 19:04:52 +02:00
Jan Beulich	838312be46	apic, i386/bigsmp: Fix false warnings regarding logical APIC ID mismatches These warnings (generally one per CPU) are a result of initializing x86_cpu_to_logical_apicid while apic_default is still in use, but the check in setup_local_APIC() being done when apic_bigsmp was already used as an override in default_setup_apic_routing(): Overriding APIC driver with bigsmp Enabling APIC mode: Physflat. Using 5 I/O APICs ------------[ cut here ]------------ WARNING: at .../arch/x86/kernel/apic/apic.c:1239 ... CPU 1 irqstacks, hard=f1c9a000 soft=f1c9c000 Booting Node 0, Processors #1 smpboot cpu 1: start_ip = 9e000 Initializing CPU#1 ------------[ cut here ]------------ WARNING: at .../arch/x86/kernel/apic/apic.c:1239 setup_local_APIC+0x137/0x46b() Hardware name: ... CPU1 logical APIC ID: 2 != 8 ... Fix this (for the time being, i.e. until x86_32_early_logical_apicid() will get removed again, as Tejun says ought to be possible) by overriding the previously stored values at the point where the APIC driver gets overridden. v2: Move this and the pre-existing override logic into arch/x86/kernel/apic/bigsmp_32.c. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: <stable@kernel.org> (2.6.39 and onwards) Link: http://lkml.kernel.org/r/4E835D16020000780005844C@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-28 19:01:53 +02:00
Stephen Rothwell	910b2c5122	x86, amd: Include linux/elf.h since we use stuff from asm/elf.h After merging the moduleh tree, today's linux-next build (x86_64 allmodconfig) failed like this: arch/x86/kernel/sys_x86_64.c:28:10: warning: 'enum align_flags' declared inside parameter list arch/x86/kernel/sys_x86_64.c:28:10: warning: its scope is only this definition or declaration, which is probably not what you want arch/x86/kernel/sys_x86_64.c:28:22: error: parameter 3 ('flags') has incomplete type [...] Presumably caused by the module.h split interacting with a new commit `dfb09f9b7a` ("x86, amd: Avoid cache aliasing penalties on AMD family 15h") from the x8 tree. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Link: http://lkml.kernel.org/r/20110928174214.17a58be15d84d67c185930e1@canb.auug.org.au Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-28 10:34:31 +02:00
Randy Dunlap	d6eed550a9	x86: Perf_event_amd.c needs <asm/apicdef.h> Fix (rare) build error by adding <asm/apicdef.h> header file: arch/x86/kernel/cpu/perf_event_amd.c:350:2: error: 'BAD_APICID' undeclared (first use in this function) Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Robert Richter <robert.richter@amd.com> Cc: Andre Przywara <andre.przywara@amd.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lkml.kernel.org/r/4E820138.90301@xenotime.net Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-27 19:55:09 +02:00
Paul Bolle	395cf9691d	doc: fix broken references There are numerous broken references to Documentation files (in other Documentation files, in comments, etc.). These broken references are caused by typo's in the references, and by renames or removals of the Documentation files. Some broken references are simply odd. Fix these broken references, sometimes by dropping the irrelevant text they were part of. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-09-27 18:08:04 +02:00
Kevin Winchester	de0428a7ad	x86, perf: Clean up perf_event cpu code The CPU support for perf events on x86 was implemented via included C files with #ifdefs. Clean this up by creating a new header file and compiling the vendor-specific files as needed. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1314747665-2090-1-git-send-email-kjwinchester@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-26 12:58:00 +02:00
Ingo Molnar	ed3982cf37	Merge commit 'v3.1-rc7' into perf/core Merge reason: Pick up the latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-26 12:54:28 +02:00
Matt Fleming	47997d756a	x86/rtc: Don't recursively acquire rtc_lock A deadlock was introduced on x86 in commit `ef68c8f87e` ("x86: Serialize EFI time accesses on rtc_lock") because efi_get_time() and friends can be called with rtc_lock already held by read_persistent_time(), e.g.: timekeeping_init() read_persistent_clock() <-- acquire rtc_lock efi_get_time() phys_efi_get_time() <-- acquire rtc_lock <DEADLOCK> To fix this let's push the locking down into the get_wallclock() and set_wallclock() implementations. Only the clock implementations that access the x86 RTC directly need to acquire rtc_lock, so it makes sense to push the locking down into the rtc, vrtc and efi code. The virtualization implementations don't require rtc_lock to be held because they provide their own serialization. Signed-off-by: Matt Fleming <matt.fleming@intel.com> Acked-by: Jan Beulich <jbeulich@novell.com> Acked-by: Avi Kivity <avi@redhat.com> [for the virtualization aspect] Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Josh Boyer <jwboyer@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 16:16:09 +02:00
Jack Steiner	6a469e4665	x86: uv2: Workaround for UV2 Hub bug (system global address format) This is a workaround for a UV2 hub bug that affects the format of system global addresses. The GRU API for UV2 was inadvertently broken by a hardware change. The format of the physical address used for TLB dropins and for addresses used with instructions running in unmapped mode has changed. This change was not documented and became apparent only when diags failed running on system simulators. For UV1, TLB and GRU instruction physical addresses are identical to socket physical addresses (although high NASID bits must be OR'ed into the address). For UV2, socket physical addresses need to be converted. The NODE portion of the physical address needs to be shifted so that the low bit is in bit 39 or bit 40, depending on an MMR value. It is not yet clear if this bug will be fixed in a silicon respin. If it is fixed, the hub revision will be incremented & the workaround disabled. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-21 11:23:15 +02:00
Suresh Siddha	c020570138	x86, ioapic: Consolidate the explicit EOI code Consolidate the io-apic EOI code in clear_IO_APIC_pin() and eoi_ioapic_irq(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Thomas Renninger <trenn@suse.de> Cc: Rafael Wysocki <rjw@novell.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: lchiquitto@novell.com Cc: jbeulich@novell.com Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/20110825190657.259696697@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:26:28 +02:00
Suresh Siddha	e57253a81d	x86, ioapic: Restore the mask bit correctly in eoi_ioapic_irq() For older IO-APIC's, we were clearing the remote-IRR by changing the RTE trigger mode to edge and then back to level. We wanted to mask the RTE during this process, so we were essentially doing mask+edge and then to unmask+level. As part of the commit `ca64c47cec`, we moved this EOI process earlier where the IO-APIC RTE is masked. So we were wrongly unmasking it in the eoi_ioapic_irq(). So change the remote-IRR clear sequence in eoi_ioapic_irq() to mask + edge and then restore the previous RTE entry which will restore the mask status as well as the level trigger. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Thomas Renninger <trenn@suse.de> Cc: Rafael Wysocki <rjw@novell.com> Cc: lchiquitto@novell.com Cc: jbeulich@novell.com Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/20110825190657.210286410@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:26:26 +02:00
Suresh Siddha	1e75b31d63	x86, kdump, ioapic: Reset remote-IRR in clear_IO_APIC In the kdump scenario mentioned below, we can have a case where the device using level triggered interrupt will not generate any interrupts in the kdump kernel. 1. IO-APIC sends a level triggered interrupt to the CPU's local APIC. 2. Kernel crashed before the CPU services this interrupt, leaving the remote-IRR in the IO-APIC set. 3. kdump kernel boot sequence does clear_IO_APIC() as part of IO-APIC initialization. But this fails to reset remote-IRR bit of the IO-APIC RTE as the remote-IRR bit is read-only. 4. Device using that level triggered entry can't generate any more interrupts because of the remote-IRR bit. In clear_IO_APIC_pin(), check if the remote-IRR bit is set and if so do an explicit attempt to clear it (by doing EOI write on modern io-apic's and changing trigger mode to edge/level on older io-apic's). Also before doing the explicit EOI to the io-apic, ensure that the trigger mode is indeed set to level. This will enable the explicit EOI to the io-apic to reset the remote-IRR bit. Tested-by: Leonardo Chiquitto <lchiquitto@novell.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Fixes: https://bugzilla.novell.com/show_bug.cgi?id=701686 Cc: Rafael Wysocki <rjw@novell.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Thomas Renninger <trenn@suse.de> Cc: jbeulich@novell.com Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/20110825190657.157502602@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:26:25 +02:00
Suresh Siddha	d3f138106b	iommu: Rename the DMAR and INTR_REMAP config options Change the CONFIG_DMAR to CONFIG_INTEL_IOMMU to be consistent with the other IOMMU options. Rename the CONFIG_INTR_REMAP to CONFIG_IRQ_REMAP to match the irq subsystem name. And define the CONFIG_DMAR_TABLE for the common ACPI DMAR routines shared by both CONFIG_INTEL_IOMMU and CONFIG_IRQ_REMAP. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: yinghai@kernel.org Cc: youquan.song@intel.com Cc: joerg.roedel@amd.com Cc: tony.luck@intel.com Cc: dwmw2@infradead.org Link: http://lkml.kernel.org/r/20110824001456.558630224@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:22:03 +02:00
Suresh Siddha	c39d77ffa2	x86, ioapic: Define irq_remap_modify_chip_defaults() Define irq_remap_modify_chip_defaults() and remove the duplicate code, cleanup the unnecessary ifdefs. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: yinghai@kernel.org Cc: youquan.song@intel.com Cc: joerg.roedel@amd.com Cc: tony.luck@intel.com Cc: dwmw2@infradead.org Link: http://lkml.kernel.org/r/20110824001456.499225692@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:22:01 +02:00
Suresh Siddha	13ea20f7a2	x86, msi, intr-remap: Use the ioapic set affinity routine IRQ set affinity routine is same for the IO-APIC IRQ's aswell as the MSI IRQ's in the presence of interrupt-remapping. This is because we modify the interrupt-remapping table entry and doesn't touch the IO-APIC RTE or the MSI entry. So remove the ir_msi_set_affinity() and re-use the ir_ioapic_set_affinity() Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: yinghai@kernel.org Cc: youquan.song@intel.com Cc: joerg.roedel@amd.com Cc: tony.luck@intel.com Cc: dwmw2@infradead.org Link: http://lkml.kernel.org/r/20110824001456.452760446@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:21:59 +02:00
Suresh Siddha	41750d31fc	x86, x2apic: Enable the bios request for x2apic optout On the platforms which are x2apic and interrupt-remapping capable, Linux kernel is enabling x2apic even if the BIOS doesn't. This is to take advantage of the features that x2apic brings in. Some of the OEM platforms are running into issues because of this, as their bios is not x2apic aware. For example, this was resulting in interrupt migration issues on one of the platforms. Also if the BIOS SMI handling uses APIC interface to send SMI's, then the BIOS need to be aware of x2apic mode that OS has enabled. On some of these platforms, BIOS doesn't have a HW mechanism to turnoff the x2apic feature to prevent OS from enabling it. To resolve this mess, recent changes to the VT-d2 specification: http://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct_IO.pdf includes a mechanism that provides BIOS a way to request system software to opt out of enabling x2apic mode. Look at the x2apic optout flag in the DMAR tables before enabling the x2apic mode in the platform. Also print a warning that we have disabled x2apic based on the BIOS request. Kernel boot parameter "intremap=no_x2apic_optout" can be used to override the BIOS x2apic optout request. Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: yinghai@kernel.org Cc: joerg.roedel@amd.com Cc: tony.luck@intel.com Cc: dwmw2@infradead.org Link: http://lkml.kernel.org/r/20110824001456.171766616@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-21 10:21:50 +02:00
Jiri Kosina	e060c38434	Merge branch 'master' into for-next Fast-forward merge with Linus to be able to merge patches based on more recent version of the tree.	2011-09-15 15:08:18 +02:00
Kamalesh Babulal	ea70ef3d9d	sched: x86_32 Fix typo in switch_to() description This patch fixes the typo in parameters passed to x86_32 switch_to() description. Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-09-15 14:13:48 +02:00
Hidetoshi Seto	9aaef96f61	x86, mce: Do not call del_timer_sync() in IRQ context del_timer_sync() can cause a deadlock when called in interrupt context. It is used with on_each_cpu() in some parts for sysfs files like bank*, check_interval, cmci_disabled and ignore_ce. However, use of on_each_cpu() results in calling the function passed as the argument in interrupt context. This causes a flood of nested warnings from del_timer_sync() (it runs on each CPU) caused even by a simple file access like: $ echo 300 > /sys/devices/system/machinecheck/machinecheck0/check_interval Fortunately, these MCE-specific files are rarely used and AFAIK only few MCE geeks experience this warning. To remove the warning, move timer deletion outside of the interrupt context. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-09-14 15:50:15 +02:00
Thomas Gleixner	59d958d2c7	locking, x86: mce: Annotate cmci_discover_lock as raw The cmci_discover_lock can be taken in atomic context (cpu bring up sequence) and therefore cannot be preempted on -rt. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-13 11:12:09 +02:00
Frank Arnold	77e75fc764	x86: cache_info: Update calculation of AMD L3 cache indices L3 subcaches 0 and 1 of AMD Family 15h CPUs can have a size of 2MB. Update the calculation routine for the number of L3 indices to reflect that. Signed-off-by: Frank Arnold <frank.arnold@amd.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Rosenfeld Hans <Hans.Rosenfeld@amd.com> Cc: Herrmann3 Andreas <Andreas.Herrmann3@amd.com> Cc: Mike Travis <travis@sgi.com> Cc: Frank Arnold <Frank.Arnold@amd.com> Link: http://lkml.kernel.org/r/20110726170449.GB32536@aftab Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-12 19:29:03 +02:00
Thomas Gleixner	d2946041ff	x86: cache_info: Kill the atomic allocation in amd_init_l3_cache() It's not a good reason to allocate memory in the smp function call just because someone thought it's the most conveniant place. The AMD L3 data is coupled to the northbridge info by a pointer to the corresponding north bridge data. So allocating it with the northbridge data and referencing the northbridge in the cache_info code instead uses less memory and gets rid of that atomic allocation hack in the smp function call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/r/20110723212626.688229918@linutronix.de Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-12 19:28:37 +02:00
Thomas Gleixner	b7d11a768b	x86: cache_info: Kill the moronic shadow struct Commit `f9b90566c` ("x86: reduce stack usage in init_intel_cacheinfo") introduced a shadow structure to reduce the stack usage on large machines instead of making the smaller structure embedded into the large one. That's definitely a candidate for the bad taste award. Move the small struct into the large one and get rid of the ugly type casts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/r/20110723212626.625651773@linutronix.de Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-12 19:28:33 +02:00
Thomas Gleixner	05b217b021	x86: cache_info: Remove bogus free of amd_l3_cache data free_cache_attributes() kfree's: per_cpu(ici_cpuid4_info, cpu)->l3 which is a pointer to memory which was allocated as a block in amd_init_l3_cache(). l3 of a particular cpu points to a part of this memory blob. The part and the rest of the blob are still referenced by other cpus. As far as I can tell from the git history this is a leftover from the conversion from per cpu to node data with commit ba06edb63(x86, cacheinfo: Make L3 cache info per node) and the following commit f658bcfb2(x86, cacheinfo: Cleanup L3 cache index disable support) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/r/20110723212626.550539989@linutronix.de Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-09-12 19:28:29 +02:00
K. Y. Srinivasan	6f4151c89b	x86: Hyper-V: Integrate the clocksource with Hyper-V detection code The Hyper-V clocksource driver is best integrated with Hyper-V detection code since: (a) Linux guests running on Hyper-V require it (b) Integration into that code significanly reduces code size Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Cc: gregkh@suse.de Cc: devel@linuxdriverproject.org Cc: virtualization@lists.osdl.org Link: http://lkml.kernel.org/r/1315434310-4827-1-git-send-email-kys@microsoft.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 10:33:59 +02:00
Andrey Vagin	20afc60f89	x86, perf: Check that current->mm is alive before getting user callchain An event may occur when an mm is already released. I added an event in dequeue_entity() and caught a panic with the following backtrace: [ 434.421110] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050 [ 434.421258] IP: [<ffffffff810464ac>] __get_user_pages_fast+0x9c/0x120 ... [ 434.421258] Call Trace: [ 434.421258] [<ffffffff8101ae81>] copy_from_user_nmi+0x51/0xf0 [ 434.421258] [<ffffffff8109a0d5>] ? sched_clock_local+0x25/0x90 [ 434.421258] [<ffffffff8101b048>] perf_callchain_user+0x128/0x170 [ 434.421258] [<ffffffff811154cd>] ? __perf_event_header__init_id+0xed/0x100 [ 434.421258] [<ffffffff81116690>] perf_prepare_sample+0x200/0x280 [ 434.421258] [<ffffffff81118da8>] __perf_event_overflow+0x1b8/0x290 [ 434.421258] [<ffffffff81065240>] ? tg_shares_up+0x0/0x670 [ 434.421258] [<ffffffff8104fe1a>] ? walk_tg_tree+0x6a/0xb0 [ 434.421258] [<ffffffff81118f44>] perf_swevent_overflow+0xc4/0xf0 [ 434.421258] [<ffffffff81119150>] do_perf_sw_event+0x1e0/0x250 [ 434.421258] [<ffffffff81119204>] perf_tp_event+0x44/0x70 [ 434.421258] [<ffffffff8105701f>] ftrace_profile_sched_block+0xdf/0x110 [ 434.421258] [<ffffffff8106121d>] dequeue_entity+0x2ad/0x2d0 [ 434.421258] [<ffffffff810614ec>] dequeue_task_fair+0x1c/0x60 [ 434.421258] [<ffffffff8105818a>] dequeue_task+0x9a/0xb0 [ 434.421258] [<ffffffff810581e2>] deactivate_task+0x42/0xe0 [ 434.421258] [<ffffffff814bc019>] thread_return+0x191/0x808 [ 434.421258] [<ffffffff81098a44>] ? switch_task_namespaces+0x24/0x60 [ 434.421258] [<ffffffff8106f4c4>] do_exit+0x464/0x910 [ 434.421258] [<ffffffff8106f9c8>] do_group_exit+0x58/0xd0 [ 434.421258] [<ffffffff8106fa57>] sys_exit_group+0x17/0x20 [ 434.421258] [<ffffffff8100b202>] system_call_fastpath+0x16/0x1b Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1314693156-24131-1-git-send-email-avagin@openvz.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-31 15:56:31 +02:00
NeilBrown	f5b9409973	All Arch: remove linkage for sys_nfsservctl system call The nfsservctl system call is now gone, so we should remove all linkage for it. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-26 15:09:58 -07:00
Andy Lutomirski	b4ca46e4e8	x86-32: Fix boot with CONFIG_X86_INVD_BUG entry_32.S contained a hardcoded alternative instruction entry, and the format changed in commit `59e97e4d6f` ("x86: Make alternative instruction pointers relative"). Replace the hardcoded entry with the altinstruction_entry macro. This fixes the 32-bit boot with CONFIG_X86_INVD_BUG=y. Reported-and-tested-by: Arnaud Lacombe <lacombar@gmail.com> Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-25 13:27:14 -07:00
Tejun Heo	cbbfa38fcb	mtrr: fix UP breakage caused during switch to stop_machine While removing custom rendezvous code and switching to stop_machine, commit `192d885742` ("x86, mtrr: use stop_machine APIs for doing MTRR rendezvous") completely dropped mtrr setting code on !CONFIG_SMP breaking MTRR settting on UP. Fix it by removing the incorrect CONFIG_SMP. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Anders Eriksson <aeriksson@fastmail.fm> Tested-and-acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-25 11:02:29 -07:00
Linus Torvalds	14c62e78dc	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-32, vdso: On system call restart after SYSENTER, use int $0x80 x86, UV: Remove UV delay in starting slave cpus x86, olpc: Wait for last byte of EC command to be accepted	2011-08-23 18:09:08 -07:00
Zhao Jin	c812d8f7ad	x86, mm, trivial: Remove unnecessary get_order() in free_thread_info() Because THREAD_SIZE is defined as PAGE_SIZE << THREAD_ORDER on x86, the call of get_order(THREAD_SIZE) can be replaced with THREAD_ORDER. Signed-off-by: Zhao Jin <cronozhj@gmail.com> Link: http://lkml.kernel.org/r/4E4FB5A9.700@gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-23 21:49:35 +02:00
Kevin Winchester	2f3aa7a06f	x86: jump_label: arch_jump_label_text_poke_early: add missing __init arch_jump_label_text_poke_early calls text_poke_early, which is an __init function. Thus arch_jump_label_text_poke_early should be the same. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Cc: jbaron@redhat.com Cc: tglx@linutronix.de Cc: mingo@redhat.com Cc: hpa@zytor.com Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1313539478-30303-1-git-send-email-kjwinchester@gmail.com [ Use __init_or_module instead of __init ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-08-19 14:35:04 -04:00
Ingo Molnar	8bc84f8731	Merge branch 'perf/urgent' into perf/core Merge reason: add the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-18 21:56:47 +02:00
Peter Zijlstra	7fdba1ca10	perf, x86: Avoid kfree() in CPU_STARTING On -rt kfree() can schedule, but CPU_STARTING is before the CPU is fully up and running. These are contradictory, so avoid it. Instead push the kfree() to CPU_ONLINE where we're free to schedule. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-kwd4j6ayld5thrscvaxgjquv@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-14 11:53:02 +02:00
Linus Torvalds	06e727d2a5	Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-tip * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-tip: x86-64: Rework vsyscall emulation and add vsyscall= parameter x86-64: Wire up getcpu syscall x86: Remove unnecessary compile flag tweaks for vsyscall code x86-64: Add vsyscall:emulate_vsyscall trace event x86-64: Add user_64bit_mode paravirt op x86-64, xen: Enable the vvar mapping x86-64: Work around gold bug 13023 x86-64: Move the "user" vsyscall segment out of the data segment. x86-64: Pad vDSO to a page boundary	2011-08-12 20:46:24 -07:00
Andy Lutomirski	3ae36655b9	x86-64: Rework vsyscall emulation and add vsyscall= parameter There are three choices: vsyscall=native: Vsyscalls are native code that issues the corresponding syscalls. vsyscall=emulate (default): Vsyscalls are emulated by instruction fault traps, tested in the bad_area path. The actual contents of the vsyscall page is the same as the vsyscall=native case except that it's marked NX. This way programs that make assumptions about what the code in the page does will not be confused when they read that code. vsyscall=none: Trying to execute a vsyscall will segfault. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/8449fb3abf89851fd6b2260972666a6f82542284.1312988155.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-10 19:26:46 -05:00
Andy Lutomirski	f3fb5b7bb7	x86: Remove unnecessary compile flag tweaks for vsyscall code As of commit `98d0ac38ca` Author: Andy Lutomirski <luto@mit.edu> Date: Thu Jul 14 06:47:22 2011 -0400 x86-64: Move vread_tsc and vread_hpet into the vDSO user code no longer directly calls into code in arch/x86/kernel/, so we don't need compile flag hacks to make it safe. All vdso code is in the vdso directory now. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/835cd05a4c7740544d09723d6ba48f4406f9826c.1312988155.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-10 18:55:29 -05:00
Stephen Rothwell	5cdd174fea	x86, amd: Include elf.h explicitly, prepare the code for the module.h split When the moduleu.h splitting tree is merged to the latest tip:x86/cpu tree, the x86_64 allmodconfig build fails like this: arch/x86/kernel/cpu/amd.c: In function 'bsp_init_amd': arch/x86/kernel/cpu/amd.c:437:3: error: 'va_align' undeclared (first use in this function) arch/x86/kernel/cpu/amd.c:438:23: error: 'ALIGN_VA_32' undeclared (first use in this function) arch/x86/kernel/cpu/amd.c:438:37: error: 'ALIGN_VA_64' undeclared (first use in this function) This is caused by the module.h split up intreacting with commit `dfb09f9b7a` ("x86, amd: Avoid cache aliasing penalties on AMD family 15h") from the tip:x86/cpu tree. I have added the following patch for today (this, or something similar, could be applied to the tip tree directly - the export.h include below was added by the module.h splitup). So include elf.h to use va_align and remove this implicit dependency on module.h doing it for us. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Link: http://lkml.kernel.org/r/20110810114956.238d66772883636e3040d29f@canb.auug.org.au Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-10 10:14:09 +02:00
Youquan Song	a34668f6be	perf, x86: Add model 45 SandyBridge support Add support to Romely-EP SandyBridge. Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Anhua Xu <anhua.xu@intel.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1312264895-2010-1-git-send-email-youquan.song@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-09 11:58:06 +02:00
Borislav Petkov	9387f774d6	x86-32, amd: Move va_align definition to unbreak 32-bit build hpa reported that `dfb09f9b7a` breaks 32-bit builds with the following error message: /home/hpa/kernel/linux-tip.cpu/arch/x86/kernel/cpu/amd.c:437: undefined reference to `va_align' /home/hpa/kernel/linux-tip.cpu/arch/x86/kernel/cpu/amd.c:436: undefined reference to `va_align' This is due to the fact that va_align is a global in a 64-bit only compilation unit. Move it to mmap.c where it is visible to both subarches. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1312633899-1131-1-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2011-08-06 11:44:57 -07:00
Jack Steiner	05e33fc20e	x86, UV: Remove UV delay in starting slave cpus Delete the 10 msec delay between the INIT and SIPI when starting slave cpus. I can find no requirement for this delay. BIOS also has similar code sequences without the delay. Removing the delay reduces boot time by 40 sec. Every bit helps. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/20110805140900.GA6774@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-05 23:48:34 +02:00
Borislav Petkov	8fa8b03508	x86, amd: Move BSP code to cpu_dev helper Move code which is run once on the BSP during boot into the cpu_dev helper. [ hpa: removed bogus cpu_has -> static_cpu_has conversion ] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/20110805180409.GC26217@aftab Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-05 12:32:33 -07:00
Borislav Petkov	a110b5ec73	x86: Add a BSP cpu_dev helper Add a function ptr to struct cpu_dev which is destined to be run only once on the BSP during boot. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/20110805180116.GB26217@aftab Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-05 12:26:49 -07:00
Borislav Petkov	dfb09f9b7a	x86, amd: Avoid cache aliasing penalties on AMD family 15h This patch provides performance tuning for the "Bulldozer" CPU. With its shared instruction cache there is a chance of generating an excessive number of cache cross-invalidates when running specific workloads on the cores of a compute module. This excessive amount of cross-invalidations can be observed if cache lines backed by shared physical memory alias in bits [14:12] of their virtual addresses, as those bits are used for the index generation. This patch addresses the issue by clearing all the bits in the [14:12] slice of the file mapping's virtual address at generation time, thus forcing those bits the same for all mappings of a single shared library across processes and, in doing so, avoids instruction cache aliases. It also adds the command line option "align_va_addr=(32\|64\|on\|off)" with which virtual address alignment can be enabled for 32-bit or 64-bit x86 individually, or both, or be completely disabled. This change leaves virtual region address allocation on other families and/or vendors unaffected. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1312550110-24160-2-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-05 12:26:44 -07:00
H. Peter Anvin	13f9a3737c	Merge commit 'v3.0' into x86/cpu	2011-08-05 12:25:56 -07:00
Andy Lutomirski	c149a665ac	x86-64: Add vsyscall:emulate_vsyscall trace event Vsyscall emulation is slow, so make it easy to track down. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/cdaad7da946a80b200df16647c1700db3e1171e9.1312378163.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-04 16:13:53 -07:00
Andy Lutomirski	318f5a2a67	x86-64: Add user_64bit_mode paravirt op Three places in the kernel assume that the only long mode CPL 3 selector is __USER_CS. This is not true on Xen -- Xen's sysretq changes cs to the magic value 0xe033. Two of the places are corner cases, but as of "x86-64: Improve vsyscall emulation CS and RIP handling" (`c9712944b2`), vsyscalls will segfault if called with Xen's extra CS selector. This causes a panic when older init builds die. It seems impossible to make Xen use __USER_CS reliably without taking a performance hit on every system call, so this fixes the tests instead with a new paravirt op. It's a little ugly because ptrace.h can't include paravirt.h. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/f4fcb3947340d9e96ce1054a432f183f9da9db83.1312378163.git.luto@mit.edu Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-04 16:13:49 -07:00
Andy Lutomirski	f670bb760e	x86-64: Work around gold bug 13023 Gold has trouble assigning numbers to the location counter inside of an output section description. The bug was triggered by `9fd67b4ed0`, which consolidated all of the vsyscall sections into a single section. The workaround is IMO still nicer than the old way of doing it. This produces an apparently valid kernel image and passes my vdso tests on both GNU ld version 2.21.51.0.6-2.fc15 20110118 and GNU gold (version 2.21.51.0.6-2.fc15 20110118) 1.10 as distributed by Fedora 15. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/0b260cb806f1f9a25c00ce8377a5f035d57f557a.1312378163.git.luto@mit.edu Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-04 16:13:38 -07:00
Andy Lutomirski	9c40818da5	x86-64: Move the "user" vsyscall segment out of the data segment. The kernel's loader doesn't seem to care, but gold complains. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/f0716870c297242a841b949953d80c0d87bf3d3f.1312378163.git.luto@mit.edu Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-08-04 16:13:35 -07:00
H. Peter Anvin	17b0436077	Merge commit 'v3.0' into x86/vdso	2011-08-04 16:13:20 -07:00
Linus Torvalds	35e51fe82d	Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6 * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6: cpuidle: stop depending on pm_idle x86 idle: move mwait_idle_with_hints() to where it is used cpuidle: replace xen access to x86 pm_idle and default_idle cpuidle: create bootparam "cpuidle.off=1" mrst_pmu: driver for Intel Moorestown Power Management Unit	2011-08-03 21:54:15 -10:00
Len Brown	a0bfa13738	cpuidle: stop depending on pm_idle cpuidle users should call cpuidle_call_idle() directly rather than via (pm_idle)() function pointer. Architecture may choose to continue using (pm_idle)(), but cpuidle need not depend on it: my_arch_cpu_idle() ... if(cpuidle_call_idle()) pm_idle(); cc: Kevin Hilman <khilman@deeprootsystems.com> cc: Paul Mundt <lethal@linux-sh.org> cc: x86@kernel.org Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-08-03 19:06:37 -04:00
Len Brown	4bfc8288bc	x86 idle: move mwait_idle_with_hints() to where it is used ...and make it static no functional change cc: x86@kernel.org Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-08-03 19:06:36 -04:00
H. Peter Anvin	49d859d78c	x86, random: Verify RDRAND functionality and allow it to be disabled If the CPU declares that RDRAND is available, go through a guranteed reseed sequence, and make sure that it is actually working (producing data.) If it does not, disable the CPU feature flag. Allow RDRAND to be disabled on the command line (as opposed to at compile time) for a user who has special requirements with regards to random numbers. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Theodore Ts'o" <tytso@mit.edu>	2011-07-31 14:02:19 -07:00
Arun Sharma	60063497a9	atomic: use <linux/atomic.h> This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:47 -07:00
Linus Torvalds	d3ec4844d4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) fs: Merge split strings treewide: fix potentially dangerous trailing ';' in #defined values/expressions uwb: Fix misspelling of neighbourhood in comment net, netfilter: Remove redundant goto in ebt_ulog_packet trivial: don't touch files that are removed in the staging tree lib/vsprintf: replace link to Draft by final RFC number doc: Kconfig: `to be' -> `be' doc: Kconfig: Typo: square -> squared doc: Konfig: Documentation/power/{pm => apm-acpi}.txt drivers/net: static should be at beginning of declaration drivers/media: static should be at beginning of declaration drivers/i2c: static should be at beginning of declaration XTENSA: static should be at beginning of declaration SH: static should be at beginning of declaration MIPS: static should be at beginning of declaration ARM: static should be at beginning of declaration rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check Update my e-mail address PCIe ASPM: forcedly -> forcibly gma500: push through device driver tree ... Fix up trivial conflicts: - arch/arm/mach-ep93xx/dma-m2p.c (deleted) - drivers/gpio/gpio-ep93xx.c (renamed and context nearby) - drivers/net/r8169.c (just context changes)	2011-07-25 13:56:39 -07:00
Linus Torvalds	fcda12e7f6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: modpost: Fix modpost's license checking V3 module: add /sys/module/<name>/uevent files module: change attr callbacks to take struct module_kobject modules: make arch's use default loader hooks modules: add default loader hook implementations param: fix return value handling in param_set_*	2011-07-24 09:54:54 -07:00
Linus Torvalds	5fabc487c9	Merge branch 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits) KVM: IOMMU: Disable device assignment without interrupt remapping KVM: MMU: trace mmio page fault KVM: MMU: mmio page fault support KVM: MMU: reorganize struct kvm_shadow_walk_iterator KVM: MMU: lockless walking shadow page table KVM: MMU: do not need atomicly to set/clear spte KVM: MMU: introduce the rules to modify shadow page table KVM: MMU: abstract some functions to handle fault pfn KVM: MMU: filter out the mmio pfn from the fault pfn KVM: MMU: remove bypass_guest_pf KVM: MMU: split kvm_mmu_free_page KVM: MMU: count used shadow pages on prepareing path KVM: MMU: rename 'pt_write' to 'emulate' KVM: MMU: cleanup for FNAME(fetch) KVM: MMU: optimize to handle dirty bit KVM: MMU: cache mmio info on page fault path KVM: x86: introduce vcpu_mmio_gva_to_gpa to cleanup the code KVM: MMU: do not update slot bitmap if spte is nonpresent KVM: MMU: fix walking shadow page table KVM guest: KVM Steal time registration ...	2011-07-24 09:07:03 -07:00
Jonas Bonn	66574cc054	modules: make arch's use default loader hooks This patch removes all the module loader hook implementations in the architecture specific code where the functionality is the same as that now provided by the recently added default hooks. Signed-off-by: Jonas Bonn <jonas@southpole.se> Acked-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Michal Simek <monstr@monstr.eu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-07-24 22:06:04 +09:30
Glauber Costa	d910f5c106	KVM guest: KVM Steal time registration This patch implements the kvm bits of the steal time infrastructure. The most important part of it, is the steal time clock. It is an continuous clock that shows the accumulated amount of steal time since vcpu creation. It is supposed to survive cpu offlining/onlining. [marcelo: fix build with CONFIG_KVM_GUEST=n] Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Avi Kivity <avi@redhat.com> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-24 11:49:36 +03:00
Linus Torvalds	b4db920c7f	Merge branches 'x86-detect-hyper-for-linus', 'x86-fpu-for-linus', 'x86-kexec-for-linus', 'x86-platform-for-linus', 'x86-quirks-for-linus', 'x86-tsc-for-linus' and 'x86-smpboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-detect-hyper-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hyper: Change hypervisor detection order * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-32, fpu: Fix DNA exception during check_fpu() * 'x86-kexec-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: kexec, x86: Fix incorrect jump back address if not preserving context * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, config: Introduce an INTEL_MID configuration * 'x86-quirks-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, quirks: Use pci_dev->revision * 'x86-tsc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: tsc: Remove unneeded DMI-based blacklisting * 'x86-smpboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, boot: Wait for boot cpu to show up if nr_cpus limit is about to hit	2011-07-23 10:38:21 -07:00
Linus Torvalds	9d0715630e	Merge branch 'timers-clocksource-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-clocksource-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: clocksource: apb: Share APB timer code with other platforms	2011-07-23 10:34:47 -07:00
Linus Torvalds	8e204874db	Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-64, vdso: Do not allocate memory for the vDSO clocksource: Change __ARCH_HAS_CLOCKSOURCE_DATA to a CONFIG option x86, vdso: Drop now wrong comment Document the vDSO and add a reference parser ia64: Replace clocksource.fsys_mmio with generic arch data x86-64: Move vread_tsc and vread_hpet into the vDSO clocksource: Replace vread with generic arch data x86-64: Add --no-undefined to vDSO build x86-64: Allow alternative patching in the vDSO x86: Make alternative instruction pointers relative x86-64: Improve vsyscall emulation CS and RIP handling x86-64: Emulate legacy vsyscalls x86-64: Fill unused parts of the vsyscall page with 0xcc x86-64: Remove vsyscall number 3 (venosys) x86-64: Map the HPET NX x86-64: Remove kernel.vsyscall64 sysctl x86-64: Give vvars their own page x86-64: Document some of entry_64.S x86-64: Fix alignment of jiffies variable	2011-07-22 17:05:15 -07:00
Linus Torvalds	8051207959	Merge branch 'x86-signal-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-signal-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Kill handle_signal()->set_fs() x86, do_signal: Simplify the TS_RESTORE_SIGMASK logic x86, signals: Convert the X86_32 code to use set_current_blocked() x86, signals: Convert the IA32_EMULATION code to use set_current_blocked()	2011-07-22 17:04:32 -07:00
Linus Torvalds	dc43d9fa73	Merge branch 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mtrr: Use pci_dev->revision x86, mtrr: use stop_machine APIs for doing MTRR rendezvous stop_machine: implement stop_machine_from_inactive_cpu() stop_machine: reorganize stop_cpus() implementation x86, mtrr: lock stop machine during MTRR rendezvous sequence	2011-07-22 17:04:04 -07:00
Linus Torvalds	80775068db	Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, microcode, AMD: Fix section header size check x86, microcode, AMD: Correct buf references	2011-07-22 17:03:52 -07:00
Linus Torvalds	7c6582b28a	Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mce: Use mce_sysdev_ prefix to group functions x86, mce: Use mce_chrdev_ prefix to group functions x86, mce: Cleanup mce_read() x86, mce: Cleanup mce_create()/remove_device() x86, mce: Check the result of ancient_init() x86, mce: Introduce mce_gather_info() x86, mce: Replace MCM_ with MCI_MISC_ x86, mce: Replace MCE_SELF_VECTOR by irq_work x86, mce, severity: Clean up trivial coding style problems x86, mce, severity: Cleanup severity table x86, mce, severity: Make formatting a bit more readable x86, mce, severity: Fix two severities table signatures	2011-07-22 17:03:40 -07:00
Linus Torvalds	35b004cce1	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, intel, power: Correct the MSR_IA32_ENERGY_PERF_BIAS message x86, msr: Fix typo in ENERGY_PERF_BIAS_POWERSAVE x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS	2011-07-22 17:02:54 -07:00
Linus Torvalds	0a613b647b	Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, smpboot: Mark the names[] array in __inquire_remote_apic() as const x86: Convert vmalloc()+memset() to vzalloc()	2011-07-22 17:02:38 -07:00
Linus Torvalds	eb47418dc5	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix write lock scalability 64-bit issue x86: Unify rwsem assembly implementation x86: Unify rwlock assembly implementation x86, asm: Fix binutils 2.16 issue with __USER32_CS x86, asm: Cleanup thunk_64.S x86, asm: Flip RESTORE_ARGS arguments logic x86, asm: Flip SAVE_ARGS arguments logic x86, asm: Thin down SAVE/RESTORE_* asm macros	2011-07-22 17:02:24 -07:00
Linus Torvalds	2c9e88a108	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, ioapic: Print IR_IO_APIC_route_entry when IR is enabled x86, ioapic: Print IRTE when IR is enabled x86, x2apic: Preserve high 32-bits of IA32_APIC_BASE MSR x86, ioapic: Also print Dest field x86, ioapic: Format clean up for IOAPIC output x86: print APIC data a little later during boot	2011-07-22 17:02:07 -07:00
Linus Torvalds	a99a7d1436	Merge branch 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: mips: Fix i8253 clockevent fallout i8253: Cleanup outb/inb magic arm: Footbridge: Use common i8253 clockevent mips: Use common i8253 clockevent x86: Use common i8253 clockevent i8253: Create common clockevent implementation i8253: Export i8253_lock unconditionally pcpskr: MIPS: Make config dependencies finer grained pcspkr: Cleanup Kconfig dependencies i8253: Move remaining content and delete asm/i8253.h i8253: Consolidate definitions of PIT_LATCH x86: i8253: Consolidate definitions of global_clock_event i8253: Alpha, PowerPC: Remove unused asm/8253pit.h alpha: i8253: Cleanup remaining users of i8253pit.h i8253: Remove I8253_LOCK config i8253: Make pcsp sound driver use the shared i8253_lock i8253: Make pcspkr input driver use the shared i8253_lock i8253: Consolidate all kernel definitions of i8253_lock i8253: Unify all kernel declarations of i8253_lock i8253: Create linux/i8253.h and use it in all 8253 related files	2011-07-22 16:51:56 -07:00
Linus Torvalds	4d4abdcb1d	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (123 commits) perf: Remove the nmi parameter from the oprofile_perf backend x86, perf: Make copy_from_user_nmi() a library function perf: Remove perf_event_attr::type check x86, perf: P4 PMU - Fix typos in comments and style cleanup perf tools: Make test use the preset debugfs path perf tools: Add automated tests for events parsing perf tools: De-opt the parse_events function perf script: Fix display of IP address for non-callchain path perf tools: Fix endian conversion reading event attr from file header perf tools: Add missing 'node' alias to the hw_cache[] array perf probe: Support adding probes on offline kernel modules perf probe: Add probed module in front of function perf probe: Introduce debuginfo to encapsulate dwarf information perf-probe: Move dwarf library routines to dwarf-aux.{c, h} perf probe: Remove redundant dwarf functions perf probe: Move strtailcmp to string.c perf probe: Rename DIE_FIND_CB_FOUND to DIE_FIND_CB_END tracing/kprobe: Update symbol reference when loading module tracing/kprobes: Support module init function probing kprobes: Return -ENOENT if probe point doesn't exist ...	2011-07-22 16:44:39 -07:00
Linus Torvalds	6d16d6d9bb	Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: iommu/core: Fix build with INTR_REMAP=y && CONFIG_DMAR=n iommu/amd: Don't use MSI address range for DMA addresses iommu/amd: Move missing parts to drivers/iommu iommu: Move iommu Kconfig entries to submenu x86/ia64: intel-iommu: move to drivers/iommu/ x86: amd_iommu: move to drivers/iommu/ msm: iommu: move to drivers/iommu/ drivers: iommu: move to a dedicated folder x86/amd-iommu: Store device alias as dev_data pointer x86/amd-iommu: Search for existind dev_data before allocting a new one x86/amd-iommu: Allow dev_data->alias to be NULL x86/amd-iommu: Use only dev_data in low-level domain attach/detach functions x86/amd-iommu: Use only dev_data for dte and iotlb flushing routines x86/amd-iommu: Store ATS state in dev_data x86/amd-iommu: Store devid in dev_data x86/amd-iommu: Introduce global dev_data_list x86/amd-iommu: Remove redundant device_flush_dte() calls iommu-api: Add missing header file Fix up trivial conflicts (independent additions close to each other) in drivers/Makefile and include/linux/pci.h	2011-07-22 16:39:42 -07:00
Linus Torvalds	acb41c0f92	Merge branch 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: pci/of: Consolidate pci_bus_to_OF_node() pci/of: Consolidate pci_device_to_OF_node() x86/devicetree: Use generic PCI <-> OF matching microblaze/pci: Move the remains of pci_32.c to pci-common.c microblaze/pci: Remove powermac originated cruft pci/of: Match PCI devices to OF nodes dynamically	2011-07-22 14:54:02 -07:00
Linus Torvalds	951cc93a74	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1287 commits) icmp: Fix regression in nexthop resolution during replies. net: Fix ppc64 BPF JIT dependencies. acenic: include NET_SKB_PAD headroom to incoming skbs ixgbe: convert to ndo_fix_features ixgbe: only enable WoL for magic packet by default ixgbe: remove ifdef check for non-existent define ixgbe: Pass staterr instead of re-reading status and error bits from descriptor ixgbe: Move interrupt related values out of ring and into q_vector ixgbe: add structure for containing RX/TX rings to q_vector ixgbe: inline the ixgbe_maybe_stop_tx function ixgbe: Update ATR to use recorded TX queues instead of CPU for routing igb: Fix for DH89xxCC near end loopback test e1000: always call e1000_check_for_link() on e1000_ce4100 MACs. netxen: add fw version compatibility check be2net: request native mode each time the card is reset ipv4: Constrain UFO fragment sizes to multiples of 8 bytes virtio_net: Fix panic in virtnet_remove ipv6: make fragment identifications less predictable ipv6: unshare inetpeers can: make function can_get_bittiming static ...	2011-07-22 14:43:13 -07:00
Rusty Russell	5dea1c88ed	lguest: use a special 1:1 linear pagetable mode until first switch. The Host used to create some page tables for the Guest to use at the top of Guest memory; it would then tell the Guest where this was. In particular, it created linear mappings for 0 and 0xC0000000 addresses because lguest used to switch to its real page tables quite late in boot. However, since `d50d8fe19` Linux initialized boot page tables in head_32.S even before the "are we lguest?" boot jump. So, now we can simplify things: the Host pagetable code assumes 1:1 linear mapping until it first calls the LHCALL_NEW_PGTABLE hypercall, which we now do before we reach C code. This also means that the Host doesn't need to know anything about the Guest's PAGE_OFFSET. (Non-Linux guests might not even have such a thing). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-07-22 14:39:48 +09:30
H. Peter Anvin	a536877e77	x86: Make Dell Latitude E6420 use reboot=pci Yet another variant of the Dell Latitude series which requires reboot=pci. From the E5420 bug report by Daniel J Blueman: > The E6420 is affected also (same platform, different casing and > features), which provides an external confirmation of the issue; I can > submit a patch for that later or include it if you prefer: > http://linux.koolsolutions.com/2009/08/04/howto-fix-linux-hangfreeze-during-reboots-and-restarts/ Reported-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@kernel.org>	2011-07-21 11:47:17 -07:00
Daniel J Blueman	b7798d28ec	x86: Make Dell Latitude E5420 use reboot=pci Rebooting on the Dell E5420 often hangs with the keyboard or ACPI methods, but is reliable via the PCI method. [ hpa: this was deferred because we believed for a long time that the recent reshuffling of the boot priorities in commit `660e34cebf` fixed this platform. Unfortunately that turned out to be incorrect. ] Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Link: http://lkml.kernel.org/r/1305248699-2347-1-git-send-email-daniel.blueman@gmail.com Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@kernel.org>	2011-07-21 11:45:49 -07:00
Robert Richter	1ac2e6ca44	x86, perf: Make copy_from_user_nmi() a library function copy_from_user_nmi() is used in oprofile and perf. Moving it to other library functions like copy_from_user(). As this is x86 code for 32 and 64 bits, create a new file usercopy.c for unified code. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110607172413.GJ20052@erda.amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 20:41:57 +02:00
Cyrill Gorcunov	f53173e47d	x86, perf: P4 PMU - Fix typos in comments and style cleanup This patch: - fixes typos in comments and clarifies the text - renames obscure p4_event_alias::original and ::alter members to ::original and ::alternative as appropriate - drops parenthesis from the return of p4_get_alias_event() No functional changes. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Link: http://lkml.kernel.org/r/20110721160625.GX7492@sun Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 20:41:54 +02:00
Phil Carmody	497888cf69	treewide: fix potentially dangerous trailing ';' in #defined values/expressions All these are instances of #define NAME value; or #define NAME(params_opt) value; These of course fail to build when used in contexts like if(foo $OP NAME) while(bar $OP NAME) and may silently generate the wrong code in contexts such as foo = NAME + 1; /* foo = value; + 1; / bar = NAME - 1; / bar = value; - 1; / baz = NAME & quux; / baz = value; & quux; */ Reported on comp.lang.c, Message-ID: <ab0d55fe-25e5-482b-811e-c475aa6065c3@c29g2000yqd.googlegroups.com> Initial analysis of the dangers provided by Keith Thompson in that thread. There are many more instances of more complicated macros having unnecessary trailing semicolons, but this pile seems to be all of the cases of simple values suffering from the problem. (Thus things that are likely to be found in one of the contexts above, more complicated ones aren't.) Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-07-21 14:10:00 +02:00
Huang Ying	050438ed5a	kexec, x86: Fix incorrect jump back address if not preserving context In kexec jump support, jump back address passed to the kexeced kernel via function calling ABI, that is, the function call return address is the jump back entry. Furthermore, jump back entry == 0 should be used to signal that the jump back or preserve context is not enabled in the original kernel. But in the current implementation the stack position used for function call return address is not cleared context preservation is disabled. The patch fixes this bug. Reported-and-tested-by: Yin Kangkai <kangkai.yin@intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1310607277-25029-1-git-send-email-ying.huang@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 11:19:28 +02:00
Sergei Shtylyov	38175051f8	x86, quirks: Use pci_dev->revision This code uses PCI_CLASS_REVISION instead of PCI_REVISION_ID, so it wasn't converted by commit `44c10138fd` ("PCI: Change all drivers to use pci_device->revision") before being moved to arch/x86/... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Dave Jones <davej@redhat.com> Link: http://lkml.kernel.org/r/201107111901.39281.sshtylyov@ru.mvista.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 10:26:00 +02:00
Greg Dietsche	a6c23905ff	x86, smpboot: Mark the names[] array in __inquire_remote_apic() as const This array is read-only. Make it explicit by marking as const. Signed-off-by: Greg Dietsche <Gregory.Dietsche@cuw.edu> Link: http://lkml.kernel.org/r/1309482653-23648-1-git-send-email-Gregory.Dietsche@cuw.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-21 10:04:51 +02:00
Len Brown	17edf2d79f	x86, intel, power: Correct the MSR_IA32_ENERGY_PERF_BIAS message Fix the printk_once() so that it actually prints (didn't print before due to a stray comma.) [ hpa: changed to an incremental patch and adjusted the description accordingly. ] Signed-off-by: Len Brown <len.brown@intel.com> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1107151732480.18606@x980 Cc: <table@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-15 15:13:55 -07:00
Oleg Nesterov	73d382decc	x86: Kill handle_signal()->set_fs() handle_signal()->set_fs() has a nice comment which explains what set_fs() is, but it doesn't explain why it is needed and why it depends on CONFIG_X86_64. Afaics, the history of this confusion is: 1. I guess today nobody can explain why it was needed in arch/i386/kernel/signal.c, perhaps it was always wrong. This predates 2.4.0 kernel. 2. then it was copy-and-past'ed to the new x86_64 arch. 3. then it was removed from i386 (but not from x86_64) by `b93b6ca3` "i386: remove unnecessary code". 4. then it was reintroduced under CONFIG_X86_64 when x86 unified i386 and x86_64, because the patch above didn't touch x86_64. Remove it. ->addr_limit should be correct. Even if it was possible that it is wrong, it is too late to fix it after setup_rt_frame(). Linus commented in: http://lkml.kernel.org/r/alpine.LFD.0.999.0707170902570.19166@woody.linux-foundation.org ... about the equivalent bit from i386: Heh. I think it's entirely historical. Please realize that the whole reason that function is called "set_fs()" is that it literally used to set the %fs segment register, not "->addr_limit". So I think the "set_fs(USER_DS)" is there _only_ to match the other regs->xds = __USER_DS; regs->xes = __USER_DS; regs->xss = __USER_DS; regs->xcs = __USER_CS; things, and never mattered. And now it matters even less, and has been copied to all other architectures where it is just totally insane. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20110710164424.GA20261@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 21:46:20 -07:00
Oleg Nesterov	9b42962074	x86, do_signal: Simplify the TS_RESTORE_SIGMASK logic 1. do_signal() looks at TS_RESTORE_SIGMASK and calculates the mask which should be stored in the signal frame, then it passes "oldset" to the callees, down to setup_rt_frame(). This is ugly, setup_rt_frame() can do this itself and nobody else needs this sigset_t. Move this code into setup_rt_frame. 2. do_signal() also clears TS_RESTORE_SIGMASK if handle_signal() succeeds. We can move this to setup_rt_frame() as well, this avoids the unnecessary checks and makes the logic more clear. 3. use set_current_blocked() instead of sigprocmask(SIG_SETMASK), sigprocmask() should be avoided. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20110710182203.GA27979@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 21:22:11 -07:00
Oleg Nesterov	3982294b03	x86, signals: Convert the X86_32 code to use set_current_blocked() sys_sigsuspend() and sys_sigreturn() change ->blocked directly. This is not correct, see the changelog in `e6fa16ab` "signal: sigprocmask() should do retarget_shared_pending()" Change them to use set_current_blocked(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20110710192727.GA31759@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 21:21:57 -07:00
Andy Lutomirski	98d0ac38ca	x86-64: Move vread_tsc and vread_hpet into the vDSO The vsyscall page now consists entirely of trap instructions. Cc: John Stultz <johnstul@us.ibm.com> Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/637648f303f2ef93af93bae25186e9a1bea093f5.1310639973.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 17:57:05 -07:00
Cyrill Gorcunov	f912987097	perf, x86: P4 PMU - Introduce event alias feature Instead of hw_nmi_watchdog_set_attr() weak function and appropriate x86_pmu::hw_watchdog_set_attr() call we introduce even alias mechanism which allow us to drop this routines completely and isolate quirks of Netburst architecture inside P4 PMU code only. The main idea remains the same though -- to allow nmi-watchdog and perf top run simultaneously. Note the aliasing mechanism applies to generic PERF_COUNT_HW_CPU_CYCLES event only because arbitrary event (say passed as RAW initially) might have some additional bits set inside ESCR register changing the behaviour of event and we can't guarantee anymore that alias event will give the same result. P.S. Thanks a huge to Don and Steven for for testing and early review. Acked-by: Don Zickus <dzickus@redhat.com> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Ingo Molnar <mingo@elte.hu> CC: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Stephane Eranian <eranian@google.com> CC: Lin Ming <ming.m.lin@intel.com> CC: Arnaldo Carvalho de Melo <acme@redhat.com> CC: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20110708201712.GS23657@sun Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-07-14 17:25:04 -04:00
Len Brown	abe48b1082	x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS Since 2.6.36 (`23016bf0d2`), Linux prints the existence of "epb" in /proc/cpuinfo, Since 2.6.38 (`d5532ee7b4`), the x86_energy_perf_policy(8) utility has been available in-tree to update MSR_IA32_ENERGY_PERF_BIAS. However, the typical BIOS fails to initialize the MSR, presumably because this is handled by high-volume shrink-wrap operating systems... Linux distros, on the other hand, do not yet invoke x86_energy_perf_policy(8). As a result, WSM-EP, SNB, and later hardware from Intel will run in its default hardware power-on state (performance), which assumes that users care for performance at all costs and not for energy efficiency. While that is fine for performance benchmarks, the hardware's intended default operating point is "normal" mode... Initialize the MSR to the "normal" by default during kernel boot. x86_energy_perf_policy(8) is available to change the default after boot, should the user have a different preference. Signed-off-by: Len Brown <len.brown@intel.com> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1107140051020.18606@x980 Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@kernel.org>	2011-07-14 12:13:42 -07:00
Tejun Heo	24aa07882b	memblock, x86: Replace memblock_x86_reserve/free_range() with generic ones Other than sanity check and debug message, the x86 specific version of memblock reserve/free functions are simple wrappers around the generic versions - memblock_reserve/free(). This patch adds debug messages with caller identification to the generic versions and replaces x86 specific ones and kills them. arch/x86/include/asm/memblock.h and arch/x86/mm/memblock.c are empty after this change and removed. Signed-off-by: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/1310462166-31469-14-git-send-email-tj@kernel.org Cc: Yinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 11:47:53 -07:00
Tejun Heo	6b5d41a1b9	memblock, x86: Reimplement memblock_find_dma_reserve() using iterators memblock_find_dma_reserve() wants to find out how much memory is reserved under MAX_DMA_PFN. memblock_x86_memory_[free_]in_range() are used to find out the amounts of all available and free memory in the area, which are then subtracted to find out the amount of reservation. memblock_x86_memblock_[free_]in_range() are implemented using __memblock_x86_memory_in_range() which builds ranges from memblock and then count them, which is rather unnecessarily complex. This patch open codes the counting logic directly in memblock_find_dma_reserve() using memblock iterators and removes now unused __memblock_x86_memory_in_range() and find_range_array(). Signed-off-by: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/1310462166-31469-11-git-send-email-tj@kernel.org Cc: Yinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 11:47:50 -07:00
Tejun Heo	8d89ac8084	x86: Replace memblock_x86_find_in_range_size() with for_each_free_mem_range() setup_bios_corruption_check() and memtest do_one_pass() open code memblock free area iteration using memblock_x86_find_in_range_size(). Convert them to use for_each_free_mem_range() instead. This leaves memblock_x86_find_in_range_size() and memblock_x86_check_reserved_size() unused. Kill them. Signed-off-by: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/1310462166-31469-8-git-send-email-tj@kernel.org Cc: Yinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 11:47:48 -07:00
Tejun Heo	ab5d140b9e	x86: Use __memblock_alloc_base() in early_reserve_e820() early_reserve_e820() implements its own ad-hoc early allocator using memblock_x86_find_in_range_size(). Use __memblock_alloc_base() instead and remove the unnecessary @startt parameter (it's top-down allocation anyway). Signed-off-by: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/1310462166-31469-6-git-send-email-tj@kernel.org Cc: Yinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-14 11:47:47 -07:00
David S. Miller	6a7ebdf2fd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/bluetooth/l2cap_core.c	2011-07-14 07:56:40 -07:00
Glauber Costa	3c404b578f	KVM guest: Add a pv_ops stub for steal time This patch adds a function pointer in one of the many paravirt_ops structs, to allow guests to register a steal time function. Besides a steal time function, we also declare two jump_labels. They will be used to allow the steal time code to be easily bypassed when not in use. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-14 12:59:44 +03:00
Tejun Heo	1f5026a7e2	memblock: Kill MEMBLOCK_ERROR `25818f0f28` (memblock: Make MEMBLOCK_ERROR be 0) thankfully made MEMBLOCK_ERROR 0 and there already are codes which expect error return to be 0. There's no point in keeping MEMBLOCK_ERROR around. End its misery. Signed-off-by: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/1310457490-3356-6-git-send-email-tj@kernel.org Cc: Yinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-13 16:36:01 -07:00
Andy Lutomirski	433bd805e5	clocksource: Replace vread with generic arch data The vread field was bloating struct clocksource everywhere except x86_64, and I want to change the way this works on x86_64, so let's split it out into per-arch data. Cc: x86@kernel.org Cc: Clemens Ladisch <clemens@ladisch.de> Cc: linux-ia64@vger.kernel.org Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/3ae5ec76a168eaaae63f08a2a1060b91aa0b7759.1310563276.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-13 11:23:12 -07:00
Andy Lutomirski	59e97e4d6f	x86: Make alternative instruction pointers relative This save a few bytes on x86-64 and means that future patches can apply alternatives to unrelocated code. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/ff64a6b9a1a3860ca4a7b8b6dc7b4754f9491cd7.1310563276.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-13 11:22:56 -07:00
Andy Lutomirski	c9712944b2	x86-64: Improve vsyscall emulation CS and RIP handling Three fixes here: - Send SIGSEGV if called from compat code or with a funny CS. - Don't BUG on impossible addresses. - Add a missing local_irq_disable. This patch also removes an unused variable. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/6fb2b13ab39b743d1e4f466eef13425854912f7f.1310563276.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-13 11:22:55 -07:00
Maxime Ripard	3628c3f5c8	x86. reboot: Make Dell Latitude E6320 use reboot=pci The Dell Latitude E6320 doesn't reboot unless reboot=pci is set. Force it thanks to DMI. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Link: http://lkml.kernel.org/r/1309269451-4966-1-git-send-email-maxime.ripard@free-electrons.com Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2011-07-12 21:42:48 -07:00
Naga Chumbalkar	42f0efc5aa	x86, ioapic: Print IR_IO_APIC_route_entry when IR is enabled When IR (interrupt remapping) is enabled print_IO_APIC() displays output according to legacy RTE (redirection table entry) definitons: NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect: 00 00 1 0 0 0 0 0 0 00 01 00 0 0 0 0 0 0 0 01 02 00 0 0 0 0 0 0 0 02 03 00 1 0 0 0 0 0 0 03 04 00 1 0 0 0 0 0 0 04 05 00 1 0 0 0 0 0 0 05 06 00 1 0 0 0 0 0 0 06 ... The above output is as per Sec 3.2.4 of the IOAPIC datasheet: 82093AA I/O Advanced Programmable Interrupt Controller (IOAPIC): http://download.intel.com/design/chipsets/datashts/29056601.pdf Instead the output should display the fields as discussed in Sec 5.5.1 of the VT-d specification: (Intel Virtualization Technology for Directed I/O: http://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct_IO.pdf) After the fix: NR Indx Fmt Mask Trig IRR Pol Stat Indx2 Zero Vect: 00 0000 0 1 0 0 0 0 0 0 00 01 000F 1 0 0 0 0 0 0 0 01 02 0001 1 0 0 0 0 0 0 0 02 03 0002 1 1 0 0 0 0 0 0 03 04 0011 1 1 0 0 0 0 0 0 04 05 0004 1 1 0 0 0 0 0 0 05 06 0005 1 1 0 0 0 0 0 0 06 ... Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Link: http://lkml.kernel.org/r/20110712211658.2939.93123.sendpatchset@nchumbalkar.americas.cpqcorp.net Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-12 20:17:58 -07:00
Naga Chumbalkar	3040db92ee	x86, ioapic: Print IRTE when IR is enabled When "apic=debug" is used as a boot parameter, Linux prints the IOAPIC routing entries in "dmesg". Below is output from IOAPIC whose apic_id is 8: # dmesg \| grep "routing entry" IOAPIC[8]: Set routing entry (8-1 -> 0x31 -> IRQ 1 Mode:0 Active:0 Dest:0) IOAPIC[8]: Set routing entry (8-2 -> 0x30 -> IRQ 0 Mode:0 Active:0 Dest:0) IOAPIC[8]: Set routing entry (8-3 -> 0x33 -> IRQ 3 Mode:0 Active:0 Dest:0) ... Similarly, when IR (interrupt remapping) is enabled, and the IRTE (interrupt remapping table entry) is set up we should display it. After the fix: # dmesg \| grep IRTE IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:31 Dest:00000000 SID:00F1 SQ:0 SVT:1) IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:30 Dest:00000000 SID:00F1 SQ:0 SVT:1) IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:33 Dest:00000000 SID:00F1 SQ:0 SVT:1) ... The IRTE is defined in Sec 9.5 of the Intel VT-d Specification. Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Link: http://lkml.kernel.org/r/20110712211704.2939.71291.sendpatchset@nchumbalkar.americas.cpqcorp.net Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-12 14:34:00 -07:00
Naga Chumbalkar	2597085228	x86, x2apic: Preserve high 32-bits of IA32_APIC_BASE MSR If there's no special reason to zero-out the "high" 32-bits of the IA32_APIC_BASE MSR, let's preserve it. The x2APIC Specification doesn't explicitly state any such requirement. (Sec 2.2 in: http://www.intel.com/Assets/PDF/manual/318148.pdf). Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Link: http://lkml.kernel.org/r/20110712055831.2498.78521.sendpatchset@nchumbalkar.americas.cpqcorp.net Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-12 14:33:49 -07:00
Naga Chumbalkar	7fece83235	x86, ioapic: Also print Dest field The code in setup_ioapic_irq() determines the Destination Field, so why not also include it in the debug printk output that gets displayed when the boot parameter "apic=debug" is used. Before the change, "dmesg" will show: IOAPIC[0]: Set routing entry (8-1 -> 0x31 -> IRQ 1 Mode:0 Active:0) IOAPIC[0]: Set routing entry (8-2 -> 0x30 -> IRQ 0 Mode:0 Active:0) IOAPIC[0]: Set routing entry (8-3 -> 0x33 -> IRQ 3 Mode:0 Active:0) ... After the change, you will see: IOAPIC[0]: Set routing entry (8-1 -> 0x31 -> IRQ 1 Mode:0 Active:0 Dest:0) IOAPIC[0]: Set routing entry (8-2 -> 0x30 -> IRQ 0 Mode:0 Active:0 Dest:0) IOAPIC[0]: Set routing entry (8-3 -> 0x33 -> IRQ 3 Mode:0 Active:0 Dest:0) ... Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Link: http://lkml.kernel.org/r/20110708184603.2734.91071.sendpatchset@nchumbalkar.americas.cpqcorp.net Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-11 16:31:05 +02:00
Naga Chumbalkar	bd6a46e087	x86, ioapic: Format clean up for IOAPIC output When IOAPIC data is displayed in "dmesg" with the help of the boot parameter "apic=debug" certain values are not formatted correctly wrt their size. In the "dmesg" snippet below, note that the output for "max redirection entries", and "IO APIC version" which are each defined to be just 8-bits long are displayed as 2 bytes in length. Similarly, "Dst" under the "IRQ redirection table" should only be 8-bits long. IO APIC #0...... ... ... .... register #01: 00170020 ....... : max redirection entries: 0017 ....... : PRQ implemented: 0 ....... : IO APIC version: 0020 ... ... .... IRQ redirection table: NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect: 00 000 1 0 0 0 0 0 0 00 01 000 0 0 0 0 0 0 0 31 02 000 0 0 0 0 0 0 0 30 03 000 1 0 0 0 0 0 0 33 ... ... Do some formatting clean up, so you will see output like below: IO APIC #0...... ... ... .... register #01: 00170020 ....... : max redirection entries: 17 ....... : PRQ implemented: 0 ....... : IO APIC version: 20 ... ... .... IRQ redirection table: NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect: 00 00 1 0 0 0 0 0 0 00 01 00 0 0 0 0 0 0 0 31 02 00 0 0 0 0 0 0 0 30 03 00 1 0 0 0 0 0 0 33 ... ... Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Link: http://lkml.kernel.org/r/20110708184557.2734.61830.sendpatchset@nchumbalkar.americas.cpqcorp.net Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-11 16:31:05 +02:00
Jiri Kosina	b7e9c223be	Merge branch 'master' into for-next Sync with Linus' tree to be able to apply pending patches that are based on newer code already present upstream.	2011-07-11 14:15:55 +02:00
Anupam Chanda	24a42bae68	x86, hyper: Change hypervisor detection order Detect Xen before HyperV because in Viridian compatibility mode Xen presents itself as HyperV. Move Xen to the top since it seems more likely that Xen would emulate VMware than vice versa. Signed-off-by: Anupam Chanda <achanda@nicira.com> Link: http://lkml.kernel.org/r/1310150570-26810-1-git-send-email-achanda@nicira.com Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Reviewed-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-08 16:22:29 -07:00
Vivek Goyal	14cb6dcf0a	x86, boot: Wait for boot cpu to show up if nr_cpus limit is about to hit nr_cpus allows one to specify number of possible cpus in the system. Current assumption seems to be that first cpu to show up is boot cpu and this assumption will be broken in kdump scenario where we can be booting on a non boot cpu with nr_cpus=1. It might happen that first cpu we parse is not the cpu we boot on and later we ignore boot cpu. Though code later seems to recognize this anomaly and forcibly sets boot cpu in physical cpu map with following warning. if (!physid_isset(hard_smp_processor_id(), phys_cpu_present_map)) { printk(KERN_WARNING "weird, boot CPU (#%d) not listed by the BIOS.\n", hard_smp_processor_id()); physid_set(hard_smp_processor_id(), phys_cpu_present_map); } This patch waits for boot cpu to show up and starts ignoring the cpus once we have hit (nr_cpus - 1) number of cpus. So effectively we are reserving one slot out of nr_cpus for boot cpu explicitly. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20110708171926.GF2930@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-07-08 15:33:35 -07:00
Naga Chumbalkar	ded1f6ab43	x86: print APIC data a little later during boot To view IOAPIC data you could boot with "apic=debug". When booting in such a way then the kernel will dump the IO-APIC's registers, for example: NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect: 00 000 1 0 0 0 0 0 0 00 01 000 0 0 0 0 0 0 0 31 02 000 0 0 0 0 0 0 0 30 03 000 0 0 0 0 0 0 0 33 04 000 0 0 0 0 0 0 0 34 05 000 0 0 0 0 0 0 0 35 06 000 0 0 0 0 0 0 0 36 07 000 0 0 0 0 0 0 0 37 08 000 0 0 0 0 0 0 0 38 09 000 0 1 0 0 0 0 0 39 0a 000 0 0 0 0 0 0 0 3A 0b 000 0 0 0 0 0 0 0 3B 0c 000 0 0 0 0 0 0 0 3C 0d 000 0 0 0 0 0 0 0 3D 0e 000 0 0 0 0 0 0 0 3E 0f 000 0 0 0 0 0 0 0 3F 10 000 1 0 0 0 0 0 0 00 11 000 1 0 0 0 0 0 0 00 12 000 1 0 0 0 0 0 0 00 13 000 1 0 0 0 0 0 0 00 14 000 1 0 0 0 0 0 0 00 15 000 1 0 0 0 0 0 0 00 16 000 1 0 0 0 0 0 0 00 17 000 1 0 0 0 0 0 0 00 Delaying the call to print_ICs() gives better results: NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect: 00 000 1 0 0 0 0 0 0 00 01 000 0 0 0 0 0 0 0 31 02 000 0 0 0 0 0 0 0 30 03 000 1 0 0 0 0 0 0 33 04 000 1 0 0 0 0 0 0 34 05 000 1 0 0 0 0 0 0 35 06 000 1 0 0 0 0 0 0 36 07 000 1 0 0 0 0 0 0 37 08 000 0 0 0 0 0 0 0 38 09 000 0 1 0 0 0 0 0 39 0a 000 1 0 0 0 0 0 0 3A 0b 000 1 0 0 0 0 0 0 3B 0c 000 0 0 0 0 0 0 0 3C 0d 000 1 0 0 0 0 0 0 3D 0e 000 1 0 0 0 0 0 0 3E 0f 000 1 0 0 0 0 0 0 3F 10 000 1 1 0 1 0 0 0 29 11 000 1 0 0 0 0 0 0 00 12 000 1 0 0 0 0 0 0 00 13 000 1 0 0 0 0 0 0 00 14 000 0 1 0 1 0 0 0 51 15 000 1 0 0 0 0 0 0 00 16 000 0 1 0 1 0 0 0 61 17 000 0 1 0 1 0 0 0 59 Notice that the entries beyond interrupt input signal 0x0f also get populated and arent just the hw-initialization default of all zeroes. Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Link: http://lkml.kernel.org/r/20110708083555.2598.42216.sendpatchset@nchumbalkar.americas.hpqcorp.net Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-08 13:20:14 +02:00
Kees Cook	7a3136666b	x86, suspend: Restore MISC_ENABLE MSR in realmode wakeup Some BIOSes will reset the Intel MISC_ENABLE MSR (specifically the XD_DISABLE bit) when resuming from S3, which can interact poorly with `ebba638ae7`. In 32bit PAE mode, this can lead to a fault when EFER is restored by the kernel wakeup routines, due to it setting the NX bit for a CPU that (thanks to the BIOS reset) now incorrectly thinks it lacks the NX feature. (64bit is not affected because it uses a common CPU bring-up that specifically handles the XD_DISABLE bit.) The need for MISC_ENABLE being restored so early is specific to the S3 resume path. Normally, MISC_ENABLE is saved in save_processor_state(), but this happens after the resume header is created, so just reproduce the logic here. (acpi_suspend_lowlevel() creates the header, calls do_suspend_lowlevel, which calls save_processor_state(), so the saved processor context isn't available during resume header creation.) [ hpa: Consider for stable if OK in mainline ] Signed-off-by: Kees Cook <kees.cook@canonical.com> Link: http://lkml.kernel.org/r/20110707011034.GA8523@outflux.net Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: <stable@kernel.org> 2.6.38+	2011-07-06 20:09:34 -07:00
Peter Chubb	b49c78d482	x86, reboot: Acer Aspire One A110 reboot quirk Since git commit `660e34cebf` x86: reorder reboot method preferences, my Acer Aspire One hangs on reboot. It appears that its ACPI method for rebooting is broken. The attached patch adds a quirk so that the machine will reboot via the BIOS. [ hpa: verified that the ACPI control on this machine is just plain broken. ] Signed-off-by: Peter Chubb <peter.chubb@nicta.com.au> Link: http://lkml.kernel.org/r/w439iki5vl.wl%25peter@chubb.wattle.id.au Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2011-07-05 19:43:23 -07:00
Ingo Molnar	931da6137e	Merge branch 'tip/perf/core-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core	2011-07-05 11:55:43 +02:00
Ingo Molnar	729aa21ab8	Merge branch 'perf/stacktrace' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core	2011-07-03 20:39:40 +02:00
Frederic Weisbecker	a2bbe75089	x86: Don't use frame pointer to save old stack on irq entry rbp is used in SAVE_ARGS_IRQ to save the old stack pointer in order to restore it later in ret_from_intr. It is convenient because we save its value in the irq regs and it's easily restored using the leave instruction. However this is a kind of abuse of the frame pointer which role is to help unwinding the kernel by chaining frames together, each node following the return address to the previous frame. But although we are breaking the frame by changing the stack pointer, there is no preceding return address before the new frame. Hence using the frame pointer to link the two stacks breaks the stack unwinders that find a random value instead of a return address here. There is no workaround that can work in every case. We are using the fixup_bp_irq_link() function to dereference that abused frame pointer in the case of non nesting interrupt (which means stack changed). But that doesn't fix the case of interrupts that don't change the stack (but we still have the unconditional frame link), which is the case of hardirq interrupting softirq. We have no way to detect this transition so the frame irq link is considered as a real frame pointer and the return address is dereferenced but it is still a spurious one. There are two possible results of this: either the spurious return address, a random stack value, luckily belongs to the kernel text and then the unwinding can continue and we just have a weird entry in the stack trace. Or it doesn't belong to the kernel text and unwinding stops there. This is the reason why stacktraces (including perf callchains) on irqs that interrupted softirqs don't work very well. To solve this, we don't save the old stack pointer on rbp anymore but we save it to a scratch register that we push on the new stack and that we pop back later on irq return. This preserves the whole frame chain without spurious return addresses in the middle and drops the need for the horrid fixup_bp_irq_link() workaround. And finally irqs that interrupt softirq are sanely unwinded. Before: 99.81% perf [kernel.kallsyms] [k] perf_pending_event \| --- perf_pending_event irq_work_run smp_irq_work_interrupt irq_work_interrupt \| \|--41.60%-- __read \| \| \| \|--99.90%-- create_worker \| \| bench_sched_messaging \| \| cmd_bench \| \| run_builtin \| \| main \| \| __libc_start_main \| --0.10%-- [...] After: 1.64% swapper [kernel.kallsyms] [k] perf_pending_event \| --- perf_pending_event irq_work_run smp_irq_work_interrupt irq_work_interrupt \| \|--95.00%-- arch_irq_work_raise \| irq_work_queue \| __perf_event_overflow \| perf_swevent_overflow \| perf_swevent_event \| perf_tp_event \| perf_trace_softirq \| __do_softirq \| call_softirq \| do_softirq \| irq_exit \| \| \| \|--73.68%-- smp_apic_timer_interrupt \| \| apic_timer_interrupt \| \| \| \| \| \|--96.43%-- amd_e400_idle \| \| \| cpu_idle \| \| \| start_secondary Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jan Beulich <JBeulich@novell.com>	2011-07-02 18:06:36 +02:00
Frederic Weisbecker	48ffee7d9e	x86: Remove useless unwinder backlink from irq regs saving The unwinder backlink in interrupt entry is very useless. It's actually not part of the stack frame chain and thus is never used. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jan Beulich <JBeulich@novell.com>	2011-07-02 18:06:21 +02:00
Frederic Weisbecker	3b99a3ef55	x86,64: Separate arg1 from rbp handling in SAVE_REGS_IRQ Just for clarity in the code. Have a first block that handles the frame pointer and a separate one that handles pt_regs pointer and its use. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jan Beulich <JBeulich@novell.com>	2011-07-02 18:05:46 +02:00
Frederic Weisbecker	1871853f7a	x86,64: Simplify save_regs() The save_regs function that saves the regs on low level irq entry is complicated because of the fact it changes its stack in the middle and also because it manipulates data allocated in the caller frame and accesses there are directly calculated from callee rsp value with the return address in the middle of the way. This complicates the static stack offsets calculation and require more dynamic ones. It also needs a save/restore of the function's return address. To simplify and optimize this, turn save_regs() into a macro. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jan Beulich <JBeulich@novell.com>	2011-07-02 18:05:31 +02:00
Frederic Weisbecker	47ce11a2b6	x86: Fetch stack from regs when possible in dump_trace() When regs are passed to dump_stack(), we fetch the frame pointer from the regs but the stack pointer is taken from the current frame. Thus the frame and stack pointers may not come from the same context. For example this can result in the unwinder to think the context is in irq, due to the current value of the stack, but the frame pointer coming from the regs points to a frame from another place. It then tries to fix up the irq link but ends up dereferencing a random frame pointer that doesn't belong to the irq stack: [ 9131.706906] ------------[ cut here ]------------ [ 9131.707003] WARNING: at arch/x86/kernel/dumpstack_64.c:129 dump_trace+0x2aa/0x330() [ 9131.707003] Hardware name: AMD690VM-FMH [ 9131.707003] Perf: bad frame pointer = 0000000000000005 in callchain [ 9131.707003] Modules linked in: [ 9131.707003] Pid: 1050, comm: perf Not tainted 3.0.0-rc3+ #181 [ 9131.707003] Call Trace: [ 9131.707003] <IRQ> [<ffffffff8104bd4a>] warn_slowpath_common+0x7a/0xb0 [ 9131.707003] [<ffffffff8104be21>] warn_slowpath_fmt+0x41/0x50 [ 9131.707003] [<ffffffff8178b873>] ? bad_to_user+0x6d/0x10be [ 9131.707003] [<ffffffff8100c2da>] dump_trace+0x2aa/0x330 [ 9131.707003] [<ffffffff810107d3>] ? native_sched_clock+0x13/0x50 [ 9131.707003] [<ffffffff8101b164>] perf_callchain_kernel+0x54/0x70 [ 9131.707003] [<ffffffff810d391f>] perf_prepare_sample+0x19f/0x2a0 [ 9131.707003] [<ffffffff810d546c>] __perf_event_overflow+0x16c/0x290 [ 9131.707003] [<ffffffff810d5430>] ? __perf_event_overflow+0x130/0x290 [ 9131.707003] [<ffffffff810107d3>] ? native_sched_clock+0x13/0x50 [ 9131.707003] [<ffffffff8100fbb9>] ? sched_clock+0x9/0x10 [ 9131.707003] [<ffffffff810752e5>] ? T.375+0x15/0x90 [ 9131.707003] [<ffffffff81084da4>] ? trace_hardirqs_on_caller+0x64/0x180 [ 9131.707003] [<ffffffff810817bd>] ? trace_hardirqs_off+0xd/0x10 [ 9131.707003] [<ffffffff810d5764>] perf_event_overflow+0x14/0x20 [ 9131.707003] [<ffffffff810d588c>] perf_swevent_hrtimer+0x11c/0x130 [ 9131.707003] [<ffffffff817821a1>] ? error_exit+0x51/0xb0 [ 9131.707003] [<ffffffff81072e93>] __run_hrtimer+0x83/0x1e0 [ 9131.707003] [<ffffffff810d5770>] ? perf_event_overflow+0x20/0x20 [ 9131.707003] [<ffffffff81073256>] hrtimer_interrupt+0x106/0x250 [ 9131.707003] [<ffffffff812a3bfd>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 9131.707003] [<ffffffff81024833>] smp_apic_timer_interrupt+0x53/0x90 [ 9131.707003] [<ffffffff81789053>] apic_timer_interrupt+0x13/0x20 [ 9131.707003] <EOI> [<ffffffff817821a1>] ? error_exit+0x51/0xb0 [ 9131.707003] [<ffffffff8178219c>] ? error_exit+0x4c/0xb0 [ 9131.707003] ---[ end trace b2560d4876709347 ]--- Fix this by simply taking the stack pointer from regs->sp when regs are provided. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>	2011-07-02 18:04:20 +02:00
Sergei Shtylyov	50c31e4a24	x86, mtrr: Use pci_dev->revision This code uses PCI_CLASS_REVISION instead of PCI_REVISION_ID, so it wasn't converted by commit `44c10138fd` ("PCI: Change all drivers to use pci_device->revision") before being moved to arch/x86/... Do it now at last -- and save one level of indentation... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/201107012242.08347.sshtylyov@ru.mvista.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-02 11:10:07 +02:00
Avi Kivity	0af3ac1fdb	x86, perf: Add constraints for architectural PMU The v1 PMU does not have any fixed counters. Using the v2 constraints, which do have fixed counters, causes an additional choice to be present in the weight calculation, but not when actually scheduling the event, leading to an event being not scheduled at all. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1309362157-6596-3-git-send-email-avi@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:39 +02:00
Avi Kivity	4dc0da8696	perf: Add context field to perf_event The perf_event overflow handler does not receive any caller-derived argument, so many callers need to resort to looking up the perf_event in their local data structure. This is ugly and doesn't scale if a single callback services many perf_events. Fix by adding a context parameter to perf_event_create_kernel_counter() (and derived hardware breakpoints APIs) and storing it in the perf_event. The field can be accessed from the callback as event->overflow_handler_context. All callers are updated. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1309362157-6596-2-git-send-email-avi@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:38 +02:00
Peter Zijlstra	89d6c0b5bd	perf, arch: Add generic NODE cache events Add a NODE level to the generic cache events which is used to measure local vs remote memory accesses. Like all other cache events, an ACCESS is HIT+MISS, if there is no way to distinguish between reads and writes do reads only etc.. The below needs filling out for !x86 (which I filled out with unsupported events). I'm fairly sure ARM can leave it like that since it doesn't strike me as an architecture that even has NUMA support. SH might have something since it does appear to have some NUMA bits. Sparc64, PowerPC and MIPS certainly want a good look there since they clearly are NUMA capable. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Miller <davem@davemloft.net> Cc: Anton Blanchard <anton@samba.org> Cc: David Daney <ddaney@caviumnetworks.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1303508226.4865.8.camel@laptop Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:38 +02:00
Peter Zijlstra	b79e8941fb	perf, intel: Try alternative OFFCORE encodings Since the OFFCORE registers are fully symmetric, try the other one when the specified one is already in use. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1306141897.18455.8.camel@twins Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:37 +02:00
Stephane Eranian	ee89cbc2d4	perf_events: Add Intel Sandy Bridge offcore_response low-level support This patch adds Intel Sandy Bridge offcore_response support by providing the low-level constraint table for those events. On Sandy Bridge, there are two offcore_response events. Each uses its own dedictated extra register. But those registers are NOT shared between sibling CPUs when HT is on unlike Nehalem/Westmere. They are always private to each CPU. But they still need to be controlled within an event group. All events within an event group must use the same value for the extra MSR. That's not controlled by the second patch in this series. Furthermore on Sandy Bridge, the offcore_response events have NO counter constraints contrary to what the official documentation indicates, so drop the events from the contraint table. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110606145712.GA7304@quad Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:37 +02:00
Stephane Eranian	cd8a38d33e	perf_events: Fix validation of events using an extra reg The validate_group() function needs to validate events with extra shared regs. Within an event group, only events with the same value for the extra reg can co-exist. This was not checked by validate_group() because it was missing the shared_regs logic. This patch changes the allocation of the fake cpuc used for validation to also point to a fake shared_regs structure such that group events be properly testing. It modifies __intel_shared_reg_get_constraints() to use spin_lock_irqsave() to avoid lockdep issues. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110606145708.GA7279@quad Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:36 +02:00
Stephane Eranian	efc9f05df2	perf_events: Update Intel extra regs shared constraints management This patch improves the code managing the extra shared registers used for offcore_response events on Intel Nehalem/Westmere. The idea is to use static allocation instead of dynamic allocation. This simplifies greatly the get and put constraint routines for those events. The patch also renames per_core to shared_regs because the same data structure gets used whether or not HT is on. When HT is off, those events still need to coordination because they use a extra MSR that has to be shared within an event group. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110606145703.GA7258@quad Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:36 +02:00
Peter Zijlstra	a7ac67ea02	perf: Remove the perf_output_begin(.sample) argument Since only samples call perf_output_sample() its much saner (and more correct) to put the sample logic in there than in the perf_output_begin()/perf_output_end() pair. Saves a useless argument, reduces conditionals and shrinks struct perf_output_handle, win! Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-2crpvsx3cqu67q3zqjbnlpsc@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:35 +02:00
Peter Zijlstra	a8b0ca17b8	perf: Remove the nmi parameter from the swevent and overflow interface The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:35 +02:00
Cyrill Gorcunov	1880c4ae18	perf, x86: Add hw_watchdog_set_attr() in a sake of nmi-watchdog on P4 Due to restriction and specifics of Netburst PMU we need a separated event for NMI watchdog. In particular every Netburst event consumes not just a counter and a config register, but also an additional ESCR register. Since ESCR registers are grouped upon counters (i.e. if ESCR is occupied for some event there is no room for another event to enter until its released) we need to pick up the "least" used ESCR (or the most available one) for nmi-watchdog purposes -- so MSR_P4_CRU_ESCR2/3 was chosen. With this patch nmi-watchdog and perf top should be able to run simultaneously. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Lin Ming <ming.m.lin@intel.com> CC: Arnaldo Carvalho de Melo <acme@redhat.com> CC: Frederic Weisbecker <fweisbec@gmail.com> Tested-and-reviewed-by: Don Zickus <dzickus@redhat.com> Tested-and-reviewed-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110623124918.GC13050@sun Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:34 +02:00
Thomas Gleixner	01898e3e29	i8253: Cleanup outb/inb magic Remove the hysterical outb/inb_pit defines and use outb_p/inb_p in the code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20110609130622.348437125@linutronix.de	2011-07-01 10:37:15 +02:00
Thomas Gleixner	0a779c5713	x86: Use common i8253 clockevent Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20110609130622.026152527@linutronix.de	2011-07-01 10:37:14 +02:00
Suresh Siddha	3e7cf5b00d	x86-32, fpu: Fix DNA exception during check_fpu() Before check_fpu() is called, we have cr0.TS bit set and hence the floating point code to check the FDIV bug was generating a DNA exception. Use kernel_fpu_begin()/kernel_fpu_end() around the floating point code to avoid this unnecessary device not available exception during boot. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1309479572.2665.1372.camel@sbsiddha-MOBL3.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-06-30 17:29:47 -07:00
Suresh Siddha	192d885742	x86, mtrr: use stop_machine APIs for doing MTRR rendezvous MTRR rendezvous sequence is not implemened using stop_machine() before, as this gets called both from the process context aswell as the cpu online paths (where the cpu has not come online and the interrupts are disabled etc). Now that we have a new stop_machine_from_inactive_cpu() API, use it for rendezvous during mtrr init of a logical processor that is coming online. For the rest (runtime MTRR modification, system boot, resume paths), use stop_machine() to implement the rendezvous sequence. This will consolidate and cleanup the code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/20110623182057.076997177@sbsiddha-MOBL3.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-06-27 15:17:13 -07:00
Jamie Iles	06c3df4952	clocksource: apb: Share APB timer code with other platforms The APB timers are an IP block from Synopsys (DesignWare APB timers) and are also found in other systems including ARM SoC's. This patch adds functions for creating clock_event_devices and clocksources from APB timers but does not do the resource allocation. This is handled in a higher layer to allow the timers to be created from multiple methods such as platform_devices. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Jamie Iles <jamie@jamieiles.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-06-27 15:16:21 -07:00
Suresh Siddha	6d3321e8e2	x86, mtrr: lock stop machine during MTRR rendezvous sequence MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially happen in parallel with another system wide rendezvous using stop_machine(). This can lead to deadlock (The order in which works are queued can be different on different cpu's. Some cpu's will be running the first rendezvous handler and others will be running the second rendezvous handler. Each set waiting for the other set to join for the system wide rendezvous, leading to a deadlock). MTRR rendezvous sequence is not implemented using stop_machine() as this gets called both from the process context aswell as the cpu online paths (where the cpu has not come online and the interrupts are disabled etc). stop_machine() works with only online cpus. For now, take the stop_machine mutex in the MTRR rendezvous sequence that gets called from an online cpu (here we are in the process context and can potentially sleep while taking the mutex). And the MTRR rendezvous that gets triggered during cpu online doesn't need to take this stop_machine lock (as the stop_machine() already ensures that there is no cpu hotplug going on in parallel by doing get_online_cpus()) TBD: Pursue a cleaner solution of extending the stop_machine() infrastructure to handle the case where the calling cpu is still not online and use this for MTRR rendezvous sequence. fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008 Reported-by: Vadim Kotelnikov <vadimuzzz@inbox.ru> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/20110623182056.807230326@sbsiddha-MOBL3.sc.intel.com Cc: stable@kernel.org # 2.6.35+, backport a week or two after this gets more testing in mainline Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-06-27 14:00:46 -07:00
Alexey Dobriyan	b7f080cfe2	net: remove mm.h inclusion from netdevice.h Remove linux/mm.h inclusion from netdevice.h -- it's unused (I've checked manually). To prevent mm.h inclusion via other channels also extract "enum dma_data_direction" definition into separate header. This tiny piece is what gluing netdevice.h with mm.h via "netdevice.h => dmaengine.h => dma-mapping.h => scatterlist.h => mm.h". Removal of mm.h from scatterlist.h was tried and was found not feasible on most archs, so the link was cutoff earlier. Hope people are OK with tiny include file. Note, that mm_types.h is still dragged in, but it is a separate story. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-06-21 19:17:20 -07:00
Joerg Roedel	801019d59d	Merge branches 'amd/transparent-bridge' and 'core' Conflicts: arch/x86/include/asm/amd_iommu_types.h arch/x86/kernel/amd_iommu.c Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-21 11:14:10 +02:00
Joerg Roedel	403f81d8ee	iommu/amd: Move missing parts to drivers/iommu A few parts of the driver were missing in drivers/iommu. Move them there to have the complete driver in that directory. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-21 10:49:31 +02:00
Ohad Ben-Cohen	29b68415e3	x86: amd_iommu: move to drivers/iommu/ This should ease finding similarities with different platforms, with the intention of solving problems once in a generic framework which everyone can use. Compile-tested on x86_64. Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-21 10:49:29 +02:00
Borislav Petkov	40b7f3dfcc	x86, microcode, AMD: Fix section header size check The ucode size check has to take the section header size into account too when sanity checking the section length. Shorten and clarify define names, while at it. Caught-by: Ben Hutchings <ben@decadent.org.uk> Link: http://lkml.kernel.org/r/1302752223.5282.674.camel@localhost Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 17:23:54 +02:00
Hidetoshi Seto	c7cece89f1	x86, mce: Use mce_sysdev_ prefix to group functions There are many functions named mce_* so use a new prefix for the subset of functions related to sysfs support. And since `f3c6ea1b06` introduces syscore_ops, use the prefix mce_syscore for some functions related to power management which were in sysdev_class before. Before: After: mce_device mce_sysdev mce_sysclass mce_sysdev_class mce_attrs mce_sysdev_attrs mce_dev_initialized mce_sysdev_initialized mce_create_device mce_sysdev_create mce_remove_device mce_sysdev_remove mce_suspend mce_syscore_suspend mce_shutdown mce_syscore_shutdown mce_resume mce_syscore_resume Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED81B.8020506@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 12:10:16 +02:00
Hidetoshi Seto	93b62c3cf5	x86, mce: Use mce_chrdev_ prefix to group functions There are many functions named mce_* so use a new prefix for the subset of functions dealing with the character device /dev/mcelog. This change doesn't impact the mce-inject module because the exported symbol mce_chrdev_ops already has the prefix, therefore it is left unchanged. Before: After: mce_wait mce_chrdev_wait mce_state_lock mce_chrdev_state_lock open_count mce_chrdev_open_count open_exclu mce_chrdev_open_exclu mce_open mce_chrdev_open mce_release mce_chrdev_release mce_read_mutex mce_chrdev_read_mutex mce_read mce_chrdev_read mce_poll mce_chrdev_poll mce_ioctl mce_chrdev_ioctl mce_log_device mce_chrdev_device Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED7CD.3040500@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 12:10:15 +02:00
Hidetoshi Seto	559faa6be1	x86, mce: Cleanup mce_read() Use a temporary local variable m to simplify the code. No change in logic. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED7A8.8020307@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 12:10:13 +02:00
Hidetoshi Seto	f6783c4234	x86, mce: Cleanup mce_create()/remove_device() Use temporary local variable sysdev to simplify the code. No change in logic. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED777.7080205@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 12:10:12 +02:00
Hidetoshi Seto	3a97fc3413	x86, mce: Check the result of ancient_init() Because "ancient CPUs" like p5 and winchip don't have X86_FEATURE_MCA (I suppose so), mcheck_cpu_init() on such CPUs will return at check of mce_available() after __mcheck_cpu_ancient_init(). It is hard to know this implicit behavior without knowing the CPUs well. So make it clear that we leave mcheck_cpu_init() when the CPU is initialized in __mcheck_cpu_ancient_init(). Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED74B.20502@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 12:10:12 +02:00
Hidetoshi Seto	b8325c5b11	x86, mce: Introduce mce_gather_info() This patch introduces mce_gather_info() which is to be called at the beginning of error handling and gathers minimum error information from proper error registers (and saved registers). As the result of mce_get_rip() is integrated, unnecessary zeroing is removed. This also takes care of saving RIP which is required to make some decision about error severity for SRAR errors, instead of retrieving it later in the handler. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED71A.1060906@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 12:10:10 +02:00
Hidetoshi Seto	2b90e77eae	x86, mce: Replace MCM_ with MCI_MISC_ Follow other MCi register defines. Plus define MCI_MISC_ADDR_LSB() and MCI_MISC_ADDR_MODE(). Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED6E8.9090509@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 12:10:10 +02:00
Hidetoshi Seto	b77e70bf35	x86, mce: Replace MCE_SELF_VECTOR by irq_work The MCE handler uses a special vector for self IPI to invoke post-emergency processing in an interrupt context, e.g. call an NMI-unsafe function, wakeup loggers, schedule time-consuming work for recovery, etc. This mechanism is now generalized by the following commit: > `e360adbe29` > Author: Peter Zijlstra <a.p.zijlstra@chello.nl> > Date: Thu Oct 14 14:01:34 2010 +0800 > > irq_work: Add generic hardirq context callbacks > > Provide a mechanism that allows running code in IRQ context. It is > most useful for NMI code that needs to interact with the rest of the > system -- like wakeup a task to drain buffers. : So change to use provided generic mechanism. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED6B2.6080005@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 12:10:08 +02:00
Hidetoshi Seto	7639bfc753	x86, mce, severity: Clean up trivial coding style problems More specifically: - sort bits in the macros - use BITCLR/BITSET - coordinate message pattern - use m for struct mce - cleanup for severities_debugfs_init() No functional change. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED679.9090503@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 12:10:07 +02:00
Hidetoshi Seto	a17957cdec	x86, mce, severity: Cleanup severity table The current format of an item in this table is: condition(param, ..., level, message [, condition2 ...]) So we have to check both an item's head and tail to find the conditions which match the item. Format them in a more straight forward manner: item(level, message, condition [, condition2 ...]) Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED61F.5010502@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 12:09:42 +02:00
Hidetoshi Seto	901d7691d3	x86, mce, severity: Make formatting a bit more readable The table looks very complicated and hard to read for people other than skilled developers. So let's clean it up a bit. At first, change format to ease reading elements in the table. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4DEED5EB.6050400@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 11:40:21 +02:00
Tony Luck	880a317abc	x86, mce, severity: Fix two severities table signatures The "Spurious not enabled" entry is redundant: the "Not enabled" entry earlier in the table will cover this case. The "Action required; unknown MCACOD" entry shouldn't specify MCACOD in the .mask field. Current code will only match for mcacod==0 rather than all AR=1 entries. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Link: http://lkml.kernel.org/r/4DEED5BC.8030703@jp.fujitsu.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-16 11:37:57 +02:00
Borislav Petkov	86b445676d	x86, microcode, AMD: Correct buf references Both the equivalence table and the microcode patch types are u32. Access them properly through the buf-ptr. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-06-15 15:13:49 +02:00
Masami Hiramatsu	395810627b	x86: Swap save_stack_trace_regs parameters Swap the 1st and 2nd parameters of save_stack_trace_regs() as same as the parameters of save_stack_trace_tsk(). Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: yrl.pp-manager.tt@hitachi.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Namhyung Kim <namhyung@gmail.com> Link: http://lkml.kernel.org/r/20110608070921.17777.31103.stgit@fedora15 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-06-14 22:48:51 -04:00
Andy Whitcroft	60b8b1de0d	x86 idle: APM requires pm_idle/default_idle unconditionally when a module [ Also from Ben Hutchings <ben@decadent.org.uk> and Vitaliy Ivanov <vitalivanov@gmail.com> ] Commit `06ae40ce07` ("x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it") removed the export for pm_idle/default_idle unless the apm module was modularised and CONFIG_APM_CPU_IDLE was set. But the apm module uses pm_idle/default_idle unconditionally, CONFIG_APM_CPU_IDLE only affects the bios idle threshold. Adjust the export accordingly. [ Used #ifdef instead of #if defined() as it's shorter, and what both Ben and Vitaliy used.. Andy, you're out-voted ;) - Linus ] Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Len Brown <len.brown@intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-14 13:42:20 -07:00
Linus Torvalds	f39e840995	Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm: Compare only lower 32 bits of framebuffer map offsets drm/i915: Don't leak in i915_gem_shmem_pread_slow() drm/radeon/kms: do bounds checking for 3D_LOAD_VBPNTR and bump array limit drm/radeon/kms: fix mac g5 quirk x86/uv/x2apic: update for change in pci bridge handling. alpha, drm: Remove obsolete Alpha support in MGA DRM code alpha/drm: Cleanup Alpha support in DRM generic code savage: remove unnecessary if statement drm/radeon: fix GUI idle IH debug statements drm/radeon/kms: check modes against max pixel clock drm: fix fbs in DRM_IOCTL_MODE_GETRESOURCES ioctl	2011-06-14 11:25:32 -07:00
Joerg Roedel	71f7758090	x86/amd-iommu: Store device alias as dev_data pointer This finally allows PCI-Device-IDs to be handled by the IOMMU driver that have no corresponding struct device present in the system. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-14 12:49:58 +02:00
Joerg Roedel	3b03bb745e	x86/amd-iommu: Search for existind dev_data before allocting a new one Search for existing dev_data first will allow to switch dev_data->alias to just store dev_data instead of struct device. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-14 12:49:58 +02:00
Joerg Roedel	2b02b091ab	x86/amd-iommu: Allow dev_data->alias to be NULL Let dev_data->alias be just NULL if the device has no alias. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-14 12:49:58 +02:00
Joerg Roedel	ec9e79ef06	x86/amd-iommu: Use only dev_data in low-level domain attach/detach functions With this patch the low-level attach/detach functions only work on dev_data structures. This allows to remove the dev_data->dev pointer. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-14 12:49:58 +02:00
Joerg Roedel	6c54204793	x86/amd-iommu: Use only dev_data for dte and iotlb flushing routines This patch make the functions flushing the DTE and IOTLBs only take the dev_data structure instead of the struct device directly. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-14 12:49:57 +02:00
Joerg Roedel	ea61cddb9d	x86/amd-iommu: Store ATS state in dev_data This allows the low-level functions to operate on dev_data exclusivly later. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-14 12:49:57 +02:00
Joerg Roedel	f62dda66b5	x86/amd-iommu: Store devid in dev_data This allows to use dev_data independent of struct device later. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-14 12:49:57 +02:00
Joerg Roedel	8fa5f802ab	x86/amd-iommu: Introduce global dev_data_list This list keeps all allocated iommu_dev_data structs in a list together. This is needed for instances that have no associated device. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-14 12:49:57 +02:00
Joerg Roedel	39c555460c	x86/amd-iommu: Remove redundant device_flush_dte() calls Remove these function calls from places where the function has already been called by another function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-14 12:49:57 +02:00
Dave Airlie	7ad35cf288	x86/uv/x2apic: update for change in pci bridge handling. When I added `3448a19da4` I forgot about the special uv handling code for this, so this patch fixes it up. Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-by: Ingo Molnar Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-06-14 09:50:12 +10:00
Linus Torvalds	c78a9b9b8e	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: ftrace: Revert `8ab2b7efd` ftrace: Remove unnecessary disabling of irqs kprobes/trace: Fix kprobe selftest for gcc 4.6 ftrace: Fix possible undefined return code oprofile, dcookies: Fix possible circular locking dependency oprofile: Fix locking dependency in sync_start() oprofile: Free potentially owned tasks in case of errors oprofile, x86: Add comments to IBS LVT offset initialization	2011-06-13 10:45:49 -07:00
Linus Torvalds	842c895d14	Merge branches 'x86-urgent-for-linus' and 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: devicetree: Add missing early_init_dt_setup_initrd_arch stub x86: cpu-hotplug: Prevent softirq wakeup on wrong CPU * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Prevent potential NULL dereference in irq_set_irq_wake()	2011-06-13 10:45:10 -07:00
Joe Perches	28f65c11f2	treewide: Convert uses of struct resource to resource_size(ptr) Several fixes as well where the +1 was missing. Done via coccinelle scripts like: @@ struct resource *ptr; @@ - ptr->end - ptr->start + 1 + resource_size(ptr) and some grep and typing. Mostly uncompiled, no cross-compilers. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-06-10 14:55:36 +02:00
Mathias Krause	dac853ae89	exec: delay address limit change until point of no return Unconditionally changing the address limit to USER_DS and not restoring it to its old value in the error path is wrong because it prevents us using kernel memory on repeated calls to this function. This, in fact, breaks the fallback of hard coded paths to the init program from being ever successful if the first candidate fails to load. With this patch applied switching to USER_DS is delayed until the point of no return is reached which makes it possible to have a multi-arch rootfs with one arch specific init binary for each of the (hard coded) probed paths. Since the address limit is already set to USER_DS when start_thread() will be invoked, this redundancy can be safely removed. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-09 12:50:05 -07:00
Florian Fainelli	977cb76d52	x86: devicetree: Add missing early_init_dt_setup_initrd_arch stub This patch fixes the following build failure: drivers/built-in.o: In function `early_init_dt_check_for_initrd': /home/florian/dev/kernel/x86/linux-2.6-x86/drivers/of/fdt.c:571: undefined reference to `early_init_dt_setup_initrd_arch' make: *** [.tmp_vmlinux1] Error 1 which happens as soon as we enable initrd support on a x86 devicetree platform such as Intel CE4100. Signed-off-by: Florian Fainelli <ffainelli@freebox.fr> Acked-by: Grant Likely <grant.likely@secretlab.ca> Cc: Maxime Bizon <mbizon@freebox.fr> Acked-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: stable@kernel.org # 2.6.39 Link: http://lkml.kernel.org/r/201106061015.50039.ffainelli@freebox.fr Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-06-09 15:39:43 +02:00
Ralf Baechle	16f871bc30	x86: i8253: Consolidate definitions of global_clock_event There are multiple declarations of global_clock_event in header files specific to particular clock event implementations. Consolidate them in <asm/time.h> and make sure all users include that header. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Venkatesh Pallipadi (Venki) <venki@google.com> Link: http://lkml.kernel.org/r/20110601180610.762763451@duck.linux-mips.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-06-09 15:01:40 +02:00
Ralf Baechle	15f304b664	i8253: Consolidate all kernel definitions of i8253_lock Move them to drivers/clocksource/i8253.c and remove the implementations in arch/ [ tglx: Avoid the extra file in lib - folded arch patches in. The export will become conditional in a later step ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Link: http://lkml.kernel.org/r/20110601180610.221426078@duck.linux-mips.net Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-06-09 15:01:38 +02:00
Ralf Baechle	334955ef96	i8253: Create linux/i8253.h and use it in all 8253 related files Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Link: http://lkml.kernel.org/r/20110601180610.054254048@duck.linux-mips.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de> arch/arm/mach-footbridge/isa-timer.c \| 2 +- arch/mips/cobalt/time.c \| 2 +- arch/mips/jazz/irq.c \| 2 +- arch/mips/kernel/i8253.c \| 2 +- arch/mips/mti-malta/malta-time.c \| 2 +- arch/mips/sgi-ip22/ip22-time.c \| 2 +- arch/mips/sni/time.c \| 2 +- arch/x86/kernel/apic/apic.c \| 2 +- arch/x86/kernel/apm_32.c \| 2 +- arch/x86/kernel/hpet.c \| 2 +- arch/x86/kernel/i8253.c \| 2 +- arch/x86/kernel/time.c \| 2 +- drivers/block/hd.c \| 2 +- drivers/clocksource/i8253.c \| 2 +- drivers/input/gameport/gameport.c \| 2 +- drivers/input/joystick/analog.c \| 2 +- drivers/input/misc/pcspkr.c \| 2 +- include/linux/i8253.h \| 11 +++++++++++ sound/drivers/pcsp/pcsp.h \| 2 +- 19 files changed, 29 insertions(+), 18 deletions(-)	2011-06-09 15:01:37 +02:00
Ingo Molnar	86dd7909c2	Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/urgent	2011-06-08 15:49:03 +02:00
Thomas Gleixner	fd8a7de177	x86: cpu-hotplug: Prevent softirq wakeup on wrong CPU After a newly plugged CPU sets the cpu_online bit it enables interrupts and goes idle. The cpu which brought up the new cpu waits for the cpu_online bit and when it observes it, it sets the cpu_active bit for this cpu. The cpu_active bit is the relevant one for the scheduler to consider the cpu as a viable target. With forced threaded interrupt handlers which imply forced threaded softirqs we observed the following race: cpu 0 cpu 1 bringup(cpu1); set_cpu_online(smp_processor_id(), true); local_irq_enable(); while (!cpu_online(cpu1)); timer_interrupt() -> wake_up(softirq_thread_cpu1); -> enqueue_on(softirq_thread_cpu1, cpu0); ^^^^ cpu_notify(CPU_ONLINE, cpu1); -> sched_cpu_active(cpu1) -> set_cpu_active((cpu1, true); When an interrupt happens before the cpu_active bit is set by the cpu which brought up the newly onlined cpu, then the scheduler refuses to enqueue the woken thread which is bound to that newly onlined cpu on that newly onlined cpu due to the not yet set cpu_active bit and selects a fallback runqueue. Not really an expected and desirable behaviour. So far this has only been observed with forced hard/softirq threading, but in theory this could happen without forced threaded hard/softirqs as well. It's probably unobservable as it would take a massive interrupt storm on the newly onlined cpu which causes the softirq loop to wake up the softirq thread and an even longer delay of the cpu which waits for the cpu_online bit. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Peter Zijlstra <peterz@infradead.org> Cc: stable@kernel.org # 2.6.39	2011-06-08 11:21:19 +02:00
Benjamin Herrenschmidt	3d5fe5a65a	x86/devicetree: Use generic PCI <-> OF matching Instead of walking the whole PCI tree to update the of_node's for PCI busses and devices after the fact, enable the new generic core code for doing so by providing the proper device nodes for the PCI host bridges Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Grant Likely <grant.likely@secretlab.ca> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>	2011-06-08 09:08:40 +10:00
Joerg Roedel	26018874e3	x86/amd-iommu: Fix boot crash with hidden PCI devices Some PCIe cards ship with a PCI-PCIe bridge which is not visible as a PCI device in Linux. But the device-id of the bridge is present in the IOMMU tables which causes a boot crash in the IOMMU driver. This patch fixes by removing these cards from the IOMMU handling. This is a pure -stable fix, a real fix to handle this situation appriatly will follow for the next merge window. Cc: stable@kernel.org # > 2.6.32 Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-07 10:06:59 +02:00
Andy Lutomirski	5cec93c216	x86-64: Emulate legacy vsyscalls There's a fair amount of code in the vsyscall page. It contains a syscall instruction (in the gettimeofday fallback) and who knows what will happen if an exploit jumps into the middle of some other code. Reduce the risk by replacing the vsyscalls with short magic incantations that cause the kernel to emulate the real vsyscalls. These incantations are useless if entered in the middle. This causes vsyscalls to be a little more expensive than real syscalls. Fortunately sensible programs don't use them. The only exception is time() which is still called by glibc through the vsyscall - but calling time() millions of times per second is not sensible. glibc has this fixed in the development tree. This patch is not perfect: the vread_tsc and vread_hpet functions are still at a fixed address. Fixing that might involve making alternative patching work in the vDSO. Signed-off-by: Andy Lutomirski <luto@mit.edu> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jesper Juhl <jj@chaosbits.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: Mikael Pettersson <mikpe@it.uu.se> Cc: Andi Kleen <andi@firstfloor.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Louis Rilling <Louis.Rilling@kerlabs.com> Cc: Valdis.Kletnieks@vt.edu Cc: pageexec@freemail.hu Link: http://lkml.kernel.org/r/e64e1b3c64858820d12c48fa739efbd1485e79d5.1307292171.git.luto@mit.edu [ Removed the CONFIG option - it's simpler to just do it unconditionally. Tidied up the code as well. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-06-07 10:02:35 +02:00
Joerg Roedel	27c2127a15	x86/amd-iommu: Use only per-device dma_ops Unfortunatly there are systems where the AMD IOMMU does not cover all devices. This breaks with the current driver as it initializes the global dma_ops variable. This patch limits the AMD IOMMU to the devices listed in the IVRS table fixing DMA for devices not covered by the IOMMU. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-06 17:37:27 +02:00
Joerg Roedel	0de66d5b35	x86/amd-iommu: Fix 3 possible endless loops The driver contains several loops counting on an u16 value where the exit-condition is checked against variables that can have values up to 0xffff. In this case the loops will never exit. This patch fixed 3 such loops. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2011-06-06 16:10:15 +02:00
Andy Lutomirski	5dfcea629a	x86-64: Fill unused parts of the vsyscall page with 0xcc Jumping to 0x00 might do something depending on the following bytes. Jumping to 0xcc is a trap. So fill the unused parts of the vsyscall page with 0xcc to make it useless for exploits to jump there. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Jesper Juhl <jj@chaosbits.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: Mikael Pettersson <mikpe@it.uu.se> Cc: Andi Kleen <andi@firstfloor.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Louis Rilling <Louis.Rilling@kerlabs.com> Cc: Valdis.Kletnieks@vt.edu Cc: pageexec@freemail.hu Link: http://lkml.kernel.org/r/ed54bfcfbe50a9070d20ec1edbe0d149e22a4568.1307292171.git.luto@mit.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-06-06 09:43:14 +02:00
Andy Lutomirski	bb5fe2f78e	x86-64: Remove vsyscall number 3 (venosys) It just segfaults since April 2008 (`a4928cff`), so I'm pretty sure that nothing uses it. And having an empty section makes the linker script a bit fragile. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Jesper Juhl <jj@chaosbits.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: Mikael Pettersson <mikpe@it.uu.se> Cc: Andi Kleen <andi@firstfloor.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Louis Rilling <Louis.Rilling@kerlabs.com> Cc: Valdis.Kletnieks@vt.edu Cc: pageexec@freemail.hu Link: http://lkml.kernel.org/r/4a4abcf47ecadc269f2391a313576fe6d06acef7.1307292171.git.luto@mit.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-06-06 09:43:14 +02:00
Andy Lutomirski	d319bb79af	x86-64: Map the HPET NX Currently the HPET mapping is a user-accessible syscall instruction at a fixed address some of the time. A sufficiently determined hacker might be able to guess when. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Jesper Juhl <jj@chaosbits.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: Mikael Pettersson <mikpe@it.uu.se> Cc: Andi Kleen <andi@firstfloor.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Louis Rilling <Louis.Rilling@kerlabs.com> Cc: Valdis.Kletnieks@vt.edu Cc: pageexec@freemail.hu Link: http://lkml.kernel.org/r/ab41b525a4ca346b1ca1145d16fb8d181861a8aa.1307292171.git.luto@mit.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-06-05 21:30:33 +02:00
Andy Lutomirski	0d7b8547fb	x86-64: Remove kernel.vsyscall64 sysctl It's unnecessary overhead in code that's supposed to be highly optimized. Removing it allows us to remove one of the two syscall instructions in the vsyscall page. The only sensible use for it is for UML users, and it doesn't fully address inconsistent vsyscall results on UML. The real fix for UML is to stop using vsyscalls entirely. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Jesper Juhl <jj@chaosbits.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: Mikael Pettersson <mikpe@it.uu.se> Cc: Andi Kleen <andi@firstfloor.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Louis Rilling <Louis.Rilling@kerlabs.com> Cc: Valdis.Kletnieks@vt.edu Cc: pageexec@freemail.hu Link: http://lkml.kernel.org/r/973ae803fe76f712da4b2740e66dccf452d3b1e4.1307292171.git.luto@mit.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-06-05 21:30:33 +02:00
Andy Lutomirski	9fd67b4ed0	x86-64: Give vvars their own page Move vvars out of the vsyscall page into their own page and mark it NX. Without this patch, an attacker who can force a daemon to call some fixed address could wait until the time contains, say, 0xCD80, and then execute the current time. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Jesper Juhl <jj@chaosbits.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: Mikael Pettersson <mikpe@it.uu.se> Cc: Andi Kleen <andi@firstfloor.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Louis Rilling <Louis.Rilling@kerlabs.com> Cc: Valdis.Kletnieks@vt.edu Cc: pageexec@freemail.hu Link: http://lkml.kernel.org/r/b1460f81dc4463d66ea3f2b5ce240f58d48effec.1307292171.git.luto@mit.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-06-05 21:30:32 +02:00
Andy Lutomirski	8b4777a4b5	x86-64: Document some of entry_64.S Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Jesper Juhl <jj@chaosbits.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: Mikael Pettersson <mikpe@it.uu.se> Cc: Andi Kleen <andi@firstfloor.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Louis Rilling <Louis.Rilling@kerlabs.com> Cc: Valdis.Kletnieks@vt.edu Cc: pageexec@freemail.hu Link: http://lkml.kernel.org/r/fc134867cc550977cc996866129e11a16ba0f9ea.1307292171.git.luto@mit.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-06-05 21:30:32 +02:00
Borislav Petkov	838feb4754	x86, asm: Flip RESTORE_ARGS arguments logic ... thus getting rid of the "else" part of the conditional statement in the macro. No functionality change. Signed-off-by: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1306873314-32523-4-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-06-03 14:38:53 -07:00
Borislav Petkov	cac0e0a78f	x86, asm: Flip SAVE_ARGS arguments logic This saves us the else part of the conditional statement in the macro. No functionality change. Signed-off-by: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1306873314-32523-3-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-06-03 14:38:51 -07:00
Tero Roponen	df049672dd	x86: tsc: Remove unneeded DMI-based blacklisting The blacklist was added in response to my bug report (http://lkml.org/lkml/2006/1/19/362) and has never contained more than the one entry describing my old now dead ThinkPad 380XD laptop. As found out later (http://lkml.org/lkml/2007/11/29/50), this special treatment has been unnecessary for a long time, so it can be removed. Signed-off-by: Tero Roponen <tero.roponen@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-05-31 23:19:51 -07:00
Linus Torvalds	af0d6a0a3a	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix mwait_play_dead() faulting on mwait-incapable cpus x86 idle: Fix mwait deprecation warning message Evil merge to remove extra quote noticed by Joe Perches	2011-06-01 02:07:22 +09:00
Linus Torvalds	643d2d7992	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Put back -pg to tsc.o and add no GCOV to vread_tsc_64.o	2011-05-31 20:32:54 +09:00
Robert Richter	cbf74cea07	oprofile, x86: Add comments to IBS LVT offset initialization Adding a comment in the code as IBS LVT setup is not obvious at all ... Signed-off-by: Robert Richter <robert.richter@amd.com>	2011-05-30 16:36:54 +02:00
Avi Kivity	4f3c125c74	x86: Fix mwait_play_dead() faulting on mwait-incapable cpus A logic error in mwait_play_dead() causes the kernel to use mwait even on cpus which don't support it, such as KVM virtual cpus. Introduced by: 349c004e3d31: x86: A fast way to check capabilities of the current cpu Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=36222 Reported-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com> Cc: Christoph Lameter <cl@linux.com> Cc: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/1306758237-9327-1-git-send-email-avi@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-30 14:37:54 +02:00
Borislav Petkov	598e887d8b	x86 idle: Fix mwait deprecation warning message Fix: arch/x86/kernel/process.c:645:1: warning: unknown escape sequence '\i' due to missing escape backslash, introduced by this commit: 5d4c47e0195b: x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param Signed-off-by: Borislav Petkov <bp@alien8.de> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1306748286-24701-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-30 13:02:04 +02:00
Linus Torvalds	f310642123	Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6 * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6: x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param x86 idle: deprecate "no-hlt" cmdline param x86 idle APM: deprecate CONFIG_APM_CPU_IDLE x86 idle floppy: deprecate disable_hlt() x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it x86 idle: clarify AMD erratum 400 workaround idle governor: Avoid lock acquisition to read pm_qos before entering idle cpuidle: menu: fixed wrapping timers at 4.294 seconds	2011-05-29 11:18:09 -07:00
Len Brown	5d4c47e019	x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param mwait_idle() is a C1-only idle loop intended to be more efficient than HLT on SMP hardware that supports it. But mwait_idle() has been replaced by the more general mwait_idle_with_hints(), which handles both C1 and deeper C-states. ACPI uses only mwait_idle_with_hints(), and never uses mwait_idle(). Deprecate mwait_idle() and the "idle=mwait" cmdline param to simplify the x86 idle code. After this change, kernels configured with (!CONFIG_ACPI=n && !CONFIG_INTEL_IDLE=n) when run on hardware that support MWAIT will simply use HLT. If MWAIT is desired on those systems, cpuidle and the cpuidle drivers above can be used. cc: x86@kernel.org cc: stable@kernel.org # .39.x Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 03:39:17 -04:00
Len Brown	cdaab4a0d3	x86 idle: deprecate "no-hlt" cmdline param We'd rather that modern machines not check if HLT works on every entry into idle, for the benefit of machines that had marginal electricals 15-years ago. If those machines are still running the upstream kernel, they can use "idle=poll". The only difference will be that they'll now invoke HLT in machine_hlt(). cc: x86@kernel.org # .39.x Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 03:39:16 -04:00
Len Brown	99c6322143	x86 idle APM: deprecate CONFIG_APM_CPU_IDLE We don't want to export the pm_idle function pointer to modules. Currently CONFIG_APM_CPU_IDLE w/ CONFIG_APM_MODULE forces us to. CONFIG_APM_CPU_IDLE is of dubious value, it runs only on 32-bit uniprocessor laptops that are over 10 years old. It calls into the BIOS during idle, and is known to cause a number of machines to fail. Removing CONFIG_APM_CPU_IDLE and will allow us to stop exporting pm_idle. Any systems that were calling into the APM BIOS at run-time will simply use HLT instead. cc: x86@kernel.org cc: Jiri Kosina <jkosina@suse.cz> cc: stable@kernel.org # .39.x Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 03:39:15 -04:00
Len Brown	06ae40ce07	x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it In the long run, we don't want default_idle() or (pm_idle)() to be exported outside of process.c. Start by not exporting them to modules, unless the APM build demands it. cc: x86@kernel.org cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 03:39:14 -04:00
Len Brown	02c68a0201	x86 idle: clarify AMD erratum 400 workaround The workaround for AMD erratum 400 uses the term "c1e" falsely suggesting: 1. Intel C1E is somehow involved 2. All AMD processors with C1E are involved Use the string "amd_c1e" instead of simply "c1e" to clarify that this workaround is specific to AMD's version of C1E. Use the string "e400" to clarify that the workaround is specific to AMD processors with Erratum 400. This patch is text-substitution only, with no functional change. cc: x86@kernel.org Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 03:38:57 -04:00
Linus Torvalds	a947e23a8e	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, asm: Clean up desc.h a bit x86, amd: Do not enable ARAT feature on AMD processors below family 0x12 x86: Move do_page_fault()'s error path under unlikely() x86, efi: Retain boot service code until after switching to virtual mode x86: Remove unnecessary check in detect_ht() x86: Reorder mm_context_t to remove x86_64 alignment padding and thus shrink mm_struct x86, UV: Clean up uv_tlb.c x86, UV: Add support for SGI UV2 hub chip x86, cpufeature: Update CPU feature RDRND to RDRAND	2011-05-28 12:57:01 -07:00
Linus Torvalds	c4a227d89f	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) perf: Fix SIGIO handling perf top: Don't stop if no kernel symtab is found perf top: Handle kptr_restrict perf top: Remove unused macro perf events: initialize fd array to -1 instead of 0 perf tools: Make sure kptr_restrict warnings fit 80 col terms perf tools: Fix build on older systems perf symbols: Handle /proc/sys/kernel/kptr_restrict perf: Remove duplicate headers ftrace: Add internal recursive checks tracing: Update btrfs's tracepoints to use u64 interface tracing: Add __print_symbolic_u64 to avoid warnings on 32bit machine ftrace: Set ops->flag to enabled even on static function tracing tracing: Have event with function tracer check error return ftrace: Have ftrace_startup() return failure code jump_label: Check entries limit in __jump_label_update ftrace/recordmcount: Avoid STT_FUNC symbols as base on ARM scripts/tags.sh: Add magic for trace-events for etags too scripts/tags.sh: Fix ctags for DEFINE_EVENT() x86/ftrace: Fix compiler warning in ftrace.c ...	2011-05-28 12:55:55 -07:00
Linus Torvalds	571503e100	Merge branch 'setns' * setns: ns: Wire up the setns system call Done as a merge to make it easier to fix up conflicts in arm due to addition of sendmmsg system call	2011-05-28 10:51:01 -07:00
Eric W. Biederman	7b21fddd08	ns: Wire up the setns system call 32bit and 64bit on x86 are tested and working. The rest I have looked at closely and I can't find any problems. setns is an easy system call to wire up. It just takes two ints so I don't expect any weird architecture porting problems. While doing this I have noticed that we have some architectures that are very slow to get new system calls. cris seems to be the slowest where the last system calls wired up were preadv and pwritev. avr32 is weird in that recvmmsg was wired up but never declared in unistd.h. frv is behind with perf_event_open being the last syscall wired up. On h8300 the last system call wired up was epoll_wait. On m32r the last system call wired up was fallocate. mn10300 has recvmmsg as the last system call wired up. The rest seem to at least have syncfs wired up which was new in the 2.6.39. v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com> v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com> v4: Moved wiring up of the system call to another patch v5: ported to v2.6.39-rc6 v6: rebased onto parisc-next and net-next to avoid syscall conflicts. v7: ported to Linus's latest post 2.6.39 tree. > arch/blackfin/include/asm/unistd.h \| 3 ++- > arch/blackfin/mach-common/entry.S \| 1 + Acked-by: Mike Frysinger <vapier@gentoo.org> Oh - ia64 wiring looks good. Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-28 10:48:39 -07:00
Steven Rostedt	89e1be50c6	x86: Put back -pg to tsc.o and add no GCOV to vread_tsc_64.o The commit `44259b1abf` Author: Andy Lutomirski <luto@MIT.EDU> x86-64: Move vread_tsc into a new file with sensible options Removed the -pg from tsc.o which caused the function graph tracer to go into an infinite function call recursion as it uses the tsc internally outside its recursion protection, thus tracing the tsc breaks the function graph tracer. This commit also added the file vread_tsc_64.c that gets used by vdso but failed to prevent GCOV from monkeying with it, causing userspace to try to access kernel data when GCOV was enabled. Thanks to Thomas Gleixner for pointing out GCOV as the likely culprit that added strange kernel accesses into the vread_tsc() call. Cc: Author: Andy Lutomirski <luto@MIT.EDU> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2011-05-27 23:47:16 -04:00
Ingo Molnar	d6a72fe465	Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent	2011-05-27 14:28:09 +02:00
Linus Torvalds	14587a2a25	Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: vdso: Remove unused variable x86-64: Optimize vDSO time() x86-64: Add time to vDSO x86-64: Turn off -pg and turn on -foptimize-sibling-calls for vDSO x86-64: Move vread_tsc into a new file with sensible options x86-64: Vclock_gettime(CLOCK_MONOTONIC) can't ever see nsec < 0 x86-64: Don't generate cmov in vread_tsc x86-64: Remove unnecessary barrier in vread_tsc x86-64: Clean up vdso/kernel shared variables	2011-05-26 12:19:31 -07:00
Linus Torvalds	fce637e392	Merge branches 'core-fixes-for-linus' and 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: seqlock: Get rid of SEQLOCK_UNLOCKED * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: irq: Remove smp_affinity_list when unregister irq proc	2011-05-26 12:19:11 -07:00

... 3 4 5 6 7 ...

7983 Commits