dect
/
linux-2.6
Archived
13
0
Fork 0
Commit Graph

70142 Commits

Author SHA1 Message Date
Steffen Rumler 3c75296562 powerpc: Fix kernel panic during kernel module load
This fixes a problem which can causes kernel oopses while loading
a kernel module.

According to the PowerPC EABI specification, GPR r11 is assigned
the dedicated function to point to the previous stack frame.
In the powerpc-specific kernel module loader, do_plt_call()
(in arch/powerpc/kernel/module_32.c), GPR r11 is also used
to generate trampoline code.

This combination crashes the kernel, in the case where the compiler
chooses to use a helper function for saving GPRs on entry, and the
module loader has placed the .init.text section far away from the
.text section, meaning that it has to generate a trampoline for
functions in the .init.text section to call the GPR save helper.
Because the trampoline trashes r11, references to the stack frame
using r11 can cause an oops.

The fix just uses GPR r12 instead of GPR r11 for generating the
trampoline code.  According to the statements from Freescale, this is
safe from an EABI perspective.

I've tested the fix for kernel 2.6.33 on MPC8541.

Cc: stable@vger.kernel.org
Signed-off-by: Steffen Rumler <steffen.rumler.ext@nsn.com>
[paulus@samba.org: reworded the description]
Signed-off-by: Paul Mackerras <paulus@samba.org>
2012-06-08 19:59:08 +10:00
Cliff Wickman d5d2d2eea8 x86/uv: Fix UV2 BAU legacy mode
The SGI Altix UV2 BAU (Broadcast Assist Unit) as used for
tlb-shootdown (selective broadcast mode) always uses UV2
broadcast descriptor format. There is no need to clear the
'legacy' (UV1) mode, because the hardware always uses UV2 mode
for selective broadcast.

But the BIOS uses general broadcast and legacy mode, and the
hardware pays attention to the legacy mode bit for general
broadcast. So the kernel must not clear that mode bit.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/E1SccoO-0002Lh-Cb@eag09.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08 11:48:28 +02:00
Yinghai Lu bd2753b2dd x86/mm: Only add extra pages count for the first memory range during pre-allocation early page table space
Robin found this regression:

| I just tried to boot an 8TB system.  It fails very early in boot with:
| Kernel panic - not syncing: Cannot find space for the kernel page tables

git bisect commit 722bc6b167.

A git revert of that commit does boot past that point on the 8TB
configuration.

That commit will add up extra pages for all memory range even
above 4g.

Try to limit that extra page count adding to first entry only.

Bisected-by: Robin Holt <holt@sgi.com>
Tested-by: Robin Holt <holt@sgi.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/CAE9FiQUj3wyzQxtq9yzBNc9u220p8JZ1FYHG7t%3DMOzJ%3D9BZMYA@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08 11:40:50 +02:00
Paul Mackerras 860aed25a1 powerpc/time: Sanity check of decrementer expiration is necessary
This reverts 68568add2c ("powerpc/time: Remove unnecessary sanity check
of decrementer expiration").  We do need to check whether we have reached
the expiration time of the next event, because we sometimes get an early
decrementer interrupt, most notably when we set the decrementer to 1 in
arch_irq_work_raise().  The effect of not having the sanity check is that
if timer_interrupt() gets called early, we leave the decrementer set to
its maximum value, which means we then don't get any more decrementer
interrupts for about 4 seconds (or longer, depending on timebase
frequency).  I saw these pauses as a consequence of getting a stray
hypervisor decrementer interrupt left over from exiting a KVM guest.

This isn't quite a straight revert because of changes to the surrounding
code, but it restores the same algorithm as was previously used.

Cc: stable@vger.kernel.org
Acked-by: Anton Blanchard <anton@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2012-06-08 14:07:35 +10:00
Jordan Justen 743628e868 x86, efi stub: Add .reloc section back into image
Some UEFI firmware will not load a .efi with a .reloc section
with a size of 0.

Therefore, we create a .efi image with 4 main areas and 3 sections.
1. PE/COFF file header
2. .setup section (covers all setup code following the first sector)
3. .reloc section (contains 1 dummy reloc entry, created in build.c)
4. .text section (covers the remaining kernel image)

To make room for the new .setup section data, the header
bugger_off_msg had to be shortened.

Reported-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Link: http://lkml.kernel.org/r/1339085121-12760-1-git-send-email-jordan.l.justen@intel.com
Tested-by: Lee G Rosenbaum <lee.g.rosenbaum@intel.com>
Tested-by: Henrik Rydberg <rydberg@euromail.se>
Cc: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-07 09:52:33 -07:00
Linus Torvalds 513335f964 PARISC fixes on 20120607
This is a set of three bug fixes for minor build breakages that got introduced
 just before 3.5-rc1 was released.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJP0ENjAAoJEDeqqVYsXL0MyjIIAIWuQZ4YYSB1b06U8Bd82vLL
 ES9hJ7ZfgKO/2bgEWRR3HgUM4kqWt31TbqaVmwbZKy+Z7XRTCtpCeOEIraS9VrI+
 tTDyUmhEwaxDAj9XisbVZwPxCI6f06Pry0K1JJn505MaVzQ8F4/fyjhLuwIoWX+Z
 Lqkl4DW4hQwqPNuBUT8paGLAenA5S9FHjhugqSiNCBKRhz7/vfHyOUHzZb8mMs2a
 8A5ZjJA6rLiCLauM3JPndq5e+GxHyN04A6o75b721I440yvBqxHsnCM8n59N8LRd
 nBgdrlorsOav/2l+BG5qrnh4Uavmniax5sAk4byqq4F6etHFCeem9lqhKFPcw6k=
 =xh6K
 -----END PGP SIGNATURE-----

Merge tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6

Pull PARISC fixes from James Bottomley:
 "This is a set of three bug fixes for minor build breakages that got
  introduced just before 3.5-rc1 was released."

* tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
  [PARISC] fix code to find libgcc
  [PARISC] fix compile break in use of lib/strncopy_from_user.c
  [PARISC] fix missing TAINT_WARN problem
2012-06-07 09:06:54 -07:00
Linus Torvalds 0c30989cc9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull tile fixes from Chris Metcalf:
 "These two minor bug fixes fix build failures from some changes that
  were merged in during the 3.5 merge window."

* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: add #include to unbreak build after generic init_task conversion
  tile: remove cpu_idle_on_new_stack
2012-06-07 09:06:13 -07:00
Chris Metcalf 2ded5c2484 tile: add #include to unbreak build after generic init_task conversion
Some code was moved from init_task.c to setup.c but the appropriate
header needed to be moved as well.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-06-06 11:29:35 -04:00
Chris Metcalf 10db9e009a tile: remove cpu_idle_on_new_stack
This routine isn't used unless CONFIG_HOMECACHE is enabled, which
isn't even available as a public configuration option yet.
Since it no longer links correctly in 3.4, just remove it for now.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-06-06 11:29:31 -04:00
Arun Sharma db0dc75d64 perf/x86: Check user address explicitly in copy_from_user_nmi()
Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334961696-19580-5-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 17:08:04 +02:00
Arun Sharma bc6ca7b342 perf/x86: Check if user fp is valid
Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334961696-19580-4-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 17:08:01 +02:00
Arun Sharma 302fa4b58a perf/x86: Allow multiple stacks
Without this patch, applications with two different stack
regions (eg: native stack vs JIT stack) get truncated
callchains even when RBP chaining is present. GDB shows proper
stack traces and the frame pointer chaining is intact.

This patch disables the (fp < RSP) check, hoping that other checks
in the code save the day for us. In our limited testing, this
didn't seem to break anything.

In the long term, we could potentially have userspace advise
the kernel on the range of valid stack addresses, so we don't
spend a lot of time unwinding from bogus addresses.

Signed-off-by: Arun Sharma <asharma@fb.com>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334961696-19580-2-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 17:07:58 +02:00
Peter Zijlstra 8440ccb43f perf/x86: Update SNB PEBS constraints
Afaict there's no need to (incompletely) iterate the
MEM_UOPS_RETIRED.* umask state.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1338884803.28282.153.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:59:52 +02:00
Peter Zijlstra b6db437ba8 perf/x86: Enable/Add IvyBridge hardware support
Implement rudimentary IVB perf support. The SDM states its identical
to SNB with exception of the exact event tables, but a quick look
suggests they're similar enough.

Also mark SNB-EP as broken for now.

Requested-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338884803.28282.153.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:59:49 +02:00
Peter Zijlstra cccb9ba9e4 perf/x86: Implement cycles:p for SNB/IVB
Now that there's finally a chip with working PEBS (IvyBridge), we can
enable the hardware and implement cycles:p for SNB/IVB.

Cc: Stephane Eranian <eranian@google.com>
Requested-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338884803.28282.153.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:59:47 +02:00
Peter Zijlstra b430f7c470 perf/x86: Fix Intel shared extra MSR allocation
Zheng Yan reported that event group validation can wreck event state
when Intel extra_reg allocation changes event state.

Validation shouldn't change any persistent state. Cloning events in
validate_{event,group}() isn't really pretty either, so add a few
special cases to avoid modifying the event state.

The code is restructured to minimize the special case impact.

Reported-by: Zheng Yan <zheng.z.yan@linux.intel.com>
Acked-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338903031.28282.175.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:59:44 +02:00
Kamalesh Babulal ceb1cbac8e sched/x86: Calculate booted cores after construction of sibling_mask
Commit 316ad24830 ("sched/x86: Rewrite set_cpu_sibling_map()")
broke the booted_cores accounting.

The problem is that the booted_cores accounting needs all the
sibling links set up. So restore the second loop and add a comment as
to why its needed.

On qemu booted with -smp sockets=1,cores=2,threads=2;
Before:
 $ grep cores /proc/cpuinfo
 cpu cores       : 2
 cpu cores       : 1
 cpu cores       : 4
 cpu cores       : 3

With the patch:
 $ grep cores /proc/cpuinfo
 cpu cores       : 2
 cpu cores       : 2
 cpu cores       : 2
 cpu cores       : 2

Reported-by: Prarit Bhargava <prarit@redhat.com>
Reported-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120531073738.GH7511@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 16:37:59 +02:00
Geert Uytterhoeven d8ce7263e1 m68k: Use generic strncpy_from_user(), strlen_user(), and strnlen_user()
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2012-06-06 15:31:28 +02:00
Tomoki Sekiyama f6175f5bfb x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs
In current Linux, percpu variable `vector_irq' is not cleared on
offlined cpus while disabling devices' irqs. If the cpu that has
the disabled irqs in vector_irq is hotplugged,
__setup_vector_irq() hits invalid irq vector and may crash.

This bug can be reproduced as following;

  # echo 0 > /sys/devices/system/cpu/cpu7/online
  # modprobe -r some_driver_using_interrupts      # vector_irq@cpu7 uncleared
  # echo 1 > /sys/devices/system/cpu/cpu7/online  # kernel may crash

This patch fixes this bug by clearing vector_irq in
__clear_irq_vector() even if the cpu is offlined.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: ltc-kernel@ml.yrl.intra.hitachi.co.jp
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/4FC340BE.7080101@hitachi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 12:03:25 +02:00
Feng Tang 55c844a4dd x86/reboot: Fix a warning message triggered by stop_other_cpus()
When rebooting our 24 CPU Westmere servers with 3.4-rc6, we
always see this warning msg:

Restarting system.
machine restart
------------[ cut here ]------------
WARNING: at arch/x86/kernel/smp.c:125
native_smp_send_reschedule+0x74/0xa7() Hardware name: X8DTN
Modules linked in: igb [last unloaded: scsi_wait_scan]
Pid: 1, comm: systemd-shutdow Not tainted 3.4.0-rc6+ #22
Call Trace:
 <IRQ>  [<ffffffff8102a41f>] warn_slowpath_common+0x7e/0x96
 [<ffffffff8102a44c>] warn_slowpath_null+0x15/0x17
 [<ffffffff81018cf7>] native_smp_send_reschedule+0x74/0xa7
 [<ffffffff810561c1>] trigger_load_balance+0x279/0x2a6
 [<ffffffff81050112>] scheduler_tick+0xe0/0xe9
 [<ffffffff81036768>] update_process_times+0x60/0x70
 [<ffffffff81062f2f>] tick_sched_timer+0x68/0x92
 [<ffffffff81046e33>] __run_hrtimer+0xb3/0x13c
 [<ffffffff81062ec7>] ? tick_nohz_handler+0xd0/0xd0
 [<ffffffff810474f2>] hrtimer_interrupt+0xdb/0x198
 [<ffffffff81019a35>] smp_apic_timer_interrupt+0x81/0x94
 [<ffffffff81655187>] apic_timer_interrupt+0x67/0x70
 <EOI>  [<ffffffff8101a3c4>] ? default_send_IPI_mask_allbutself_phys+0xb4/0xc4
 [<ffffffff8101c680>] physflat_send_IPI_allbutself+0x12/0x14
 [<ffffffff81018db4>] native_nmi_stop_other_cpus+0x8a/0xd6
 [<ffffffff810188ba>] native_machine_shutdown+0x50/0x67
 [<ffffffff81018926>] machine_shutdown+0xa/0xc
 [<ffffffff8101897e>] native_machine_restart+0x20/0x32
 [<ffffffff810189b0>] machine_restart+0xa/0xc
 [<ffffffff8103b196>] kernel_restart+0x47/0x4c
 [<ffffffff8103b2e6>] sys_reboot+0x13e/0x17c
 [<ffffffff8164e436>] ? _raw_spin_unlock_bh+0x10/0x12
 [<ffffffff810fcac9>] ? bdi_queue_work+0xcf/0xd8
 [<ffffffff810fe82f>] ? __bdi_start_writeback+0xae/0xb7
 [<ffffffff810e0d64>] ? iterate_supers+0xa3/0xb7
 [<ffffffff816547a2>] system_call_fastpath+0x16/0x1b
---[ end trace 320af5cb1cb60c5b ]---

The root cause seems to be the
default_send_IPI_mask_allbutself_phys() takes quite some time (I
measured it could be several ms) to complete sending NMIs to all
the other 23 CPUs, and for HZ=250/1000 system, the time is long
enough for a timer interrupt to happen, which will in turn
trigger to kick load balance to a stopped CPU and cause this
warning in native_smp_send_reschedule().

So disabling the local irq before stop_other_cpu() can fix this
problem (tested 25 times reboot ok), and it is fine as there
should be nobody caring the timer interrupt in such reboot
stage.

The latest 3.4 kernel slightly changes this behavior by sending
REBOOT_VECTOR first and only send NMI_VECTOR if the REBOOT_VCTOR
fails, and this patch is still needed to prevent the problem.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20120530231541.4c13433a@feng-i7
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 12:03:23 +02:00
Sebastian Andrzej Siewior 7071f6b288 x86/intel/moorestown: Change intel_scu_devices_create() to __devinit
The allmodconfig hits:

 WARNING: vmlinux.o(.text+0x6553d): Section mismatch in
          reference from the function intel_scu_devices_create() to the
          function .devinit.text: spi_register_board_info()
	  [...]

This patch marks intel_scu_devices_create() as devinit because
it only calls a devinit function, spi_register_board_info().

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Feng Tang <feng.tang@intel.com>
Link: http://lkml.kernel.org/r/20120531212025.GA8519@breakpoint.cc
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 11:58:40 +02:00
Yasuaki Ishimatsu 4af463d28f x86/numa: Set numa_nodes_parsed at acpi_numa_memory_affinity_init()
When hot-adding a CPU, the system outputs following messages
since node_to_cpumask_map[2] was not allocated memory.

Booting Node 2 Processor 32 APIC 0xc0
node_to_cpumask_map[2] NULL
Pid: 0, comm: swapper/32 Tainted: G       A     3.3.5-acd #21
Call Trace:
 [<ffffffff81048845>] debug_cpumask_set_cpu+0x155/0x160
 [<ffffffff8105e28a>] ? add_timer_on+0xaa/0x120
 [<ffffffff8150665f>] numa_add_cpu+0x1e/0x22
 [<ffffffff815020bb>] identify_cpu+0x1df/0x1e4
 [<ffffffff815020d6>] identify_econdary_cpu+0x16/0x1d
 [<ffffffff81504614>] smp_store_cpu_info+0x3c/0x3e
 [<ffffffff81505263>] smp_callin+0x139/0x1be
 [<ffffffff815052fb>] start_secondary+0x13/0xeb

The reason is that the bit of node 2 was not set at
numa_nodes_parsed. numa_nodes_parsed is set by only
acpi_numa_processor_affinity_init /
acpi_numa_x2apic_affinity_init. Thus even if hot-added memory
which is same PXM as hot-added CPU is written in ACPI SRAT
Table, if the hot-added CPU is not written in ACPI SRAT table,
numa_nodes_parsed is not set.

But according to ACPI Spec Rev 5.0, it says about ACPI SRAT
table as follows: This optional table provides information that
allows OSPM to associate processors and memory ranges, including
ranges of memory provided by hot-added memory devices, with
system localities / proximity domains and clock domains.

It means that ACPI SRAT table only provides information for CPUs
present at boot time and for memory including hot-added memory.
So hot-added memory is written in ACPI SRAT table, but hot-added
CPU is not written in it. Thus numa_nodes_parsed should be set
by not only acpi_numa_processor_affinity_init /
acpi_numa_x2apic_affinity_init but also
acpi_numa_memory_affinity_init for the case.

Additionally, if system has cpuless memory node,
acpi_numa_processor_affinity_init /
acpi_numa_x2apic_affinity_init cannot set numa_nodes_parseds
since these functions cannot find cpu description for the node.
In this case, numa_nodes_parsed needs to be set by
acpi_numa_memory_affinity_init.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: liuj97@gmail.com
Cc: kosaki.motohiro@gmail.com
Link: http://lkml.kernel.org/r/4FCC2098.4030007@jp.fujitsu.com
[ merged it ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 11:58:39 +02:00
Xiaotian Feng aff5a62d52 x86/gart: Fix kmemleak warning
aperture_64.c now is using memblock, the previous
kmemleak_ignore() for alloc_bootmem() should be removed then.

Otherwise, with kmemleak enabled, kernel will throw warnings
like:

[    0.000000] kmemleak: Trying to color unknown object at 0xffff8800c4000000 as Black
[    0.000000] Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc1-next-20120605+ #130
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff811b27e6>] paint_ptr+0x66/0xc0
[    0.000000]  [<ffffffff816b90fb>] kmemleak_ignore+0x2b/0x60
[    0.000000]  [<ffffffff81ef7bc0>] kmemleak_init+0x217/0x2c1
[    0.000000]  [<ffffffff81ed2b97>] start_kernel+0x32d/0x3eb
[    0.000000]  [<ffffffff81ed25e4>] ? repair_env_string+0x5a/0x5a
[    0.000000]  [<ffffffff81ed2356>] x86_64_start_reservations+0x131/0x135
[    0.000000]  [<ffffffff81ed2120>] ? early_idt_handlers+0x120/0x120
[    0.000000]  [<ffffffff81ed245c>] x86_64_start_kernel+0x102/0x111
[    0.000000] kmemleak: Early log backtrace:
[    0.000000]    [<ffffffff816b911b>] kmemleak_ignore+0x4b/0x60
[    0.000000]    [<ffffffff81ee6a38>] gart_iommu_hole_init+0x3e7/0x547
[    0.000000]    [<ffffffff81edb20b>] pci_iommu_alloc+0x44/0x6f
[    0.000000]    [<ffffffff81ee81ad>] mem_init+0x19/0xec
[    0.000000]    [<ffffffff81ed2a54>] start_kernel+0x1ea/0x3eb
[    0.000000]    [<ffffffff81ed2356>] x86_64_start_reservations+0x131/0x135
[    0.000000]    [<ffffffff81ed245c>] x86_64_start_kernel+0x102/0x111
[    0.000000]    [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
Cc: Xiaotian Feng <xtfeng@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/1338922831-2847-1-git-send-email-xtfeng@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 11:58:38 +02:00
Thomas Gleixner 1a87fc1ec7 x86: mce: Add the dropped timer interval init back
commit 82f7af09 ("x86/mce: Cleanup timer mess) dropped the
initialization of the per cpu timer interval. Duh :(

Restore the previous behaviour.

Reported-by: Chen Gong <gong.chen@linux.intel.com>
Cc: bp@amd64.org
Cc: tony.luck@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-06-06 11:33:21 +02:00
Masami Hiramatsu 436d03faf6 x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix
Fix the x86 instruction decoder to decode bsr/bsf/jmpe with
operand-size prefix (66h). This fixes the test case failure
reported by Linus, attached below.

bsf/bsr/jmpe have a special encoding. Opcode map in
Intel Software Developers Manual vol2 says they have
TZCNT/LZCNT variants if it has F3h prefix. However, there
is no information if it has other 66h or F2h prefixes.
Current instruction decoder supposes that those are
bad instructions, but it actually accepts at least
operand-size prefixes.

H. Peter Anvin further explains:

 " TZCNT/LZCNT are F3 + BSF/BSR exactly because the F2 and
   F3 prefixes have historically been no-ops with most instructions.
   This allows software to unconditionally use the prefixed versions
   and get TZCNT/LZCNT on the processors that have them if they don't
   care about the difference. "

This fixes errors reported by test_get_len:

  Warning: arch/x86/tools/test_get_len found difference at <em_bsf>:ffffffff81036d87
  Warning: ffffffff81036de5:	66 0f bc c2          	bsf    %dx,%ax
  Warning: objdump says 4 bytes, but insn_get_length() says 3
  Warning: arch/x86/tools/test_get_len found difference at <em_bsr>:ffffffff81036ea6
  Warning: ffffffff81036f04:	66 0f bd c2          	bsr    %dx,%ax
  Warning: objdump says 4 bytes, but insn_get_length() says 3
  Warning: decoded and checked 13298882 instructions with 2 warnings

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <yrl.pp-manager.tt@hitachi.com>
Link: http://lkml.kernel.org/r/20120604150911.22338.43296.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 08:54:18 +02:00
Chen Gong 958fb3c512 x86/mce: Fix the MCE poll timer logic
In commit 82f7af09 ("x86/mce: Cleanup timer mess), Thomas just
forgot the "/ 2" there while cleaning up.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@amd64.org
Cc: tony.luck@intel.com
Link: http://lkml.kernel.org/r/1338863702-9245-1-git-send-email-gong.chen@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 08:28:21 +02:00
Linus Torvalds eea5b5510f Typo/thinko in a cleanup caused a semantic change. Fix it.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPzkYHAAoJEKurIx+X31iBeZ8QAK44Watxy0Ib0IUGrx2wcds8
 dEgVNyv9qrT28/PCBQiFtoRB83vsWWOzdNDWH6s4AAB6cFRbS8dOHvjUTsFfwgJ1
 lgD9URJmegBFInmkwSBVhY/MpwNm/pw0CNIs2ymqAvSlAgj8I8zF5xWsxTZfJcYn
 gxDNTxbuaEd3sPQO8wjBrw8NhCNpNwzEzZUXh31tM92bgpscIsXsJl/cRna5B1NU
 Z5LiSZev1W0/lf+Ys94ZsOSRT9zfjTI+mjXwv/lu8DlgeRQYyXixRjROgWvCbywx
 hlepvAQHtss9z5YTiGhRlPnR/CZ0fEUMQRtyRsp4qxG8BrgkWAdB++3ZMVbQYdom
 i98TQh1HZU3zzxueIwKwfjPKhG9q2Ee1XzE0ow7sXinBJQgiGrXiEv1tcX7001P7
 vpkyqVon2KKSYknxdHtbc6XnwjbzGDEoS0fqZf0boKoHee7KWBmFyX9JXLvZjtY1
 ef4FqTZNccYWL/5Hi0slZWAucC5iPleeV6sm9y4xG/gJFbTIw+joq1dc3pBZJ2uR
 rHhxD5tWTwbovsq1igcjAbrh9davwiFWiufW3Y5GdTAZJJ1tF6YjCmg18QbvaHJj
 9uplEUBUA4N6UUqPCjdKRdPaxPwkNOmjYH3YQIajSmLcn0YoI9NYw4Xn0sp2jCBo
 zGnaJvG2IIC9LNgaVohz
 =LQWb
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull MCE regression fix from Tony Luck:
 "Typo/thinko in a cleanup caused a semantic change. Fix it."

* tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  x86/mce: Fix the MCE poll timer logic
2012-06-05 15:15:04 -07:00
Linus Torvalds ecc728467f Merge branch 'fixes-for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull arm CMA fix from Marek Szyprowski:
 "This removes the ARMv6+ CMA dependency and lets one use old, well-
  tested dma-mapping implementation also on ARMv6+ systems without the
  need to use EXPERIMENTAL stuff."

Russell King complained (rightly) about the experimental feature being
forced on by the ARM config.

Here CMA is "continuous memory allocator", not "cross-memory attach".
We really neet to stop using insane TLA's for things that aren't big
industry standards.

* 'fixes-for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: dma-mapping: remove unconditional dependency on CMA
2012-06-05 13:23:17 -07:00
Chen Gong c2238f10e0 x86/mce: Fix the MCE poll timer logic
In commit 82f7af09 (x86/mce: Cleanup timer mess), Thomas just forgot
the "/ 2" there while cleaning up.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-06-05 10:15:07 -07:00
Linus Torvalds 0b3e9f3f21 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar.

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Remove NULL assignment of dattr_cur
  sched: Remove the last NULL entry from sched_feat_names
  sched: Make sched_feat_names const
  sched/rt: Fix SCHED_RR across cgroups
  sched: Move nr_cpus_allowed out of 'struct sched_rt_entity'
  sched: Make sure to not re-read variables after validation
  sched: Fix SD_OVERLAP
  sched: Don't try allocating memory from offline nodes
  sched/nohz: Fix rq->cpu_load calculations some more
  sched/x86: Use cpu_llc_shared_mask(cpu) for coregroup_mask
2012-06-05 09:47:15 -07:00
Tomi Valkeinen c3a21fc79b OMAPDSS: fix registration of DPI and SDI devices
The omapdss arch initialization code registers all the output devices as
omap_devices. However, DPI and SDI are not proper omap_devices, as they
do not have any corresponding HWMOD. This leads to crashes or problems
when the platform code tries to use omap_device functions for DPI and
SDI devices.

One such crash was reported by John Stultz <johnstul@us.ibm.com>:

[   18.756835] Unable to handle kernel NULL pointer dereference at
virtual addr8
[   18.765319] pgd = ea6b8000
[   18.768188] [00000018] *pgd=aa942831, *pte=00000000, *ppte=00000000
[   18.774749] Internal error: Oops: 17 [#1] SMP ARM
[   18.779663] Modules linked in:
[   18.782836] CPU: 0    Not tainted  (3.5.0-rc1-dirty #456)
[   18.788482] PC is at _od_resume_noirq+0x1c/0x78
[   18.793212] LR is at _od_resume_noirq+0x6c/0x78
[   18.797943] pc : [<c00307ec>]    lr : [<c003083c>]    psr: 20000113
[   18.797943] sp : ec3abe80  ip : ec3abdb8  fp : 00000006
[   18.809936] r10: ec1148b8  r9 : c08a48f0  r8 : c00307d0
[   18.815368] r7 : 00000000  r6 : 00000000  r5 : ec114800  r4 :
ec114808
[   18.822174] r3 : 00000000  r2 : 00000000  r1 : ec154fe8  r0 :
00000006
[   18.829010] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
Segment user
[   18.836456] Control: 10c5387d  Table: aa6b804a  DAC: 00000015
[   18.842437] Process sh (pid: 1139, stack limit = 0xec3aa2f0)
[   18.848358] Stack: (0xec3abe80 to 0xec3ac000)

DPI and SDI can be plain platform_devices. This patch changes the
registration from omap_device_register() to platform_device_add().

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Reported-by: John Stultz <johnstul@us.ibm.com>
Tested-by: Jean Pihet <jean.pihet@newoldbits.com>
2012-06-05 17:15:24 +03:00
James Bottomley 4c01acc01d [PARISC] fix code to find libgcc
Sam broke this with

commit 1f2bfbd00e
Author: Sam Ravnborg <sam@ravnborg.org>
Date:   Sat May 5 10:18:41 2012 +0200

    kbuild: link of vmlinux moved to a script

But we should be deriving the location of libgcc in the same way as all
the other archs, so fix by adding a LIBGCC variable which is evaluated
in the makefile

Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-06-05 14:10:23 +09:00
James Bottomley 731455624f [PARISC] fix compile break in use of lib/strncopy_from_user.c
Linus broke us with

commit 36126f8f2e
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sat May 26 10:43:17 2012 -0700

    word-at-a-time: make the interfaces truly generic

By moving functions defined in strncopy_from_user.c into the asm-geneic
version word-at-a-time.h.  Spark and OpenRisc were fixed to use this, but
not parisc.  Fix by adding to generic-y in asm/Kbuild

Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-06-05 14:10:22 +09:00
James Bottomley f1ea8b66e5 [PARISC] fix missing TAINT_WARN problem
Al viro broke us with

commit edd63a2763
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Fri Apr 27 13:42:45 2012 -0400

    set_restore_sigmask() is never called without SIGPENDING (and never should be)

Although it's pretty much our fault since parisc's asm/bug.h uses
BUGWARN_TAINT but doesn't include the file that defines it.  Fix that.

Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-06-05 14:10:17 +09:00
Al Viro 03240b279d fixups for signal breakage
Obvious brainos spotted by Geert.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-04 17:47:34 -04:00
Linus Torvalds c22072bdf0 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "The clocksource driver is pure hardware enablement and the skew option
  is default off, well tested and non dangerous."

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick: Move skew_tick option into the HIGH_RES_TIMER section
  clocksource: em_sti: Add DT support
  clocksource: em_sti: Emma Mobile STI driver
  clockevents: Make clockevents_config() a global symbol
  tick: Add tick skew boot option
2012-06-04 11:25:31 -07:00
Marek Szyprowski f1ae98da85 ARM: dma-mapping: remove unconditional dependency on CMA
CMA has been enabled unconditionally on all ARMv6+ systems to solve the
long standing issue of double kernel mappings for all dma coherent
buffers. This however created a dependency on CONFIG_EXPERIMENTAL for
the whole ARM architecture what should be really avoided. This patch
removes this dependency and lets one use old, well-tested dma-mapping
implementation also on ARMv6+ systems without the need to use
EXPERIMENTAL stuff.

Reported-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-06-04 08:01:24 +02:00
Linus Torvalds 63004afa71 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull straggler x86 fixes from Peter Anvin:
 "Three groups of patches:

  - EFI boot stub documentation and the ability to print error messages;
  - Removal for PTRACE_ARCH_PRCTL for x32 (obsolete interface which
    should never have been ported, and the port is broken and
    potentially dangerous.)
  - ftrace stack corruption fixes.  I'm not super-happy about the
    technical implementation, but it is probably the least invasive in
    the short term.  In the future I would like a single method for
    nesting the debug stack, however."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, x32, ptrace: Remove PTRACE_ARCH_PRCTL for x32
  x86, efi: Add EFI boot stub documentation
  x86, efi; Add EFI boot stub console support
  x86, efi: Only close open files in error path
  ftrace/x86: Do not change stacks in DEBUG when calling lockdep
  x86: Allow nesting of the debug stack IDT setting
  x86: Reset the debug_stack update counter
  ftrace: Use breakpoint method to update ftrace caller
  ftrace: Synchronize variable setting with breakpoints
2012-06-02 16:17:03 -07:00
Linus Torvalds 233e562eac Merge 'for-linus' branches from git://git.kernel.org/pub/scm/linux/kernel/git/viro/{vfs,signal}
Pull vfs fix and a fix from the signal changes for frv from Al Viro.

The __kernel_nlink_t for powerpc got scrogged because 64-bit powerpc
actually depended on the default "unsigned long", while 32-bit powerpc
had an explicit override to "unsigned short".  Al didn't notice, and
made both of them be the unsigned short.

The frv signal fix is fallout from simplifying the do_notify_resume()
code, and leaving an extra parenthesis.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  powerpc: Fix size of st_nlink on 64bit

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  frv: Remove bogus closing parenthesis
2012-06-02 09:03:54 -07:00
Anton Blanchard 0fd7bee1e9 powerpc: Fix size of st_nlink on 64bit
commit e57f93cc53 (powerpc: get rid of nlink_t uses, switch to
explicitly-sized type) changed the size of st_nlink on ppc64 from
a long to a short, resulting in boot failures.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-02 10:44:11 -04:00
Geert Uytterhoeven a393624969 frv: Remove bogus closing parenthesis
Introduced by commit 6fd84c0831
("TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set")

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-02 10:38:19 -04:00
Linus Torvalds 804ce9866d fbdev updates for 3.5
It includes:
 - driver for AUO-K1900 and AUO-K1901 epaper controller
 - large updates for OMAP (e.g. decouple HDMI audio and video)
 - some updates for Exynos and SH Mobile
 - various other small fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPyAhmAAoJECSVL5KnPj1PBcoQAIWftuoXo3sk94f5jKcV4Ucx
 MthEc5iEpMVs8xaEruHHNHXWv8ic0x/PfdC2xrpKOEbNXQcNPlb/QE2xWmBRxmT1
 ucDyu10HJ36jKcwcK4ra5IQwOW+GtbTBEoBZT+WNAjxHZtJmxzjQGM4C12zVQpdJ
 +qV2RP93JmsJoVBL9aKVAg1Ko135LLfD8TcKd+z8TmgFnLfSwKhfl7Jtd2xXwyvz
 /hmW3kJUEnD8E5wuj+/g8sKJhQkGalEiITTqG2j2vJyFgxHSqyLSw8BBixrFW1uT
 B9VnZsHF35ccCo+96UZRH4QsGJTx08+rea/qsv8IMSGczyRp5ey1ufjL+CzKiiIN
 FWfex6fY0HHqZGAopQhjag54e914SIbSxdBwWS/iRrtVt3e9d03BzkhYs4rXl4Ey
 CTC5obzWNTbQ6hLEjgWfVKkKcrF56BnRn3zGPgCTKGp2NK3vODdBkt/EmzUFvCWR
 CcyQhh+PvZzEWp3XsdOGossYs/0aP4bO+7XPGJxZaa3+WVcRaZwAG/uZvJXXBfnp
 DGRFy4wPsTTwKYIx4+t/KrsLtNVKioSMS5GEtuM1YEb8pA7mkUIkqwJv1I261h58
 heTr6vWUsviUqHlKALJ+1CdwWGr3CtktCZssGsSUri61nm8CvlSRn2Nr2aJ/L3RN
 AkemC/33RE5X/+lfkdMx
 =tmIU
 -----END PGP SIGNATURE-----

Merge tag 'fbdev-updates-for-3.5' of git://github.com/schandinat/linux-2.6

Pull fbdev updates from Florian Tobias Schandinat:
 - driver for AUO-K1900 and AUO-K1901 epaper controller
 - large updates for OMAP (e.g. decouple HDMI audio and video)
 - some updates for Exynos and SH Mobile
 - various other small fixes and cleanups

* tag 'fbdev-updates-for-3.5' of git://github.com/schandinat/linux-2.6: (130 commits)
  video: bfin_adv7393fb: Fix cleanup code
  video: exynos_dp: reduce delay time when configuring video setting
  video: exynos_dp: move sw reset prioir to enabling sw defined function
  video: exynos_dp: use devm_ functions
  fb: handle NULL pointers in framebuffer release
  OMAPDSS: HDMI: OMAP4: Update IRQ flags for the HPD IRQ request
  OMAPDSS: Apply VENC timings even if panel is disabled
  OMAPDSS: VENC/DISPC: Delay dividing Y resolution for managers connected to VENC
  OMAPDSS: DISPC: Support rotation through TILER
  OMAPDSS: VRFB: remove compiler warnings when CONFIG_BUG=n
  OMAPFB: remove compiler warnings when CONFIG_BUG=n
  OMAPDSS: remove compiler warnings when CONFIG_BUG=n
  OMAPDSS: DISPC: fix usage of dispc_ovl_set_accu_uv
  OMAPDSS: use DSI_FIFO_BUG workaround only for manual update displays
  OMAPDSS: DSI: Support command mode interleaving during video mode blanking periods
  OMAPDSS: DISPC: Update Accumulator configuration for chroma plane
  drivers/video: fsl-diu-fb: don't initialize the THRESHOLDS registers
  video: exynos mipi dsi: support reverse panel type
  video: exynos mipi dsi: Properly interpret the interrupt source flags
  video: exynos mipi dsi: Avoid races in probe()
  ...
2012-06-01 16:57:51 -07:00
Linus Torvalds f5e7e844a5 - More robust parsing especially of xattr data in JFFS2
- Updates to mxc_nand and gpmi drivers to support new boards and device tree
  - Improve consistency of information about ECC strength in NAND devices
  - Clean up partition handling of plat_nand
  - Support NAND drivers without dedicated access to OOB area
  - BCH hardware ECC support for OMAP
  - Other fixes and cleanups, and a few new device IDs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iEYEABECAAYFAk/JG1wACgkQdwG7hYl686M80wCglN4kutx20j+KJWuZofkr9Hog
 weEAoI4jrqEWEdW9EcT2CIWQw7eG+1v+
 =7tdo
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtd

Pull mtd update from David Woodhouse:
 - More robust parsing especially of xattr data in JFFS2
 - Updates to mxc_nand and gpmi drivers to support new boards and device tree
 - Improve consistency of information about ECC strength in NAND devices
 - Clean up partition handling of plat_nand
 - Support NAND drivers without dedicated access to OOB area
 - BCH hardware ECC support for OMAP
 - Other fixes and cleanups, and a few new device IDs

Fixed trivial conflict in drivers/mtd/nand/gpmi-nand/gpmi-nand.c due to
added include files next to each other.

* tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtd: (75 commits)
  mtd: mxc_nand: move ecc strengh setup before nand_scan_tail
  mtd: block2mtd: fix recursive call of mtd_writev
  mtd: gpmi-nand: define ecc.strength
  mtd: of_parts: fix breakage in Kconfig
  mtd: nand: fix scan_read_raw_oob
  mtd: docg3 fix in-middle of blocks reads
  mtd: cfi_cmdset_0002: Slight cleanup of fixup messages
  mtd: add fixup for S29NS512P NOR flash.
  jffs2: allow to complete xattr integrity check on first GC scan
  jffs2: allow to discriminate between recoverable and non-recoverable errors
  mtd: nand: omap: add support for hardware BCH ecc
  ARM: OMAP3: gpmc: add BCH ecc api and modes
  mtd: nand: check the return code of 'read_oob/read_oob_raw'
  mtd: nand: remove 'sndcmd' parameter of 'read_oob/read_oob_raw'
  mtd: m25p80: Add support for Winbond W25Q80BW
  jffs2: get rid of jffs2_sync_super
  jffs2: remove unnecessary GC pass on sync
  jffs2: remove unnecessary GC pass on umount
  jffs2: remove lock_super
  mtd: gpmi: add gpmi support for mx6q
  ...
2012-06-01 16:55:42 -07:00
H. Peter Anvin 40b46a7d29 Merge remote-tracking branch 'rostedt/tip/perf/urgent-2' into x86-urgent-for-linus 2012-06-01 15:55:31 -07:00
Linus Torvalds efff0471b0 Merge branch 'ux500/hickup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull arm fixes for ux500 mismerge mishap from Arnd Bergmann:
 "The device tree conversion for arm/ux500 in 3.5 turns out to be
  incomplete because of a mismerge done by Linus Walleij that I failed
  to notice early enough and that Lee Jones as the original author of
  those patches did not manage to fix during the -next cycle.  While we
  originally to get a much larger set of ux500 device tree enablement
  patches merged, this did not happen in time.

  After some discussion at Linaro Connect conference this week, Lee has
  been able to do damage control and provide a series to put the broken
  platform back into usable shape for both DT and non-DT based booting.

  This series has not been part of linux-next and is based on top of the
  current state of the upstream kernel rather than an -rc, but this is
  the best we could manage given the earlier breakage."

* 'ux500/hickup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: ux500: Enable probing of pinctrl through Device Tree
  ARM: ux500: Add support for ab8500 regulators into the Device Tree
  ARM: ux500: Provide regulator support for SMSC911x via Device Tree
  ARM: ux500: Allow PRCMU regulator to be probed during a DT enabled boot
  ARM: ux500: Apply db8500-prcmu regulator information to db8500 Device Tree
  ARM: ux500: Only initialise STE's UIBs on boards which support them
  ARM: ux500: Disable platform setup of the ab8500 when DT is enabled
  ARM: ux500: Use correct format for dynamic IRQ assignment
  ARM: ux500: Re-enable SMSC911x platform code registration during non-DT boots
  ARM: ux500: PRCMU related configuration and layout corrections for Device Tree
  ARM: ux500: Remove DB8500 PRCMU platform registration when DT is enabled
  ARM: ux500: Disable SMSC911x platform code registration when DT is enabled
  ARM: ux500: New DT:ed u8500_init_devices for one-by-one device enablement
  ARM: ux500: New DT:ed snowball_platform_devs for one-by-one device enablement
  pinctrl-nomadik: Allow Device Tree driver probing
2012-06-01 15:46:46 -07:00
H.J. Lu bad1a753d4 x86, x32, ptrace: Remove PTRACE_ARCH_PRCTL for x32
When I added x32 ptrace to 3.4 kernel, I also include PTRACE_ARCH_PRCTL
support for x32 GDB  For ARCH_GET_FS/GS, it takes a pointer to int64.  But
at user level, ARCH_GET_FS/GS takes a pointer to int32.  So I have to add
x32 ptrace to glibc to handle it with a temporary int64 passed to kernel and
copy it back to GDB as int32.  Roland suggested that PTRACE_ARCH_PRCTL
is obsolete and x32 GDB should use fs_base and gs_base fields of
user_regs_struct instead.

Accordingly, remove PTRACE_ARCH_PRCTL completely from the x32 code to
avoid possible memory overrun when pointer to int32 is passed to
kernel.

Link: http://lkml.kernel.org/r/CAMe9rOpDzHfS7NH7m1vmD9QRw8SSj4Sc%2BaNOgcWm_WJME2eRsQ@mail.gmail.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@vger.kernel.org> v3.4
2012-06-01 13:54:21 -07:00
Linus Torvalds 86c47b70f6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull third pile of signal handling patches from Al Viro:
 "This time it's mostly helpers and conversions to them; there's a lot
  of stuff remaining in the tree, but that'll either go in -rc2
  (isolated bug fixes, ideally via arch maintainers' trees) or will sit
  there until the next cycle."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  x86: get rid of calling do_notify_resume() when returning to kernel mode
  blackfin: check __get_user() return value
  whack-a-mole with TIF_FREEZE
  FRV: Optimise the system call exit path in entry.S [ver #2]
  FRV: Shrink TIF_WORK_MASK [ver #2]
  FRV: Prevent syscall exit tracing and notify_resume at end of kernel exceptions
  new helper: signal_delivered()
  powerpc: get rid of restore_sigmask()
  most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set
  set_restore_sigmask() is never called without SIGPENDING (and never should be)
  TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set
  don't call try_to_freeze() from do_signal()
  pull clearing RESTORE_SIGMASK into block_sigmask()
  sh64: failure to build sigframe != signal without handler
  openrisc: tracehook_signal_handler() is supposed to be called on success
  new helper: sigmask_to_save()
  new helper: restore_saved_sigmask()
  new helpers: {clear,test,test_and_clear}_restore_sigmask()
  HAVE_RESTORE_SIGMASK is defined on all architectures now
2012-06-01 11:53:44 -07:00
Linus Torvalds 1193755ac6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs changes from Al Viro.
 "A lot of misc stuff.  The obvious groups:
   * Miklos' atomic_open series; kills the damn abuse of
     ->d_revalidate() by NFS, which was the major stumbling block for
     all work in that area.
   * ripping security_file_mmap() and dealing with deadlocks in the
     area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in
     general.
   * ->encode_fh() switched to saner API; insane fake dentry in
     mm/cleancache.c gone.
   * assorted annotations in fs (endianness, __user)
   * parts of Artem's ->s_dirty work (jff2 and reiserfs parts)
   * ->update_time() work from Josef.
   * other bits and pieces all over the place.

  Normally it would've been in two or three pull requests, but
  signal.git stuff had eaten a lot of time during this cycle ;-/"

Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the
'truncate_range' inode method was removed by the VM changes, the VFS
update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due
to sparse fix added twice, with other changes nearby).

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits)
  nfs: don't open in ->d_revalidate
  vfs: retry last component if opening stale dentry
  vfs: nameidata_to_filp(): don't throw away file on error
  vfs: nameidata_to_filp(): inline __dentry_open()
  vfs: do_dentry_open(): don't put filp
  vfs: split __dentry_open()
  vfs: do_last() common post lookup
  vfs: do_last(): add audit_inode before open
  vfs: do_last(): only return EISDIR for O_CREAT
  vfs: do_last(): check LOOKUP_DIRECTORY
  vfs: do_last(): make ENOENT exit RCU safe
  vfs: make follow_link check RCU safe
  vfs: do_last(): use inode variable
  vfs: do_last(): inline walk_component()
  vfs: do_last(): make exit RCU safe
  vfs: split do_lookup()
  Btrfs: move over to use ->update_time
  fs: introduce inode operation ->update_time
  reiserfs: get rid of resierfs_sync_super
  reiserfs: mark the superblock as dirty a bit later
  ...
2012-06-01 10:34:35 -07:00
Al Viro 44fbbb3dc6 x86: get rid of calling do_notify_resume() when returning to kernel mode
If we end up calling do_notify_resume() with !user_mode(refs), it
does nothing (do_signal() explicitly bails out and we can't get there
with TIF_NOTIFY_RESUME in such situations).  Then we jump to
resume_userspace_sig, which rechecks the same thing and bails out
to resume_kernel, thus breaking the loop.

It's easier and cheaper to check *before* calling do_notify_resume()
and bail out to resume_kernel immediately.  And kill the check in
do_signal()...

Note that on amd64 we can't get there with !user_mode() at all - asm
glue takes care of that.

Acked-and-reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 13:01:51 -04:00
Al Viro 29bf5dd895 blackfin: check __get_user() return value
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 13:01:27 -04:00
Al Viro 35d5180757 whack-a-mole with TIF_FREEZE
blackfin has reintroduced it, completely unused.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 13:00:49 -04:00
David Howells a2eddc7c49 FRV: Optimise the system call exit path in entry.S [ver #2]
Optimise the system call exit path in entry.S by packing some instructions.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:59:38 -04:00
David Howells 1e5ef91556 FRV: Shrink TIF_WORK_MASK [ver #2]
Shrink TIF_WORK_MASK so that it will fit in the 12-bit signed immediate
operand field of an ANDI instruction.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:59:37 -04:00
David Howells 137c3c469f FRV: Prevent syscall exit tracing and notify_resume at end of kernel exceptions
Move the test for kernel mode processing from do_signal() into entry.S to also
prevent system call exit tracing and userspace resumption notification handling
happening when returning from kernel exceptions.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:59:18 -04:00
Al Viro efee984c27 new helper: signal_delivered()
Does block_sigmask() + tracehook_signal_handler();  called when
sigframe has been successfully built.  All architectures converted
to it; block_sigmask() itself is gone now (merged into this one).

I'm still not too happy with the signature, but that's a separate
story (IMO we need a structure that would contain signal number +
siginfo + k_sigaction, so that get_signal_to_deliver() would fill one,
signal_delivered(), handle_signal() and probably setup...frame() -
take one).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:52 -04:00
Al Viro 17440f171e powerpc: get rid of restore_sigmask()
... it's just a call of set_current_blocked() now

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:51 -04:00
Al Viro 77097ae503 most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set
Only 3 out of 63 do not.  Renamed the current variant to __set_current_blocked(),
added set_current_blocked() that will exclude unblockable signals, switched
open-coded instances to it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:51 -04:00
Al Viro edd63a2763 set_restore_sigmask() is never called without SIGPENDING (and never should be)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:50 -04:00
Al Viro 6fd84c0831 TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:50 -04:00
Al Viro bf343dfd87 don't call try_to_freeze() from do_signal()
get_signal_to_deliver() will handle it itself

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:49 -04:00
Al Viro a610d6e672 pull clearing RESTORE_SIGMASK into block_sigmask()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:49 -04:00
Al Viro 5754f412a3 sh64: failure to build sigframe != signal without handler
it's actually "send me SIGSEGV"...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:49 -04:00
Al Viro 39974d085d openrisc: tracehook_signal_handler() is supposed to be called on success
... not if sigframe couldn't have been built.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:48 -04:00
Al Viro b7f9a11a6c new helper: sigmask_to_save()
replace boilerplate "should we use ->saved_sigmask or ->blocked?"
with calls of obvious inlined helper...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:48 -04:00
Al Viro 51a7b448d4 new helper: restore_saved_sigmask()
first fruits of ..._restore_sigmask() helpers: now we can take
boilerplate "signal didn't have a handler, clear RESTORE_SIGMASK
and restore the blocked mask from ->saved_mask" into a common
helper.  Open-coded instances switched...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:47 -04:00
Al Viro 4ebefe3ec7 new helpers: {clear,test,test_and_clear}_restore_sigmask()
helpers parallel to set_restore_sigmask(), used in the next commits

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01 12:58:47 -04:00
Matt Fleming 0c7596621e x86, efi: Add EFI boot stub documentation
Since we can't expect every user to read the EFI boot stub code it
seems prudent to have a couple of paragraphs explaining what it is and
how it works.

The "initrd=" option in particular is tricky because it only
understands absolute EFI-style paths (backslashes as directory
separators), and until now this hasn't been documented anywhere. This
has tripped up a couple of users.

Cc: Matthew Garrett <mjg@redhat.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/1331907517-3985-4-git-send-email-matt@console-pimps.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-01 09:11:41 -07:00
Matt Fleming 9fa7dedad3 x86, efi; Add EFI boot stub console support
We need a way of printing useful messages to the user, for example
when we fail to open an initrd file, instead of just hanging the
machine without giving the user any indication of what went wrong. So
sprinkle some error messages throughout the EFI boot stub code to make
it easier for users to diagnose/report problems.

Reported-by: Keshav P R <the.ridikulus.rat@gmail.com>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/1331907517-3985-3-git-send-email-matt@console-pimps.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-01 09:11:26 -07:00
Matt Fleming 30dc0d0fe5 x86, efi: Only close open files in error path
The loop at the 'close_handles' label in handle_ramdisks() should be
using 'i', which represents the number of initrd files that were
successfully opened, not 'nr_initrds' which is the number of initrd=
arguments passed on the command line.

Currently, if we execute the loop to close all file handles and we
failed to open any initrds we'll try to call the close function on a
garbage pointer, causing the machine to hang.

Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/1331907517-3985-2-git-send-email-matt@console-pimps.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-01 09:11:10 -07:00
Steven Rostedt 5963e317b1 ftrace/x86: Do not change stacks in DEBUG when calling lockdep
When both DYNAMIC_FTRACE and LOCKDEP are set, the TRACE_IRQS_ON/OFF
will call into the lockdep code. The lockdep code can call lots of
functions that may be traced by ftrace. When ftrace is updating its
code and hits a breakpoint, the breakpoint handler will call into
lockdep. If lockdep happens to call a function that also has a breakpoint
attached, it will jump back into the breakpoint handler resetting
the stack to the debug stack and corrupt the contents currently on
that stack.

The 'do_sym' call that calls do_int3() is protected by modifying the
IST table to point to a different location if another breakpoint is
hit. But the TRACE_IRQS_OFF/ON are outside that protection, and if
a breakpoint is hit from those, the stack will get corrupted, and
the kernel will crash:

[ 1013.243754] BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
[ 1013.272665] IP: [<ffff880145cc0000>] 0xffff880145cbffff
[ 1013.285186] PGD 1401b2067 PUD 14324c067 PMD 0
[ 1013.298832] Oops: 0010 [#1] PREEMPT SMP
[ 1013.310600] CPU 2
[ 1013.317904] Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr iTCO_wdt i2c_i801 iTCO_vendor_support e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]
[ 1013.401848]
[ 1013.407399] Pid: 112, comm: kworker/2:1 Not tainted 3.4.0+ #30
[ 1013.437943] RIP: 8eb8:[<ffff88014630a000>]  [<ffff88014630a000>] 0xffff880146309fff
[ 1013.459871] RSP: ffffffff8165e919:ffff88014780f408  EFLAGS: 00010046
[ 1013.477909] RAX: 0000000000000001 RBX: ffffffff81104020 RCX: 0000000000000000
[ 1013.499458] RDX: ffff880148008ea8 RSI: ffffffff8131ef40 RDI: ffffffff82203b20
[ 1013.521612] RBP: ffffffff81005751 R08: 0000000000000000 R09: 0000000000000000
[ 1013.543121] R10: ffffffff82cdc318 R11: 0000000000000000 R12: ffff880145cc0000
[ 1013.564614] R13: ffff880148008eb8 R14: 0000000000000002 R15: ffff88014780cb40
[ 1013.586108] FS:  0000000000000000(0000) GS:ffff880148000000(0000) knlGS:0000000000000000
[ 1013.609458] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1013.627420] CR2: 0000000000000002 CR3: 0000000141f10000 CR4: 00000000001407e0
[ 1013.649051] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1013.670724] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1013.692376] Process kworker/2:1 (pid: 112, threadinfo ffff88013fe0e000, task ffff88014020a6a0)
[ 1013.717028] Stack:
[ 1013.724131]  ffff88014780f570 ffff880145cc0000 0000400000004000 0000000000000000
[ 1013.745918]  cccccccccccccccc ffff88014780cca8 ffffffff811072bb ffffffff81651627
[ 1013.767870]  ffffffff8118f8a7 ffffffff811072bb ffffffff81f2b6c5 ffffffff81f11bdb
[ 1013.790021] Call Trace:
[ 1013.800701] Code: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a <e7> d7 64 81 ff ff ff ff 01 00 00 00 00 00 00 00 65 d9 64 81 ff
[ 1013.861443] RIP  [<ffff88014630a000>] 0xffff880146309fff
[ 1013.884466]  RSP <ffff88014780f408>
[ 1013.901507] CR2: 0000000000000002

The solution was to reuse the NMI functions that change the IDT table to make the debug
stack keep its current stack (in kernel mode) when hitting a breakpoint:

  call debug_stack_set_zero
  TRACE_IRQS_ON
  call debug_stack_reset

If the TRACE_IRQS_ON happens to hit a breakpoint then it will keep the current stack
and not crash the box.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-31 23:12:22 -04:00
Steven Rostedt f8988175fd x86: Allow nesting of the debug stack IDT setting
When the NMI handler runs, it checks if it preempted a debug handler
and if that handler is using the debug stack. If it is, it changes the
IDT table not to update the stack, otherwise it will reset the debug
stack and corrupt the debug handler it preempted.

Now that ftrace uses breakpoints to change functions from nops to
callers, many more places may hit a breakpoint. Unfortunately this
includes some of the calls that lockdep performs. Which causes issues
with the debug stack. It too needs to change the debug stack before
tracing (if called from the debug handler).

Allow the debug_stack_set_zero() and debug_stack_reset() to be nested
so that the debug handlers can take advantage of them too.

[ Used this_cpu_*() over __get_cpu_var() as suggested by H. Peter Anvin ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-31 23:12:21 -04:00
Steven Rostedt c0525a6972 x86: Reset the debug_stack update counter
When an NMI goes off and it sees that it preempted the debug stack,
to keep the debug stack safe, it changes the IDT to point to one that
does not modify the stack on breakpoint (to allow breakpoints in NMIs).

But the variable that gets set to know to undo it on exit never gets
cleared on exit. Thus every NMI will reset it on exit the first time
it is done even if it does not need to be reset.

[ Added H. Peter Anvin's suggestion to use this_cpu_read/write ]

Cc: <stable@vger.kernel.org> # v3.3
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-31 23:12:20 -04:00
Steven Rostedt 8a4d0a687a ftrace: Use breakpoint method to update ftrace caller
On boot up and module load, it is fine to modify the code directly,
without the use of breakpoints. This is because boot up modification
is done before SMP is initialized, thus the modification is serial,
and module load is done before the module executes.

But after that we must use a SMP safe method to modify running code.
Otherwise, if we are running the function tracer and update its
function (by starting off the stack tracer, or perf tracing)
the change of the function called by the ftrace trampoline is done
directly. If this is being executed on another CPU, that CPU may
take a GPF and crash the kernel.

The breakpoint method is used to change the nops at all the functions, but
the change of the ftrace callback handler itself was still using a
direct modification. If tracing was enabled and the function callback
was changed then another CPU could fault if it was currently calling
the original callback. This modification must use the breakpoint method
too.

Note, the direct method is still used for boot up and module load.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-31 23:12:19 -04:00
Steven Rostedt a192cd0413 ftrace: Synchronize variable setting with breakpoints
When the function tracer starts modifying the code via breakpoints
it sets a variable (modifying_ftrace_code) to inform the breakpoint
handler to call the ftrace int3 code.

But there's no synchronization between setting this code and the
handler, thus it is possible for the handler to be called on another
CPU before it sees the variable. This will cause a kernel crash as
the int3 handler will not know what to do with it.

I originally added smp_mb()'s to force the visibility of the variable
but H. Peter Anvin suggested that I just make it atomic.

[ Added comments as suggested by Peter Zijlstra ]

Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-31 23:12:17 -04:00
Linus Torvalds fb21affa49 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull second pile of signal handling patches from Al Viro:
 "This one is just task_work_add() series + remaining prereqs for it.

  There probably will be another pull request from that tree this
  cycle - at least for helpers, to get them out of the way for per-arch
  fixes remaining in the tree."

Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile
had brought in commit 97fd75b7b8 ("kernel/irq/manage.c: use the
pr_foo() infrastructure to prefix printks") which changed one of the
pr_err() calls that this merge moves around.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  keys: kill task_struct->replacement_session_keyring
  keys: kill the dummy key_replace_session_keyring()
  keys: change keyctl_session_to_parent() to use task_work_add()
  genirq: reimplement exit_irq_thread() hook via task_work_add()
  task_work_add: generic process-context callbacks
  avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers
  parisc: need to check NOTIFY_RESUME when exiting from syscall
  move key_repace_session_keyring() into tracehook_notify_resume()
  TIF_NOTIFY_RESUME is defined on all targets now
2012-05-31 18:47:30 -07:00
Linus Torvalds 08615d7d85 Merge branch 'akpm' (Andrew's patch-bomb)
Merge misc patches from Andrew Morton:

 - the "misc" tree - stuff from all over the map

 - checkpatch updates

 - fatfs

 - kmod changes

 - procfs

 - cpumask

 - UML

 - kexec

 - mqueue

 - rapidio

 - pidns

 - some checkpoint-restore feature work.  Reluctantly.  Most of it
   delayed a release.  I'm still rather worried that we don't have a
   clear roadmap to completion for this work.

* emailed from Andrew Morton <akpm@linux-foundation.org>: (78 patches)
  kconfig: update compression algorithm info
  c/r: prctl: add ability to set new mm_struct::exe_file
  c/r: prctl: extend PR_SET_MM to set up more mm_struct entries
  c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat
  syscalls, x86: add __NR_kcmp syscall
  fs, proc: introduce /proc/<pid>/task/<tid>/children entry
  sysctl: make kernel.ns_last_pid control dependent on CHECKPOINT_RESTORE
  aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector()
  eventfd: change int to __u64 in eventfd_signal()
  fs/nls: add Apple NLS
  pidns: make killed children autoreap
  pidns: use task_active_pid_ns in do_notify_parent
  rapidio/tsi721: add DMA engine support
  rapidio: add DMA engine support for RIO data transfers
  ipc/mqueue: add rbtree node caching support
  tools/selftests: add mq_perf_tests
  ipc/mqueue: strengthen checks on mqueue creation
  ipc/mqueue: correct mq_attr_ok test
  ipc/mqueue: improve performance of send/recv
  selftests: add mq_open_tests
  ...
2012-05-31 18:10:18 -07:00
Cyrill Gorcunov d97b46a646 syscalls, x86: add __NR_kcmp syscall
While doing the checkpoint-restore in the user space one need to determine
whether various kernel objects (like mm_struct-s of file_struct-s) are
shared between tasks and restore this state.

The 2nd step can be solved by using appropriate CLONE_ flags and the
unshare syscall, while there's currently no ways for solving the 1st one.

One of the ways for checking whether two tasks share e.g.  mm_struct is to
provide some mm_struct ID of a task to its proc file, but showing such
info considered to be not that good for security reasons.

Thus after some debates we end up in conclusion that using that named
'comparison' syscall might be the best candidate.  So here is it --
__NR_kcmp.

It takes up to 5 arguments - the pids of the two tasks (which
characteristics should be compared), the comparison type and (in case of
comparison of files) two file descriptors.

Lookups for pids are done in the caller's PID namespace only.

At moment only x86 is supported and tested.

[akpm@linux-foundation.org: fix up selftests, warnings]
[akpm@linux-foundation.org: include errno.h]
[akpm@linux-foundation.org: tweak comment text]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: Michal Marek <mmarek@suse.cz>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:32 -07:00
Anton Vorontsov 2c922c51e6 um: properly check all process' threads for a live mm
kill_off_processes() might miss a valid process, this is because checking
for process->mm is not enough.  Process' main thread may exit or detach
its mm via use_mm(), but other threads may still have a valid mm.

To catch this we use find_lock_task_mm(), which walks up all threads and
returns an appropriate task (with task lock held).

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:30 -07:00
Anton Vorontsov 137d1a26c8 um: fix possible race on task->mm
Checking for task->mm is dangerous as ->mm might disappear (exit_mm()
assigns NULL under task_lock(), so tasklist lock is not enough).

We can't use get_task_mm()/mmput() pair as mmput() might sleep, so let's
take the task lock while we care about its mm.

Note that we should also use find_lock_task_mm() to check all process'
threads for a valid mm, but for uml we'll do it in a separate patch.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:30 -07:00
Anton Vorontsov 9bd0a07712 um: should hold tasklist_lock while traversing processes
Traversing the tasks requires holding tasklist_lock, otherwise it is
unsafe.

p.s.  However, I'm not sure that calling os_kill_ptraced_process() in the
atomic context is correct.  It seem to work, but please take a closer
look.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:30 -07:00
Anton Vorontsov af1be5a578 blackfin: fix possible deadlock in decode_address()
Oleg Nesterov found an interesting deadlock possibility:

> sysrq_showregs_othercpus() does smp_call_function(showacpu)
> and showacpu() show_stack()->decode_address(). Now suppose that IPI
> interrupts the task holding read_lock(tasklist).

To fix this, blackfin should not grab the write_ variant of the
tasklist lock, read_ one is enough.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:30 -07:00
Anton Vorontsov 2214f707de blackfin: a couple of task->mm handling fixes
The patch fixes two problems:

1. Working with task->mm w/o getting mm or grabing the task lock is
   dangerous as ->mm might disappear (exit_mm() assigns NULL under
   task_lock(), so tasklist lock is not enough).

   We can't use get_task_mm()/mmput() pair as mmput() might sleep,
   so we have to take the task lock while handle its mm.

2. Checking for process->mm is not enough because process' main
   thread may exit or detach its mm via use_mm(), but other threads
   may still have a valid mm.

   To catch this we use find_lock_task_mm(), which walks up all
   threads and returns an appropriate task (with task lock held).

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:30 -07:00
Anton Vorontsov 1198c8b9af sh: use clear_tasks_mm_cpumask()
Checking for process->mm is not enough because process' main thread may
exit or detach its mm via use_mm(), but other threads may still have a
valid mm.

To fix this we would need to use find_lock_task_mm(), which would walk up
all threads and returns an appropriate task (with task lock held).

clear_tasks_mm_cpumask() has the issue fixed, so let's use it.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:30 -07:00
Anton Vorontsov 73863ab028 powerpc: use clear_tasks_mm_cpumask()
Current CPU hotplug code has some task->mm handling issues:

1. Working with task->mm w/o getting mm or grabing the task lock is
   dangerous as ->mm might disappear (exit_mm() assigns NULL under
   task_lock(), so tasklist lock is not enough).

   We can't use get_task_mm()/mmput() pair as mmput() might sleep,
   so we must take the task lock while handle its mm.

2. Checking for process->mm is not enough because process' main
   thread may exit or detach its mm via use_mm(), but other threads
   may still have a valid mm.

   To fix this we would need to use find_lock_task_mm(), which would
   walk up all threads and returns an appropriate task (with task
   lock held).

clear_tasks_mm_cpumask() has all the issues fixed, so let's use it.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:29 -07:00
Anton Vorontsov 3eaa73bde2 arm: use clear_tasks_mm_cpumask()
Checking for process->mm is not enough because process' main thread may
exit or detach its mm via use_mm(), but other threads may still have a
valid mm.

To fix this we would need to use find_lock_task_mm(), which would walk up
all threads and returns an appropriate task (with task lock held).

clear_tasks_mm_cpumask() has this issue fixed, so let's use it.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:29 -07:00
Kautuk Consul 1cefe28f95 um/kernel/trap.c: port OOM changes to handle_page_fault()
Commit d065bd810b ("mm: retry page fault when blocking on disk
transfer") and commit 37b23e0525 ("x86,mm: make pagefault killable")
introduced changes into the x86 pagefault handler for making the page
fault handler retryable as well as killable.

These changes reduce the mmap_sem hold time, which is crucial during OOM
killer invocation.

Port these changes to um.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:26 -07:00
Lee Jones 5910de9e2d ARM: ux500: Enable probing of pinctrl through Device Tree
The Nomadik GPIO controller now relies on Nomadik pinctrl, however
the pinctrl driver is not currently started by any ux500 platform.
This is requred or GPIOs do not work at all.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:53 +02:00
Lee Jones 4a85c7fa52 ARM: ux500: Add support for ab8500 regulators into the Device Tree
Here we supply the information required to setup regulators successfully
on Snowball and other db8500 variants which use the ab8500 regulators.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:49 +02:00
Lee Jones bc36748153 ARM: ux500: Provide regulator support for SMSC911x via Device Tree
This patch adds a fixed regulator for use by the SMSC911x Ethernet
chip driver into the db8500 Device Tree. It also references other
regulators required by the same device.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:46 +02:00
Lee Jones 890d84fac0 ARM: ux500: Allow PRCMU regulator to be probed during a DT enabled boot
This patch adds the correct compatible string for use during Device Tree
population. Without it the DB8500 PRCMU regulators would be processed
when DT is enabled.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:41 +02:00
Lee Jones e5999f2890 ARM: ux500: Apply db8500-prcmu regulator information to db8500 Device Tree
Here we inform Device Tree of which regulators are provided by the db8500-
prcmu. This way we can reference some of their consumers directly from the
Device Tree e.g. SMSC911x Ethernet chip.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:37 +02:00
Lee Jones fd6948bb2a ARM: ux500: Only initialise STE's UIBs on boards which support them
ST-Ericsson uses User Interface Boards to extend functionality of
some of their development boards. However, these aren't compatible
with all the supported boards found in Mainline (Snowball for
instance). This patch ensures that the UIBs are only probed on
boards which can actually support them. This in turn saves lots of
unnecessary error messages normally found in Snowball's boot log.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:33 +02:00
Lee Jones 48a4ea626d ARM: ux500: Disable platform setup of the ab8500 when DT is enabled
The final piece of the ab8500 puzzle. Here we prevent any of the ab8500-*
drivers from being registered from platform code when Device Tree is
enabled, as we expect DT do probe each of these individually. We also
provide the relevant compatible strings, so that DT knows which nodes
it needs to pay attention to during population.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:29 +02:00
Lee Jones 93b5698aae ARM: ux500: Use correct format for dynamic IRQ assignment
This patch applies the correct format requested by the irq
domain. For chained IRQs which use GPIO lines as IRQs, we
stipulate that a two cell request is required. The first cell
contains the requested IRQ and the second can contain flags
pertaining to edge detection and level sensitive values. The
zeroth cell specifies the GPIO controller by use of a phandle.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:23 +02:00
Lee Jones e6fada59d4 ARM: ux500: Re-enable SMSC911x platform code registration during non-DT boots
The patch to disable SMSC911x registration was applied twice in the upstream
kernel by mistake. Git interpreted this as 'take the same entry from a
similar struct' which was close by. This was the wrong thing to do. This patch
rectifies this error and re-enables SMSC911x registration when Device Tree is
not enabled.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:18 +02:00
Lee Jones ccf74f7677 ARM: ux500: PRCMU related configuration and layout corrections for Device Tree
Apply db8500 related PRCMU Device Tree settings and clean up some formatting
errors. We also remove one of the PRCMU assigned IRQs, as it is currently not
used.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:14 +02:00
Lee Jones dee42ebe45 ARM: ux500: Remove DB8500 PRCMU platform registration when DT is enabled
Now the DB8500 has Device Tree support it will be probed when the DT
is parsed, rendering the requirement for platform registration void.
This patch removes DB8500 PRCMU platform registration.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:10 +02:00
Lee Jones ada46cda50 ARM: ux500: Disable SMSC911x platform code registration when DT is enabled
Now the SCMC911x is correctly enabled in Device Tree, there is no need
to continue registering it from platform code. In fact, if we continue
doing so, the system will throw an error on boot.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:04:05 +02:00
Lee Jones f65c1982fa ARM: ux500: New DT:ed u8500_init_devices for one-by-one device enablement
During Device Tree enablement it is necessary to remove
<hw_component>_add_<device> calls one at at time, as and when particular
devices are DT enabled. This patch provides a temporary solution. Once
the new *of_init_devices function has been fully unpopulated it will be
removed again.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:03:58 +02:00
Lee Jones 11a0b5f09c ARM: ux500: New DT:ed snowball_platform_devs for one-by-one device enablement
During Device Tree enablement it is necessary to remove snowball_<device>*
platform_data segments one at at time, as and when particular devices are
DT enabled. This patch provides a temporary solution. Once this new struct
is empty it will be removed again.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-06-01 02:03:42 +02:00