Archived
14
0
Fork 0
Commit graph

966 commits

Author SHA1 Message Date
Andi Kleen
3cfc348bf9 [PATCH] x86: Add portable getcpu call
For NUMA optimization and some other algorithms it is useful to have a fast
to get the current CPU and node numbers in user space.

x86-64 added a fast way to do this in a vsyscall. This adds a generic
syscall for other architectures to make it a generic portable facility.

I expect some of them will also implement it as a faster vsyscall.

The cache is an optimization for the x86-64 vsyscall optimization. Since
what the syscall returns is an approximation anyways and user space
often wants very fast results it can be cached for some time.  The norma
methods to get this information in user space are relatively slow

The vsyscall is in a better position to manage the cache because it has direct
access to a fast time stamp (jiffies). For the generic syscall optimization
it doesn't help much, but enforce a valid argument to keep programs
portable

I only added an i386 syscall entry for now. Other architectures can follow
as needed.

AK: Also added some cleanups from Andrew Morton

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Vojtech Pavlik
c08c820508 [PATCH] Add the vgetcpu vsyscall
This patch adds a vgetcpu vsyscall, which depending on the CPU RDTSCP
capability uses either the RDTSCP or CPUID to obtain a CPU and node
numbers and pass them to the program.

AK: Lots of changes over Vojtech's original code:
Better prototype for vgetcpu()
It's better to pass the cpu / node numbers as separate arguments
to avoid mistakes when going from SMP to NUMA.
Also add a fast time stamp based cache using a user supplied
argument to speed things more up.
Use fast method from Chuck Ebbert to retrieve node/cpu from
GDT limit instead of CPUID
Made sure RDTSCP init is always executed after node is known.
Drop printk

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Vojtech Pavlik
a670fad0ad [PATCH] Add initalization of the RDTSCP auxilliary values
This patch adds initalization of the RDTSCP auxilliary values to CPU numbers
to time.c. If RDTSCP is available, the MSRs are written with the respective
values. It can be later used to initalize per-cpu timekeeping variables.

AK: Some cleanups. Move externs into headers and fix CPU hotplug.

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Venkatesh Pallipadi
248dcb2fff [PATCH] x86: i386/x86-64 Add nmi watchdog support for new Intel CPUs
AK: This redoes the changes I temporarily reverted.

Intel now has support for Architectural Performance Monitoring Counters
( Refer to IA-32 Intel Architecture Software Developer's Manual
http://www.intel.com/design/pentium4/manuals/253669.htm ). This
feature is present starting from Intel Core Duo and Intel Core Solo processors.

What this means is, the performance monitoring counters and some performance
monitoring events are now defined in an architectural way (using cpuid).
And there will be no need to check for family/model etc for these architectural
events.

Below is the patch to use this performance counters in nmi watchdog driver.
Patch handles both i386 and x86-64 kernels.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Vivek Goyal
3f22c5789e [PATCH] kdump x86_64 nmi event notification fix
After a crash we should wait for NMI IPI event and not for external NMI or
NMI watchdog tick.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Andi Kleen
fac58550e8 [PATCH] Fix up panic messages for different NMI panics
When a unknown NMI happened the panic would claim a NMI watchdog timeout.
Also it would check the variable set by nmi_watchdog=panic and panic then.

Fix up the panic message to be generic
Unconditionally panic on unknown NMI when panic on unknown nmi is enabled.

Noticed by Jan Beulich

Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Shaohua Li
4038f901cf [PATCH] i386/x86-64: Fix NMI watchdog suspend/resume
Making NMI suspend/resume work with SMP. We use CPU hotplug to offline
APs in SMP suspend/resume. Only BSP executes sysdev's .suspend/.resume
method. APs should follow CPU hotplug code path.

And:

+From: Don Zickus <dzickus@redhat.com>

Makes the start/stop paths of nmi watchdog more robust to handle the
suspend/resume cases more gracefully.

AK: I merged the two patches together

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Don Zickus
c41c5cd3b2 [PATCH] x86: x86 clean up nmi panic messages
Clean up some of the output messages on the nmi error paths to make more
sense when they are displayed.  This is mainly a cosmetic fix and
shouldn't impact any normal code path.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus
8da5adda91 [PATCH] x86: Allow users to force a panic on NMI
To quote Alan Cox:

The default Linux behaviour on an NMI of either memory or unknown is to
continue operation. For many environments such as scientific computing
it is preferable that the box is taken out and the error dealt with than
an uncorrected parity/ECC error get propogated.

A small number of systems do generate NMI's for bizarre random reasons
such as power management so the default is unchanged. In other respects
the new proc/sys entry works like the existing panic controls already in
that directory.

This is separate to the edac support - EDAC allows supported chipsets to
handle ECC errors well, this change allows unsupported cases to at least
panic rather than cause problems further down the line.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus
e33e89ab1a [PATCH] x86: Add abilty to enable/disable nmi watchdog from procfs (update)
Adds a new /proc/sys/kernel/nmi_watchdog call that will enable/disable the
nmi watchdog.

By entering a non-zero value here, a user can enable the nmi watchdog to
monitor the online cpus in the system.  By entering a zero value here, a
user can disable the nmi watchdog and free up a performance counter which
could then be utilized by the oprofile subsystem, otherwise oprofile may be
short a counter when in use.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Don Zickus
407984f1af [PATCH] x86: Add abilty to enable/disable nmi watchdog with sysctl
Adds a new /proc/sys/kernel/nmi call that will enable/disable the nmi
watchdog.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus
2fbe7b25c8 [PATCH] i386/x86-64: Remove un/set_nmi_callback and reserve/release_lapic_nmi functions
Removes the un/set_nmi_callback and reserve/release_lapic_nmi functions as
they are no longer needed.  The various subsystems are modified to register
with the die_notifier instead.

Also includes compile fixes by Andrew Morton.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Andi Kleen
957dc87c1b [PATCH] Add ppoll/pselect syscalls
Needed TIF_RESTORE_SIGMASK first

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Andi Kleen
1d001df19d [PATCH] Add TIF_RESTORE_SIGMASK
We need TIF_RESTORE_SIGMASK in order to support ppoll() and pselect()
system calls. This patch originally came from Andi, and was based
heavily on David Howells' implementation of same on i386. I fixed a typo
which was causing do_signal() to use the wrong signal mask.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus
3adbbcce9a [PATCH] x86: Cleanup NMI interrupt path
This patch cleans up the NMI interrupt path.  Instead of being gated by if
the 'nmi callback' is set, the interrupt handler now calls everyone who is
registered on the die_chain and additionally checks the nmi watchdog,
reseting it if enabled.  This allows more subsystems to hook into the NMI if
they need to (without being block by set_nmi_callback).

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus
f2802e7f57 [PATCH] Add SMP support on x86_64 to reservation framework
This patch includes the changes to make the nmi watchdog on x86_64 SMP
aware.  A bunch of code was moved around to make it simpler to read.  In
addition, it is now possible to determine if a particular NMI was the result
of the watchdog or not.  This feature allows the kernel to filter out
unknown NMIs easier.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus
828f0afda1 [PATCH] x86: Add performance counter reservation framework for UP kernels
Adds basic infrastructure to allow subsystems to reserve performance
counters on the x86 chips.  Only UP kernels are supported in this patch to
make reviewing easier.  The SMP portion makes a lot more changes.

Think of this as a locking mechanism where each bit represents a different
counter.  In addition, each subsystem should also reserve an appropriate
event selection register that will correspond to the performance counter it
will be using (this is mainly neccessary for the Pentium 4 chips as they
break the 1:1 relationship to performance counters).

This will help prevent subsystems like oprofile from interfering with the
nmi watchdog.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Andi Kleen
b07f8915cd [PATCH] x86: Temporarily revert parts of the Core 2 nmi nmi watchdog support
This makes merging easier.  They are readded a few patches later.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Andi Kleen
265baba316 [PATCH] Update defconfig
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Herbert Xu
560c06ae1a [CRYPTO] api: Get rid of flags argument to setkey
Now that the tfm is passed directly to setkey instead of the ctx, we no
longer need to pass the &tfm->crt_flags pointer.

This patch also gets rid of a few unnecessary checks on the key length
for ciphers as the cipher layer guarantees that the key length is within
the bounds specified by the algorithm.

Rather than testing dia_setkey every time, this patch does it only once
during crypto_alloc_tfm.  The redundant check from crypto_digest_setkey
is also removed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:41:02 +10:00
Joachim Fritschi
eaf44088ff [CRYPTO] twofish: x86-64 assembly version
The patch passed the trycpt tests and automated filesystem tests.
This rewrite resulted in some nice perfomance increase over my last patch.

Short summary of the tcrypt benchmarks:

Twofish Assembler vs. Twofish C (256bit 8kb block CBC)
encrypt: -27% Cycles
decrypt: -23% Cycles

Twofish Assembler vs. AES Assembler (128bit 8kb block CBC)
encrypt: +18%  Cycles
decrypt: +15% Cycles

Twofish Assembler vs. AES Assembler (256bit 8kb block CBC)
encrypt: -9% Cycles
decrypt: -8% Cycles

Full Output:
http://homepages.tu-darmstadt.de/~fritschi/twofish/tcrypt-speed-twofish-c-x86_64.txt
http://homepages.tu-darmstadt.de/~fritschi/twofish/tcrypt-speed-twofish-asm-x86_64.txt
http://homepages.tu-darmstadt.de/~fritschi/twofish/tcrypt-speed-aes-asm-x86_64.txt


Here is another bonnie++ benchmark with encrypted filesystems. Most runs maxed
out the hd. It should give some idea what the module can do for encrypted filesystem
performance even though you can't see the full numbers.

http://homepages.tu-darmstadt.de/~fritschi/twofish/output_20060610_130806_x86_64.html

Signed-off-by: Joachim Fritschi <jfritschi@freenet.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:16:29 +10:00
Linus Torvalds
79e453d49b Revert mmiocfg heuristics and blacklist changes
This reverts commits 11012d419c and
40dd2d20f2, which allowed us to use the
MMIO accesses for PCI config cycles even without the area being marked
reserved in the e820 memory tables.

Those changes were needed for EFI-environment Intel macs, but broke some
newer Intel 965 boards, so for now it's better to revert to our old
2.6.17 behaviour and at least avoid introducing any new breakage.

Andi Kleen has a set of patches that work with both EFI and the broken
Intel 965 boards, which will be applied once they get wider testing.

Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-19 08:15:22 -07:00
Al Viro
e65e1fc2d2 [PATCH] syscall class hookup for all normal targets
Take default arch/*/kernel/audit.c to lib/, have those with special
needs (== biarch) define AUDIT_ARCH in their Kconfig.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-09-12 03:04:40 -04:00
Al Viro
55669bfa14 [PATCH] audit: AUDIT_PERM support
add support for AUDIT_PERM predicate

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-09-11 13:32:30 -04:00
Al Viro
dc104fb323 [PATCH] audit: more syscall classes added
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-09-11 13:32:27 -04:00
Suleiman Souhlal
ec0063b40a [PATCH] x86_64: Don't write out segments from vsyscall32 DSO if it is not mapped
It's possible to get an invalid page fault in kernel mode when we try to
write out segments from vsyscall32 when dumping core for a 32bit process if
the vsyscall32 DSO is not mapped in its address space (which can happen if,
for example, ulimit -v 100 is run).

Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:16 -07:00
Keith Owens
01ebb77b31 [PATCH] x86_64: Save original IST values for checking stack addresses
The values in init_tss.ist[] can change when an IST event occurs.  Save
the original IST values for checking stack addresses when debugging or
doing stack traces.

Signed-off-by: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:16 -07:00
Andi Kleen
40dd2d20f2 [PATCH] x86: Disable MMCONFIG on Intel SDV using DMI blacklist
As a replacement for the earlier removal of the e820 MCFG check
we blacklist the Intel SDV with the original BIOS bug that
motivated that check. On those machines don't use MMCONFIG.

This also adds a new pci=mmconf parameter to override the blacklist.

Cc: Greg KH <gregkh@suse.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:16 -07:00
Andi Kleen
ceee882230 [PATCH] x86_64: Recover 1MB of kernel memory
Noticed by Jan Beulich.

When the kernel was moved from 1MB to 2MB in 2.6.17 the kernel reservation
code wasn't adjusted and it still reserved starting with 1MB. This means 1MB always
were lost.

This patch fixes this by reserving only starting with _text.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Jan Beulich
ea424055b7 [PATCH] x86: Make backtracer fallback logic more bullet-proof
The unwinder fallback logic still had potential for falling through to
the legacy stack trace code without printing an indication (at once
serving as a separator) of this.

Further, the stack pointer retrieval for the fallback should be as
restrictive as possible (in order to avoid having the legacy stack
tracer try to access invalid memory). The patch tightens that, but
this could certainly be further improved.

Also making the call_trace command line option now conditional upon
CONFIG_STACK_UNWIND (as it's meaningless otherwise).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Andi Kleen
c05991ed12 [PATCH] x86_64: Add kernel thread stack frame termination for properly stopping stack unwinds.
One open question: Should these added pushes perhaps be made
conditional upon CONFIG_STACK_UNWIND or CONFIG_UNWIND_INFO?
[AK: Not needed -- these are all very slow paths]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Andi Kleen
11012d419c [PATCH] x86: Revert e820 MCFG heuristics
The check for the MCFG table being reserved in the e820 map was originally
added to detect a broken BIOS in a preproduction Intel SDV. However it also
breaks the Apple x86 Macs, which can't supply this properly, but need
a working MCFG. With this patch they wouldn't use the MCFG and not work.

After some discussion I think it's best to remove the heuristic again.
It also failed on some other boxes (although it didn't cause much
problems there because old style port access for PCI config space
still works as fallback), but the preproduction SDVs can just use
pci=nommcfg. Supporting production machines properly is more
important.

Edgar Hucek did all the debugging work.

Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Andi Kleen
ddcf36511d [PATCH] x86_64: Update defconfig
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Horms
012c437d03 [PATCH] Change panic_on_oops message to "Fatal exception"
Previously the message was "Fatal exception: panic_on_oops", as introduced
in a recent patch whith removed a somewhat dangerous call to ssleep() in
the panic_on_oops path.  However, Paul Mackerras suggested that this was
somewhat confusing, leadind people to believe that it was panic_on_oops
that was the root cause of the fatal exception.  On his suggestion, this
patch changes the message to simply "Fatal exception".  A suitable oops
message should already have been displayed.

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-14 12:54:29 -07:00
Alexey Dobriyan
825e037fb8 [PATCH] Fix more per-cpu typos
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06 08:57:47 -07:00
Muli Ben-Yehuda
a166222cde [PATCH] x86_64: Fix CONFIG_IOMMU_DEBUG
If CONFIG_IOMMU_DEBUG is set force_iommu defaults to 1. In the case
where no HW IOMMU is present in the machine and we end up using nommu,
leaving force_iommu set to 1 causes dma_alloc_coherent to do the wrong
thing. Therefore, if we end up using nommu, make sure force_iommu is
0.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-02 20:19:54 -07:00
Andi Kleen
2699500b31 [PATCH] x86_64: Fix backtracing for interrupt stacks
Re-add backlink for old style unwinder to stack switching.  Add proper
stack frame and CFI annotations to call_softirq

This prevents a oops when backtracing with fallback through the
interrupt stack top.

Suggested by Jan Beulich and Herbert Xu wanted it in 2.6.18.

Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-02 20:19:54 -07:00
Roland McGrath
0b0bf7a3cc [PATCH] vDSO hash-style fix
The latest toolchains can produce a new ELF section in DSOs and
dynamically-linked executables.  The new section ".gnu.hash" replaces
".hash", and allows for more efficient runtime symbol lookups by the
dynamic linker.  The new ld option --hash-style={sysv|gnu|both} controls
whether to produce the old ".hash", the new ".gnu.hash", or both.  In some
new systems such as Fedora Core 6, gcc by default passes --hash-style=gnu
to the linker, so that a standard invocation of "gcc -shared" results in
producing a DSO with only ".gnu.hash".  The new ".gnu.hash" sections need
to be dealt with the same way as ".hash" sections in all respects; only the
dynamic linker cares about their contents.  To work with older dynamic
linkers (i.e.  preexisting releases of glibc), a binary must have the old
".hash" section.  The --hash-style=both option produces binaries that a new
dynamic linker can use more efficiently, but an old dynamic linker can
still handle.

The new section runs afoul of the custom linker scripts used to build vDSO
images for the kernel.  On ia64, the failure mode for this is a boot-time
panic because the vDSO's PT_IA_64_UNWIND segment winds up ill-formed.

This patch addresses the problem in two ways.

First, it mentions ".gnu.hash" in all the linker scripts alongside ".hash".
 This produces correct vDSO images with --hash-style=sysv (or old tools),
with --hash-style=gnu, or with --hash-style=both.

Second, it passes the --hash-style=sysv option when building the vDSO
images, so that ".gnu.hash" is not actually produced.  This is the most
conservative choice for compatibility with any old userland.  There is some
concern that some ancient glibc builds (though not any known old production
system) might choke on --hash-style=both binaries.  The optimizations
provided by the new style of hash section do not really matter for a DSO
with a tiny number of symbols, as the vDSO has.  If someone wants to use
=gnu or =both for their vDSO builds and worry less about that
compatibility, just change the option and the linker script changes will
make any choice work fine.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:43 -07:00
Chandra Seetharaman
be6b5a3505 [PATCH] cpu hotplug: use hotplug version of registration in late inits
Use hotplug version of register_cpu_notifier in late init functions.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:39 -07:00
Horms
cea6a4ba8a [PATCH] panic_on_oops: remove ssleep()
This patch is part of an effort to unify the panic_on_oops behaviour across
all architectures that implement it.

It was pointed out to me by Andi Kleen that if an oops has occured in
interrupt context, then calling sleep() in the oops path will only cause a
panic, and that it would be really better for it not to be in the path at
all.

This patch removes the ssleep() call and reworks the console message
accordinly.  I have a slght concern that the resulting console message is
too long, feedback welcome.

For powerpc it also unifies the 32bit and 64bit behaviour.

Fror x86_64, this patch only updates the console message, as ssleep() is
already not present.

Signed-off-by: Horms <horms@verge.net.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:39 -07:00
Eric W. Biederman
2a8a3d5b65 [PATCH] machine_kexec.c: Fix the description of segment handling
One of my original comments in machine_kexec was unclear
and this should fix it.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Horms <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:38 -07:00
Andi Kleen
65f87d8a8a [PATCH] x86_64: Fix swiotlb=force
It was broken before. But having it is important as possible hardware
bug workaround.

And previously there was no way to force swiotlb if there is another IOMMU.
Side effect is that iommu=force won't force swiotlb anymore even if there
isn't another IOMMU.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Andi Kleen
355540f333 [PATCH] x86_64: Revert k8-bus.c northbridge access change
As Travis Betak points out it accesses the wrong northbridge subfunction
now. Switch back to the old code.

Cc: "Travis Betak" <betak@mpdtxmail.amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Jon Mason
d2105b10fe [PATCH] x86_64: Calgary IOMMU - Multi-Node NULL pointer dereference fix
Calgary hits a NULL pointer dereference when booting in a multi-chassis
NUMA system.  See Redhat bugzilla number 198498, found by Konrad
Rzeszutek (konradr@redhat.com).

There are many issues that had to be resolved to fix this problem.
Firstly when I originally wrote the code to handle NUMA systems, I
had a large misunderstanding that was not corrected until now.  That was
that I thought the "number of nodes online" referred to number of
physical systems connected.  So that if NUMA was disabled, there
would only be 1 node and it would only show that node's PCI bus.
In reality if NUMA is disabled, the system displays all of the
connected chassis as one node but is only ignorant of the delays
in accessing main memory.  Therefore, references to num_online_nodes()
and MAX_NUMNODES are incorrect and need to be set to the maximum
number of nodes that can be accessed (which are 8).  I created a
variable, MAX_NUM_CHASSIS, and set it to 8 to fix this.

Secondly, when walking the PCI in detect_calgary, the code only
checked the first "slot" when looking to see if a device is present.
This will work for most cases, but unfortunately it isn't always the
case.  In the NUMA MXE drawers, there are USB devices present on the
3rd slot (with slot 1 being empty).  So, to work around this, all
slots (up to 8) are scanned to see if there are any devices present.

Lastly, the bus is being enumerated on large systems in a different
way the we originally thought.  This throws the ugly logic we had
out the window.  To more elegantly handle this, I reorganized the
kva array to be sparse (which removed the need to have any bus number
to kva slot logic in tce.c) and created a secondary space array to
contain the bus number to phb mapping.

With these changes Calgary boots on an x460 with 4 nodes with and
without NUMA enabled.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Muli Ben-Yehuda
089bbbcb36 [PATCH] x86_64: Calgary IOMMU - fix off by one error
Fixed off-by-one error in detect_calgary and calgary_init which will
cause arrays to overflow.  Also, removed impossible to hit BUG_ON.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Andi Kleen
0e5f61b00c [PATCH] x86_64: On Intel systems when CPU has C3 don't use TSC
On Intel systems generally the TSC stops in C3 or deeper,
so don't use it there. Follows similar logic on i386.

This should fix problems on Meroms.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Andi Kleen
260f659b23 [PATCH] x86_64: Update defconfig
Update defconfig

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Andi Kleen
b13761ecd1 [PATCH] x86_64: Dump leftover backtrace entries when dwarf2 unwinder got stuck
The dwarf2 unwinder currently often gets stuck because a lot
of assembly code doesn't have proper dwarf2 annotiation yet.

This currently often happens with __down. Should fix this by
adding proper dwarf2 annotation to all inline assembly. However
until that's done we need a quick fix for 2.6.18 to avoid
incomplete backtraces.

So when this happens dump the rest of the stack with the old unwinder
instead of silently not dumping it. There was already a optional
"both" mode that dumped both, but that was too ugly.

I also clarified the headers for the different backtraces a bit.

Also add a clear error message for missing dwarf2
annotation that people can work on.

And I removed a dead variable left over from Ingo's changes.

Cc: mingo@elte.hu
Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-28 19:28:00 -07:00
Andi Kleen
0e92da4acb [PATCH] x86_64: Don't clobber r8-r11 in int 0x80 handler
When int 0x80 is called from long mode r8-r11 would leak out of the
kernel (or rather they would be filled with some values from
the kernel stack). I don't think it's a security issue because
the values come from the fixed stack frame which should be near
always user registers from a previous interrupt.

Still better fix it.

Longer term the register save macros need to be cleaned up
to avoid such mistakes in the future.

Original analysis from Richard Brunner, fix by me.

Cc: Richard.Brunner@amd.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-28 19:28:00 -07:00
Andi Kleen
d5a2601734 [PATCH] i386/x86-64: Add user_mode checks to profile_pc for oprofile
Fixes a obscure user space triggerable crash during oprofiling.

Oprofile calls profile_pc from NMIs even when user_mode(regs) is not true and
the program counter is inside the kernel lock section. This opens
a race - when a user program jumps to a kernel lock address and
a NMI happens before the illegal page fault exception is raised
and the program has a unmapped esp or ebp then the kernel could
oops. NMIs have a higher priority than exceptions so that could
happen.

Add user_mode checks to i386/x86-64 profile_pc to prevent that.

Cc: John Levon <levon@movementarian.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-28 19:28:00 -07:00
Andi Kleen
2c87e2cd0b [PATCH] x86_64: Fix access check in ptrace compat
We can't safely directly access an compat_alloc_user_space() pointer
with the siginfo copy functions. Bounce it through the stack.

Noticed by Al Viro using sparse

[ This was only added post 2.6.17, not in any released kernel ]

Cc: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:33 -07:00
Muli Ben-Yehuda
aa0a9f373e [PATCH] x86_64: Fix Calgary copyright statements per IBM guidelines
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:33 -07:00
Jacob Shin
0d2caebd56 [PATCH] x86_64: Fix hotplug problem in mce amd
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:32 -07:00
Markus Schoder
3391c22e5b [PATCH] x86_64: Bring x86-64 ia32 emul in sync with i386 on READ_IMPLIES_EXEC enabling
Currently ia32 binaries behave differently with respect to enabling
READ_IMPLIES_EXEC.  On i386 a binary with the exec_stack flag set is
executed with READ_IMPLIES_EXEC enabled as well.  The same binary
executes without READ_IMPLIES_EXEC on x86-64.

This causes binaries that work on i386 to fail on x86-64 which goes
somewhat against the whole 32 bit emulation idea.

It has been argued that READ_IMPLIES_EXEC should not be enabled at all
for binaries that have the exec_stack flag.  Which is probably a valid
point.  However until this is clarified I think x86-64 should behave the
same for ia32 binaries as i386.

The following patch brings x86-64 in sync with i386 for ia32 binaries.

Signed-off-by: Markus Schoder <lists@gammarayburst.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:32 -07:00
Andi Kleen
d5d8ad78b0 [PATCH] x86_64: Update defconfig
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:32 -07:00
Jon Smirl
894673ee61 [PATCH] tty: Remove include of screen_info.h from tty.h
screen_info.h doesn't have anything to do with the tty layer and shouldn't be
included by tty.h.  This patches removes the include and modifies all users to
directly include screen_info.h.  struct screen_info is mainly used to
communicate with the console drivers in drivers/video/console.  Note that this
patch touches every arch and I have no way of testing it.  If there is a
mistake the worst thing that will happen is a compile error.

[akpm@osdl.org: fix arm build]
[akpm@osdl.org: fix alpha build]
Signed-off-by: Jon Smirl <jonsmir@gmail.com>
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:16 -07:00
Arjan van de Ven
1454aed92b [PATCH] put a comment at register_die_notifier that the export is used
{un}register_die_notifier() is used by kdb... document this so that future
"remove dead export" rounds can skip this export.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:15 -07:00
Ingo Molnar
f86bf9b7bc [PATCH] lockdep: clean up completion initializer in smpboot.c
Clean up lockdep on-stack-completion initializer.  (This also removes the
dependency on waitqueue_lock_key.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:14 -07:00
Andrew Morton
1a91023a9f [PATCH] x86_64: e820.c needs pgtable.h
arch/x86_64/kernel/e820.c:42: error: 'MAXMEM' undeclared here (not in a function)

Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:13 -07:00
Linus Torvalds
51bece910d Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild:
  kbuild: introduce utsrelease.h
  kbuild: explicit turn off gcc stack-protector
2006-07-03 21:26:12 -07:00
Ingo Molnar
60be6b9a41 [PATCH] lockdep: annotate on-stack completions
lockdep needs to have the waitqueue lock initialized for on-stack waitqueues
implicitly initialized by DECLARE_COMPLETION().  Annotate on-stack completions
accordingly.

Has no effect on non-lockdep kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:09 -07:00
Ingo Molnar
366c7f554e [PATCH] lockdep: annotate enable_in_hardirq()
Make use of local_irq_enable_in_hardirq() API to annotate places that enable
hardirqs in hardirq context.

Has no effect on non-lockdep kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:09 -07:00
Ingo Molnar
1e9505279a [PATCH] lockdep: enable on x86_64
Enable LOCKDEP_SUPPORT on x86_64.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:05 -07:00
Ingo Molnar
2148270cd2 [PATCH] lockdep: x86_64 early init
x86_64 uses spinlocks very early - earlier than start_kernel().  So call
lockdep_init() from the arch setup code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:04 -07:00
Ingo Molnar
6375e2b74c [PATCH] lockdep: irqtrace cleanup of include/asm-x86_64/irqflags.h
Clean up the x86-64 irqflags.h file:

 - macro => inline function transformation
 - simplifications
 - style fixes

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:03 -07:00
Ingo Molnar
2601e64d26 [PATCH] lockdep: irqtrace subsystem, x86_64 support
Add irqflags-tracing support to x86_64.

[akpm@osdl.org: build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:03 -07:00
Ingo Molnar
21b32bbff9 [PATCH] lockdep: stacktrace subsystem, x86_64 support
Framework to generate and save stacktraces quickly, without printing anything
to the console.  x86_64 support.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:02 -07:00
Ingo Molnar
c9ca1ba5bd [PATCH] lockdep: x86_64 document stack frame internals
Document stack frame nesting internals some more.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:02 -07:00
Ingo Molnar
3ac94932a2 [PATCH] lockdep: beautify x86_64 stacktraces
Beautify x86_64 stacktraces to be more readable.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:02 -07:00
Sam Ravnborg
63104eec23 kbuild: introduce utsrelease.h
include/linux/version.h contained both actual KERNEL version
and UTS_RELEASE that contains a subset from git SHA1 for when
kernel was compiled as part of a git repository.
This had the unfortunate side-effect that all files including version.h
would be recompiled when some git changes was made due to changes SHA1.
Split it out so we keep independent parts in separate files.

Also update checkversion.pl script to no longer check for UTS_RELEASE.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2006-07-03 23:30:54 +02:00
Thomas Gleixner
b1e05aa230 [PATCH] irq-flags: x86_64: Use the new IRQF_ constants
Use the new IRQF_ constants and remove the SA_INTERRUPT define

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-02 13:58:49 -07:00
Linus Torvalds
fc25465f09 Merge branch 'audit.b22' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b22' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] audit syscall classes
  [PATCH] audit: support for object context filters
  [PATCH] audit: rename AUDIT_SE_* constants
  [PATCH] add rule filterkey
2006-07-01 09:59:08 -07:00
Heiko Carstens
a581c2a469 [PATCH] add __[start|end]_rodata sections to asm-generic/sections.h
Add __start_rodata and __end_rodata to sections.h to avoid extern
declarations.  Needed by s390 code (see following patch).

[akpm@osdl.org: update architectures]
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-01 09:56:03 -07:00
Al Viro
b915543b46 [PATCH] audit syscall classes
Allow to tie upper bits of syscall bitmap in audit rules to kernel-defined
sets of syscalls.  Infrastructure, a couple of classes (with 32bit counterparts
for biarch targets) and actual tie-in on i386, amd64 and ia64.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-07-01 07:44:10 -04:00
Jörn Engel
6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Adrian Bunk
80f7228b59 typo fixes: occuring -> occurring
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 18:27:16 +02:00
Adrian Bunk
5bba17127e [NET]: make skb_release_data() static
skb_release_data() no longer has any users in other files.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29 16:58:30 -07:00
Matt LaPlante
1f1332f727 [PATCH] KConfig: Spellchecking 'similarity' and 'independent'
Several KConfig files had 'similarity' and 'independent' spelled incorrectly...

Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 11:59:14 -07:00
Ingo Molnar
c0ad90a32f [PATCH] genirq: add ->retrigger() irq op to consolidate hw_irq_resend()
Add ->retrigger() irq op to consolidate hw_irq_resend() implementations.
(Most architectures had it defined to NOP anyway.)

NOTE: ia64 needs testing. i386 and x86_64 tested.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:23 -07:00
Ingo Molnar
a53da52fd7 [PATCH] genirq: cleanup: merge irq_affinity[] into irq_desc[]
Consolidation: remove the irq_affinity[NR_IRQS] array and move it into the
irq_desc[NR_IRQS].affinity field.

[akpm@osdl.org: sparc64 build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:22 -07:00
Ingo Molnar
d1bef4ed5f [PATCH] genirq: rename desc->handler to desc->chip
This patch-queue improves the generic IRQ layer to be truly generic, by adding
various abstractions and features to it, without impacting existing
functionality.

While the queue can be best described as "fix and improve everything in the
generic IRQ layer that we could think of", and thus it consists of many
smaller features and lots of cleanups, the one feature that stands out most is
the new 'irq chip' abstraction.

The irq-chip abstraction is about describing and coding and IRQ controller
driver by mapping its raw hardware capabilities [and quirks, if needed] in a
straightforward way, without having to think about "IRQ flow"
(level/edge/etc.) type of details.

This stands in contrast with the current 'irq-type' model of genirq
architectures, which 'mixes' raw hardware capabilities with 'flow' details.
The patchset supports both types of irq controller designs at once, and
converts i386 and x86_64 to the new irq-chip design.

As a bonus side-effect of the irq-chip approach, chained interrupt controllers
(master/slave PIC constructs, etc.) are now supported by design as well.

The end result of this patchset intends to be simpler architecture-level code
and more consolidation between architectures.

We reused many bits of code and many concepts from Russell King's ARM IRQ
layer, the merging of which was one of the motivations for this patchset.

This patch:

rename desc->handler to desc->chip.

Originally i did not want to do this, because it's a big patch.  But having
both "desc->handler", "desc->handle_irq" and "action->handler" caused a
large degree of confusion and made the code appear alot less clean than it
truly is.

I have also attempted a dual approach as well by introducing a
desc->chip alias - but that just wasnt robust enough and broke
frequently.

So lets get over with this quickly.  The conversion was done automatically
via scripts and converts all the code in the kernel.

This renaming patch is the first one amongst the patches, so that the
remaining patches can stay flexible and can be merged and split up
without having some big monolithic patch act as a merge barrier.

[akpm@osdl.org: build fix]
[akpm@osdl.org: another build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:21 -07:00
Yasunori Goto
cc57637b0b [PATCH] solve config broken: undefined reference to `online_page'
Memory hotplug code of i386 adds memory to only highmem.  So, if
CONFIG_HIGHMEM is not set, CONFIG_MEMORY_HOTPLUG shouldn't be set.
Otherwise, it causes compile error.

In addition, many architecture can't use memory hotplug feature yet.  So, I
introduce CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:20 -07:00
Andrew Morton
d9a5685436 [PATCH] x86_64: oprofile build fix
WARNING: "unset_nmi_callback" [arch/x86_64/oprofile/oprofile.ko] undefined!
WARNING: "set_nmi_callback" [arch/x86_64/oprofile/oprofile.ko] undefined!

Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28 14:59:07 -07:00
Andrew Morton
a052b68b1e [PATCH] x86: do_IRQ(): check irq number
We recently changed x86 to handle more than 256 IRQs.  Add a check in do_IRQ()
just to make sure that nothing went wrong with that implementation.

[chrisw@sous-sol.org: do x86_64 too]
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@muc.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28 14:59:02 -07:00
Siddha, Suresh B
5c45bf279d [PATCH] sched: mc/smt power savings sched policy
sysfs entries 'sched_mc_power_savings' and 'sched_smt_power_savings' in
/sys/devices/system/cpu/ control the MC/SMT power savings policy for the
scheduler.

Based on the values (1-enable, 0-disable) for these controls, sched groups
cpu power will be determined for different domains.  When power savings
policy is enabled and under light load conditions, scheduler will minimize
the physical packages/cpu cores carrying the load and thus conserving
power(with a perf impact based on the workload characteristics...  see OLS
2005 CMP kernel scheduler paper for more details..)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Con Kolivas <kernel@kolivas.org>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:45 -07:00
Chandra Seetharaman
74b85f3790 [PATCH] cpu hotplug: make cpu_notifier related notifier blocks __cpuinit only
Make notifier_blocks associated with cpu_notifier as __cpuinitdata.

__cpuinitdata makes sure that the data is init time only unless
CONFIG_HOTPLUG_CPU is defined.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:41 -07:00
Chandra Seetharaman
9c7b216d23 [PATCH] cpu hotplug: revert init patch submitted for 2.6.17
In 2.6.17, there was a problem with cpu_notifiers and XFS.  I provided a
band-aid solution to solve that problem.  In the process, i undid all the
changes you both were making to ensure that these notifiers were available
only at init time (unless CONFIG_HOTPLUG_CPU is defined).

We deferred the real fix to 2.6.18.  Here is a set of patches that fixes the
XFS problem cleanly and makes the cpu notifiers available only at init time
(unless CONFIG_HOTPLUG_CPU is defined).

If CONFIG_HOTPLUG_CPU is defined then cpu notifiers are available at run
time.

This patch reverts the notifier_call changes made in 2.6.17

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:40 -07:00
Randy Dunlap
c9cf55285e [PATCH] add poison.h and patch primary users
Localize poison values into one header file for better documentation and
easier/quicker debugging and so that the same values won't be used for
multiple purposes.

Use these constants in core arch., mm, driver, and fs code.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:38 -07:00
Rusty Russell
19eadf98c8 [PATCH] x86: increase interrupt vector range
Remove the limit of 256 interrupt vectors by changing the value stored in
orig_{e,r}ax to be the complemented interrupt vector.  The orig_{e,r}ax
needs to be < 0 to allow the signal code to distinguish between return from
interrupt and return from syscall.  With this change applied, NR_IRQS can
be > 256.

Xen extends the IRQ numbering space to include room for dynamically
allocated virtual interrupts (in the range 256-511), which requires a more
permissive interface to do_IRQ.

Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Yasunori Goto
bc02af93dd [PATCH] pgdat allocation for new node add (specify node id)
Change the name of old add_memory() to arch_add_memory.  And use node id to
get pgdat for the node at NODE_DATA().

Note: Powerpc's old add_memory() is defined as __devinit. However,
      add_memory() is usually called only after bootup.
      I suppose it may be redundant. But, I'm not well known about powerpc.
      So, I keep it. (But, __meminit is better at least.)

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:35 -07:00
Linus Torvalds
da206c9e68 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
  typo fixes
  Clean up 'inline is not at beginning' warnings for usb storage
  Storage class should be first
  i386: Trivial typo fixes
  ixj: make ixj_set_tone_off() static
  spelling fixes
  fix paniced->panicked typos
  Spelling fixes for Documentation/atomic_ops.txt
  move acknowledgment for Mark Adler to CREDITS
  remove the bouncing email address of David Campbell
2006-06-26 13:33:14 -07:00
Linus Torvalds
972d19e837 Merge master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  [CRYPTO] tcrypt: Forbid tcrypt from being built-in
  [CRYPTO] aes: Add wrappers for assembly routines
  [CRYPTO] tcrypt: Speed benchmark support for digest algorithms
  [CRYPTO] tcrypt: Return -EAGAIN from module_init()
  [CRYPTO] api: Allow replacement when registering new algorithms
  [CRYPTO] api: Removed const from cra_name/cra_driver_name
  [CRYPTO] api: Added cra_init/cra_exit
  [CRYPTO] api: Fixed incorrect passing of context instead of tfm
  [CRYPTO] padlock: Rearrange context structure to reduce code size
  [CRYPTO] all: Pass tfm instead of ctx to algorithms
  [CRYPTO] digest: Remove unnecessary zeroing during init
  [CRYPTO] aes-i586: Get rid of useless function wrappers
  [CRYPTO] digest: Add alignment handling
  [CRYPTO] khazad: Use 32-bit reads on key
2006-06-26 11:03:29 -07:00
Linus Torvalds
81a07d7588 Merge branch 'x86-64'
* x86-64: (83 commits)
  [PATCH] x86_64: x86_64 stack usage debugging
  [PATCH] x86_64: (resend) x86_64 stack overflow debugging
  [PATCH] x86_64: msi_apic.c build fix
  [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs
  [PATCH] x86_64: Avoid broadcasting NMI IPIs
  [PATCH] x86_64: fix apic error on bootup
  [PATCH] x86_64: enlarge window for stack growth
  [PATCH] x86_64: Minor string functions optimizations
  [PATCH] x86_64: Move export symbols to their C functions
  [PATCH] x86_64: Standardize i386/x86_64 handling of NMI_VECTOR
  [PATCH] x86_64: Fix modular pc speaker
  [PATCH] x86_64: remove sys32_ni_syscall()
  [PATCH] x86_64: Do not use -ffunction-sections for modules
  [PATCH] x86_64: Add cpu_relax to apic_wait_icr_idle
  [PATCH] x86_64: adjust kstack_depth_to_print default
  [PATCH] i386/x86-64: adjust /proc/interrupts column headings
  [PATCH] x86_64: Fix race in cpu_local_* on preemptible kernels
  [PATCH] x86_64: Fix fast check in safe_smp_processor_id
  [PATCH] x86_64: x86_64 setup.c - printing cmp related boottime information
  [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status
  ...

Manual resolve of trivial conflict in arch/i386/kernel/Makefile
2006-06-26 10:51:09 -07:00
Eric Sandeen
8501a2fbe7 [PATCH] x86_64: x86_64 stack usage debugging
Applies to git & 2.6.17-rc6 after CONFIG_DEBUG_STACKOVERFLOW patch

uses same stack-zeroing mechanism as on i386 to discover maximum stack
excursions.

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Eric Sandeen
4961f10e22 [PATCH] x86_64: (resend) x86_64 stack overflow debugging
Take two, now without spurious whitespace :(  Applies to git & 2.6.17-rc6

CONFIG_DEBUG_STACKOVERFLOW existed for x86_64 in 2.4, but seems to have gone AWOL in 2.6.

I've pretty much just copied this over from the 2.4 code, with
appropriate tweaks for the 2.6 kernel, plus a bugfix.  I'd personally
rather see it printed out the way other arches do it, i.e.
bytes-remaining-until-overflow, rather than having to do the subtraction
yourself.  Also, only 128 bytes remaining seems awfully late to issue a
warning.  But I'll start here :)

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Venkatesh Pallipadi
0080e66755 [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs
Intel now has support for Architectural Performance Monitoring Counters
( Refer to IA-32 Intel Architecture Software Developer's Manual
http://www.intel.com/design/pentium4/manuals/253669.htm ). This
feature is present starting from Intel Core Duo and Intel Core Solo processors.

What this means is, the performance monitoring counters and some performance
monitoring events are now defined in an architectural way (using cpuid).
And there will be no need to check for family/model etc for these architectural
events.

Below is the patch to use this performance counters in nmi watchdog driver.
Patch handles both i386 and x86-64 kernels.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Keith Owens
e77deacb7b [PATCH] x86_64: Avoid broadcasting NMI IPIs
On some i386/x86_64 systems, sending an NMI IPI as a broadcast will
reset the system.  This seems to be a BIOS bug which affects machines
where one or more cpus are not under OS control.  It occurs on HT
systems with a version of the OS that is not compiled without HT
support.  It also occurs when a system is booted with max_cpus=n where
2 <= n < cpus known to the BIOS.  The fix is to always send NMI IPI as
a mask instead of as a broadcast.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Siddha, Suresh B
704fc59e1d [PATCH] x86_64: fix apic error on bootup
Appended patch fixes the "APIC error on CPUX: 00(40)" observed during bootup.

From SDM Vol-3A "Valid Interrupt Vectors" section:
	"When an illegal vector value (0-15) is written to an LVT entry
	and the delivery mode is Fixed, the APIC may signal an illegal
	vector error, with out regard to whether the mask bit is set
	or whether an interrupt is actually seen on input."

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Chuck Ebbert
03fdc2c277 [PATCH] x86_64: enlarge window for stack growth
Allow stack growth so the 'enter' instruction works.  Also
fixes problem in compat_sys_kexec_load() which could allocate
more than 128 bytes using compat_alloc_user_space().

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Andi Kleen
6bfa9bb519 [PATCH] x86_64: Minor string functions optimizations
- Use tail call from clear_user to __clear_user to save some code size
 - Use standard memcpy for forward memmove

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00