Archived
14
0
Fork 0
Commit graph

513 commits

Author SHA1 Message Date
Siddha, Suresh B
f6c2e3330d [PATCH] x86_64: Unmap NULL during early bootup
We should zap the low mappings, as soon as possible, so that we can catch
kernel bugs more effectively. Previously early boot had NULL mapped
and didn't trap on NULL references.

This patch introduces boot_level4_pgt, which will always have low identity
addresses mapped.  Druing boot, all the processors will use this as their
level4 pgt.  On BP, we will switch to init_level4_pgt as soon as we enter C
code and zap the low mappings as soon as we are done with the usage of
identity low mapped addresses.  On AP's we will zap the low mappings as
soon as we jump to C code.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
Suresh Siddha
f5f786d045 [PATCH] x86-64/i386: Fix CPU model for family 6
According to cpuid instruction in IA32 SDM-Vol2, when computing cpu model,
we need to consider extended model ID for family 0x6 also.

AK: Also added fixes/simplifcation from Petr Vandrovec

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:14 -08:00
James Cleverdon
6004e1b7ef [PATCH] i386/x86-64: Share interrupt vectors when there is a large number of interrupt sources
Here's a patch that builds on Natalie Protasevich's IRQ compression
patch and tries to work for MPS boots as well as ACPI.  It is meant for
a 4-node IBM x460 NUMA box, which was dying because it had interrupt
pins with GSI numbers > NR_IRQS and thus overflowed irq_desc.

The problem is that this system has 270 GSIs (which are 1:1 mapped with
I/O APIC RTEs) and an 8-node box would have 540.  This is much bigger
than NR_IRQS (224 for both i386 and x86_64).  Also, there aren't enough
vectors to go around.  There are about 190 usable vectors, not counting
the reserved ones and the unused vectors at 0x20 to 0x2F.  So, my patch
attempts to compress the GSI range and share vectors by sharing IRQs.

Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com>

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Andi Kleen
fed644132f [PATCH] x86_64: Make i386 compile again with fourth DMA32 zone
The code should deal with an additional empty zone, so fix up the
#error.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-14 19:55:13 -08:00
Nick Piggin
53e86b91b7 [PATCH] i386: generic cmpxchg
- Make cmpxchg generally available on the i386 platform.

- Provide emulation of cmpxchg suitable for uniprocessor if built and run on
  386.

From: Christoph Lameter <clameter@sgi.com>

- Cut down patch and small style changes.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-13 18:14:15 -08:00
Tim Mann
7feacd5334 [PATCH] x86: fix cpu_khz with clock=pit
Fix http://bugzilla.kernel.org/show_bug.cgi?id=5546

The cpu_khz global is not initialized and remains 0 if you boot with
clock=pit, even if the processor does have a TSC.  This may have bad
ramifications since the variable is used in various places scattered around
the kernel, though I didn't check them all to see if they can tolerate cpu_khz
= 0.  You can observe the problem by doing "cat /proc/cpuinfo"; the cpu MHz
line says 0.000.

The fix is trivial; call init_cpu_khz() from init_pit(), just as it's called
from the timers/timer_foo.c:init_foo() for other values of foo.

Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-13 18:14:13 -08:00
Jan Beulich
e27182088e [PATCH] i386: NMI pointer comparison fix
Instruction pointer comparisons for the NMI on debug stack check/fixup
were incorrect.

From: Jan Beulich <jbeulich@novell.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Zwane Mwaikambo <zwane@holomorphy.com>
Acked-by: "Seth, Rohit" <rohit.seth@intel.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-13 18:14:13 -08:00
Adrian Bunk
27d99f7ead [PATCH] arch/i386/mm/init.c: small cleanups
This patch contains the following cleanups:
- make a needlessly global function static
- every file should include the headers containing the prototypes for
  it's global functions

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-13 18:14:13 -08:00
Jeff Garzik
bca73e4bf8 [PATCH] move pm_register/etc. to CONFIG_PM_LEGACY, pm_legacy.h
Since few people need the support anymore, this moves the legacy
pm_xxx functions to CONFIG_PM_LEGACY, and include/linux/pm_legacy.h.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-13 18:14:10 -08:00
Jesse Barnes
6e6ece5dc6 [PATCH] PCI: fix for Toshiba ohci1394 quirk
After much testing and agony, I've discovered that my previous ohci1394
quirk for Toshiba laptops is not 100% reliable.  It apparently fails to
do the interrupt line change either correctly or in time, since in about
2 out of 5 boots, the kernel's irqdebug code will *still* disable irq 11
when the ohci1394 driver is loaded (at pci_enable_device time I think).

This patch switches things around a little in the workaround.  First, it
removes the mdelay.  I didn't see a need for it and my testing has shown
that it's not necessary for the quirk to work.

Secondly, instead of trying to change the interrupt line to what ACPI
tells us it should be, this patch makes the quirk use the value in the
PCI_INTERRUPT_LINE register.  On this laptop at least, that seems to be
the right thing to do, though additional testing on other laptops and/or
with actual firewire devices would be appreciated.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-11-10 16:09:18 -08:00
Adrian Bunk
78ac151c83 [PATCH] i386: EXPORT_SYMBOL(screen_info) even #ifndef CONFIG_VT
The folllowing modules require screen_info but don't depend
on CONFIG_VT:
- vga16fb.ko
- intelfb.ko

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:36 -08:00
Nick Piggin
64c7c8f885 [PATCH] sched: resched and cpu_idle rework
Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce
confusion, and make their semantics rigid.  Improves efficiency of
resched_task and some cpu_idle routines.

* In resched_task:
- TIF_NEED_RESCHED is only cleared with the task's runqueue lock held,
  and as we hold it during resched_task, then there is no need for an
  atomic test and set there. The only other time this should be set is
  when the task's quantum expires, in the timer interrupt - this is
  protected against because the rq lock is irq-safe.

- If TIF_NEED_RESCHED is set, then we don't need to do anything. It
  won't get unset until the task get's schedule()d off.

- If we are running on the same CPU as the task we resched, then set
  TIF_NEED_RESCHED and no further action is required.

- If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set
  after TIF_NEED_RESCHED has been set, then we need to send an IPI.

Using these rules, we are able to remove the test and set operation in
resched_task, and make clear the previously vague semantics of
POLLING_NRFLAG.

* In idle routines:
- Enter cpu_idle with preempt disabled. When the need_resched() condition
  becomes true, explicitly call schedule(). This makes things a bit clearer
  (IMO), but haven't updated all architectures yet.

- Many do a test and clear of TIF_NEED_RESCHED for some reason. According
  to the resched_task rules, this isn't needed (and actually breaks the
  assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock
  held). So remove that. Generally one less locked memory op when switching
  to the idle thread.

- Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner
  most polling idle loops. The above resched_task semantics allow it to be
  set until before the last time need_resched() is checked before going into
  a halt requiring interrupt wakeup.

  Many idle routines simply never enter such a halt, and so POLLING_NRFLAG
  can be always left set, completely eliminating resched IPIs when rescheduling
  the idle task.

  POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:33 -08:00
Nick Piggin
5bfb5d690f [PATCH] sched: disable preempt in idle tasks
Run idle threads with preempt disabled.

Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()).
How did it ever work before?

Might fix the CPU hotplugging hang which Nigel Cunningham noted.

We think the bug hits if the idle thread is preempted after checking
need_resched() and before going to sleep, then the CPU offlined.

After calling stop_machine_run, the CPU eventually returns from preemption and
into the idle thread and goes to sleep.  The CPU will continue executing
previous idle and have no chance to call play_dead.

By disabling preemption until we are ready to explicitly schedule, this bug is
fixed and the idle threads generally become more robust.

From: alexs <ashepard@u.washington.edu>

  PPC build fix

From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>

  MIPS build fix

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:33 -08:00
Linus Torvalds
dad2ad82c5 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq 2005-11-07 13:28:20 -08:00
Adrian Bunk
5fed0578be [PATCH] unexport phys_proc_id and cpu_core_id
EXPORT_SYMBOL's for phys_proc_id and cpu_core_id were added this year but
never used.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:54:09 -08:00
Ananth N Mavinakayanahalli
d217d5450f [PATCH] Kprobes: preempt_disable/enable() simplification
Reorganize the preempt_disable/enable calls to eliminate the extra preempt
depth.  Changes based on Paul McKenney's review suggestions for the kprobes
RCU changeset.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli
991a51d83a [PATCH] Kprobes: Use RCU for (un)register synchronization - arch changes
Changes to the arch kprobes infrastructure to take advantage of the locking
changes introduced by usage of RCU for synchronization.  All handlers are now
run without any locks held, so they have to be re-entrant or provide their own
synchronization.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli
9a0e3a8683 [PATCH] Kprobes: Track kprobe on a per_cpu basis - i386 changes
I386 changes to track kprobe execution on a per-cpu basis.  We now track the
kprobe state machine independently on each cpu, using an arch specific kprobe
control block.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:45 -08:00
Ananth N Mavinakayanahalli
66ff2d0691 [PATCH] Kprobes: rearrange preempt_disable/enable() calls
The following set of patches are aimed at improving kprobes scalability.  We
currently serialize kprobe registration, unregistration and handler execution
using a single spinlock - kprobe_lock.

With these changes, kprobe handlers can run without any locks held.  It also
allows for simultaneous kprobe handler executions on different processors as
we now track kprobe execution on a per processor basis.  It is now necessary
that the handlers be re-entrant since handlers can run concurrently on
multiple processors.

All changes have been tested on i386, ia64, ppc64 and x86_64, while sparc64
has been compile tested only.

The patches can be viewed as 3 logical chunks:

patch 1: 	Reorder preempt_(dis/en)able calls
patches 2-7: 	Introduce per_cpu data areas to track kprobe execution
patches 8-9: 	Use RCU to synchronize kprobe (un)registration and handler
		execution.

Thanks to Maneesh Soni, James Keniston and Anil Keshavamurthy for their
review and suggestions. Thanks again to Anil, Hien Nguyen and Kevin Stafford
for testing the patches.

This patch:

Reorder preempt_disable/enable() calls in arch kprobes files in preparation to
introduce locking changes.  No functional changes introduced by this patch.

Signed-off-by: Ananth N Mavinakayahanalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:45 -08:00
Christoph Hellwig
481bed4542 [PATCH] consolidate sys_ptrace()
The sys_ptrace boilerplate code (everything outside the big switch
statement for the arch-specific requests) is shared by most architectures.
This patch moves it to kernel/ptrace.c and leaves the arch-specific code as
arch_ptrace.

Some architectures have a too different ptrace so we have to exclude them.
They continue to keep their implementations.  For sh64 I had to add a
sh64_ptrace wrapper because it does some initialization on the first call.
For um I removed an ifdefed SUBARCH_PTRACE_SPECIAL block, but
SUBARCH_PTRACE_SPECIAL isn't defined anywhere in the tree.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-By: David Howells <dhowells@redhat.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:42 -08:00
Prasanna S Panchamukhi
cd6b0762a0 [PATCH] Move Kprobes and Oprofile to "Instrumentation Support" menu
Andrew Morton suggested to move kprobes from kernel hacking menu, since
kernel hacking menu is in-appropriate for the Kprobes.  This patch moves
Kprobes and Oprofile under instrumentation menu.

(akpm: it's not a natural fit, but things like djprobes and the s390 guys'
statistics library need a home)

Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Philippe Elie <phil.el@wanadoo.fr>
Cc: John Levon <levon@movementarian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:35 -08:00
Shaohua Li
31ab269a03 [PATCH] x86: add MCE resume
It's widely seen a MCE non-fatal error reported after resume.  It seems MCE
resume is lacked under ia32.  This patch tries to fix the gap.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:30 -08:00
Adrian Bunk
cc658cfe3c [PATCH] arch/i386/kernel/scx200.c should #include <linux/scx200_gpio.h>
Every file should #include the header files containing the prototypes of
its global functions

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:29 -08:00
Adrian Bunk
5cc6135af7 [PATCH] arch/i386/kernel/reboot_fixups.c should #include <linux/reboot_fixups.h>
Every file should #include the header files containing the prototypes of
its global functions

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:29 -08:00
Adrian Bunk
8d1ed6366b [PATCH] arch/i386/kernel/ldt.c should #include <asm/mmu_context.h>
Every file should #include the header files containing the prototypes of
its global functions

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:29 -08:00
Zwane Mwaikambo
77f72b192f [PATCH] i386: LVT entries remaining unmasked on reboot
Excerpt from bugzilla entry

http://bugzilla.kernel.org/show_bug.cgi?id=5518

"i386 version of Reboot-through-BIOS is unsafe: it forgets to mask APIC LVT
interrupts before jumping to a BIOS entry point.  As a result, BIOS ends up
bombarded with interrupts early on boot.  The BIOS does not expect it since
following a "normal" hardware cpu reset, all APIC LVT registers have the
Mask bit (16) set and can't generate interrupts.

For example, the version of Phoenix BIOS used by VMware enables interrupts
for the first time before masking/clearing APIC LVT.  The APIC Timer LVT
register is still set up for a timer interrupt delivery with a high vector
from the previous Linux incarnation (0xef in our case).  The BIOS has not
fully initialized its IDT at this point and the real mode gate for 0xef
remains all zeros.  Vector 0xef dispatches BIOS to address 0:0, BIOS takes
a #GP and eventually hangs.

machine_shutdown() does attempt to shut down APIC before jumping to BIOS,
but it is ineffective"

Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Cc: "Seth, Rohit" <rohit.seth@intel.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:28 -08:00
Tobias Klauser
38e548ee1a [PATCH] arch/i386: Use ARRAY_SIZE macro
Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0])

Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:28 -08:00
Bart Oldeman
f912696ab3 [PATCH] reset tss->io_bitmap_owner in sys_ioperm()
my patch "x86: initialise tss->io_bitmap_owner to something" (commit ID
d5cd4aadd3) introduced a problem with a
program (DOSEMU) that called ioperm after already doing some port i/o.

The problem is that a process switch return causes tss->io_bitmap_base
to be set to IO_BITMAP_OFFSET so that the fault (that *really* sets the
io bitmap) never triggers.

This fixes that regression.

Signed-off-by: Bart Oldeman <bartoldeman@users.sourceforge.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-05 16:31:36 -08:00
Roland Dreier
1d37374197 [PATCH] toshiba_ohci1394_dmi_table should be __devinitdata, not __devinit
I don't really understand why gcc gives the error it does, but without
this patch, when building with CONFIG_HOTPLUG=n, I get errors like:

      CC      arch/x86_64/pci/../../i386/pci/fixup.o
    arch/x86_64/pci/../../i386/pci/fixup.c: In function `pci_fixup_i450nx':
    arch/x86_64/pci/../../i386/pci/fixup.c:13: error: pci_fixup_i450nx causes a section type conflict

The change is obviously correct: an array should be declared
__devinitdata rather that __devinit.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Martin J. Bligh <mbligh@mbligh.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-01 21:27:22 -08:00
Linus Torvalds
1e4c85f97f Revert "i386: move apic init in init_IRQs"
Commit f2b36db692 causes a bootup hang on
at least one machine.  Revert for now until we understand why.  The old
code may be ugly, but it works.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-31 19:16:17 -08:00
Arthur Othieno
f2c84c0e84 [PATCH] i386: CONFIG_PC removal
CONFIG_PC is left-over cruft after the introduction of CONFIG_X86_PC with
the subarch split.  Remove it, and fixup the remaining users to depend on
CONFIG_X86_PC instead.

Signed-off-by: Arthur Othieno <a.othieno@bluewin.ch>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-31 09:20:54 -08:00
Tim Schmielau
4e57b68178 [PATCH] fix missing includes
I recently picked up my older work to remove unnecessary #includes of
sched.h, starting from a patch by Dave Jones to not include sched.h
from module.h. This reduces the number of indirect includes of sched.h
by ~300. Another ~400 pointless direct includes can be removed after
this disentangling (patch to follow later).
However, quite a few indirect includes need to be fixed up for this.

In order to feed the patches through -mm with as little disturbance as
possible, I've split out the fixes I accumulated up to now (complete for
i386 and x86_64, more archs to follow later) and post them before the real
patch.  This way this large part of the patch is kept simple with only
adding #includes, and all hunks are independent of each other.  So if any
hunk rejects or gets in the way of other patches, just drop it.  My scripts
will pick it up again in the next round.

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:32 -08:00
Clemens Ladisch
7811fb8f40 [PATCH] hpet-RTC: cache the comparator register
Reads from an HPET register require a round trip to the south bridge and are
almost as slow as PCI reads.  By caching the last value we've written to the
comparator register, we can eliminate all HPET reads from the fast path in the
emulated RTC interrupt handler.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:30 -08:00
Clemens Ladisch
5f819949ee [PATCH] hpet-RTC: fix timer config register accesses
Make sure that the RTC timer is in non-periodic mode; some stupid BIOS might
have initialized it to periodic mode.

Furthermore, don't set the SETVAL bit in the config register.  This wouldn't
have any effect unless the timer was in period mode (which it isn't), and then
the actual timer frequency would be half that of the desired one because
incrementing the comparator in the interrupt handler would be done after the
hardware has already incremented it itself.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:29 -08:00
Clemens Ladisch
f00c96f313 [PATCH] hpet-RTC: disable interrupt when no longer needed
When the emulated RTC interrupt is no longer needed, we better disable it;
otherwise, we get a spurious interrupt whenever the timer has rolled over and
reaches the same comparator value.

Having a superfluous interrupt every five minutes doesn't hurt much, but it's
bad style anyway.  ;-)

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:29 -08:00
Randy Dunlap
874ec33ff9 [PATCH] sparse cleanups: NULL pointers, C99 struct init.
Convert most of the remaining "Using plain integer as NULL pointer" sparse
warnings to use NULL.  (Not duplicating patches that are already in -mm,
-bird, or -kj.)

Convert isdn driver struct initializer to use C99 syntax.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:29 -08:00
Thomas Gleixner
ecea8d19c9 [PATCH] jiffies_64 cleanup
Define jiffies_64 in kernel/timer.c rather than having 24 duplicated
defines in each architecture.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:25 -08:00
Christoph Hellwig
dfb7dac3af [PATCH] unify sys_ptrace prototype
Make sure we always return, as all syscalls should.  Also move the common
prototype to <linux/syscalls.h>

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:20 -08:00
Paolo 'Blaisorblade' Giarrusso
d89ea9b8bb [PATCH] i386: use -mcpu, not -mtune, for GCCs older than 3.4
I just noted that -mtune is used, which is only supported on recent GCCs; by
reading http://gcc.gnu.org/gcc-3.4/changes.html, you see "-mcpu has been
renamed to -mtune.", so for GCC < 3.4 we're not using any specific tuning in
the appropriate cases.  However -mcpu is deprecated, so use -mtune when
possible.

This was introduced by commit e9d4dce954a60dc23dd1d967766ca2347b780e54 of the
old tree (between 2.6.10-rc3 and 2.6.10) by Linus Torvalds, to remove the use
of -march, since that could trigger gcc using SSE on its own.  But no
attention was used about using -mcpu vs.  -mtune.

And btw, the old 2.6.4 code (for instance) was:
cflags-$(CONFIG_MPENTIUMII)     += $(call check_gcc,-march=pentium2,-march=i686)
cflags-$(CONFIG_MPENTIUMIII)    += $(call check_gcc,-march=pentium3,-march=i686)
cflags-$(CONFIG_MPENTIUMM)      += $(call check_gcc,-march=pentium3,-march=i686)
cflags-$(CONFIG_MPENTIUM4)      += $(call check_gcc,-march=pentium4,-march=i686)

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:16 -08:00
Paolo 'Blaisorblade' Giarrusso
96d55b882b [PATCH] uml: reuse i386 cpu-specific tuning
Make UML share the underlying cpu-specific tuning done on i386.

Actually, for now many config options aren't used a lot - but that can be done
later.  Also, UML relies on GCC optimization for things like memcpy and such
more than i386, so specifying the correct -march and -mtune should be enough.
Later, we may want to correct some other stuff.

For instance, since FPU context switching, for us, is done (at least
partially, i.e.  between our kernelspace and userspace) by the host, we may
allow usage of FPU operations by GCC.  This doesn't hold for kernelspace vs.
kernelspace, but we don't support preemption.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:16 -08:00
Ashok Raj
1aa1a9f98f [PATCH] create and destroy cache sysfs entries based on cpu notifiers
cpu cache entries should be populated only when cpu is online and removed
when they are logically offlined.

Without which entries are not removed when cpu is offlined, or dont appear
when we boot with maxcpus=1 and then kick the rest of the cpus via echo 1
to the sysfs online file.

- Changed __devinit to __cpuinit for consistency.
- Changed sysfs_driver_register to register_cpu_notifier.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Zwane Mwaikambo <zwane@holomorphy.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:14 -08:00
Magnus Damm
5d35704028 [PATCH] i386: srat on non-acpi hw fix
This patch adds a check for the return value of acpi_find_root_pointer().
Without this patch systems without ACPI support such as QEMU crashes when
booting a NUMA kernel with CONFIG_ACPI_SRAT=y.

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:13 -08:00
Eric W. Biederman
6c180d94ab [PATCH] i386 mpparse: Only ignore lapic information we can't store
After staring at mpparse.c for a little longer I noticed that when we hit
our limit of num_processors we are filtering out information about other
processors that we can still store.

This patch just reorders the code so we store everything we can.

This should avoid the incorrect warning about our boot CPU not being listed
by the BIOS that we are now getting in the kexec on panic case, and it
should allow us to detect all apicid conflicts even when our physical
number of cpus exceeds maxcpus.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:13 -08:00
Vivek Goyal
009b29d90f [PATCH] kdump/i386: apic verification failure fix
o Removes the unnecessary call to local_irq_disable().

o Kdump was failing while second kernel was coming up. Check for presence
  of boot cpu apic id was failing in (apic_id_registered), hence hitting
  BUG().

o This should not have failed because before calling setup_local_APIC(), it is
  ensured that even if BIOS has not reported boot cpu, then hard set the
  prence of it. Problem happens because of usage of hard_smp_processor_id()
  which is hardcoded to zero in case of non SMP kernel. In kdump case second
  kernel can boot on a cpu whose boot cpu id is not zero.

o Using boot_cpu_physical_apicid instead to hard set the presence of boot cpu.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:13 -08:00
Brian Gerst
c531178157 [PATCH] Clean up mtrr compat ioctl code
Handle 32-bit mtrr ioctls in the mtrr driver instead of the ia32
compatability layer.

Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:13 -08:00
Kamble, Nitin A
daedb82d6b [PATCH] x86: vmx cpu feature detection
If VMX feature is available in the CPU, this patch will make it visible in
the /proc/cpuinfo with the cpuid detection.

Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:13 -08:00
Eric W. Biederman
3d1675b41b [PATCH] i386 kexec-on-panic: Don't shutdown the apics.
It is dangerous to shutdown the apics in machine_crash_shutdown.

With my previous patch to initialize apics in init_IRQ we should be able to
boot a kernel without this.  As long as we reinitialize the APICs we don't
care what state they were in during bootup.

This should make machine_crash_shutdown noticeably more reliable.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:13 -08:00
Eric W. Biederman
f2b36db692 [PATCH] i386: move apic init in init_IRQs
All kinds of ugliness exists because we don't initialize
the apics during init_IRQs.
- We calibrate jiffies in non apic mode even when we are using apics.
- We have to have special code to initialize the apics when non-smp.
- The legacy i8259 must exist and be setup correctly, even
  when we won't use it past initialization.
- The kexec on panic code must restore the state of the io_apics.
- init/main.c needs a special case for !smp smp_init on x86

In addition to pure code movement I needed a couple
of non-obvious changes:
- Move setup_boot_APIC_clock into APIC_late_time_init for
  simplicity.
- Use cpu_khz to generate a better approximation of loops_per_jiffies
  so I can verify the timer interrupt is working.
- Call setup_apic_nmi_watchdog again after cpu_khz is initialized on
  the boot cpu.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:13 -08:00
Eric W. Biederman
29b70081f7 [PATCH] i386 nmi_watchdog: Merge check_nmi_watchdog fixes from x86_64
The per cpu nmi watchdog timer is based on an event counter.  idle cpus
don't generate events so the NMI watchdog doesn't fire and the test to see
if the watchdog is working fails.

- Add nmi_cpu_busy so idle cpus don't mess up the test.
- kmalloc prev_nmi_count to keep kernel stack usage bounded.
- Improve the error message on failure so there is enough
  information to debug problems.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:13 -08:00
Eric W. Biederman
fcfd636a72 [PATCH] i386 io_apic.c: Memorize at bootup where the i8259 is connected
Currently we attempt to restore virtual wire mode on reboot, which only
works if we can figure out where the i8259 is connected.  This is very
useful when we kexec another kernel and likely helpful when dealing with a
BIOS that make assumptions about how the system is setup.

Since the acpi MADT table does not provide the location where the i8259 is
connected we have to look at the hardware to figure it out.

Most systems have the i8259 connected the local apic of the cpu so won't be
affected but people running Opteron and some serverworks chipsets should be
able to use kexec now.

In addition this patch removes the hard coded assumption that the io_apic
that delivers isa interrups is always known to the kernel as io_apic 0.  As
there does not appear to be anything to guarantee that assumption is true.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:12 -08:00
Natalie.Protasevich@unisys.com
9338316c93 [PATCH] ES7000 platform update
This is platform code update for ES7000: disables IRQ overrides for the
recent ES7000 (Rascal/Zorro), cleans up the compile warning.  The patch
only affects the ES7000 subarch.

Signed-off-by: <Natalie.Protasevich@unisys.com>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:12 -08:00
Venkatesh Pallipadi
30037f66ce [PATCH] x86: when L3 is present show its size in /proc/cpuinfo
The code that prints the cache size assumes that L3 always lives in chipset
and is shared across CPUs.  Which is not really true.

I think all the cachesizes reported by cpuid are in the processor itself.
The attached patch changes the code to reflect that.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:12 -08:00
Dave Hansen
f014a556e7 [PATCH] fixup bogus e820 entry with mem=
This was reported because someone was getting oopses reading /proc/iomem.
It was tracked down to a zero-sized 'struct resource' entry which was
located right at 4GB.

You need two conditions to hit this bug: a BIOS E820_RAM area starting at
exactly the boundary where you specify mem= (to get a zero-sized entry),
and for the legacy_init_iomem_resources() loop to skip that resource (which
only happens at exactly 4G).

I think the killing zero-sized e820 entry is the easiest way to fix this.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:12 -08:00
aleksey_gorelov@phoenix.com
750deaa402 [PATCH] asus vt8235 router buggy bios workaround
Hopefully fix http://bugzilla.kernel.org/show_bug.cgi?id=5235

Similar problem has been reported before here:
http://groups.google.com/group/linux.kernel/browse_thread/thread/def4ca19dbc3cd4/5cffbf349f2c87a4?tvc=2&q=Aleksey+Gorelov&hl=en#5cffbf349f2c87a4
and was related to bug in BIOS reporting 82C686 router compatible to 586.

I suspect BIOS on this board has similar issue: reports VT8235 router to be
compatible with 586 one - which is obviously not true.  Patch from the link
above has already incorporated in both 2.6 & 2.4 series, but might not work
in this particular case.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:12 -08:00
Venkatesh Pallipadi
434440a280 [PATCH] x86: bug fix in P6 Machine check initialization
Make P6 MCA initialization code complaint with guidelines in IA-32 SDM
Vol3.  Bank 0 control register should not be set by OS and clear status
registers on all banks on reset.

This will prevent false MCE alarms on the systems that has some non-MCE
information left-over in MC0_STATUS on reboot.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:12 -08:00
Zachary Amsden
251e6912df [PATCH] x86: add an accessor function for getting the per-CPU gdt
Add an accessor function for getting the per-CPU gdt.  Callee must already
have the CPU.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:12 -08:00
Zachary Amsden
72e12b76fe [PATCH] x86: bogus tls from gdt
The per-CPU initialization code is copying in bogus data into
thread->tls_array.  Note that it copies &per_cpu(cpu_gdt_table, cpu), not
&per_cpu(cpu_gdt_table, cpu)[GDT_ENTRY_TLS_MIN).  That is totally broken
and unnecessary.  Make the initialization explicitly NULL.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:12 -08:00
Natalie Protasevich
9f40a72a7e [PATCH] x86: hot plug CPU to support physical add of new processors
The patch allows physical bring-up of new processors (not initially present
in the configuration) from facilities such as driver/utility implemented on
a platform.  The actual method of making processors available is up to the
platform implementation.

Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Zwane Mwaikambo <zwane@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:12 -08:00
Siddha, Suresh B
d16aafff25 [PATCH] intel_cacheinfo: remove MAX_CACHE_LEAVES limit
Initial internal version of Venki's cpuid(4) deterministic cache parameter
identification patch used static arrays of size MAX_CACHE_LEAVES.  Final patch
which made to the base used dynamic array allocation, with this
MAX_CACHE_LEAVES limit hunk still in place.

cpuid(4) already has a mechanism to find out the number of cache levels
implemented and there is no need for this hardcoded MAX_CACHE_LEAVES limit.

So remove the MAX_CACHE_LEAVES limit from the routine which calculates the
number of cache levels using cpuid(4)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:11 -08:00
Bart Oldeman
d5cd4aadd3 [PATCH] x86: initialise tss->io_bitmap_owner to something
There exists a field io_bitmap_owner in the TSS that is only checked, but
never set to anything else but NULL.

Signed-off-by: Bart Oldeman <bartoldeman@users.sourceforge.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:11 -08:00
Shaohua Li
08967f941a [PATCH] FPU context corrupted after resume
mxcsr_feature_mask_init isn't needed in suspend/resume time (we can use
boot time mask).  And actually it's harmful, as it clear task's saved
fxsave in resume.  This bug is widely seen by users using zsh.

(akpm: my eyes.  Fixed some surrounding whitespace mess)

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:11 -08:00
Jan Beulich
8896fab35e [PATCH] x86: cmpxchg improvements
This adjusts i386's cmpxchg patterns so that

- for word and long cmpxchg-es the compiler can utilize all possible
  registers

- cmpxchg8b gets disabled when the minimum specified hardware architectur
  doesn't support it (like was already happening for the byte, word, and
  long ones).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:11 -08:00
Mathieu Desnoyers
dacb16b1a0 [PATCH] i386 and x86_64 TSC set_cyc2ns_scale imprecision
I just found out that some precision is unnecessarily lost in the
arch/i386/kernel/timers/timer_tsc.c:set_cyc2ns_scale function.  It uses a
cpu_mhz parameter when it could use a cpu_khz.  In the specific case of an
Intel P4 running at 3001.171 Mhz, the truncation to 3001 Mhz leads to an
imprecision of 19 microseconds per second : this is very sad for a timer with
nearly nanosecond accuracy.

Fix the x86_64 architecture too.

Cc: george anzinger <george@mvista.com>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:11 -08:00
Brian Gerst
0d078f6f96 [PATCH] CONFIG_IA32
Add CONFIG_X86_32 for i386.  This allows selecting options that only apply
to 32-bit systems.

(X86 && !X86_64) becomes X86_32
(X86 ||  X86_64) becomes X86

Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:10 -08:00
Dave Hansen
05039b9263 [PATCH] memory hotplug: i386 addition functions
Adds the necessary for non-NUMA hot-add of highmem to an existing zone on
i386.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:45 -07:00
Dave Hansen
208d54e551 [PATCH] memory hotplug locking: node_size_lock
pgdat->node_size_lock is basically only neeeded in one place in the normal
code: show_mem(), which is the arch-specific sysrq-m printing function.

Strictly speaking, the architectures not doing memory hotplug do no need this
locking in show_mem().  However, they are all included for completeness.  This
should also make any future consolidation of all of the implementations a
little more straightforward.

This lock is also held in the sparsemem code during a memory removal, as
sections are invalidated.  This is the place there pfn_valid() is made false
for a memory area that's being removed.  The lock is only required when doing
pfn_valid() operations on memory which the user does not already have a
reference on the page, such as in show_mem().

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:44 -07:00
Hugh Dickins
4c21e2f244 [PATCH] mm: split page table lock
Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
a many-threaded application which concurrently initializes different parts of
a large anonymous area.

This patch corrects that, by using a separate spinlock per page table page, to
guard the page table entries in that page, instead of using the mm's single
page_table_lock.  (But even then, page_table_lock is still used to guard page
table allocation, and anon_vma allocation.)

In this implementation, the spinlock is tucked inside the struct page of the
page table page: with a BUILD_BUG_ON in case it overflows - which it would in
the case of 32-bit PA-RISC with spinlock debugging enabled.

Splitting the lock is not quite for free: another cacheline access.  Ideally,
I suppose we would use split ptlock only for multi-threaded processes on
multi-cpu machines; but deciding that dynamically would have its own costs.
So for now enable it by config, at some number of cpus - since the Kconfig
language doesn't support inequalities, let preprocessor compare that with
NR_CPUS.  But I don't think it's worth being user-configurable: for good
testing of both split and unsplit configs, split now at 4 cpus, and perhaps
change that to 8 later.

There is a benefit even for singly threaded processes: kswapd can be attacking
one part of the mm while another part is busy faulting.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:42 -07:00
Hugh Dickins
60ec558549 [PATCH] mm: i386 sh sh64 ready for split ptlock
Use pte_offset_map_lock, instead of pte_offset_map (or inappropriate
pte_offset_kernel) and mm-wide page_table_lock, in sundry arch places.

The i386 vm86 mark_screen_rdonly: yes, there was and is an assumption that the
screen fits inside the one page table, as indeed it does.

The sh __do_page_fault: which handles both kernel faults (without lock) and
user mm faults (locked - though it set_pte without locking before).

The sh64 flush_cache_range and helpers: which wrongly thought callers held
page_table_lock before (only its tlb_start_vma did, and no longer does so);
moved the flush loop down, and adjusted the large versus small range decision
to consider a range which spans page tables as large.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:41 -07:00
Hugh Dickins
c34d1b4d16 [PATCH] mm: kill check_user_page_readable
check_user_page_readable is a problematic variant of follow_page.  It's used
only by oprofile's i386 and arm backtrace code, at interrupt time, to
establish whether a userspace stackframe is currently readable.

This is problematic, because we want to push the page_table_lock down inside
follow_page, and later split it; whereas oprofile is doing a spin_trylock on
it (in the i386 case, forgotten in the arm case), and needs that to pin
perhaps two pages spanned by the stackframe (which might be covered by
different locks when we split).

I think oprofile is going about this in the wrong way: it doesn't need to know
the area is readable (neither i386 nor arm uses read protection of user
pages), it doesn't need to pin the memory, it should simply
__copy_from_user_inatomic, and see if that succeeds or not.  Sorry, but I've
not got around to devising the sparse __user annotations for this.

Then we can eliminate check_user_page_readable, and return to a single
follow_page without the __follow_page variants.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:41 -07:00
Hugh Dickins
872fec16d9 [PATCH] mm: init_mm without ptlock
First step in pushing down the page_table_lock.  init_mm.page_table_lock has
been used throughout the architectures (usually for ioremap): not to serialize
kernel address space allocation (that's usually vmlist_lock), but because
pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.

Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
and drop it when allocating a new one, to check lest a racing task already
did.  Similarly no page_table_lock in vmalloc's map_vm_area.

Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
user mms, which are converted only by a later patch, for now they have to lock
differently according to whether or not it's init_mm.

If sources get muddled, there's a danger that an arch source taking
init_mm.page_table_lock will be mixed with common source also taking it (or
neither take it).  So break the rules and make another change, which should
break the build for such a mismatch: remove the redundant mm arg from
pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).

Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
took page_table_lock for no good reason.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
Jesse Barnes
f8977d0a9b [PATCH] PCI fixup for Toshiba laptops and ohci1394
This is a fix for a bug I see on my Toshiba laptop, where the ohci1394
controller gets initialized improperly.  The patch adds two PCI fixups
to arch/i386/pci/fixup.c, one that happens early on to cache the value
of the PCI_CACHE_LINE_SIZE config register, and another that later
restores the value, along with a valid IRQ number and some BAR values.
I've tested it on my laptop, and it prevents me from running into what I
consider to be a major bug: IRQ 11 is disabled by the IRQ debug code,
causing my wireless to break.

Thanks to Rob for the original patch to ohci1394.c and Stefan for lots
of proofreading (and a last minute bug caught in review!) and additional
information collection.  I think the DMI system list is correct, but we
may need to add some more PCI IDs to the PCI_FIXUP macros over time.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-28 15:37:02 -07:00
Greg Kroah-Hartman
53f4654272 [PATCH] Driver Core: fix up all callers of class_device_create()
The previous patch adding the ability to nest struct class_device
changed the paramaters to the call class_device_create().  This patch
fixes up all in-kernel users of the function.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-28 09:52:52 -07:00
Chris Wright
63172cb3d5 [PATCH] typo fix in last cpufreq powernow patch
Not sure how it slipped by, but here's a trivial typo fix for powernow.

Signed-off-by: Chris Wright <chrisw@osdl.org>
[ It's "nurter" backwards.. Maybe we have a hillbilly The Shining fan? ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-21 17:08:30 -07:00
Dave Jones
0213df7431 [PATCH] cpufreq: fix pending powernow timer stuck condition
AMD recently discovered that on some hardware, there is a race condition
possible when a C-state change request goes onto the bus at the same
time as a P-state change request.

Both requests happen, but the southbridge hardware only acknowledges the
C-state change.  The PowerNow! driver is then stuck in a loop, waiting
for the P-state change acknowledgement.  The driver eventually times
out, but can no longer perform P-state changes.

It turns out the solution is to resend the P-state change, which the
southbridge will acknowledge normally.

Thanks to Johannes Winkelmann for reporting this and testing the fix.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-21 14:28:58 -07:00
Dave Jones
bfdc708dc7 [CPUFREQ] kzalloc conversions for i386 drivers.
Signed-off-by: Dave Jones <davej@redhat.com>
2005-10-20 15:16:15 -07:00
Andi Kleen
3c92c2ba33 [PATCH] i386: Don't discard upper 32bits of HWCR on K8
Need to use long long, not long when RMWing a MSR. I think
it's harmless right now, but still should be better fixed
if AMD adds any bits in the upper 32bit of HWCR.

Bug was introduced with the TLB flush filter fix for i386

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-10 16:34:09 -07:00
Markus F.X.J. Oberhumer
d347f37227 [PATCH] i386: fix stack alignment for signal handlers
This fixes the setup of the alignment of the signal frame, so that all
signal handlers are run with a properly aligned stack frame.

The current code "over-aligns" the stack pointer so that the stack frame
is effectively always mis-aligned by 4 bytes.  But what we really want
is that on function entry ((sp + 4) & 15) == 0, which matches what would
happen if the stack were aligned before a "call" instruction.

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-10 08:45:06 -07:00
Al Viro
dd0fc66fb3 [PATCH] gfp flags annotations - part 1
- added typedef unsigned int __nocast gfp_t;

 - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
   the same warnings as far as sparse is concerned, doesn't change
   generated code (from gcc point of view we replaced unsigned int with
   typedef) and documents what's going on far better.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-08 15:00:57 -07:00
Nick Piggin
b33fa1f3c3 [PATCH] i386: include linux/irq.h rather than asm/hw_irq.h
I need the following patch to compile -git8 here, otherwise these
files fail to compile (asm/hw_irq.h needs definitions from
linux/irq.h and that file provides the required include ordering).

I did not do a full audit, though there looks to be many other
places that should get the same treatment, if this is  the right
way to do it.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-30 10:58:37 -07:00
Andi Kleen
7d318d7747 [PATCH] Fix up TLB flush filter disabling
I checked with AMD and they requested to only disable it for family 15.
Also disable it for i386 too. And some style fixes.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-29 15:41:42 -07:00
Al Viro
ce3a161e69 [PATCH] useless includes of linux/irq.h in arch/i386
Most of these guys are simply not needed (pulled by other stuff
via asm-i386/hardirq.h).  One that is not entirely useless is hilarious -
arch/i386/oprofile/nmi_timer_int.c includes linux/irq.h... as a way to
get linux/errno.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-26 18:29:50 -07:00
Dave Jones
b9111b7b7f [CPUFREQ] Remove preempt_disable from powernow-k8
Via reading the code, my understanding is that powernow-k8 uses
preempt_disable to ensure that driver->target doesn't migrate across cpus
whilst it's accessing per processor registers, however set_cpus_allowed
will provide this for us.  Additionally, remove schedule() calls from
set_cpus_allowed as set_cpus_allowed ensures that you're executing on the
target processor on return.

Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-09-23 11:10:42 -07:00
David S. Miller
4db2ce0199 [LIB]: Consolidate _atomic_dec_and_lock()
Several implementations were essentialy a common piece of C code using
the cmpxchg() macro.  Put the implementation in one spot that everyone
can share, and convert sparc64 over to using this.

Alpha is the lone arch-specific implementation, which codes up a
special fast path for the common case in order to avoid GP reloading
which a pure C version would require.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14 21:47:01 -07:00
Linus Torvalds
1619cca292 Partially revert "Fix time going twice as fast problem on ATI Xpress chipsets"
Commit 66759a01ad introduced the fix for
time ticking too fast on some boards by disabling one of the doubly
connected timer pins on ATI boards.

However, it ends up being _much_ too broad a brush, and that just makes
some other ATI boards not work at all since they now have no timer
source.

So disable the automatic ATI southbridge detection, and just rely on
people who see this problem disabling it by hand with the option
"disable_timer_pin_1" on the kernel command line.

Maybe somebody can figure out the proper tests at a later date.

Acked-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-14 15:56:27 -07:00
Cal Peake
0a305d2e1b [PATCH] Even more fallout from ATI Xpress timer workaround
disable_timer_pin_1 needs IO-APIC, not just local APIC.

Signed-off-by: Cal Peake <cp@absolutedigital.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 15:07:06 -07:00
Chuck Ebbert
33333373c4 [PATCH] i386: Ignore masked FPU exceptions
Masked FPU exceptions should obviously not happen in the first place,
but if they do, ignoring them seems to be the right thing to do.

Although there is no documentation available for Cyrix MII, I did find
erratum F-7 for Winchip C6, "FPU instruction may result in spurious
exception under certain conditions" which seems to indicate that this
can happen.

That would also explain the behaviour Ondrej Zary reported on the MII.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 09:59:04 -07:00
Tobias Klauser
6f673d83ca [PATCH] arch/i386: Replace custom macro with isdigit()
Replace the custom is_digit() macro with isdigit() from <linux/ctype.h>

Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 08:22:33 -07:00
Randy Dunlap
9f1583339a [PATCH] use add_taint() for setting tainted bit flags
Use the add_taint() interface for setting tainted bit flags instead of
doing it manually.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 08:22:29 -07:00
Linus Torvalds
a217e8c181 Fix fallout from ATI Xpress timer workaround
ACPI earlyquirks needs to honor the proper config variables, and include
the right header file.

(Fixes commit 66759a01ad)

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 12:32:31 -07:00
Chuck Ebbert
66759a01ad [PATCH] x86-64: i386/x86-64: Fix time going twice as fast problem on ATI Xpress chipsets
Original patch from Bertro Simul

This is probably still not quite correct, but seems to be
the best solution so far.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:50:58 -07:00
Andi Kleen
69e1a33f62 [PATCH] x86-64: Use ACPI PXM to parse PCI<->node assignments
Since this is shared code I had to implement it for i386 too

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:57 -07:00
Andi Kleen
f343bb4cd7 [PATCH] x86{-64}: Remove old hack that disabled mmconfig support on AMD systems.
Now that Greg implemented MCFG/_SEG support this shouldn't be needed
anymore

Cc: gregkh@suse.de

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 10:49:55 -07:00
Roland McGrath
c3ff8ec31c [PATCH] i386: Don't miss pending signals returning to user mode after signal processing
Signed-off-by: Roland McGrath <roland@redhat.com>
Acked-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-12 07:52:16 -07:00
Paolo 'Blaisorblade' Giarrusso
a7d0c21033 [PATCH] i386 / uml: add dwarf sections to static link script
Inside the linker script, insert the code for DWARF debug info sections. This
may help GDB'ing a Uml binary. Actually, it seems that ld is able to guess
what I added correctly, but normal linker scripts include this section so it
should be correct anyway adding it.

On request by Sam Ravnborg <sam@ravnborg.org>, I've added it to
asm-generic/vmlinux.lds.s. I've also moved there the stabs debug section,
used the new macro in i386 linker script and added DWARF debug section to
that.

In the truth, I've not been able to verify the difference in GDB behaviour
after this change (I've seen large improvements with another patch). This
may depend on my binutils version, older one may have worse defaults.

However, this section is present in normal linker script, so add it at
least for the sake of cleanness.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 12:00:17 -07:00
Nishanth Aravamudan
52e6e63088 [PATCH] i386: fix-up schedule_timeout() usage
Use schedule_timeout_interruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:37 -07:00
Adrian Bunk
672289e9fa [PATCH] i386/x86_64: make get_cpu_vendor() static
get_cpu_vendor() no longer has any users in other files.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:35 -07:00
Pavel Machek
b01d8684e9 [PATCH] remove ACPI S4bios support
Remove S4BIOS support.  It is pretty useless, and only ever worked for _me_
once.  (I do not think anyone else ever tried it).  It was in feature-removal
for a long time, and it should have been removed before.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:35 -07:00
Nishanth Aravamudan
aeb8397b6a [PATCH] i386/smpboot: use msleep() instead of schedule_timeout()
Replace schedule_timeout() with msleep() to guarantee the task delays as
expected.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:29 -07:00
Linus Torvalds
486a153f0e Merge master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild 2005-09-09 15:46:49 -07:00
Daniel Ritz
f0eca9626c [PATCH] Update PCI IOMEM allocation start
This fixes the problem with "Averatec 6240 pcmcia_socket0: unable to
apply power", which was due to the CardBus IOMEM register region being
allocated at an address that was actually inside the RAM window that had
been reserved for video frame-buffers in an UMA setup.

The BIOS _should_ have marked that region reserved in the e820 memory
descriptor tables, but did not.

It is fixed by rounding up the default starting address of PCI memory
allocations, so that we leave a bigger gap after the final known memory
location.  The amount of rounding depends on how big the unused memory
gap is that we can allocate IOMEM from.

Based on example code by Linus.

Acked-by: Greg KH <greg@kroah.com>
Acked-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 15:25:46 -07:00
Ingo Molnar
8d06afab73 [PATCH] timer initialization cleanup: DEFINE_TIMER
Clean up timer initialization by introducing DEFINE_TIMER a'la
DEFINE_SPINLOCK.  Build and boot-tested on x86.  A similar patch has been
been in the -RT tree for some time.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 14:03:48 -07:00
Antonino A. Daplas
5e518d7672 [PATCH] fbdev: Resurrect hooks to get EDID from firmware
For the i386, code is already present in video.S that gets the EDID from the
video BIOS.  Make this visible so drivers can also use this data as fallback
when i2c does not work.

To ensure that the EDID block is returned for the primary graphics adapter
only, by check if the IORESOURCE_ROM_SHADOW flag is set.

Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:59 -07:00
Antonino A. Daplas
d2d58384fc [PATCH] vesafb: Add blanking support
Add rudimentary support by manipulating the VGA registers.  However, not
all vesa modes are VGA compatible, so VGA compatiblity is checked first.
Only 2 levels are supported, powerup and powerdown.

Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:58 -07:00
Andrea Arcangeli
4c7fc7220f [PATCH] i386: seccomp fix for auditing/ptrace
This is the same issue as ppc64 before, when returning to userland we
shouldn't re-compute the seccomp check or the task could be killed during
sigreturn when orig_eax is overwritten by the sigreturn syscall.  This was
found by Roland.

This was harmless from a security standpoint, but some i686 users reported
failures with auditing enabled system wide (some distro surprisingly makes
it the default) and I reproduced it too by keeping the whole workload under
strace -f.

Patch is tested and works for me under strace -f.

nobody@athlon:~/cpushare> strace -o /tmp/o -f python seccomp_test.py
make: Nothing to be done for `seccomp_test'.
Starting computing some malicious bytecode
init
load
start
stop
receive_data failure
kill
exit_code 0 signal 9
The malicious bytecode has been killed successfully by seccomp
Starting computing some safe bytecode
init
load
start
stop
174 counts
kill
exit_code 0 signal 0
The seccomp_test.py completed successfully, thank you for testing.

(akpm: collaterally cleaned up a bit of do_syscall_trace() too)

Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:30 -07:00
Andrew Morton
1299232b57 [PATCH] x86: MP_processor_info fix
Remove the weird and apparently unnecessary logic in MP_processor_info() which
assumes that the BSP is the first one to run MP_processor_info().  On one of
my boxes that isn't true and cpu_possible_map gets the wrong value.

Cc: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Cc: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:56:43 -07:00
Karsten Wiese
0b968d2361 [PATCH] Fix misspelled i8259 typo in io_apic.c
The legacy PIC's name is "i8259".

Signed-off-by: Karsten Wiese <annabellesgarden@yahoo.de>
Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 10:37:10 -07:00
viro@ZenIV.linux.org.uk
fc0b1af257 [PATCH] __user annotations for pointers in i386 sigframe
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 10:31:59 -07:00
Sam Ravnborg
86feeaa812 kbuild: full dependency check on asm-offsets.h
Building asm-offsets.h has been moved to a seperate Kbuild file
located in the top-level directory. This allow us to share the
functionality across the architectures.

The old rules in architecture specific Makefiles will die
in subsequent patches.

Furhtermore the usual kbuild dependency tracking is now used
when deciding to rebuild asm-offsets.s. So we no longer risk
to fail a rebuild caused by asm-offsets.c dependencies being touched.

With this common rule-set we now force the same name across
all architectures. Following patches will fix the rest.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-09-09 19:28:28 +02:00
Linus Torvalds
529980c8b0 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq 2005-09-08 17:24:34 -07:00
Michael S. Tsirkin
346d38823b [PATCH] arch/386/pci: remap_pfn_range -> io_remap_pfn_range
Convert i386/pci to use io_remap_pfn_range instead of remap_pfn_range.
This is good for Xen which reuses i386/pci/i386.c for domain 0 code.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-08 14:57:24 -07:00
Len Brown
64e47488c9 Merge linux-2.6 with linux-acpi-2.6 2005-09-08 01:45:47 -04:00
viro@ZenIV.linux.org.uk
a08b6b7968 [PATCH] Kconfig fix (BLK_DEV_FD dependencies)
Sanitized and fixed floppy dependencies: split the messy dependencies for
BLK_DEV_FD by introducing a new symbol (ARCH_MAY_HAVE_PC_FDC), making
BLK_DEV_FD depend on that one and taking declarations of ARCH_MAY_HAVE_PC_FDC
to arch/*/Kconfig.  While we are at it, fixed several obvious cases when
BLK_DEV_FD should have been excluded (architectures lacking asm/floppy.h
are *not* going to have floppy.c compile, let alone work).

If you can come up with better name for that ("this architecture might
have working PC-compatible floppy disk controller"), you are more than
welcome - just s/ARCH_MAY_HAVE_PC_FDC/your_prefered_name/g in the patch
below...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 17:17:12 -07:00
Keshavamurthy Anil S
deac66ae45 [PATCH] kprobes: fix bug when probed on task and isr functions
This patch fixes a race condition where in system used to hang or sometime
crash within minutes when kprobes are inserted on ISR routine and a task
routine.

The fix has been stress tested on i386, ia64, pp64 and on x86_64.  To
reproduce the problem insert kprobes on schedule() and do_IRQ() functions
and you should see hang or system crash.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:58:01 -07:00
Jim Keniston
bce0649417 [PATCH] kprobes: fix handling of simultaneous probe hit/unregister
This patch fixes a bug in kprobes's handling of a corner case on i386 and
x86_64.  On an SMP system, if one CPU unregisters a kprobe just after
another CPU hits that probepoint, kprobe_handler() on the latter CPU sees
that the kprobe has been unregistered, and attempts to let the CPU continue
as if the probepoint hadn't been hit.  The bug is that on i386 and x86_64,
we were neglecting to set the IP back to the beginning of the probed
instruction.  This could cause an oops or crash.

This bug doesn't exist on ppc64 and ia64, where a breakpoint instruction
leaves the IP pointing to the beginning of the instruction.  I don't know
about sparc64.  (Dave, could you please advise?)

This fix has been tested on i386 and x86_64 SMP systems.  To reproduce the
problem, set one CPU to work registering and unregistering a kprobe
repeatedly, and another CPU pounding the probepoint in a tight loop.

Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:58:01 -07:00
Prasanna S Panchamukhi
3d97ae5b95 [PATCH] kprobes: prevent possible race conditions i386 changes
This patch contains the i386 architecture specific changes to prevent the
possible race conditions.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:59 -07:00
Robert Love
640e803376 [PATCH] fix: dmi_check_system
Background:

	1) dmi_check_system() returns the count of the number of
	   matches.  Zero thus means no matches.
	2) A match callback can return nonzero to stop the match
	   checking.

Bug: The count is incremented after we check for the nonzero return value,
so it does not reflect the actual count.  We could say this is intended,
for some dumb reason, except that it means that a match on the first check
returns zero--no matches--if the callback returns nonzero.

Attached patch implements the count before calling the callback and thus
before potentially short-circuiting.

Signed-off-by: Robert Love <rml@novell.com>
Cc: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:44 -07:00
Andrey Panin
ebad6a4230 [PATCH] dmi: add onboard devices discovery
This patch adds onboard devices and IPMI BMC discovery into DMI scan code.
Drivers can use dmi_find_device() function to search for devices by type and
name.

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:44 -07:00
Andrey Panin
c3c7120d55 [PATCH] dmi: make dmi_string() behave like strdup()
This patch changes dmi_string() function to allocate string copy by itself, to
avoid code duplication in the next patch.

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:44 -07:00
Andrey Panin
4e70b9a3d6 [PATCH] dmi: remove old debugging code
DMI debugging code is unused for ages.  This patch removes it.

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:44 -07:00
Andrey Panin
61e032fa2f [PATCH] dmi: remove uneeded function
After elimination of central DMI blacklist dmi_scan_machine() function became
a wrapper for dmi_iterate().  This patch moves some code around to kill
unneeded function.

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:44 -07:00
john stultz
b149ee2233 [PATCH] NTP: ntp-helper functions
This patch cleans up a commonly repeated set of changes to the NTP state
variables by adding two helper inline functions:

ntp_clear(): Clears the ntp state variables

ntp_synced(): Returns 1 if the system is synced with a time server.

This was compile tested for alpha, arm, i386, x86-64, ppc64, s390, sparc,
sparc64.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:34 -07:00
Ravikiran G Thirumalai
6c231b7bab [PATCH] Additions to .data.read_mostly section
Mark variables which are usually accessed for reads with __readmostly.

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:33 -07:00
Adrian Bunk
7f4bde9a34 [PATCH] remove the second arg of do_timer_interrupt()
The second arg of do_timer_interrupt() is not used in the functions, and
all callers pass NULL.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Paul Mundt <lethal@Linux-SH.ORG>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:32 -07:00
David Gibson
96d0821cac [PATCH] Fix function/macro name collision on i386 oprofile
The i386 OProfile code has a function named nmi_exit(), which collides with
the nmi_exit() macro in linux/hardirq.h.  At the moment, we get away with
it, because hardirq.h isn't included in the oprofile code.  I hit this as a
bug when working with a patch which (indirectly) adds a #include of
hardirq.h to oprofile.

Regardless, the name collision is probably not a good idea, so this patch
fixes it, renaming the oprofile function to op_nmi_exit().  It also renames
the nmi_init() and nmi_timer_init() functions similarly, for consistency.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:29 -07:00
H. Peter Anvin
f8eeaaf418 [PATCH] Make the bzImage format self-terminating
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Frank Sorenson <frank@tuxrocks.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:29 -07:00
Paul E. McKenney
19306059cd [PATCH] NMI: Update NMI users of RCU to use new API
Uses of RCU for dynamically changeable NMI handlers need to use the new
rcu_dereference() and rcu_assign_pointer() facilities.  This change makes
it clear that these uses are safe from a memory-barrier viewpoint, but the
main purpose is to document exactly what operations are being protected by
RCU.  This has been tested on x86 and x86-64, which are the only
architectures affected by this change.

Signed-off-by: <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:19 -07:00
Christoph Lameter
c3d8c14145 [PATCH] More __read_mostly variables
Move some more frequently read variables that showed up during some of our
performance tests as sometimes ending up in hot cachelines to the
read_mostly section.

Fix: Move the __read_mostly from before hpet_usec_quotient to follow the
variable like the other uses of __read_mostly.

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Christoph Lameter <christoph@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:18 -07:00
Ingo Molnar
8446f1d391 [PATCH] detect soft lockups
This patch adds a new kernel debug feature: CONFIG_DETECT_SOFTLOCKUP.

When enabled then per-CPU watchdog threads are started, which try to run
once per second.  If they get delayed for more than 10 seconds then a
callback from the timer interrupt detects this condition and prints out a
warning message and a stack dump (once per lockup incident).  The feature
is otherwise non-intrusive, it doesnt try to unlock the box in any way, it
only gets the debug info out, automatically, and on all CPUs affected by
the lockup.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-Off-By: Matthias Urlichs <smurf@smurf.noris.de>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:17 -07:00
Ashok Raj
a888cebe17 [PATCH] x86_64: create sysfs entries for cpu only for present cpus
Need to create sysfs only for cpus that are present.  Without which we see
NR_CPUS entries created when we have CONFIG_HOTPLUG and CONFIG_HOTPLUG_CPU
enabled.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:16 -07:00
Ashok Raj
54d5d42404 [PATCH] x86/x86_64: deferred handling of writes to /proc/irqxx/smp_affinity
When handling writes to /proc/irq, current code is re-programming rte
entries directly. This is not recommended and could potentially cause
chipset's to lockup, or cause missing interrupts.

CONFIG_IRQ_BALANCE does this correctly, where it re-programs only when the
interrupt is pending. The same needs to be done for /proc/irq handling as well.
Otherwise user space irq balancers are really not doing the right thing.

- Changed pending_irq_balance_cpumask to pending_irq_migrate_cpumask for
  lack of a generic name.
- added move_irq out of IRQ_BALANCE, and added this same to X86_64
- Added new proc handler for write, so we can do deferred write at irq
  handling time.
- Display of /proc/irq/XX/smp_affinity used to display CPU_MASKALL, instead
  it now shows only active cpu masks, or exactly what was set.
- Provided a common move_irq implementation, instead of duplicating
  when using generic irq framework.

Tested on i386/x86_64 and ia64 with CONFIG_PCI_MSI turned on and off.
Tested UP builds as well.

MSI testing: tbd: I have cards, need to look for a x-over cable, although I
did test an earlier version of this patch.  Will test in a couple days.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Zwane Mwaikambo <zwane@holomorphy.com>
Grudgingly-acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:15 -07:00
Paolo 'Blaisorblade' Giarrusso
640aa46e25 [PATCH] uml: SYSEMU: slight cleanup and speedup
As a follow-up to "UML Support - Ptrace: adds the host SYSEMU support, for
UML and general usage" (i.e.  uml-support-* in current mm).

Avoid unconditionally jumping to work_pending and code copying, just reuse
the already existing resume_userspace path.

One interesting note, from Charles P.  Wright, suggested that the API is
improvable with no downsides for UML (except that it will have to support
yet another host API, since dropping support for the current API, for UML,
is not reasonable from users' point of view).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Charles P. Wright <cwright@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:20 -07:00
Bodo Stroesser
ab1c23c244 [PATCH] SYSEMU: fix sysaudit / singlestep interaction
Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>

This is simply an adjustment for "Ptrace - i386: fix Syscall Audit interaction
with singlestep" to work on top of SYSEMU patches, too.  On this patch, I have
some doubts: I wonder why we need to alter that way ptrace_disable().

I left the patch this way because it has been extensively tested, but I don't
understand the reason.

The current PTRACE_DETACH handling simply clears child->ptrace; actually this
is not enough because entry.S just looks at the thread_flags; actually,
do_syscall_trace checks current->ptrace but I don't think depending on that is
good, at least for performance, so I think the clearing is done elsewhere.
For instance, on PTRACE_CONT it's done, but doing PTRACE_DETACH without
PTRACE_CONT is possible (and happens when gdb crashes and one kills it
manually).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Roland McGrath <roland@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:20 -07:00
Bodo Stroesser
1b38f0064e [PATCH] Uml support: add PTRACE_SYSEMU_SINGLESTEP option to i386
This patch implements the new ptrace option PTRACE_SYSEMU_SINGLESTEP, which
can be used by UML to singlestep a process: it will receive SINGLESTEP
interceptions for normal instructions and syscalls, but syscall execution will
be skipped just like with PTRACE_SYSEMU.

Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:20 -07:00
Bodo Stroesser
c8c86cecd1 [PATCH] Uml support: reorganize PTRACE_SYSEMU support
With this patch, we change the way we handle switching from PTRACE_SYSEMU to
PTRACE_{SINGLESTEP,SYSCALL}, to free TIF_SYSCALL_EMU from double use as a
preparation for PTRACE_SYSEMU_SINGLESTEP extension, without changing the
behavior of the host kernel.

Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:20 -07:00
Laurent Vivier
ed75e8d580 [PATCH] UML Support - Ptrace: adds the host SYSEMU support, for UML and general usage
Jeff Dike <jdike@addtoit.com>,
      Paolo 'Blaisorblade' Giarrusso <blaisorblade_spam@yahoo.it>,
      Bodo Stroesser <bstroesser@fujitsu-siemens.com>

Adds a new ptrace(2) mode, called PTRACE_SYSEMU, resembling PTRACE_SYSCALL
except that the kernel does not execute the requested syscall; this is useful
to improve performance for virtual environments, like UML, which want to run
the syscall on their own.

In fact, using PTRACE_SYSCALL means stopping child execution twice, on entry
and on exit, and each time you also have two context switches; with SYSEMU you
avoid the 2nd stop and so save two context switches per syscall.

Also, some architectures don't have support in the host for changing the
syscall number via ptrace(), which is currently needed to skip syscall
execution (UML turns any syscall into getpid() to avoid it being executed on
the host).  Fixing that is hard, while SYSEMU is easier to implement.

* This version of the patch includes some suggestions of Jeff Dike to avoid
  adding any instructions to the syscall fast path, plus some other little
  changes, by myself, to make it work even when the syscall is executed with
  SYSENTER (but I'm unsure about them). It has been widely tested for quite a
  lot of time.

* Various fixed were included to handle the various switches between
  various states, i.e. when for instance a syscall entry is traced with one of
  PT_SYSCALL / _SYSEMU / _SINGLESTEP and another one is used on exit.
  Basically, this is done by remembering which one of them was used even after
  the call to ptrace_notify().

* We're combining TIF_SYSCALL_EMU with TIF_SYSCALL_TRACE or TIF_SINGLESTEP
  to make do_syscall_trace() notice that the current syscall was started with
  SYSEMU on entry, so that no notification ought to be done in the exit path;
  this is a bit of a hack, so this problem is solved in another way in next
  patches.

* Also, the effects of the patch:
"Ptrace - i386: fix Syscall Audit interaction with singlestep"
are cancelled; they are restored back in the last patch of this series.

Detailed descriptions of the patches doing this kind of processing follow (but
I've already summed everything up).

* Fix behaviour when changing interception kind #1.

  In do_syscall_trace(), we check the status of the TIF_SYSCALL_EMU flag
  only after doing the debugger notification; but the debugger might have
  changed the status of this flag because he continued execution with
  PTRACE_SYSCALL, so this is wrong.  This patch fixes it by saving the flag
  status before calling ptrace_notify().

* Fix behaviour when changing interception kind #2:
  avoid intercepting syscall on return when using SYSCALL again.

  A guest process switching from using PTRACE_SYSEMU to PTRACE_SYSCALL
  crashes.

  The problem is in arch/i386/kernel/entry.S.  The current SYSEMU patch
  inhibits the syscall-handler to be called, but does not prevent
  do_syscall_trace() to be called after this for syscall completion
  interception.

  The appended patch fixes this.  It reuses the flag TIF_SYSCALL_EMU to
  remember "we come from PTRACE_SYSEMU and now are in PTRACE_SYSCALL", since
  the flag is unused in the depicted situation.

* Fix behaviour when changing interception kind #3:
  avoid intercepting syscall on return when using SINGLESTEP.

  When testing 2.6.9 and the skas3.v6 patch, with my latest patch and had
  problems with singlestepping on UML in SKAS with SYSEMU.  It looped
  receiving SIGTRAPs without moving forward.  EIP of the traced process was
  the same for all SIGTRAPs.

What's missing is to handle switching from PTRACE_SYSCALL_EMU to
PTRACE_SINGLESTEP in a way very similar to what is done for the change from
PTRACE_SYSCALL_EMU to PTRACE_SYSCALL_TRACE.

I.e., after calling ptrace(PTRACE_SYSEMU), on the return path, the debugger is
notified and then wake ups the process; the syscall is executed (or skipped,
when do_syscall_trace() returns 0, i.e.  when using PTRACE_SYSEMU), and
do_syscall_trace() is called again.  Since we are on the return path of a
SYSEMU'd syscall, if the wake up is performed through ptrace(PTRACE_SYSCALL),
we must still avoid notifying the parent of the syscall exit.  Now, this
behaviour is extended even to resuming with PTRACE_SINGLESTEP.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:20 -07:00
Bodo Stroesser
94c80b2598 [PATCH] Ptrace/i386: fix "syscall audit" interaction with singlestep
Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>

Avoid giving two traps for singlestep instead of one, when syscall auditing is
enabled.

In fact no singlestep trap is sent on syscall entry, only on syscall exit, as
can be seen in entry.S:

# Note that in this mask _TIF_SINGLESTEP is not tested !!! <<<<<<<<<<<<<<
        testb $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),TI_flags(%ebp)
        jnz syscall_trace_entry
	...
syscall_trace_entry:
	...
	call do_syscall_trace

But auditing a SINGLESTEP'ed process causes do_syscall_trace to be called, so
the tracer will get one more trap on the syscall entry path, which it
shouldn't.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Roland McGrath <roland@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:19 -07:00
Shaohua Li
c3c433e4f3 [PATCH] add suspend/resume for timer
The timers lack .suspend/.resume methods.  Because of this, jiffies got a
big compensation after a S3 resume.  And then softlockup watchdog reports
an oops.  This occured with HPET enabled, but it's also possible for other
timers.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:18 -07:00
Pavel Machek
829ca9a30a [PATCH] swsusp: fix remaining u32 vs. pm_message_t confusion
Fix remaining bits of u32 vs.  pm_message confusion.  Should not break
anything.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:15 -07:00
Pierre Ossman
795312e763 [PATCH] ISA DMA suspend for i386
Reset the ISA DMA controller into a known state after a suspend.  Primary
concern was reenabling the cascading DMA channel (4).

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:14 -07:00
Benjamin LaHaise
52fdd08903 [PATCH] unify x86/x86-64 semaphore code
This patch moves the common code in x86 and x86-64's semaphore.c into a
single file in lib/semaphore-sleepers.c.  The arch specific asm stubs are
left in the arch tree (in semaphore.c for i386 and in the asm for x86-64).
There should be no changes in code/functionality with this patch.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:14 -07:00
Zwane Mwaikambo
4ad8d38342 [PATCH] i386 boottime for_each_cpu broken
for_each_cpu walks through all processors in cpu_possible_map, which is
defined as cpu_callout_map on i386 and isn't initialised until all
processors have been booted. This breaks things which do for_each_cpu
iterations early during boot. So, define cpu_possible_map as a bitmap with
NR_CPUS bits populated. This was triggered by a patch i'm working on which
does alloc_percpu before bringing up secondary processors.

From: Alexander Nyberg <alexn@telia.com>

i386-boottime-for_each_cpu-broken.patch
i386-boottime-for_each_cpu-broken-fix.patch

The SMP version of __alloc_percpu checks the cpu_possible_map before
allocating memory for a certain cpu.  With the above patches the BSP cpuid
is never set in cpu_possible_map which breaks CONFIG_SMP on uniprocessor
machines (as soon as someone tries to dereference something allocated via
__alloc_percpu, which in fact is never allocated since the cpu is not set
in cpu_possible_map).

Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:13 -07:00
Zachary Amsden
d7271b14b2 [PATCH] i386: encapsulate copying of pgd entries
Add a clone operation for pgd updates.

This helps complete the encapsulation of updates to page tables (or pages
about to become page tables) into accessor functions rather than using
memcpy() to duplicate them.  This is both generally good for consistency
and also necessary for running in a hypervisor which requires explicit
updates to page table entries.

The new function is:

clone_pgd_range(pgd_t *dst, pgd_t *src, int count);

   dst - pointer to pgd range anwhere on a pgd page
   src - ""
   count - the number of pgds to copy.

   dst and src can be on the same page, but the range must not overlap
   and must not cross a page boundary.

Note that I ommitted using this call to copy pgd entries into the
software suspend page root, since this is not technically a live paging
structure, rather it is used on resume from suspend.  CC'ing Pavel in case
he has any feedback on this.

Thanks to Chris Wright for noticing that this could be more optimal in
PAE compiles by eliminating the memset.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:13 -07:00
George Anzinger
748f2edb52 [PATCH] x86 NMI: better support for debuggers
This patch adds a notify to the die_nmi notify that the system is about to
be taken down.  If the notify is handled with a NOTIFY_STOP return, the
system is given a new lease on life.

We also change the nmi watchdog to carry on if die_nmi returns.

This give debug code a chance to a) catch watchdog timeouts and b) possibly
allow the system to continue, realizing that the time out may be due to
debugger activities such as single stepping which is usually done with
"other" cpus held.

Signed-off-by: George Anzinger<george@mvista.com>
Cc: Keith Owens <kaos@ocs.com.au>
Signed-off-by: George Anzinger <george@mvista.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:13 -07:00
Zachary Amsden
f2f30ebca6 [PATCH] x86: introduce a write acessor for updating the current LDT
Introduce a write acessor for updating the current LDT.  This is required
for hypervisors like Xen that do not allow LDT pages to be directly
written.

Testing - here's a fun little LDT test that can be trivially modified to
test limits as well.

/*
 * Copyright (c) 2005, Zachary Amsden (zach@vmware.com)
 * This is licensed under the GPL.
 */

#include <stdio.h>
#include <signal.h>
#include <asm/ldt.h>
#include <asm/segment.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/mman.h>
#define __KERNEL__
#include <asm/page.h>

void main(void)
{
        struct user_desc desc;
        char *code;
        unsigned long long tsc;

        code = (char *)mmap(0, 8192, PROT_EXEC|PROT_READ|PROT_WRITE,
                                 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        desc.entry_number = 0;
        desc.base_addr = code;
        desc.limit = 1;
        desc.seg_32bit = 1;
        desc.contents = MODIFY_LDT_CONTENTS_CODE;
        desc.read_exec_only = 0;
        desc.limit_in_pages = 1;
        desc.seg_not_present = 0;
        desc.useable = 1;
        if (modify_ldt(1, &desc, sizeof(desc)) != 0) {
                perror("modify_ldt");
        }
        printf("code base is 0x%08x\n", (unsigned)code);
        code[0x0ffe] = 0x0f;  /* rdtsc */
        code[0x0fff] = 0x31;
        code[0x1000] = 0xcb;  /* lret */
        __asm__ __volatile("lcall $7,$0xffe" : "=A" (tsc));
        printf("TSC is 0x%016llx\n", tsc);
}

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:13 -07:00
Zachary Amsden
e9f86e351f [PATCH] x86: remove redundant TSS clearing
When reviewing GDT updates, I found the code:

	set_tss_desc(cpu,t);	/* This just modifies memory; ... */
        per_cpu(cpu_gdt_table, cpu)[GDT_ENTRY_TSS].b &= 0xfffffdff;

This second line is unnecessary, since set_tss_desc() has already cleared
the busy bit.

Commented disassembly, line 1:

c028b8bd:       8b 0c 86                mov    (%esi,%eax,4),%ecx
c028b8c0:       01 cb                   add    %ecx,%ebx
c028b8c2:       8d 0c 39                lea    (%ecx,%edi,1),%ecx

  => %ecx = per_cpu(cpu_gdt_table, cpu)

c028b8c5:       8d 91 80 00 00 00       lea    0x80(%ecx),%edx

  => %edx = &per_cpu(cpu_gdt_table, cpu)[GDT_ENTRY_TSS]

c028b8cb:       66 c7 42 00 73 20       movw   $0x2073,0x0(%edx)
c028b8d1:       66 89 5a 02             mov    %bx,0x2(%edx)
c028b8d5:       c1 cb 10                ror    $0x10,%ebx
c028b8d8:       88 5a 04                mov    %bl,0x4(%edx)
c028b8db:       c6 42 05 89             movb   $0x89,0x5(%edx)

  => ((char *)%edx)[5] = 0x89
  (equivalent) ((char *)per_cpu(cpu_gdt_table, cpu)[GDT_ENTRY_TSS])[5] = 0x89

c028b8df:       c6 42 06 00             movb   $0x0,0x6(%edx)
c028b8e3:       88 7a 07                mov    %bh,0x7(%edx)
c028b8e6:       c1 cb 10                ror    $0x10,%ebx

  => other bits

Commented disassembly, line 2:

c028b8e9:       8b 14 86                mov    (%esi,%eax,4),%edx
c028b8ec:       8d 04 3a                lea    (%edx,%edi,1),%eax

  => %eax = per_cpu(cpu_gdt_table, cpu)

c028b8ef:       81 a0 84 00 00 00 ff    andl   $0xfffffdff,0x84(%eax)

  => per_cpu(cpu_gdt_table, cpu)[GDT_ENTRY_TSS].b &= 0xfffffdff;
  (equivalent) ((char *)per_cpu(cpu_gdt_table, cpu)[GDT_ENTRY_TSS])[5] &= 0xfd

Note that (0x89 & ~0xfd) == 0; i.e, set_tss_desc(cpu,t) has already stored
the type field in the GDT with the busy bit clear.

Eliminating redundant and obscure code is always a good thing; in fact, I
pointed out this same optimization many moons ago in arch/i386/setup.c,
back when it used to be called that.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:13 -07:00
Zachary Amsden
a520112930 [PATCH] x86: make IOPL explicit
The pushf/popf in switch_to are ONLY used to switch IOPL.  Making this
explicit in C code is more clear.  This pushf/popf pair was added as a
bugfix for leaking IOPL to unprivileged processes when using
sysenter/sysexit based system calls (sysexit does not restore flags).

When requesting an IOPL change in sys_iopl(), it is just as easy to change
the current flags and the flags in the stack image (in case an IRET is
required), but there is no reason to force an IRET if we came in from the
SYSENTER path.

This change is the minimal solution for supporting a paravirtualized Linux
kernel that allows user processes to run with I/O privilege.  Other
solutions require radical rewrites of part of the low level fault / system
call handling code, or do not fully support sysenter based system calls.

Unfortunately, this added one field to the thread_struct.  But as a bonus,
on P4, the fastest time measured for switch_to() went from 312 to 260
cycles, a win of about 17% in the fast case through this performance
critical path.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:12 -07:00
Zachary Amsden
0998e4228a [PATCH] x86: privilege cleanup
Privilege checking cleanup.  Originally, these diffs were much greater, but
recent cleanups in Linux have already done much of the cleanup.  I added
some explanatory comments in places where the reasoning behind certain
tests is rather subtle.

Also, in traps.c, we can skip the user_mode check in handle_BUG().  The
reason is, there are only two call chains - one via die_if_kernel() and one
via do_page_fault(), both entering from die().  Both of these paths already
ensure that a kernel mode failure has happened.  Also, the original check
here, if (user_mode(regs)) was insufficient anyways, since it would not
rule out BUG faults from V8086 mode execution.

Saving the %ss segment in show_regs() rather than assuming a fixed value
also gives better information about the current kernel state in the
register dump.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:12 -07:00
Zachary Amsden
f2ab446124 [PATCH] x86: more asm cleanups
Some more assembler cleanups I noticed along the way.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:12 -07:00
Zachary Amsden
c9b02a2413 [PATCH] i386: use set_pte macros in a couple places where they were missing
Also, setting PDPEs in PAE mode does not require atomic operations, since the
PDPEs are cached by the processor, and only reloaded on an explicit or
implicit reload of CR3.

Since the four PDPEs must always be present in an active root, and the kernel
PDPE is never updated, we are safe even from SMIs and interrupts / NMIs using
task gates (which reload CR3).  Actually, much of this is moot, since the user
PDPEs are never updated either, and the only usage of task gates is by the
doublefault handler.  It appears the only place PGDs get updated in PAE mode
is in init_low_mappings() / zap_low_mapping() for initial page table creation
and recovery from ACPI sleep state, and these sites are safe by inspection.
Getting rid of the cmpxchg8b saves code space and 720 cycles in pgd_alloc on
P4.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:12 -07:00
Zachary Amsden
e7a2ff593c [PATCH] i386: load_tls() fix
Subtle fix: load_TLS has been moved after saving %fs and %gs segments to avoid
creating non-reversible segments.  This could conceivably cause a bug if the
kernel ever needed to save and restore fs/gs from the NMI handler.  It
currently does not, but this is the safest approach to avoiding fs/gs
corruption.  SMIs are safe, since SMI saves the descriptor hidden state.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:11 -07:00