dect
/
linux-2.6
Archived
13
0
Fork 0
Commit Graph

245171 Commits

Author SHA1 Message Date
Linus Torvalds ddb503b429 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
  alpha: Wire up syscalls new to 2.6.39
  alpha: convert to clocksource_register_hz
2011-05-13 17:29:03 -07:00
Michael Cree 90b57f3516 alpha: Wire up syscalls new to 2.6.39
Wire up the syscalls:
   name_to_handle_at
   open_by_handle_at
   clock_adjtime
   syncfs
and adjust some whitespace in the neighbourhood to align commments.

Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-05-13 19:16:11 -04:00
John Stultz f550806a7f alpha: convert to clocksource_register_hz
Converts alpha to use clocksource_register_hz.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-05-13 19:16:10 -04:00
Linus Torvalds 298eaaad0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  bridge: fix forwarding of IPv6
  bonding,llc: Fix structure sizeof incompatibility for some PDUs
  ipv6: restore correct ECN handling on TCP xmit
  ne-h8300: Fix regression caused during net_device_ops conversion
  hydra: Fix regression caused during net_device_ops conversion
  zorro8390: Fix regression caused during net_device_ops conversion
  sfc: Always map MCDI shared memory as uncacheable
  ehea: Fix memory hotplug oops
  libertas: fix cmdpendingq locking
  iwlegacy: fix IBSS mode crashes
  ath9k: Fix a warning due to a queued work during S3 state
  mac80211: don't start the dynamic ps timer if not associated
2011-05-13 15:20:51 -07:00
Linus Torvalds cf70cc5b9d Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFSv4.1: Ensure that layoutget uses the correct gfp modes
  NFSv4.1: remove pnfs_layout_hdr from pnfs_destroy_all_layouts tmp_list
  NFSv41: Resend on NFS4ERR_RETRY_UNCACHED_REP
2011-05-13 15:19:39 -07:00
Yehuda Sadeh 1fec70932d rbd: fix split bio handling
The rbd driver currently splits bios when they span an object boundary.
However, the blk_end_request expects the completions to roll up the results
in block device order, and the split rbd/ceph ops can complete in any
order.  This patch adds a struct rbd_req_coll to track completion of split
requests and ensures that the results are passed back up to the block layer
in order.

This fixes errors where the file system gets completion of a read operation
that spans an object boundary before the data has actually arrived.  The
bug is easily reproduced with iozone with a working set larger than
available RAM.

Reported-by: Fyodor Ustinov <ufm@ufm.su>
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-13 13:52:57 -07:00
Stephen Hemminger cb68552858 bridge: fix forwarding of IPv6
The commit 6b1e960fdb
    bridge: Reset IPCB when entering IP stack on NF_FORWARD
broke forwarding of IPV6 packets in bridge because it would
call bp_parse_ip_options with an IPV6 packet.

Reported-by: Noah Meyerhans <noahm@debian.org>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 16:03:24 -04:00
Andy Lutomirski 087fbc9962 drm/i915: Revert i915.semaphore=1 default from i915 merge
My Q67 / i7-2600 box has rev09 Sandy Bridge graphics.  It hangs
instantly when GNOME loads and it hangs so hard the reset button
doesn't work.  Setting i915.semaphore=0 fixes it.

Semaphores were disabled in a1656b9090 ("drm/i915: Disable GPU
semaphores by default") in 2.6.38 but were then re-enabled (by mistake?)
by the merge 47ae63e0c2 ("Merge branch 'drm-intel-fixes' into
drm-intel-next").

(It's worth noting that the offending change is i915_drv.c, which was
not marked as a conflict - although a 'git show --cc' on the merge does
show that neither parent had it set to 1)

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-13 12:22:51 -07:00
Vitalii Demianets a10e146676 bonding,llc: Fix structure sizeof incompatibility for some PDUs
With some combinations of arch/compiler (e.g. arm-linux-gcc) the sizeof
operator on structure returns value greater than expected. In cases when the
structure is used for mapping PDU fields it may lead to unexpected results
(such as holes and alignment problems in skb data). __packed prevents this
undesired behavior.

Signed-off-by: Vitalii Demianets <vitas@nppfactor.kiev.ua>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 15:13:24 -04:00
Linus Torvalds 26cf46be95 vfs: micro-optimize acl_permission_check()
It's a hot function, and we're better off not mixing types in the mask
calculations.  The compiler just ends up mixing 16-bit and 32-bit
operations, for no good reason.

So do everything in 'unsigned int' rather than mixing 'unsigned int'
masking with a 'umode_t' (16-bit) mode variable.

This, together with the parent commit (47a150edc2ae: "Cache user_ns in
struct cred") makes acl_permission_check() much nicer.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-13 11:51:01 -07:00
Serge E. Hallyn 47a150edc2 Cache user_ns in struct cred
If !CONFIG_USERNS, have current_user_ns() defined to (&init_user_ns).

Get rid of _current_user_ns.  This requires nsown_capable() to be
defined in capability.c rather than as static inline in capability.h,
so do that.

Request_key needs init_user_ns defined at current_user_ns if
!CONFIG_USERNS, so forward-declare that in cred.h if !CONFIG_USERNS
at current_user_ns() define.

Compile-tested with and without CONFIG_USERNS.

Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
[ This makes a huge performance difference for acl_permission_check(),
  up to 30%.  And that is one of the hottest kernel functions for loads
  that are pathname-lookup heavy.  ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-13 11:45:33 -07:00
Julia Lawall d9a5ac9ef3 x86, mce, AMD: Fix leaving freed data in a list
b may be added to a list, but is not removed before being freed
in the case of an error.  This is done in the corresponding
deallocation function, so the code here has been changed to
follow that.

The sematic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression E,E1,E2;
identifier l;
@@

*list_add(&E->l,E1);
... when != E1
    when != list_del(&E->l)
    when != list_del_init(&E->l)
    when != E = E2
*kfree(E);// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1305294731-12127-1-git-send-email-julia@diku.dk
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-13 17:11:02 +02:00
Avinash H.M 5fd2a84ab3 OMAP3: set the core dpll clk rate in its set_rate function
The debug l3_ick/rate is not displaying the actual rate of the clock in
hardware. This is because, the core dpll set_rate function doesn't update the
clk.rate. After fixing, the l3_ick/rate is displaying proper values.

Signed-off-by: Shweta Gulati <shweta.gulati@ti.com>
Signed-off-by: Avinash.H.M <avinashhm@ti.com>
Cc: Rajendra Nayak <rnayak@ti.com>
Cc: Paul Wamsley <paul@pwsan.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2011-05-13 07:08:18 -07:00
Alex Deucher 3a8ab79eae drm/radeon/kms: add some evergreen/ni safe regs
need to programmed from the userspace drivers.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-13 16:16:36 +10:00
Alex Deucher 05fa7ea7d2 drm/radeon/kms: fix extended lvds info parsing
On rev <= 1.1 tables, the offset is absolute,
on newer tables, it's relative.

Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=700326

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-13 16:16:33 +10:00
Alex Deucher d9282fca8a drm/radeon/kms: fix tiling reg on fusion
The location of MC_ARB_RAMCFG changed on fusion.
I've diffed all the other regs in evergreend.h and this
is the only other reg that changed.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-13 16:15:39 +10:00
Sage Weil 11f770027b rbd: fix leak of ops struct
The ops vector must be freed by the rbd_do_request caller.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-12 20:59:14 -07:00
Linus Torvalds 381e7863d9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  SELinux: delete debugging printks from filename_trans rule processing
2011-05-12 18:16:13 -07:00
Linus Torvalds 446cc6345d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  net/9p/protocol.c: Fix a memory leak
2011-05-12 18:00:09 -07:00
Linus Torvalds 9bbd055808 Merge branch 'for-2639-rc7/i2c-fixes' of git://git.fluff.org/bjdooks/linux
* 'for-2639-rc7/i2c-fixes' of git://git.fluff.org/bjdooks/linux:
  i2c: pnx: Fix crash due to wrong init of timer->data
2011-05-12 16:55:33 -07:00
James Morris ca7d120008 Merge branch 'for-linus' of git://git.infradead.org/users/eparis/selinux into for-linus 2011-05-13 09:52:16 +10:00
Wolfram Sang 9ddabb055d i2c: pnx: Fix crash due to wrong init of timer->data
alg_data is already a pointer which must be passed directly.

Reported-by: Dieter Ripp <ripp@systecnet.com>
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ben Dooks <ben-i2c@fluff.org>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2011-05-13 00:10:36 +01:00
Steinar H. Gunderson ca06707022 ipv6: restore correct ECN handling on TCP xmit
Since commit e9df2e8fd8 (Use appropriate sock tclass setting for
routing lookup) we lost ability to properly add ECN codemarks to ipv6
TCP frames.

It seems like TCP_ECN_send() calls INET_ECN_xmit(), which only sets the
ECN bit in the IPv4 ToS field (inet_sk(sk)->tos), but after the patch,
what's checked is inet6_sk(sk)->tclass, which is a completely different
field.

Close bug https://bugzilla.kernel.org/show_bug.cgi?id=34322

[Eric Dumazet] : added the INET_ECN_dontxmit() fix and replace macros
by inline functions for clarity.

Signed-off-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12 18:52:14 -04:00
Ingo Molnar 411f05f123 vsprintf: Turn kptr_restrict off by default
kptr_restrict has been triggering bugs in apps such as perf, and it also makes
the system less useful by default, so turn it off by default.

This is how we generally handle security features that remove functionality,
such as firewall code or SELinux - they have to be configured and activated
from user-space.

Distributions can turn kptr_restrict on again via this line in
/etc/sysctrl.conf:

kernel.kptr_restrict = 1

( Also mark the variable __read_mostly while at it, as it's typically modified
  only once per bootup, or not at all. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-12 15:18:16 -07:00
Pedro Scarapicchia Junior 1b0bcbcf62 net/9p/protocol.c: Fix a memory leak
When p9pdu_readf() is called with "s" attribute, it allocates a pointer that
will store a string. In p9dirent_read(), this pointer is not being released,
leading to out of memory errors.
This patch releases this pointer after string is copyed to dirent->d_name.

Signed-off-by: Pedro Scarapicchia Junior <pedro.scarapiccha@br.flextronics.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-05-12 17:05:37 -05:00
Cliff Wickman 77ed23f8d9 x86: Fix UV BAU for non-consecutive nasids
This is a fix for the SGI Altix-UV Broadcast Assist Unit code,
which is used for TLB flushing.

Certain hardware configurations (that customers are ordering)
cause nasids (numa address space id's) to be non-consecutive.
Specifically, once you have more than 4 blades in a IRU
(Individual Rack Unit - or 1/2 rack) but less than the maximum
of 16, the nasid numbering becomes non-consecutive.  This
currently results in a 'catastrophic error' (CATERR) detected by
the firmware during OS boot.  The BAU is generating an 'INTD'
request that is targeting a non-existent nasid value. Such
configurations may also occur when a blade is configured off
because of hardware errors. (There is one UV hub per blade.)

This patch is required to support such configurations.

The problem with the tlb_uv.c code is that is using the
consecutive hub numbers as indices to the BAU distribution bit
map. These are simply the ordinal position of the hub or blade
within its partition.  It should be using physical node numbers
(pnodes), which correspond to the physical nasid values. Use of
the hub number only works as long as the nasids in the partition
are consecutive and increase with a stride of 1.

This patch changes the index to be the pnode number, thus
allowing nasids to be non-consecutive.
It also provides a table in local memory for each cpu to
translate target cpu number to target pnode and nasid.
And it improves naming to properly reflect 'node' and 'uvhub'
versus 'nasid'.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/E1QJmxX-0002Mz-Fk@eag09.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-12 23:45:42 +02:00
Geert Uytterhoeven 2592a73540 ne-h8300: Fix regression caused during net_device_ops conversion
Changeset dcd39c9029 ("ne-h8300: convert to
net_device_ops") broke ne-h8300 by adding 8390.o to the link. That
meant that lib8390.c was included twice, once in ne-h8300.c and once in
8390.c, subject to different macros. This patch reverts that by
avoiding the wrappers in 8390.c.

Fix based on commits 217cbfa856 ("mac8390:
fix regression caused during net_device_ops conversion") and
4e0168fa48 ("mac8390: fix build with
NET_POLL_CONTROLLER").

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12 16:59:57 -04:00
Geert Uytterhoeven 0b25e0157d hydra: Fix regression caused during net_device_ops conversion
Changeset 5618f0d119 ("hydra: convert to
net_device_ops") broke hydra by adding 8390.o to the link. That
meant that lib8390.c was included twice, once in hydra.c and once in
8390.c, subject to different macros. This patch reverts that by
avoiding the wrappers in 8390.c.

Fix based on commits 217cbfa856 ("mac8390:
fix regression caused during net_device_ops conversion") and
4e0168fa48 ("mac8390: fix build with
NET_POLL_CONTROLLER").

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12 16:59:57 -04:00
Geert Uytterhoeven cf7e032fc8 zorro8390: Fix regression caused during net_device_ops conversion
Changeset b6114794a1 ("zorro8390: convert to
net_device_ops") broke zorro8390 by adding 8390.o to the link. That
meant that lib8390.c was included twice, once in zorro8390.c and once in
8390.c, subject to different macros. This patch reverts that by
avoiding the wrappers in 8390.c.

Fix based on commits 217cbfa856 ("mac8390:
fix regression caused during net_device_ops conversion") and
4e0168fa48 ("mac8390: fix build with
NET_POLL_CONTROLLER").

Reported-by: Christian T. Steigies <cts@debian.org>
Suggested-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Christian T. Steigies <cts@debian.org>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12 16:59:57 -04:00
David S. Miller d44cf14ddf Merge branch 'sfc-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-2.6 2011-05-12 16:26:45 -04:00
Eric Paris 93826c092c SELinux: delete debugging printks from filename_trans rule processing
The filename_trans rule processing has some printk(KERN_ERR ) messages
which were intended as debug aids in creating the code but weren't removed
before it was submitted.  Remove them.

Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Eric Paris <eparis@redhat.com>
2011-05-12 16:02:42 -04:00
Linus Torvalds ca1376d108 Merge branch 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ASoC: WM8903: Fix Digital Capture Volume range
  ASoC: UDA134x: Remove POWER_OFF_ON_STANDBY define.
  ASoC: SSM2602: Fix reg_cache_size
  ASoC: SSM2602: Fix 'Mic Boost2' control
  ASoC: SSM2602: Properly annotate i2c probe and remove functions
  ASoC: sst_platform: add hw_free callback to fix resource leak
  ASoC: Don't crash on PM operations
  ASoC: JZ4740: Fix i2s shutdown
2011-05-12 12:41:30 -07:00
Linus Torvalds 0c5e1577f1 Merge branch 'stable/bug-fixes-for-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/bug-fixes-for-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  x86/mm: Fix section mismatch derived from native_pagetable_reserve()
  x86,xen: introduce x86_init.mapping.pagetable_reserve
  Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high""
2011-05-12 12:21:51 -07:00
Linus Torvalds 982b2035d9 Revert "drm/i915: Only enable the plane after setting the fb base (pre-ILK)"
This reverts commit 49183b2818.

Quoth Franz Melchior:

  "This patch introduces a bug on my infamous "Acer Travelmate
   5735Z-452G32Mnss": when KMS takes over, the frame buffer contents get
   completely garbled up on screen, with colored stripes and unreadable
   text (photo on request).  Only when X11 is started, the screen gets
   restored again.  Closing and re-opening the lid partly cures the
   mess, too: it makes the font readable, though horizontally stretched."

Acked-by: Keith Packard <keithp@keithp.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-12 12:19:43 -07:00
Linus Torvalds df43938bc5 Merge branch 'fbmem'
* fbmem:
  fbmem: make read/write/ioctl use the frame buffer at open time
  fbcon: add lifetime refcount to opened frame buffers
2011-05-12 10:42:36 -07:00
Linus Torvalds 49f019c188 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: ads7846 - remove unused variable from struct ads7845_ser_req
  Input: ads7846 - make transfer buffers DMA safe
2011-05-12 10:41:31 -07:00
Sedat Dilek 53f8023feb x86/mm: Fix section mismatch derived from native_pagetable_reserve()
With CONFIG_DEBUG_SECTION_MISMATCH=y I see these warnings in next-20110415:

  LD      vmlinux.o
  MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x1ba48): Section mismatch in reference from the function native_pagetable_reserve() to the function .init.text:memblock_x86_reserve_range()
The function native_pagetable_reserve() references
the function __init memblock_x86_reserve_range().
This is often because native_pagetable_reserve lacks a __init
annotation or the annotation of memblock_x86_reserve_range is wrong.

This patch fixes the issue.
Thanks to pipacs from PaX project for help on IRC.

Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 13:05:05 -04:00
Stefano Stabellini 279b706bf8 x86,xen: introduce x86_init.mapping.pagetable_reserve
Introduce a new x86_init hook called pagetable_reserve that at the end
of init_memory_mapping is used to reserve a range of memory addresses for
the kernel pagetable pages we used and free the other ones.

On native it just calls memblock_x86_reserve_range while on xen it also
takes care of setting the spare memory previously allocated
for kernel pagetable pages from RO to RW, so that it can be used for
other purposes.

A detailed explanation of the reason why this hook is needed follows.

As a consequence of the commit:

commit 4b239f458c
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Fri Dec 17 16:58:28 2010 -0800

    x86-64, mm: Put early page table high

at some point init_memory_mapping is going to reach the pagetable pages
area and map those pages too (mapping them as normal memory that falls
in the range of addresses passed to init_memory_mapping as argument).
Some of those pages are already pagetable pages (they are in the range
pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and
everything is fine.
Some of these pages are not pagetable pages yet (they fall in the range
pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they
are going to be mapped RW.  When these pages become pagetable pages and
are hooked into the pagetable, xen will find that the guest has already
a RW mapping of them somewhere and fail the operation.
The reason Xen requires pagetables to be RO is that the hypervisor needs
to verify that the pagetables are valid before using them. The validation
operations are called "pinning" (more details in arch/x86/xen/mmu.c).

In order to fix the issue we mark all the pages in the entire range
pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation
is completed only the range pgt_buf_start-pgt_buf_end is reserved by
init_memory_mapping. Hence the kernel is going to crash as soon as one
of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those
ranges are RO).

For this reason we need a hook to reserve the kernel pagetable pages we
used and free the other ones so that they can be reused for other
purposes.
On native it just means calling memblock_x86_reserve_range, on Xen it
also means marking RW the pagetable pages that we allocated before but
that haven't been used before.

Another way to fix this is without using the hook is by adding a 'if
(xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen
counterpart, but that is just nasty.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 13:05:04 -04:00
Konrad Rzeszutek Wilk 92bdaef7b2 Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high""
This reverts commit a38647837a.

It does not work with certain AMD machines.

last_pfn = 0x100000 max_arch_pfn = 0x400000000
initial memory mapped : 0 - 02c3a000
Base memory trampoline at [ffff88000009b000] 9b000 size 20480
init_memory_mapping: 0000000000000000-0000000100000000
 0000000000 - 0100000000 page 4k
kernel direct mapping tables up to 100000000 @ ff7fb000-100000000
init_memory_mapping: 0000000100000000-00000001e0800000
 0100000000 - 01e0800000 page 4k
kernel direct mapping tables up to 1e0800000 @ 1df0f3000-1e0000000
xen: setting RW the range fffdc000 - 100000000
RAMDISK: 0203b000 - 02c3a000
No NUMA configuration found
Faking a node at 0000000000000000-00000001e0800000
NUMA: Using 63 for the hash shift.
Initmem setup node 0 0000000000000000-00000001e0800000
  NODE_DATA [00000001dfffb000 - 00000001dfffffff]
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
PGD 0
Oops: 0003 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:

Pid: 0, comm: swapper Not tainted 2.6.39-0-virtual #6~smb1
RIP: e030:[<ffffffff81cf6a75>]  [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
RSP: e02b:ffffffff81c01e38  EFLAGS: 00010046
RAX: 0000000000000000 RBX: 00000001e0800000 RCX: 0000000000001040
RDX: 0000000000004100 RSI: 0000000000000000 RDI: ffff8801dfffb000
RBP: ffffffff81c01e58 R08: 0000000000000020 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000bfe400
FS:  0000000000000000(0000) GS:ffffffff81cca000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000001c03000 CR4: 0000000000000660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c0b020)
Stack:
 0000000000000040 0000000000000001 0000000000000000 ffffffffffffffff
 ffffffff81c01e88 ffffffff81cf6c25 0000000000000000 0000000000000000
 ffffffff81cf687f 0000000000000000 ffffffff81c01ea8 ffffffff81cf6e45
Call Trace:
 [<ffffffff81cf6c25>] numa_register_memblks.constprop.3+0x150/0x181
 [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c
 [<ffffffff81cf6e45>] numa_init.part.2+0x1c/0x7c
 [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c
 [<ffffffff81cf6f67>] numa_init+0x6c/0x70
 [<ffffffff81cf7057>] initmem_init+0x39/0x3b
 [<ffffffff81ce5865>] setup_arch+0x64e/0x769
 [<ffffffff815e43c1>] ? printk+0x51/0x53
 [<ffffffff81cdf92b>] start_kernel+0xd4/0x3f3
 [<ffffffff81cdf388>] x86_64_start_reservations+0x132/0x136
 [<ffffffff81ce2ed4>] xen_start_kernel+0x588/0x58f
Code: 41 00 00 48 8b 3c c5 a0 24 cc 81 31 c0 40 f6 c7 01 74 05 aa 66 ba ff 40 40 f6 c7 02 74 05 66 ab 83 ea 02 89 d1 c1 e9 02 f6 c2 02 <f3> ab 74 02 66 ab 80 e2 01 74 01 aa 49 63 c4 48 c1 eb 0c 44 89
RIP  [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea
 RSP <ffffffff81c01e38>
CR2: 0000000000000000
---[ end trace a7919e7f17c0a725 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
Pid: 0, comm: swapper Tainted: G      D     2.6.39-0-virtual #6~smb1

Reported-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 13:04:29 -04:00
Linus Torvalds 6eaed0a438 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix oops in revalidate when called with NULL nameidata
2011-05-12 08:06:53 -07:00
Linus Torvalds 8043f4eb85 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic
  sparc32: fix sparcstation 5 boot
  sparc32: fix section mismatch warnings in apc, pmc and time_32
2011-05-12 07:53:34 -07:00
Linus Torvalds 75c0b3b466 Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm:
  ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses
  ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls
  ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM
  ARM: zImage: the page table memory must be considered before relocation
  ARM: zImage: make sure not to relocate on top of the relocation code
  ARM: zImage: Fix bad SP address after relocating kernel
  ARM: zImage: make sure the stack is 64-bit aligned
  ARM: RiscPC: acornfb: fix section mismatches
  ARM: RiscPC: etherh: fix section mismatches
2011-05-12 07:53:06 -07:00
Linus Torvalds c47747fde9 fbmem: make read/write/ioctl use the frame buffer at open time
read/write/ioctl on a fbcon file descriptor has traditionally used the
fbcon not when it was opened, but as it was at the time of the call.
That makes no sense, but the lack of sense is much more obvious now that
we properly ref-count the usage - it means that the ref-counting doesn't
actually protect operations we do on the frame buffer.

This changes it to look at the fb_info that we got at open time, but in
order to avoid using a frame buffer long after it has been unregistered,
we do verify that it is still current, and return -ENODEV if not.

Acked-by: Tim Gardner <tim.gardner@canonical.com>
Tested-by: Daniel J Blueman <daniel.blueman@gmail.com>
Tested-by: Anca Emanuel <anca.emanuel@gmail.com>
Cc: Bruno Prémont <bonbons@linux-vserver.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Andy Whitcroft <andy.whitcroft@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-12 07:46:43 -07:00
Linus Torvalds 698b368275 fbcon: add lifetime refcount to opened frame buffers
This just adds the refcount and the new registration lock logic.  It
does not (for example) actually change the read/write/ioctl routines to
actually use the frame buffer that was opened: those function still end
up alway susing whatever the current frame buffer is at the time of the
call.

Without this, if something holds the frame buffer open over a
framebuffer switch, the close() operation after the switch will access a
fb_info that has been free'd by the unregistering of the old frame
buffer.

(The read/write/ioctl operations will normally not cause problems,
because they will - illogically - pick up the new fbcon instead.  But a
switch that happens just as one of those is going on might see problems
too, the window is just much smaller: one individual op rather than the
whole open-close sequence.)

This use-after-free is apparently fairly easily triggered by the Ubuntu
11.04 boot sequence.

Acked-by: Tim Gardner <tim.gardner@canonical.com>
Tested-by: Daniel J Blueman <daniel.blueman@gmail.com>
Tested-by: Anca Emanuel <anca.emanuel@gmail.com>
Cc: Bruno Prémont <bonbons@linux-vserver.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Andy Whitcroft <andy.whitcroft@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-12 07:37:51 -07:00
Ben Hutchings 747df2258b sfc: Always map MCDI shared memory as uncacheable
We enabled write-combining for memory-mapped registers in commit
65f0b417de, but inhibited it for the
MCDI shared memory where this is not supported.  However,
write-combining mappings also allow read-reordering, which may also
be a problem.

I found that when an SFC9000-family controller is connected to an
Intel 3000 chipset, and write-combining is enabled, the controller
stops responding to PCIe read requests during driver initialisation
while the driver is polling for completion of an MCDI command.  This
results in an NMI and system hang.  Adding read memory barriers
between all reads to the shared memory area appears to reduce but not
eliminate the probability of this.

We have not yet established whether this is a bug in our BIU or in the
PCIe bridge.  For now, work around by mapping the shared memory area
separately.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2011-05-12 15:16:32 +01:00
Catalin Marinas a904f5f9eb ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses
Since mandatory barriers may be used (explicitly or implicitly via readl
etc.) to ensure the ordering between Device and Normal memory accesses,
a DMB is not enough. This patch converts it to a DSB.

Cc: Colin Cross <ccross@android.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-05-12 10:52:00 +01:00
Arnd Bergmann 2af68df02f ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls
GDB's interrupt.exp test cases currenly fail on ARM.  The problem is how do_signal
handled restarting interrupted system calls:

The entry.S assembler code determines that we come from a system call; and that
information is passed as "syscall" parameter to do_signal.  That routine then
calls get_signal_to_deliver [*] and if a signal is to be delivered, calls into
handle_signal.  If a system call is to be restarted either after the signal
handler returns, or if no handler is to be called in the first place, the PC
is updated after the get_signal_to_deliver call, either in handle_signal (if
we have a handler) or at the end of do_signal (otherwise).

Now the problem is that during [*], the call to get_signal_to_deliver, a ptrace
intercept may happen.  During this intercept, the debugger may change registers,
including the PC.  This is done by GDB if it wants to execute an "inferior call",
i.e. the execution of some code in the debugged program triggered by GDB.

To this purpose, GDB will save all registers, allocate a stack frame, set up
PC and arguments as appropriate for the call, and point the link register to
a dummy breakpoint instruction.  Once the process is restarted, it will execute
the call and then trap back to the debugger, at which point GDB will restore
all registers and continue original execution.

This generally works fine.  However, now consider what happens when GDB attempts
to do exactly that while the process was interrupted during execution of a to-be-
restarted system call:  do_signal is called with the syscall flag set; it calls
get_signal_to_deliver, at which point the debugger takes over and changes the PC
to point to a completely different place.  Now get_signal_to_deliver returns
without a signal to deliver; but now do_signal decides it should be restarting
a system call, and decrements the PC by 2 or 4 -- so it now points to 2 or 4
bytes before the function GDB wants to call -- which leads to a subsequent crash.

To fix this problem, two things need to be supported:
- do_signal must be able to recognize that get_signal_to_deliver changed the PC
  to a different location, and skip the restart-syscall sequence
- once the debugger has restored all registers at the end of the inferior call
  sequence, do_signal must recognize that *now* it needs to restart the pending
  system call, even though it was now entered from a breakpoint instead of an
  actual svc instruction

This set of issues is solved on other platforms, usually by one of two
mechanisms:

- The status information "do_signal is handling a system call that may need
  restarting" is itself carried in some register that can be accessed via
  ptrace.  This is e.g. on Intel the "orig_eax" register; on Sparc the kernel
  defines a magic extra bit in the flags register for this purpose.
  This allows GDB to manage that state: reset it when doing an inferior call,
  and restore it after the call is finished.

- On s390, do_signal transparently handles this problem without requiring
  GDB interaction, by performing system call restarting in the following
  way: first, adjust the PC as necessary for restarting the call.  Then,
  call get_signal_to_deliver; and finally just continue execution at the
  PC.  This way, if GDB does not change the PC, everything is as before.
  If GDB *does* change the PC, execution will simply continue there --
  and once GDB restores the PC it saved at that point, it will automatically
  point to the *restarted* system call.  (There is the minor twist how to
  handle system calls that do *not* need restarting -- do_signal will undo
  the PC change in this case, after get_signal_to_deliver has returned, and
  only if ptrace did not change the PC during that call.)

Because there does not appear to be any obvious register to carry the
syscall-restart information on ARM, we'd either have to introduce a new
artificial ptrace register just for that purpose, or else handle the issue
transparently like on s390.  The patch below implements the second option;
using this patch makes the interrupt.exp test cases pass on ARM, with no
regression in the GDB test suite otherwise.

Cc: patches@linaro.org
Signed-off-by: Ulrich Weigand <ulrich.weigand@linaro.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-05-12 10:52:00 +01:00
Will Deacon 9af386c8dc ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM
The SPARSEMEM code allocates memmap entries only for sections which are
present (i.e. those which contain some valid memory). The membank checks
in free_unused_memmap do not take this into account and can incorrectly
attempt to free memory which is not allocated, resulting in a BUG() in
the bootmem code.

However, if memory is configured as follows:

    |<----section---->|<----hole---->|<----section---->|
    +--------+--------+--------------+--------+--------+
    | bank 0 | unused |              | bank 1 | unused |
    +--------+--------+--------------+--------+--------+

where a bank only occupies part of a section, the memmap allocated for
the remainder of the section *can* be freed.

This patch modifies the checks in free_unused_memmap so that only valid
memmap entries are considered for removal.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-05-12 10:52:00 +01:00
Thomas Gleixner 557d97d574 Merge branch 'fortglx/39/tip/timers/rtc' of git://git.linaro.org/people/jstultz/linux into timers/urgent 2011-05-12 11:31:13 +02:00
Tkhai Kirill b1054282d7 sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic
When we are in the label cc_dword_align, registers %o0 and %o1 have the same last 2 bits,
but it's not guaranteed one of them is zero. So we can get unaligned memory access
in label ccte. Example of parameters which lead to this:
%o0=0x7ff183e9, %o1=0x8e709e7d, %g1=3

With the parameters I had a memory corruption, when the additional 5 bytes were rewritten.
This patch corrects the error.

One comment to the patch. We don't care about the third bit in %o1, because cc_end_cruft
stores word or less.

Signed-off-by: Tkhai Kirill <tkhai@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-11 21:35:04 -07:00