Archived
14
0
Fork 0
Commit graph

2728 commits

Author SHA1 Message Date
Ian Kent
5c0a32fc2c [PATCH] autofs4: add new packet type for v5 communications
This patch define a new autofs packet for autofs v5 and updates the waitq.c
functions to handle the additional packet type.

Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:40 -08:00
Ian Kent
f75ba3ade8 [PATCH] autofs4: increase module version
Update autofs4 version.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:40 -08:00
Ben Woodard
837c787877 [BLOCK] increase size of disk stat counters
The kernel's representation of the disk statistics uses the type unsigned
which is 32b on both 32b and 64b platforms.  Unfortunately, most system
tools that work with these numbers that are exported in /proc/diskstats
including iostat read these numbers into unsigned longs.  This works fine
on 32b platforms and when the number of IO transactions are small on 64b
platforms.  However, when the numbers wrap on 64b platforms & you read the
numbers into unsigned longs, and compare the numbers to previous readings,
then you get an unsigned representation of a negative number.  This looks
like a very large 64b number & gives you bizarre readouts in iostat:

ilc4: Device:    rrqm/s wrqm/s r/s    w/s  rsec/s  wsec/s    rkB/s wkB/s avgrq-sz avgqu-sz   await  svctm  %util
ilc4: sda        5.50   0.00   143.96 0.00 307496983987862656.00 0.00 153748491993931328.00     0.00 2136028725038430.00     7.94   55.12    5.59  80.42

Though fixing iostat in user space is possible, and a quick survey
indicates that several other similar tools also use unsigned longs when
processing /proc/diskstats.  Therefore, it seems like a better approach
would be to extend the length of the disk_stats structure on 64b
architectures to 64b.  The following patch does that.  It should not affect
the operation on 32b platforms.

Signed-off-by: Ben Woodard <woodard@redhat.com>
Cc: Rick Lindsley <ricklind@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-27 09:29:02 +02:00
Akinobu Mita
e9bebd6f3a [PATCH] bitops: remove unused generic bitops in include/linux/bitops.h
generic_{ffs,fls,fls64,hweight{64,32,16,8}}() were moved into
include/asm-generic/bitops.h.  So all architectures don't use them.

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:15 -08:00
Akinobu Mita
0b28002fdf [PATCH] more s/fucn/func/ typo fixes
s/fucntion/function/ typo fixes

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:09 -08:00
Hansjoerg Lipp
ee8a4b7f85 [PATCH] isdn4linux: Siemens Gigaset drivers - tty interface
And: Tilman Schmidt <tilman@imap.cc>

This patch adds the tty interface to the gigaset module.  The tty interface
provides direct access to the AT command set of the Gigaset devices.

Signed-off-by: Hansjoerg Lipp <hjlipp@web.de>
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:05 -08:00
Roman Zippel
05cfb614dd [PATCH] hrtimers: remove data field
The nanosleep cleanup allows to remove the data field of hrtimer.  The
callback function can use container_of() to get it's own data.  Since the
hrtimer structure is anyway embedded in other structures, this adds no
overhead.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:03 -08:00
Roman Zippel
df869b630d [PATCH] hrtimers: remove nsec_t typedef
nsec_t predates ktime_t and has mostly been superseded by it.  In the few
places that are left it's better to make it explicit that we're dealing with
64 bit values here.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:03 -08:00
Roman Zippel
272705c597 [PATCH] hrtimers: remove DEFINE_KTIME and ktime_to_clock_t()
Now that it_real_value is gone, the last user of DEFINE_KTIME and
ktime_to_clock_t are also gone, so remove it before someone starts using it
again.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:03 -08:00
Roman Zippel
b75f7a51ca [PATCH] hrtimers: remove state field
Remove the state field and encode this information in the rb_node similiar to
normal timer.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:02 -08:00
Roman Zippel
432569bb9d [PATCH] hrtimers: simplify nanosleep
nanosleep is the only user of the expired state, so let it manage this itself,
which makes the hrtimer code a bit simpler.  The remaining time is also only
calculated if requested.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:02 -08:00
Roman Zippel
44f2147551 [PATCH] hrtimers: pass current time to hrtimer_forward()
Pass current time to hrtimer_forward().  This allows to use the softirq time
in the timer base when the forward function is called from the timer callback.
 Other places pass current time with a call to timer->base->get_time().

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:02 -08:00
Thomas Gleixner
92127c7a45 [PATCH] hrtimers: optimize softirq runqueues
The hrtimer softirq is called from the timer softirq every tick.  Retrieve the
current time from xtime and wall_to_monotonic instead of calling
base->get_time() for each timer base.  Store the time in the base structure
and provide a hook once clock source abstractions are in place and to keep the
code open for new base clocks.

Based on a patch from: Roman Zippel <zippel@linux-m68k.org>

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:02 -08:00
Badari Pulavarty
1d8fa7a2b9 [PATCH] remove ->get_blocks() support
Now that get_block() can handle mapping multiple disk blocks, no need to have
->get_blocks().  This patch removes fs specific ->get_blocks() added for DIO
and makes it users use get_block() instead.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Badari Pulavarty
b0cf2321c6 [PATCH] pass b_size to ->get_block()
Pass amount of disk needs to be mapped to get_block().  This way one can
modify the fs ->get_block() functions to map multiple blocks at the same time.

[akpm@osdl.org: performance tweak]
[akpm@osdl.org: remove unneeded assignments]
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Badari Pulavarty
205f87f6b3 [PATCH] change buffer_head.b_size to size_t
Increase the size of the buffer_head b_size field (only) for 64 bit platforms.
Update some old and moldy comments in and around the structure as well.

The b_size increase allows us to perform larger mappings and allocations for
large I/O requests from userspace, which tie in with other changes allowing
the get_block_t() interface to map multiple blocks at once.

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Mingming Cao
b54e41ec17 [PATCH] ext3_get_blocks: support multiple blocks allocation in ext3_new_block()
Change ext3_try_to_allocate() (called via ext3_new_blocks()) to try to
allocate the requested number of blocks on a best effort basis: After
allocated the first block, it will always attempt to allocate the next few(up
to the requested size and not beyond the reservation window) adjacent blocks
at the same time.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Mingming Cao
89747d369d [PATCH] ext3_get_blocks: Mapping multiple blocks at a once
Currently ext3_get_block() only maps or allocates one block at a time.  This
is quite inefficient for sequential IO workload.

I have posted a early implements a simply multiple block map and allocation
with current ext3.  The basic idea is allocating the 1st block in the existing
way, and attempting to allocate the next adjacent blocks on a best effort
basis.  More description about the implementation could be found here:
http://marc.theaimsgroup.com/?l=ext2-devel&m=112162230003522&w=2

The following the latest version of the patch: break the original patch into 5
patches, re-worked some logicals, and fixed some bugs.  The break ups are:

 [patch 1] Adding map multiple blocks at a time in ext3_get_blocks()
 [patch 2] Extend ext3_get_blocks() to support multiple block allocation
 [patch 3] Implement multiple block allocation in ext3-try-to-allocate
 (called via ext3_new_block()).
 [patch 4] Proper accounting updates in ext3_new_blocks()
 [patch 5] Adjust reservation window size properly (by the given number
 of blocks to allocate) before block allocation to increase the
 possibility of allocating multiple blocks in a single call.

Tests done so far includes fsx,tiobench and dbench.  The following numbers
collected from Direct IO tests (1G file creation/read) shows the system time
have been greatly reduced (more than 50% on my 8 cpu system) with the patches.

 1G file DIO write:
 	2.6.15		2.6.15+patches
 real    0m31.275s	0m31.161s
 user    0m0.000s	0m0.000s
 sys     0m3.384s	0m0.564s

 1G file DIO read:
 	2.6.15		2.6.15+patches
 real    0m30.733s	0m30.624s
 user    0m0.000s	0m0.004s
 sys     0m0.748s	0m0.380s

Some previous test we did on buffered IO with using multiple blocks allocation
and delayed allocation shows noticeable improvement on throughput and system
time.

This patch:

Add support of mapping multiple blocks in one call.

This is useful for DIO reads and re-writes (where blocks are already
allocated), also is in line with Christoph's proposal of using getblocks() in
mpage_readpage() or mpage_readpages().

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:00 -08:00
Takashi Sato
e2d53f9525 [PATCH] 2TB files: change type of kstatfs entries
This fix was proposed by Trond Myklebust.  He says: The type "sector_t" is
heavily tied in to the block layer interface as an offset/handle to a block,
and is subject to a supposedly block-specific configuration option:
CONFIG_LBD.  Despite this, it is used in struct kstatfs to save a couple of
bytes on the stack whenever we call the filesystems' ->statfs().

So kstatfs's entries related to blocks are invalid on statfs64 for a network
filesystem which has more than 2^32-1 blocks when CONFIG_LBD is disabled.

- struct kstatfs
  Change the type of following entries from sector_t to u64.
  f_blocks
  f_bfree
  f_bavail
  f_files
  f_ffree

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Takashi Sato <sho@tnes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:00 -08:00
Takashi Sato
a0f62ac636 [PATCH] 2TB files: add blkcnt_t
Add blkcnt_t as the type of inode.i_blocks.  This enables you to make the size
of blkcnt_t either 4 bytes or 8 bytes on 32 bits architecture with CONFIG_LSF.

- CONFIG_LSF
  Add new configuration parameter.
- blkcnt_t
  On h8300, i386, mips, powerpc, s390 and sh that define sector_t,
  blkcnt_t is defined as u64 if CONFIG_LSF is enabled; otherwise it is
  defined as unsigned long.
  On other architectures, it is defined as unsigned long.
- inode.i_blocks
  Change the type from sector_t to blkcnt_t.

Signed-off-by: Takashi Sato <sho@tnes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:00 -08:00
Takashi Sato
abcb6c9fd1 [PATCH] 2TB files: st_blocks is invalid when calling stat64
This patch series fixes the following problems on 32 bits architecture.

o stat64 returns the lower 32 bits of blocks, although userland st_blocks
  has 64 bits, because i_blocks has only 32 bits.  The ioctl with FIOQSIZE has
  the same problem.

o As Dave Kleikamp said, making >2TB file on JFS results in writing an
  invalid block number to disk inode.  The cause is the same as above too.

o In generic quota code dquot_transfer(), the file usage is calculated from
  i_blocks via inode_get_bytes().  If the file is over 2TB, the change of
  usage is less than expected.  The cause is the same as above too.

o As Trond Myklebust said, statfs64's entries related to blocks are invalid
  on statfs64 for a network filesystem which has more than 2^32-1 blocks with
  CONFIG_LBD disabled.  [PATCH 3/3]

We made patches to fix problems that occur when handling a large filesystem
and a large file.  It was discussed on the mails titled "stat64 for over 2TB
file returned invalid st_blocks".

Signed-off-by: Takashi Sato <sho@tnes.nec.co.jp>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:00 -08:00
Matthew Dobson
93d2341c75 [PATCH] mempool: use mempool_create_slab_pool()
Modify well over a dozen mempool users to call mempool_create_slab_pool()
rather than calling mempool_create() with extra arguments, saving about 30
lines of code and increasing readability.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:00 -08:00
Matthew Dobson
fec433aaaa [PATCH] mempool: add mempool_create_slab_pool()
Create a simple wrapper function for the common case of creating a slab-based
mempool.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:00 -08:00
Matthew Dobson
f183323d38 [PATCH] mempool: add kzalloc allocator
Add another allocator to the common mempool code: a kzalloc/kfree allocator

This will be used by the next patch in the series to replace a mempool-backed
kzalloc allocator.  It is also very likely that there will be more users in
the future.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:59 -08:00
Matthew Dobson
53184082b0 [PATCH] mempool: add kmalloc allocator
Add another allocator to the common mempool code: a kmalloc/kfree allocator

This will be used by the next patch in the series to replace duplicate
mempool-backed kmalloc allocators in several places in the kernel.  It is also
very likely that there will be more users in the future.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:59 -08:00
Matthew Dobson
6e0678f394 [PATCH] mempool: add page allocator
This will be used by the next patch in the series to replace duplicate
mempool-backed page allocators in 2 places in the kernel.  It is also likely
that there will be more users in the future.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:59 -08:00
Maneesh Soni
22e6c1b39c [PATCH] Use loff_t for size in struct proc_dir_entry
Change proc_dir_entry->size to be loff_t to represent files like
/proc/vmcore for 32bit systems with more than 4G memory.

Needed for seeing correct size for /proc/vmcore for 32-bit systems with >
4G RAM.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:57 -08:00
Stephen Rothwell
3158e9411a [PATCH] consolidate sys32/compat_adjtimex
Create compat_sys_adjtimex and use it an all appropriate places.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:57 -08:00
Stephen Rothwell
88959ea968 [PATCH] create struct compat_timex and use it everywhere
We had a copy of the compatibility version of struct timex in each 64 bit
architecture.  This patch just creates a global one and replaces all the
usages of the old ones.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:57 -08:00
Andy Adamson
5842add2f3 [PATCH] VFS,fs/locks.c,NFSD4: add race_free posix_lock_file_conf() interface
Lockd and the NFSv4 server both exercise a race condition where
posix_test_lock() is called either before or after posix_lock_file() to
deal with a denied lock request due to a conflicting lock.

Remove the race condition for the NFSv4 server by adding a new conflicting
lock parameter to __posix_lock_file() , changing the name to
__posix_lock_file_conf().

Keep posix_lock_file() interface, add posix_lock_conf() interface, both
call __posix_lock_file_conf().

[akpm@osdl.org: Put the EXPORT_SYMBOL() where it belongs]
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:56 -08:00
Randy Dunlap
878a9f30d7 [PATCH] hpet header sanitization
Add __KERNEL__ block.
Use __KERNEL__ to allow ioctl interface to be usable.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:56 -08:00
Corey Minyard
50c812b2b9 [PATCH] ipmi: add full sysfs support
Add full driver model support for the IPMI driver.  It links in the proper
bus and device support.

It adds an "ipmi" driver interface that has each BMC discovered by the
driver (as a device).  These BMCs appear in the devices/platform directory.
 If there are multiple interfaces to the same BMC, the driver should
discover this and will only have one BMC entry.  The BMC entry will have
pointers to each interface device that connects to it.

The device information (statistics and config information) has not yet been
ported over to the driver model from proc, that will come later.

This work was based on work by Yani Ioannou.  I basically rewrote it using
that code as a guide, but he still deserves credit :).

[bunk@stusta.de: make ipmi_find_bmc_guid() static]
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Yani Ioannou <yani.ioannou@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:56 -08:00
Con Kolivas
3c30b06df4 [PATCH] cleanup smp_call_function UP build
net/core/flow.c: In function 'flow_cache_flush':
net/core/flow.c:299: warning: statement with no effect

Signed-off-by: Con Kolivas <kernel@kolivas.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:55 -08:00
NeilBrown
2ff28e22bd [PATCH] Make address_space_operations->invalidatepage return void
The return value of this function is never used, so let's be honest and
declare it as void.

Some places where invalidatepage returned 0, I have inserted comments
suggesting a BUG_ON.

[akpm@osdl.org: JBD BUG fix]
[akpm@osdl.org: rework for git-nfs]
[akpm@osdl.org: don't go BUG in block_invalidate_page()]
Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:55 -08:00
NeilBrown
3978d7179d [PATCH] Make address_space_operations->sync_page return void
The only user ignores the return value, and the only instanace
(block_sync_page) always returns 0...

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:55 -08:00
Bjorn Helgaas
b2c99e3c70 [PATCH] EFI: keep physical table addresses in efi structure
Almost all users of the table addresses from the EFI system table want
physical addresses.  So rather than doing the pa->va->pa conversion, just keep
physical addresses in struct efi.

This fixes a DMI bug: the efi structure contained the physical SMBIOS address
on x86 but the virtual address on ia64, so dmi_scan_machine() used ioremap()
on a virtual address on ia64.

This is essentially the same as an earlier patch by Matt Tolentino:
	http://marc.theaimsgroup.com/?l=linux-kernel&m=112130292316281&w=2
except that this changes all table addresses, not just ACPI addresses.

Matt's original patch was backed out because it caused MCAs on HP sx1000
systems.  That problem is resolved by the ioremap() attribute checking added
for ia64.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:54 -08:00
Bjorn Helgaas
136939a2b5 [PATCH] EFI, /dev/mem: simplify efi_mem_attribute_range()
Pass the size, not a pointer to the size, to efi_mem_attribute_range().

This function validates memory regions for the /dev/mem read/write/mmap paths.
The pointer allows arches to reduce the size of the range, but I think that's
unnecessary complexity.  Simplifying it will let me use
efi_mem_attribute_range() to improve the ia64 ioremap() implementation.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: "Tolentino, Matthew E" <matthew.e.tolentino@intel.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:54 -08:00
James Bottomley
5a3a5a98b6 [PATCH] Add flush_kernel_dcache_page() API
We have a problem in a lot of emulated storage in that it takes a page from
get_user_pages() and does something like

kmap_atomic(page)
modify page
kunmap_atomic(page)

However, nothing has flushed the kernel cache view of the page before the
kunmap.  We need a lightweight API to do this, so this new API would
specifically be for flushing the kernel cache view of a user page which the
kernel has modified.  The driver would need to add
flush_kernel_dcache_page(page) before the final kunmap.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:53 -08:00
James Bottomley
03beb07664 [PATCH] Add API for flushing Anon pages
Currently, get_user_pages() returns fully coherent pages to the kernel for
anything other than anonymous pages.  This is a problem for things like
fuse and the SCSI generic ioctl SG_IO which can potentially wish to do DMA
to anonymous pages passed in by users.

The fix is to add a new memory management API: flush_anon_page() which
is used in get_user_pages() to make anonymous pages coherent.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:53 -08:00
Steven Rostedt
64a07bd82e [PATCH] protect remove_proc_entry
It has been discovered that the remove_proc_entry has a race in the removing
of entries in the proc file system that are siblings.  There's no protection
around the traversing and removing of elements that belong in the same
subdirectory.

This subdirectory list is protected in other areas by the BKL.  So the BKL was
at first used to protect this area too, but unfortunately, remove_proc_entry
may be called with spinlocks held.  The BKL may schedule, so this was not a
solution.

The final solution was to add a new global spin lock to protect this list,
called proc_subdir_lock.  This lock now protects the list in
remove_proc_entry, and I also went around looking for other areas that this
list is modified and added this protection there too.  Care must be taken
since these locations call several functions that may also schedule.

Since I don't see any location that these functions that modify the
subdirectory list are called by interrupts, the irqsave/restore versions of
the spin lock was _not_ used.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:53 -08:00
Linus Torvalds
36ddf5bbde Merge master.kernel.org:/home/rmk/linux-2.6-serial
* master.kernel.org:/home/rmk/linux-2.6-serial:
  [ARM] 3383/3: ixp2000: ixdp2x01 platform serial conversion
  [SERIAL] amba-pl010: Remove accessor macros
  [SERIAL] remove 8250_acpi (replaced by 8250_pnp and PNPACPI)
  [SERIAL] icom: select FW_LOADER
2006-03-25 20:31:32 -08:00
Linus Torvalds
a41622eaa9 Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 3030/2: fix permission check in the obscur cmpxchg syscall
  [ARM] nommu: rename compressed/head.S symbols to a new style
  [ARM] select TLS_REG_EMUL and NEEDS_SYSCALL_FOR_CMPXCHG
  [ARM] nommu: Move hardware page table definitions to pgtable-hwdef.h
  [ARM] Move read of processor ID out of lookup_processor_type()
  [ARM] Fix typo in tlbflush.h
  [ARM] noMMU: removes TLB codes in nommu mode
  [ARM] noMMU: block sys_fork in nommu mode
  [ARM] 3399/1: Fix link problem when CONFIG_PRINTK is disabled
  [ARM] 3398/1: Fix the VFP registers loading/storing base address
  [ARM] 3397/1: AT91RM9200 Header update
  [ARM] 3385/1: Battery support for sharp zaurus sl-5500 (collie)
  [ARM] SMP: don't set cpu_*_map in smp_prepare_boot_cpu
  include/linux/clk.h is betraying its ARM origins
  [ARM] Move enable_irq and disable_irq to assembler.h
  [ARM] 3391/1: use PLAT8250_DEV_PLATFORM{,1} for platform device id instead of 0/1
2006-03-25 20:29:54 -08:00
Lennert Buytenhek
104c7b03ea [ARM] 3383/3: ixp2000: ixdp2x01 platform serial conversion
Patch from Lennert Buytenhek

Add a PLAT8250_DEV_PLATFORM2, and convert the two ixdp2x01 CPLD serial
ports to use platform serial devices with ids PLAT8250_DEV_PLATFORM[12].
(The on-chip xscale UART is PLAT8250_DEV_PLATFORM, id #0.)

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-03-25 23:03:13 +00:00
Todd Poynor
686f8c5d77 include/linux/clk.h is betraying its ARM origins
include/linux/clk.h is betraying its ARM origins.

Signed-off-by: Todd Poynor <tpoynor@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-03-25 18:15:24 +00:00
Linus Torvalds
1b9a391736 Merge branch 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (22 commits)
  [PATCH] fix audit_init failure path
  [PATCH] EXPORT_SYMBOL patch for audit_log, audit_log_start, audit_log_end and audit_format
  [PATCH] sem2mutex: audit_netlink_sem
  [PATCH] simplify audit_free() locking
  [PATCH] Fix audit operators
  [PATCH] promiscuous mode
  [PATCH] Add tty to syscall audit records
  [PATCH] add/remove rule update
  [PATCH] audit string fields interface + consumer
  [PATCH] SE Linux audit events
  [PATCH] Minor cosmetic cleanups to the code moved into auditfilter.c
  [PATCH] Fix audit record filtering with !CONFIG_AUDITSYSCALL
  [PATCH] Fix IA64 success/failure indication in syscall auditing.
  [PATCH] Miscellaneous bug and warning fixes
  [PATCH] Capture selinux subject/object context information.
  [PATCH] Exclude messages by message type
  [PATCH] Collect more inode information during syscall processing.
  [PATCH] Pass dentry, not just name, in fsnotify creation hooks.
  [PATCH] Define new range of userspace messages.
  [PATCH] Filter rule comparators
  ...

Fixed trivial conflict in security/selinux/hooks.c
2006-03-25 09:24:53 -08:00
Linus Torvalds
53846a21c1 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (103 commits)
  SUNRPC,RPCSEC_GSS: spkm3--fix config dependencies
  SUNRPC,RPCSEC_GSS: spkm3: import contexts using NID_cast5_cbc
  LOCKD: Make nlmsvc_traverse_shares return void
  LOCKD: nlmsvc_traverse_blocks return is unused
  SUNRPC,RPCSEC_GSS: fix krb5 sequence numbers.
  NFSv4: Dont list system.nfs4_acl for filesystems that don't support it.
  SUNRPC,RPCSEC_GSS: remove unnecessary kmalloc of a checksum
  SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release()
  SUNRPC: Fix memory barriers for req->rq_received
  NFS: Fix a race in nfs_sync_inode()
  NFS: Clean up nfs_flush_list()
  NFS: Fix a race with PG_private and nfs_release_page()
  NFSv4: Ensure the callback daemon flushes signals
  SUNRPC: Fix a 'Busy inodes' error in rpc_pipefs
  NFS, NLM: Allow blocking locks to respect signals
  NFS: Make nfs_fhget() return appropriate error values
  NFSv4: Fix an oops in nfs4_fill_super
  lockd: blocks should hold a reference to the nlm_file
  NFSv4: SETCLIENTID_CONFIRM should handle NFS4ERR_DELAY/NFS4ERR_RESOURCE
  NFSv4: Send the delegation stateid for SETATTR calls
  ...
2006-03-25 09:18:27 -08:00
Andi Kleen
267b48014a [PATCH] x86_64: Try to allocate node memmap near the end of node
This fixes problems with very large nodes (over 128GB) filling up all of
the first 4GB with their mem_map and not leaving enough space for the
swiotlb.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:56 -08:00
Andi Kleen
f083a329e6 [PATCH] x86_64: Clean up and tweak ACPI blacklist year code
- Move the core parser into dmi_scan.c.  It can be useful for other
   subsystems too.
 - Differentiate between field doesn't exist and field is 0 or
   unparseable.  The first case is likely an old BIOS with broken ACPI,
   the later is likely a slightly buggy BIOS where someone forget to
   edit the date.  Don't blacklist in the later case.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:54 -08:00
Linus Torvalds
1e8c573933 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (21 commits)
  BUG_ON() Conversion in drivers/video/
  BUG_ON() Conversion in drivers/parisc/
  BUG_ON() Conversion in drivers/block/
  BUG_ON() Conversion in sound/sparc/cs4231.c
  BUG_ON() Conversion in drivers/s390/block/dasd.c
  BUG_ON() Conversion in lib/swiotlb.c
  BUG_ON() Conversion in kernel/cpu.c
  BUG_ON() Conversion in ipc/msg.c
  BUG_ON() Conversion in block/elevator.c
  BUG_ON() Conversion in fs/coda/
  BUG_ON() Conversion in fs/binfmt_elf_fdpic.c
  BUG_ON() Conversion in input/serio/hil_mlc.c
  BUG_ON() Conversion in md/dm-hw-handler.c
  BUG_ON() Conversion in md/bitmap.c
  The comment describing how MS_ASYNC works in msync.c is confusing
  rcu: undeclared variable used in documentation
  fix typos "wich" -> "which"
  typo patch for fs/ufs/super.c
  Fix simple typos
  tabify drivers/char/Makefile
  ...
2006-03-25 08:41:09 -08:00
Linus Torvalds
b55813a2e5 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NETFILTER] x_table.c: sem2mutex
  [IPV4]: Aggregate route entries with different TOS values
  [TCP]: Mark tcp_*mem[] __read_mostly.
  [TCP]: Set default max buffers from memory pool size
  [SCTP]: Fix up sctp_rcv return value
  [NET]: Take RTNL when unregistering notifier
  [WIRELESS]: Fix config dependencies.
  [NET]: Fill in a 32-bit hole in struct sock on 64-bit platforms.
  [NET]: Ensure device name passed to SO_BINDTODEVICE is NULL terminated.
  [MODULES]: Don't allow statically declared exports
  [BRIDGE]: Unaligned accesses in the ethernet bridge
2006-03-25 08:39:20 -08:00
Linus Torvalds
368d17e068 Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (33 commits)
  V4L/DVB (3604): V4l printk fix
  V4L/DVB (3599c): Whitespace cleanups under Documentation/video4linux
  V4L/DVB (3599b): Whitespace cleanups under drivers/media
  V4L/DVB (3599a): Move drivers/usb/media to drivers/media/video
  V4L/DVB (3599): Implement new routing commands for wm8775 and cs53l32a.
  V4L/DVB (3598): Add bit algorithm adapter for the Conexant CX2341X boards.
  V4L/DVB (3597): Vivi: fix warning: implicit declaration of function 'in_interrupt'
  V4L/DVB (3588): Remove VIDIOC_G/S_AUDOUT from msp3400
  V4L/DVB (3587): Always wake thread after routing change.
  V4L/DVB (3584): Implement V4L2_TUNER_MODE_LANG1_LANG2 audio mode
  V4L/DVB (3582): Implement correct msp3400 input/output routing
  V4L/DVB (3581): Add new media/msp3400.h header containing the routing macros
  V4L/DVB (3580): Last round of msp3400 cleanups before adding routing commands
  V4L/DVB (3579): Move msp_modus to msp3400-kthreads, add JP and KR std detection
  V4L/DVB (3578): Make scart definitions easier to handle
  V4L/DVB (3577): Cleanup audio input handling
  V4L/DVB (3575): Cxusb: fix i2c debug messages for bluebird devices
  V4L/DVB (3574): Cxusb: fix debug messages
  V4L/DVB (3573): Cxusb: remove FIXME: comment in bluebird_patch_dvico_firmware_download
  V4L/DVB (3572): Cxusb: conditionalize gpio write for the medion box
  ...
2006-03-25 08:37:36 -08:00
Roman Zippel
5ddcfa878d [PATCH] remove pps support
This removes the support for pps.  It's completely unused within the kernel
and is basically in the way for further cleanups.  It should be easier to
readd proper support for it after the rest has been converted to NTP4
(where the pps mechanisms are quite different from NTP3 anyway).

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:02 -08:00
Pekka Enberg
d12ddde2bb [PATCH] udf: remove duplicate definitions
This patch removes duplicate definitions from include/linux/udf_fs_i.h
which are already defined in fs/udf/ecma_167.h.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:01 -08:00
Ashok Raj
34f361ade2 [PATCH] Check if cpu can be onlined before calling smp_prepare_cpu()
- Moved check for online cpu out of smp_prepare_cpu()

- Moved default declaration of smp_prepare_cpu() to kernel/cpu.c

- Removed lock_cpu_hotplug() from smp_prepare_cpu() to around it, since
  its called from cpu_up() as well now.

- Removed clearing from cpu_present_map during cpu_offline as it breaks
  using cpu_up() directly during a subsequent online operation.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:01 -08:00
Andrew Morton
96a9b4d31e [PATCH] cpumask: uninline any_online_cpu()
text    data     bss     dec     hex filename
before: 3605597 1363528  363328 5332453  515de5 vmlinux
after:  3605295 1363612  363200 5332107  515c8b vmlinux

218 bytes saved.

Also, optimise any_online_cpu() out of existence on CONFIG_SMP=n.

This function seems inefficient.  Can't we simply AND the two masks, then use
find_first_bit()?

Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:00 -08:00
Andrew Morton
8630282070 [PATCH] cpumask: uninline highest_possible_processor_id()
Shrinks the only caller (net/bridge/netfilter/ebtables.c) by 174 bytes.

Also, optimise highest_possible_processor_id() out of existence on
CONFIG_SMP=n.

Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:00 -08:00
Andrew Morton
3d18bd74a2 [PATCH] cpumask: uninline next_cpu()
text    data     bss     dec     hex filename
before: 3488027 1322496  360128 5170651  4ee5db vmlinux
after:  3485112 1322480  359968 5167560  4ed9c8 vmlinux

2931 bytes saved

Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:59 -08:00
Andrew Morton
ccb46000f4 [PATCH] cpumask: uninline first_cpu()
text    data     bss     dec     hex filename
before: 3490577 1322408  360000 5172985  4eeef9 vmlinux
after:  3488027 1322496  360128 5170651  4ee5db vmlinux

Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:59 -08:00
Jonathan Corbet
daff89f324 [PATCH] radix-tree documentation cleanups
Documentation changes to help radix tree users avoid overrunning the tags
array.  RADIX_TREE_TAGS moves to linux/radix-tree.h and is now known as
RADIX_TREE_MAX_TAGS (Nick Piggin's idea).  Tag parameters are changed to
unsigned, and some comments are updated.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:59 -08:00
Andrew Morton
962749af67 [PATCH] roundup_pow_of_two() 64-bit fix
fls() takes an integer, so roundup_pow_of_two() is busted for ulongs larger
than 2^32-1.

Fix this by implementing and using fls_long().

(Why does roundup_pow_of_two() return a long?)

(Why is roundup_pow_of_two() __attribute_const__ whereas long_log2() is
__attribute_pure__?)

(Why does long_log2() suck so much?  Because we were missing fls_long()?)

Cc: Roland Dreier <rdreier@cisco.com>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:58 -08:00
Chris Wright
12b5989be1 [PATCH] refactor capable() to one implementation, add __capable() helper
Move capable() to kernel/capability.c and eliminate duplicate
implementations.  Add __capable() function which can be used to check for
capabiilty of any process.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:56 -08:00
Adrian Bunk
77d47582c2 [PATCH] add a proper prototype for setup_arch()
This patch adds a proper prototype for setup_arch() in init.h.

This patch is based on a patch by Ben Dooks <ben-linux@fluff.org>.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:56 -08:00
Andrew Morton
c777ac5594 [PATCH] irq: uninline migration functions
Uninline some massive IRQ migration functions.  Put them in the new
kernel/irq/migration.c.

Cc: Andi Kleen <ak@muc.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:55 -08:00
Bjorn Helgaas
33d8675ea6 [PATCH] amiga: fix driver_register() return handling, remove zorro_module_init()
Remove the assumption that driver_register() returns the number of devices
bound to the driver.  In fact, it returns zero for success or a negative
error value.

zorro_module_init() used the device count to automatically unregister and
unload drivers that found no devices.  That might have worked at one time,
but has been broken for some time because zorro_register_driver() returned
either a negative error or a positive count (never zero).  So it could only
unregister on failure, when it's not needed anyway.

This functionality could be resurrected in individual drivers by counting
devices in their .probe() methods.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:53 -08:00
Bjorn Helgaas
e51c01b084 [PATCH] hp300: fix driver_register() return handling, remove dio_module_init()
Remove the assumption that driver_register() returns the number of devices
bound to the driver.  In fact, it returns zero for success or a negative
error value.

dio_module_init() used the device count to automatically unregister and
unload drivers that found no devices.  That might have worked at one time,
but has been broken for some time because dio_register_driver() returned
either a negative error or a positive count (never zero).  So it could only
unregister on failure, when it's not needed anyway.

This functionality could be resurrected in individual drivers by counting
devices in their .probe() methods.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Philip Blundell <philb@gnu.org>
Cc: Jochen Friedrich <jochen@scram.de>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:53 -08:00
Vladimir V. Saveliev
cd02b966bf [PATCH] reiserfs: cleanups
Clean up several places where gcc issues warnings when -W is specified.
Thanks to Neil for finding that.

Signed-off-by: Vladimir V. Saveliev <vs@namesys.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Hans Reiser <reiser@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:53 -08:00
Rene Herman
e6a6784627 [PATCH] parport: move PP_MAJOR from ppdev.h to major.h
Today I wondered about /dev/parport<n> after not seeing anything in
drivers/parport register char-major-99.  Having PP_MAJOR in
include/linux/major.h would've allowed me to more quickly determine that it
was the ppdev driver driving these.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:53 -08:00
Nick Piggin
c32ccd87bf [PATCH] inotify: lock avoidance with parent watch status in dentry
Previous inotify work avoidance is good when inotify is completely unused,
but it breaks down if even a single watch is in place anywhere in the
system.  Robin Holt notices that udev is one such culprit - it slows down a
512-thread application on a 512 CPU system from 6 seconds to 22 minutes.

Solve this by adding a flag in the dentry that tells inotify whether or not
its parent inode has a watch on it.  Event queueing to parent will skip
taking locks if this flag is cleared.  Setting and clearing of this flag on
all child dentries versus event delivery: this is no in terms of race
cases, and that was shown to be equivalent to always performing the check.

The essential behaviour is that activity occuring _after_ a watch has been
added and _before_ it has been removed, will generate events.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:53 -08:00
Adrian Bunk
9871728b75 [PATCH] kernel/params.c: make param_array() static
param_array() in kernel/params.c can now become static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:52 -08:00
Rusty Russell
8d3b33f67f [PATCH] Remove MODULE_PARM
MODULE_PARM was actually breaking: recent gcc version optimize them out as
unused.  It's time to replace the last users, which are generally in the
most unloved drivers anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:52 -08:00
Thomas Koeller
1aef821a6b [PATCH] constify tty flip buffer handling
Add a couple of 'const' qualifiers to the TTY flip buffer APIs, where
appropriate.

Signed-off-by: Thomas Koeller <thomas@koeller.dyndns.org>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:52 -08:00
Oleg Drokin
b500531e6f [PATCH] Introduce FMODE_EXEC file flag
Introduce FMODE_EXEC file flag, to indicate that file is being opened for
execution.  This is useful for distributed filesystems to maintain
consistent behavior for returning ETXTBUSY when opening for write and
execution happens on different nodes.

akpm:

  Needed by Lustre at present.  I assume their objective to to work towards
  being able to install Lustre on an unmodified distro kernel, which seems
  sane.  It should have zero runtime cost.

  Trond and Chuck indicate that NFS4 can probably use this too, for the same
  thing.

  Steven says it's also on the GFS todo list.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:51 -08:00
Alexander Zarochentzev
23f9e0f891 [PATCH] reiserfs: fix transaction overflowing
This patch fixes a bug in reiserfs truncate.  A transaction might overflow
when truncating long highly fragmented file.  The fix is to split
truncation into several transactions to avoid overflowing.

Signed-off-by: Vladimir V. Saveliev <vs@namesys.com>
Cc; Charles McColgan <cm@chuck.net>
Cc: Alexander Zarochentsev <zam@namesys.com>
Cc: Hans Reiser <reiser@namesys.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:51 -08:00
Adrian Bunk
bdfc326614 [PATCH] fs/inode.c: make iprune_mutex static
There's no reason for iprune_mutex being global.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:51 -08:00
Jan Kara
ca5734db60 [PATCH] Small cleanup in quota.h
Remove unused quota flag.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:51 -08:00
Andrew Morton
e3df18983e [PATCH] jbd: embed j_commit_timer in journal struct
The kjournald timer is currently on the kernel thread's stack and the journal
structure points at it.  Save a pointer hop by moving the timer into the
journal structure.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:50 -08:00
Pekka Enberg
40c07ae8da [PATCH] slab: optimize constant-size kzalloc calls
As suggested by Eric Dumazet, optimize kzalloc() calls that pass a
compile-time constant size.  Please note that the patch increases kernel
text slightly (~200 bytes for defconfig on x86).

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:49 -08:00
Pekka Enberg
a8c0f9a41f [PATCH] slab: introduce kmem_cache_zalloc allocator
Introduce a memory-zeroing variant of kmem_cache_alloc.  The allocator
already exits in XFS and there are potential users for it so this patch
makes the allocator available for the general public.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:49 -08:00
Al Viro
871751e25d [PATCH] slab: implement /proc/slab_allocators
Implement /proc/slab_allocators.   It produces output like:

idr_layer_cache: 80 idr_pre_get+0x33/0x4e
buffer_head: 2555 alloc_buffer_head+0x20/0x75
mm_struct: 9 mm_alloc+0x1e/0x42
mm_struct: 20 dup_mm+0x36/0x370
vm_area_struct: 384 dup_mm+0x18f/0x370
vm_area_struct: 151 do_mmap_pgoff+0x2e0/0x7c3
vm_area_struct: 1 split_vma+0x5a/0x10e
vm_area_struct: 11 do_brk+0x206/0x2e2
vm_area_struct: 2 copy_vma+0xda/0x142
vm_area_struct: 9 setup_arg_pages+0x99/0x214
fs_cache: 8 copy_fs_struct+0x21/0x133
fs_cache: 29 copy_process+0xf38/0x10e3
files_cache: 30 alloc_files+0x1b/0xcf
signal_cache: 81 copy_process+0xbaa/0x10e3
sighand_cache: 77 copy_process+0xe65/0x10e3
sighand_cache: 1 de_thread+0x4d/0x5f8
anon_vma: 241 anon_vma_prepare+0xd9/0xf3
size-2048: 1 add_sect_attrs+0x5f/0x145
size-2048: 2 journal_init_revoke+0x99/0x302
size-2048: 2 journal_init_revoke+0x137/0x302
size-2048: 2 journal_init_inode+0xf9/0x1c4

Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Alexander Nyberg <alexn@telia.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
DESC
slab-leaks3-locking-fix
EDESC
From: Andrew Morton <akpm@osdl.org>

Update for slab-remove-cachep-spinlock.patch

Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Alexander Nyberg <alexn@telia.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:49 -08:00
Thomas Gleixner
c08b8a4910 [PATCH] sys_alarm() unsigned signed conversion fixup
alarm() calls the kernel with an unsigend int timeout in seconds.  The
value is stored in the tv_sec field of a struct timeval to setup the
itimer.  The tv_sec field of struct timeval is of type long, which causes
the tv_sec value to be negative on 32 bit machines if seconds > INT_MAX.

Before the hrtimer merge (pre 2.6.16) such a negative value was converted
to the maximum jiffies timeout by the timeval_to_jiffies conversion.  It's
not clear whether this was intended or just happened to be done by the
timeval_to_jiffies code.

hrtimers expect a timeval in canonical form and treat a negative timeout as
already expired.  This breaks the legitimate usage of alarm() with a
timeout value > INT_MAX seconds.

For 32 bit machines it is therefor necessary to limit the internal seconds
value to avoid API breakage.  Instead of doing this in all implementations
of sys_alarm the duplicated sys_alarm code is moved into a common function
in itimer.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:48 -08:00
a1a8feed17 [MODULES]: Don't allow statically declared exports
Add an extern declaration for exported symbols to make the compiler warn
on symbols declared statically.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-24 15:44:58 -08:00
Hans Verkuil
a20c522498 V4L/DVB (3598): Add bit algorithm adapter for the Conexant CX2341X boards.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-03-24 16:27:00 -03:00
Hans Verkuil
301e22d691 V4L/DVB (3584): Implement V4L2_TUNER_MODE_LANG1_LANG2 audio mode
Add a new audio mode V4L2_TUNER_MODE_LANG1_LANG2 (used by VIDIOC_G/S_TUNER).
This mode allows the user to select both languages of a bilingual transmission,
one language on the left, one on the right audio channel. If there is no
bilingual transmission, or it is not supported, then this mode should act like
V4L2_TUNER_MODE_STEREO.
This mode is introduced for PVR-like drivers where it is useful to be able to
record both languages of a bilingual broadcast.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-03-24 16:26:58 -03:00
Andrzej Zaborowski
13fce80629 Fix simple typos
This corrects some trivial errors in ARM docs and comments,

Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-24 18:13:37 +01:00
Linus Torvalds
e93252faca Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [PATCH] libata: Remove dependence on host_set->dev for SAS
  [PATCH] libata: ata_scsi_ioctl cleanup
  [PATCH] libata: ata_scsi_queuecmd cleanup
  [libata] export ata_dev_pair; trim trailing whitespace
  [PATCH] libata: add ata_dev_pair helper
  [PATCH] Make libata not powerdown drivers on PM_EVENT_FREEZE.
  [PATCH] libata: make ata_set_mode() responsible for failure handling
  [PATCH] libata: use ata_dev_disable() in ata_bus_probe()
  [PATCH] libata: implement ata_dev_disable()
  [PATCH] libata: check if port is disabled after internal command
  [PATCH] libata: make per-dev transfer mode limits per-dev
  [PATCH] libata: add per-dev pio/mwdma/udma_mask
  [PATCH] libata: implement ata_unpack_xfermask()
  [libata] Move some bmdma-specific code to libata-bmdma.c
  [libata sata_uli] kill scr_addr abuse
  [libata sata_nv] eliminate duplicate codepaths with iomap
  [libata sata_nv] cleanups: convert #defines to enums; remove in-file history
  [libata sata_sil24] cleanups: use pci_iomap(), kzalloc()
2006-03-24 08:19:51 -08:00
Davi Arnaut
96840aa00a [PATCH] strndup_user()
This patch series creates a strndup_user() function to easy copying C strings
from userspace.  Also we avoid common pitfalls like userspace modifying the
final \0 after the strlen_user().

Signed-off-by: Davi Arnaut <davi.arnaut@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:31 -08:00
Ingo Molnar
6687a97d40 [PATCH] timer-irq-driven soft-watchdog, cleanups
Make the softlockup detector purely timer-interrupt driven, removing
softirq-context (timer) dependencies.  This means that if the softlockup
watchdog triggers, it has truly observed a longer than 10 seconds
scheduling delay of a SCHED_FIFO prio 99 task.

(the patch also turns off the softlockup detector during the initial bootup
phase and does small style fixes)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:30 -08:00
Kumar Gala
208a08f7cc [PATCH] ide: Allow IDE interface to specify its not capable of 32-bit operations
In some embedded systems the IDE hardware interface may only support 16-bit
or smaller accesses.  Allow the interface to specify if this is the case
and don't allow the drive or user to override the setting.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:28 -08:00
Pierre Ossman
97f2478db1 [PATCH] Secure Digital Host Controller id and regs
Class code and register definitions for the Secure Digital Host Controller
standard.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:27 -08:00
Andrew Morton
18e79b40ed [PATCH] fsync: extract internal code
Pull the guts out of do_fsync() - we can use it elsewhere.

Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:27 -08:00
Andrew Morton
4741c9fd36 [PATCH] set_page_dirty() return value fixes
We need set_page_dirty() to return true if it actually transitioned the page
from a clean to dirty state.  This wasn't right in a couple of places.  Do a
kernel-wide audit, fix things up.

This leaves open the possibility of returning a negative errno from
set_page_dirty() sometime in the future.  But we don't do that at present.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:26 -08:00
Andrew Morton
fa5a734e40 [PATCH] balance_dirty_pages_ratelimited: take nr_pages arg
Modify balance_dirty_pages_ratelimited() so that it can take a
number-of-pages-which-I-just-dirtied argument.  For msync().

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:26 -08:00
Andrew Morton
ebcf28e1c7 [PATCH] fadvise(): write commands
Add two new linux-specific fadvise extensions():

LINUX_FADV_ASYNC_WRITE: start async writeout of any dirty pages between file
offsets `offset' and `offset+len'.  Any pages which are currently under
writeout are skipped, whether or not they are dirty.

LINUX_FADV_WRITE_WAIT: wait upon writeout of any dirty pages between file
offsets `offset' and `offset+len'.

By combining these two operations the application may do several things:

LINUX_FADV_ASYNC_WRITE: push some or all of the dirty pages at the disk.

LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE: push all of the currently dirty
pages at the disk.

LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE, LINUX_FADV_WRITE_WAIT: push all
of the currently dirty pages at the disk, wait until they have been written.

It should be noted that none of these operations write out the file's
metadata.  So unless the application is strictly performing overwrites of
already-instantiated disk blocks, there are no guarantees here that the data
will be available after a crash.

To complete this suite of operations I guess we should have a "sync file
metadata only" operation.  This gives applications access to all the building
blocks needed for all sorts of sync operations.  But sync-metadata doesn't fit
well with the fadvise() interface.  Probably it should be a new syscall:
sys_fmetadatasync().

The patch also diddles with the meaning of `endbyte' in sys_fadvise64_64().
It is made to represent that last affected byte in the file (ie: it is
inclusive).  Generally, all these byterange and pagerange functions are
inclusive so we can easily represent EOF with -1.

As Ulrich notes, these two functions are somewhat abusive of the fadvise()
concept, which appears to be "set the future policy for this fd".

But these commands are a perfect fit with the fadvise() impementation, and
several of the existing fadvise() commands are synchronous and don't affect
future policy either.   I think we can live with the slight incongruity.

Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:25 -08:00
Jan Beulich
ab7efcc97e [PATCH] abstract type/size specification for assembly
Provide abstraction for generating type and size information of assembly
routines and data, while permitting architectures to override these
defaults.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: "Russell King" <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "Andi Kleen" <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:25 -08:00
Paul Jackson
c61afb181c [PATCH] cpuset memory spread slab cache optimizations
The hooks in the slab cache allocator code path for support of NUMA
mempolicies and cpuset memory spreading are in an important code path.  Many
systems will use neither feature.

This patch optimizes those hooks down to a single check of some bits in the
current tasks task_struct flags.  For non NUMA systems, this hook and related
code is already ifdef'd out.

The optimization is done by using another task flag, set if the task is using
a non-default NUMA mempolicy.  Taking this flag bit along with the
PF_SPREAD_PAGE and PF_SPREAD_SLAB flag bits added earlier in this 'cpuset
memory spreading' patch set, one can check for the combination of any of these
special case memory placement mechanisms with a single test of the current
tasks task_struct flags.

This patch also tightens up the code, to save a few bytes of kernel text
space, and moves some of it out of line.  Due to the nested inlines called
from multiple places, we were ending up with three copies of this code, which
once we get off the main code path (for local node allocation) seems a bit
wasteful of instruction memory.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:23 -08:00
Paul Jackson
101a50019a [PATCH] cpuset memory spread slab cache implementation
Provide the slab cache infrastructure to support cpuset memory spreading.

See the previous patches, cpuset_mem_spread, for an explanation of cpuset
memory spreading.

This patch provides a slab cache SLAB_MEM_SPREAD flag.  If set in the
kmem_cache_create() call defining a slab cache, then any task marked with the
process state flag PF_MEMSPREAD will spread memory page allocations for that
cache over all the allowed nodes, instead of preferring the local (faulting)
node.

On systems not configured with CONFIG_NUMA, this results in no change to the
page allocation code path for slab caches.

On systems with cpusets configured in the kernel, but the "memory_spread"
cpuset option not enabled for the current tasks cpuset, this adds a call to a
cpuset routine and failed bit test of the processor state flag PF_SPREAD_SLAB.

For tasks so marked, a second inline test is done for the slab cache flag
SLAB_MEM_SPREAD, and if that is set and if the allocation is not
in_interrupt(), this adds a call to to a cpuset routine that computes which of
the tasks mems_allowed nodes should be preferred for this allocation.

==> This patch adds another hook into the performance critical
    code path to allocating objects from the slab cache, in the
    ____cache_alloc() chunk, below.  The next patch optimizes this
    hook, reducing the impact of the combined mempolicy plus memory
    spreading hooks on this critical code path to a single check
    against the tasks task_struct flags word.

This patch provides the generic slab flags and logic needed to apply memory
spreading to a particular slab.

A subsequent patch will mark a few specific slab caches for this placement
policy.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:23 -08:00
Paul Jackson
44110fe385 [PATCH] cpuset memory spread page cache implementation and hooks
Change the page cache allocation calls to support cpuset memory spreading.

See the previous patch, cpuset_mem_spread, for an explanation of cpuset memory
spreading.

On systems without cpusets configured in the kernel, this is no change.

On systems with cpusets configured in the kernel, but the "memory_spread"
cpuset option not enabled for the current tasks cpuset, this adds a call to a
cpuset routine and failed bit test of the processor state flag PF_SPREAD_PAGE.

On tasks in cpusets with "memory_spread" enabled, this adds a call to a cpuset
routine that computes which of the tasks mems_allowed nodes should be
preferred for this allocation.

If memory spreading applies to a particular allocation, then any other NUMA
mempolicy does not apply.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:22 -08:00
Paul Jackson
825a46af5a [PATCH] cpuset memory spread basic implementation
This patch provides the implementation and cpuset interface for an alternative
memory allocation policy that can be applied to certain kinds of memory
allocations, such as the page cache (file system buffers) and some slab caches
(such as inode caches).

The policy is called "memory spreading." If enabled, it spreads out these
kinds of memory allocations over all the nodes allowed to a task, instead of
preferring to place them on the node where the task is executing.

All other kinds of allocations, including anonymous pages for a tasks stack
and data regions, are not affected by this policy choice, and continue to be
allocated preferring the node local to execution, as modified by the NUMA
mempolicy.

There are two boolean flag files per cpuset that control where the kernel
allocates pages for the file system buffers and related in kernel data
structures.  They are called 'memory_spread_page' and 'memory_spread_slab'.

If the per-cpuset boolean flag file 'memory_spread_page' is set, then the
kernel will spread the file system buffers (page cache) evenly over all the
nodes that the faulting task is allowed to use, instead of preferring to put
those pages on the node where the task is running.

If the per-cpuset boolean flag file 'memory_spread_slab' is set, then the
kernel will spread some file system related slab caches, such as for inodes
and dentries evenly over all the nodes that the faulting task is allowed to
use, instead of preferring to put those pages on the node where the task is
running.

The implementation is simple.  Setting the cpuset flags 'memory_spread_page'
or 'memory_spread_cache' turns on the per-process flags PF_SPREAD_PAGE or
PF_SPREAD_SLAB, respectively, for each task that is in the cpuset or
subsequently joins that cpuset.  In subsequent patches, the page allocation
calls for the affected page cache and slab caches are modified to perform an
inline check for these flags, and if set, a call to a new routine
cpuset_mem_spread_node() returns the node to prefer for the allocation.

The cpuset_mem_spread_node() routine is also simple.  It uses the value of a
per-task rotor cpuset_mem_spread_rotor to select the next node in the current
tasks mems_allowed to prefer for the allocation.

This policy can provide substantial improvements for jobs that need to place
thread local data on the corresponding node, but that need to access large
file system data sets that need to be spread across the several nodes in the
jobs cpuset in order to fit.  Without this patch, especially for jobs that
might have one thread reading in the data set, the memory allocation across
the nodes in the jobs cpuset can become very uneven.

A couple of Copyright year ranges are updated as well.  And a couple of email
addresses that can be found in the MAINTAINERS file are removed.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:22 -08:00
Adrian Bunk
cdb0452789 [PATCH] kill include/linux/platform.h, default_idle() cleanup
include/linux/platform.h contained nothing that was actually used except
the default_idle() prototype, and is therefore removed by this patch.

This patch does the following with the platform specific default_idle()
functions on different architectures:
- remove the unused function:
  - parisc
  - sparc64
- make the needlessly global function static:
  - arm
  - h8300
  - m68k
  - m68knommu
  - s390
  - v850
  - x86_64
- add a prototype in asm/system.h:
  - cris
  - i386
  - ia64

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Patrick Mochel <mochel@digitalimplant.org>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:21 -08:00
Bart Samwel
f6ef943813 [PATCH] Represent dirty_*_centisecs as jiffies internally
Make that the internal values for:

/proc/sys/vm/dirty_writeback_centisecs
/proc/sys/vm/dirty_expire_centisecs

are stored as jiffies instead of centiseconds.  Let the sysctl interface do
the conversions with full precision using clock_t_to_jiffies, instead of
doing overflow-sensitive on-the-fly conversions every time the values are
used.

Cons: apparent precision loss if HZ is not a multiple of 100, because of
conversion back and forth.  This is a common problem for all sysctl values
that use proc_dointvec_userhz_jiffies.  (There is only one other in-tree
use, in net/core/neighbour.c.)

Signed-off-by: Bart Samwel <bart@samwel.tk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:20 -08:00