dect
/
linux-2.6
Archived
13
0
Fork 0
Commit Graph

6022 Commits

Author SHA1 Message Date
Andrew Morton 12127498c8 nfsd warning fix
gcc-4.3:

fs/nfsd/nfsctl.c: In function 'write_getfs':
fs/nfsd/nfsctl.c:248: warning: cast from pointer to integer of different size

Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:06 -07:00
Christoph Hellwig 019ab801cf knfsd: exportfs: split out reconnecting a dentry from find_exported_dentry
There's a clear subfunctionality of reconnecting a given dentry to the main
dentry tree in find_exported_dentry, that can be called both for the dentry
we're looking for or it's parent directory.

This patch splits the subfunctionality out into a separate helper to make the
code more readable and document it's intent.  As a nice side-optimization we
can avoid getting a superfluous dentry reference count in the case we need to
reconnect a directory on it's own.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:06 -07:00
Christoph Hellwig dd90b50906 knfsd: exportfs: add find_disconnected_root helper
Break the loop that finds the root of a disconnected subtree into a helper of
its own to make reading easier and document the intent.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:06 -07:00
Christoph Hellwig fb66a1989c knfsd: exportfs: move acceptable check into find_acceptable_alias
All callers of find_acceptable_alias check if the current dentry is acceptable
before looking for other acceptable aliases using find_acceptable_alias.  Move
the check into find_acceptable_alias to make the code a little more dense and
add a comment to find_acceptable_alias that documents its intent.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:06 -07:00
Christoph Hellwig d7dd618a59 knfsd: exportfs: untangle ISDIR logic in find_exported_dentry
Rework some logic in find_exported_dentry so that we only have a single
S_ISDIR check and logic that makes clear to the reader what we're really doing
here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:06 -07:00
Christoph Hellwig 10f11c341d knfsd: exportfs: remove CALL macro
Currently exportfs uses a way to call methods very differently from the rest
of the kernel.  This patch changes it to the standard conventions for method
calls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:06 -07:00
Christoph Hellwig d37065cd6d knfsd: exportfs: add procedural interface for NFSD
Currently NFSD calls directly into filesystems through the export_operations
structure.  I plan to change this interface in various ways in later patches,
and want to avoid the export of the default operations to NFSD, so this patch
adds two simple exportfs_encode_fh/exportfs_decode_fh helpers for NFSD to call
instead of poking into exportfs guts.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:06 -07:00
Christoph Hellwig 5ca2960733 knfsd: exportfs: remove iget abuse
When the exportfs interface was added the expectation was that filesystems
provide an operation to convert from a file handle to an inode/dentry, but it
kept a backwards compat option that still calls into iget.

Calling into iget from non-filesystem code is very bad, because it gives too
little information to filesystem, and simply crashes if the filesystem doesn't
implement the ->read_inode routine.

Fortunately there are only two filesystems left using this fallback: efs and
jfs.  This patch moves a copy of export_iget to each of those to implement the
get_dentry method.

While this is a temporary increase of lines of code in the kernel it allows
for a much cleaner interface and important code restructuring in later
patches.

[akpm@linux-foundation.org: add jfs_get_inode_flags() declaration]
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:06 -07:00
Christoph Hellwig a569425512 knfsd: exportfs: add exportfs.h header
currently the export_operation structure and helpers related to it are in
fs.h.  fs.h is already far too large and there are very few places needing the
export bits, so split them off into a separate header.

[akpm@linux-foundation.org: fix cifs build]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:06 -07:00
Tejun Heo 9281acea6a kallsyms: make KSYM_NAME_LEN include space for trailing '\0'
KSYM_NAME_LEN is peculiar in that it does not include the space for the
trailing '\0', forcing all users to use KSYM_NAME_LEN + 1 when allocating
buffer.  This is nonsense and error-prone.  Moreover, when the caller
forgets that it's very likely to subtly bite back by corrupting the stack
because the last position of the buffer is always cleared to zero.

This patch increments KSYM_NAME_LEN by one and updates code accordingly.

* off-by-one bug in asm-powerpc/kprobes.h::kprobe_lookup_name() macro
  is fixed.

* Where MODULE_NAME_LEN and KSYM_NAME_LEN were used together,
  MODULE_NAME_LEN was treated as if it didn't include space for the
  trailing '\0'.  Fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Paulo Marques <pmarques@grupopie.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:03 -07:00
Rafael J. Wysocki 8314418629 Freezer: make kernel threads nonfreezable by default
Currently, the freezer treats all tasks as freezable, except for the kernel
threads that explicitly set the PF_NOFREEZE flag for themselves.  This
approach is problematic, since it requires every kernel thread to either
set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
care for the freezing of tasks at all.

It seems better to only require the kernel threads that want to or need to
be frozen to use some freezer-related code and to remove any
freezer-related code from the other (nonfreezable) kernel threads, which is
done in this patch.

The patch causes all kernel threads to be nonfreezable by default (ie.  to
have PF_NOFREEZE set by default) and introduces the set_freezable()
function that should be called by the freezable kernel threads in order to
unset PF_NOFREEZE.  It also makes all of the currently freezable kernel
threads call set_freezable(), so it shouldn't cause any (intentional)
change of behaviour to appear.  Additionally, it updates documentation to
describe the freezing of tasks more accurately.

[akpm@linux-foundation.org: build fixes]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:02 -07:00
Nick Piggin 787d2214c1 fs: introduce some page/buffer invariants
It is a bug to set a page dirty if it is not uptodate unless it has
buffers.  If the page has buffers, then the page may be dirty (some buffers
dirty) but not uptodate (some buffers not uptodate).  The exception to this
rule is if the set_page_dirty caller is racing with truncate or invalidate.

A buffer can not be set dirty if it is not uptodate.

If either of these situations occurs, it indicates there could be some data
loss problem.  Some of these warnings could be a harmless one where the
page or buffer is set uptodate immediately after it is dirtied, however we
should fix those up, and enforce this ordering.

Bring the order of operations for truncate into line with those of
invalidate.  This will prevent a page from being able to go !uptodate while
we're holding the tree_lock, which is probably a good thing anyway.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:02 -07:00
Rusty Russell 8e1f936b73 mm: clean up and kernelify shrinker registration
I can never remember what the function to register to receive VM pressure
is called.  I have to trace down from __alloc_pages() to find it.

It's called "set_shrinker()", and it needs Your Help.

1) Don't hide struct shrinker.  It contains no magic.
2) Don't allocate "struct shrinker".  It's not helpful.
3) Call them "register_shrinker" and "unregister_shrinker".
4) Call the function "shrink" not "shrinker".
5) Reduce the 17 lines of waffly comments to 13, but document it properly.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Chinner <dgc@sgi.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:00 -07:00
Andy Whitcroft 5ad333eb66 Lumpy Reclaim V4
When we are out of memory of a suitable size we enter reclaim.  The current
reclaim algorithm targets pages in LRU order, which is great for fairness at
order-0 but highly unsuitable if you desire pages at higher orders.  To get
pages of higher order we must shoot down a very high proportion of memory;
>95% in a lot of cases.

This patch set adds a lumpy reclaim algorithm to the allocator.  It targets
groups of pages at the specified order anchored at the end of the active and
inactive lists.  This encourages groups of pages at the requested orders to
move from active to inactive, and active to free lists.  This behaviour is
only triggered out of direct reclaim when higher order pages have been
requested.

This patch set is particularly effective when utilised with an
anti-fragmentation scheme which groups pages of similar reclaimability
together.

This patch set is based on Peter Zijlstra's lumpy reclaim V2 patch which forms
the foundation.  Credit to Mel Gorman for sanitity checking.

Mel said:

  The patches have an application with hugepage pool resizing.

  When lumpy-reclaim is used used with ZONE_MOVABLE, the hugepages pool can
  be resized with greater reliability.  Testing on a desktop machine with 2GB
  of RAM showed that growing the hugepage pool with ZONE_MOVABLE on it's own
  was very slow as the success rate was quite low.  Without lumpy-reclaim,
  each attempt to grow the pool by 100 pages would yield 1 or 2 hugepages.
  With lumpy-reclaim, getting 40 to 70 hugepages on each attempt was typical.

[akpm@osdl.org: ia64 pfn_to_nid fixes and loop cleanup]
[bunk@stusta.de: static declarations for internal functions]
[a.p.zijlstra@chello.nl: initial lumpy V2 implementation]
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Bob Picco <bob.picco@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:22:59 -07:00
Mel Gorman 769848c038 Add __GFP_MOVABLE for callers to flag allocations from high memory that may be migrated
It is often known at allocation time whether a page may be migrated or not.
This patch adds a flag called __GFP_MOVABLE and a new mask called
GFP_HIGH_MOVABLE.  Allocations using the __GFP_MOVABLE can be either migrated
using the page migration mechanism or reclaimed by syncing with backing
storage and discarding.

An API function very similar to alloc_zeroed_user_highpage() is added for
__GFP_MOVABLE allocations called alloc_zeroed_user_highpage_movable().  The
flags used by alloc_zeroed_user_highpage() are not changed because it would
change the semantics of an existing API.  After this patch is applied there
are no in-kernel users of alloc_zeroed_user_highpage() so it probably should
be marked deprecated if this patch is merged.

Note that this patch includes a minor cleanup to the use of __GFP_ZERO in
shmem.c to keep all flag modifications to inode->mapping in the
shmem_dir_alloc() helper function.  This clean-up suggestion is courtesy of
Hugh Dickens.

Additional credit goes to Christoph Lameter and Linus Torvalds for shaping the
concept.  Credit to Hugh Dickens for catching issues with shmem swap vector
and ramfs allocations.

[akpm@linux-foundation.org: build fix]
[hugh@veritas.com: __GFP_ZERO cleanup]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:22:59 -07:00
Eric Van Hensbergen 10fa16e75c 9p: fix debug compilation error
s/9p/v9fs.c: In function 'v9fs_parse_options':
fs/9p/v9fs.c:134: error: 'p9_debug_level' undeclared (first use in this function)

Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-07-16 16:03:25 -05:00
Satyam Sharma 5b37696fda utime(s): Honour CAP_FOWNER when times==NULL
do_utimes() does not honour CAP_FOWNER when times==NULL.
Trivial and obvious one-line fix.

Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 12:14:08 -07:00
Anton Altaparmakov 959bc220df Fix LDM for new field in the VOL5 VBLK.
Teach LDM about a new field encountered with Windows Vista.

This fixes LDM for people using Vista who have disabled drive letter
assignment from one or more volumes.  Doing this introduces a so far
unknown field in the LDM database in the VOL5 VBLK structure which
causes the LDM driver to fail to parse the VBLK structure and hence LDM
fails to parse the disk altogether.  This patch teaches the driver about
this field.

Thanks got to Ashton Mills <amills@iinet.com.au> for reporting the
problem and working with me on getting it fixed.  It is now working for
him.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
CC: Richard Russon <ldm@flatcap.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 12:01:30 -07:00
Linus Torvalds 10b275ddfd Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  [PATCH] sched: fix up fs/proc/array.c whitespace problems
  [PATCH] sched: prettify prio_to_wmult[]
  [PATCH] sched: document prio_to_wmult[]
  [PATCH] sched: improve weight-array comments
  [PATCH] sched: remove dead code from task_stime()

Fixed up trivial conflict in fs/proc/array.c
2007-07-16 11:02:49 -07:00
Linus Torvalds add096909d Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (32 commits)
  [PATCH] ocfs2: zero_user_page conversion
  ocfs2: Support xfs style space reservation ioctls
  ocfs2: support for removing file regions
  ocfs2: update truncate handling of partial clusters
  ocfs2: btree support for removal of arbirtrary extents
  ocfs2: Support creation of unwritten extents
  ocfs2: support writing of unwritten extents
  ocfs2: small cleanup of ocfs2_write_begin_nolock()
  ocfs2: btree changes for unwritten extents
  ocfs2: abstract btree growing calls
  ocfs2: use all extent block suballocators
  ocfs2: plug truncate into cached dealloc routines
  ocfs2: simplify deallocation locking
  ocfs2: harden buffer check during mapping of page blocks
  ocfs2: shared writeable mmap
  ocfs2: factor out write aops into nolock variants
  ocfs2: rework ocfs2_buffered_write_cluster()
  ocfs2: take ip_alloc_sem during entire truncate
  ocfs2: Add "preferred slot" mount option
  [KJ PATCH] Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c
  ...
2007-07-16 10:52:55 -07:00
Linus Torvalds 14dc524972 Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
  splice: direct splicing updates ppos twice
  more ACSI removal
  umem: Fix match of pci_ids in umem driver
  umem: Remove references to dead CONFIG_MM_MAP_MEMORY variable
  remove the documentation for the legacy CDROM drivers
2007-07-16 10:48:20 -07:00
Linus Torvalds b91cba52e9 Merge master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (68 commits)
  sh: sh-rtc support for SH7709.
  sh: Revert __xdiv64_32 size change.
  sh: Update r7785rp defconfig.
  sh: Export div symbols for GCC 4.2 and ST GCC.
  sh: fix race in parallel out-of-tree build
  sh: Kill off dead mach.c for hp6xx.
  sh: hd64461.h cleanup and added comments.
  sh: Update the alignment when 4K stacks are used.
  sh: Add a .bss.page_aligned section for 4K stacks.
  sh: Don't let SH-4A clobber SH-4 CFLAGS.
  sh: Add parport stub for SuperIO ports.
  sh: Drop -Wa,-dsp for DSP tuning.
  sh: Update dreamcast defconfig.
  fb: pvr2fb: A few more __devinit annotations for PCI.
  fb: pvr2fb: Fix up section mismatch warnings.
  sh: Select IPR-IRQ for SH7091.
  sh: Correct __xdiv64_32/div64_32 return value size.
  sh: Fix timer-tmu build for SH-3.
  sh: Add cpu and mach links to CLEAN_FILES.
  sh: Preliminary support for the SH-X3 CPU.
  ...
2007-07-16 10:32:02 -07:00
OGAWA Hirofumi 98283bb49c fat: Fix the race of read/write the FAT12 entry
FAT12 entry is 12bits, so it needs 2 phase to update the value.  And
writer and reader access it without any lock, so reader can get the
half updated value.

This fixes the long standing race condition by adding a global
spinlock to only FAT12 for avoiding any impact against FAT16/32.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 10:31:01 -07:00
Geert Uytterhoeven 98701dc19e compat32: ignore the LOOP_CLR_FD ioctl
compat32: Ignore the LOOP_CLR_FD ioctl for the loop block device, to kill an
annoying kernel message when e.g. busybox umount is used.

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:52 -07:00
Robert P. J. Day 29e3f34777 NLS: Remove obsolete Makefile entries
Since the corresponding source files no longer exist, remove the
irrelevant Makefile entries for them.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:52 -07:00
Badari Pulavarty 5e70030d4c ext4: statfs speed up
This is a patch that speeds up statfs.  It is very simple - the "overhead"
calculation, which takes a huge amount of time for large filesystems, never
changes unless the size of the filesystem itself changes.  That means we can
store it in memory and only recalculate if the filesystem has been resized
(almost never).

It also fixes a minor problem that we never update the on-disk superblock free
blocks/inodes counts until the filesystem is unmounted.  While not fatal, we
may as well update that on disk when we have the information, and it makes
things like debugfs and dumpe2fs report a bit more accurate info.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:52 -07:00
Badari Pulavarty a71ce8c6c9 ext3: statfs speed up
This is a patch that speeds up statfs.  It is very simple - the "overhead"
calculation, which takes a huge amount of time for large filesystems, never
changes unless the size of the filesystem itself changes.  That means we can
store it in memory and only recalculate if the filesystem has been resized
(almost never).

It also fixes a minor problem that we never update the on-disk superblock free
blocks/inodes counts until the filesystem is unmounted.  While not fatal, we
may as well update that on disk when we have the information, and it makes
things like debugfs and dumpe2fs report a bit more accurate info.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:52 -07:00
Badari Pulavarty 2235219b77 ext2: statfs speed up
This is a patch that speeds up statfs.  It is very simple - the "overhead"
calculation, which takes a huge amount of time for large filesystems, never
changes unless the size of the filesystem itself changes.  That means we can
store it in memory and only recalculate if the filesystem has been resized
(almost never).

It also fixes a minor problem that we never update the on-disk superblock free
blocks/inodes counts until the filesystem is unmounted.  While not fatal, we
may as well update that on disk when we have the information, and it makes
things like debugfs and dumpe2fs report a bit more accurate info.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:52 -07:00
J. Bruce Fields 4b4e5a1411 Fix trivial typos in anon_inodes.c comments
Trivial typo and grammar fixes.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:52 -07:00
Borislav Petkov 6c675bd43c ext4: fix error handling in ext4_create_journal
Fix error handling in ext4_create_journal according to kernel conventions.

Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:51 -07:00
Borislav Petkov 952d9de116 ext3: fix error handling in ext3_create_journal()
Fix error handling in ext3_create_journal according to kernel conventions.

Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:51 -07:00
Cyrill Gorcunov d375b97037 UDF: fix function name from udf_crc16 to udf_crc
We have to change udf_crc16() name to udf_crc() to be able to play with CRC
test.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Jan Kara <jack@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:51 -07:00
vignesh babu 9e8c4273ef is_power_of_2: ufs/super.c
Replace (n & (n-1)) with is_power_of_2

Signed-off-by: vignesh babu <vignesh.babu@wipro.com>
Acked-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:50 -07:00
Andrea Arcangeli 1d9d02feee move seccomp from /proc to a prctl
This reduces the memory footprint and it enforces that only the current
task can enable seccomp on itself (this is a requirement for a
strightforward [modulo preempt ;) ] TIF_NOTSC implementation).

Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:50 -07:00
Andrew Morton 4210df283c bd_claim_by_disk: fix warning
Fix this:

fs/block_dev.c: In function 'bd_claim_by_disk':
fs/block_dev.c:970: warning: 'found' may be used uninitialized in this function

and given that free_bd_holder() now needs free(NULL)-is-legal behaviour, we
can simplify bd_release_from_kobject().

Cc: Bjorn Steinbrink <B.Steinbrink@gmx.de>
Cc: Johannes Weiner <hannes-kernel@saeurebad.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:50 -07:00
Johannes Weiner 4e91672c76 Replace obscure constructs in fs/block_dev.c
Replace some funky codepaths in fs/block_dev.c with cleaner versions of the
affected places.

[akpm@linux-foundation.org: fix return value]
Signed-off-by: Johannes Weiner <hannes-kernel@saeurebad.de>
Cc: Bjorn Steinbrink <B.Steinbrink@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:50 -07:00
Adrian Bunk 948730b0e3 fs/namespace.c should #include "internal.h"
Every file should include the headers containing the prototypes for
its global functions.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:50 -07:00
Duane Griffin d45bce8faf HFS+: add custom dentry hash and comparison operations
Add custom dentry hash and comparison operations for HFS+ filesystems that are
case-insensitive and/or do automatic unicode decomposition.  The new
operations reuse the existing HFS+ ASCII to unicode conversion, unicode
decomposition and case folding functionality.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:49 -07:00
Duane Griffin 1e96b7ca1e HFS+: refactor ASCII to unicode conversion routine for later reuse
The HFS+ filesystem is case-insensitive and does automatic unicode
decomposition by default, but does not provide custom dentry operations.  This
can lead to multiple dentries being cached for lookups on a filename with
varying case and/or character (de)composition.

These patches add custom dentry hash and comparison operations for
case-sensitive and/or automatically decomposing HFS+ filesystems.  Unicode
decomposition and case-folding are performed as required to ensure equivalent
filenames are hashed to the same values and compare as equal.

This patch:

Refactor existing HFS+ ASCII to unicode string conversion routine to split out
character conversion functionality.  This will be reused by the custom dentry
hash and comparison routines.  This approach avoids unnecessary memory
allocation compared to using the string conversion routine directly in the new
functions.

[akpm@linux-foundation.org: avoid use-of-uninitialised]
Signed-off-by: Duane Griffin <duaneg@dghda.com>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:49 -07:00
Toshiyuki Okajima 29bc5b4f73 mistaken ext4_inode_bitmap for ext4_block_bitmap
In ext4_new_blocks(), one of two ext4_block_bitmap() calls should be
ext4_inode_bitmap() call.  It is not harmful in normal processing, but it
should be fixed.

Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:49 -07:00
vignesh babu f482394ccb is_power_of_2(): jbd
Replace (n & (n-1)) in the context of power of 2 checks with
is_power_of_2().

Signed-off-by: vignesh babu <vignesh.babu@wipro.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:48 -07:00
vignesh babu 3fc74269c8 is_power_of_2: ext3/super.c
Replace (n & (n-1)) in the context of power of 2 checks with is_power_of_2()

Signed-off-by: vignesh babu <vignesh.babu@wipro.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:48 -07:00
Christoph Hellwig 681dcd9543 drop obsolete sys_ioctl export
sys_ioctl() was only exported for our first version of compat ioctl
handling.  Now that the whole compat ioctl handling mess is more or less
sorted out there are no more modular users left and we can kill it.

There's one exception and that's sparc64's solaris compat module, but
sparc64 has it's own export predating the generic one by years for that
which this patch leaves untouched.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:48 -07:00
Eric W. Biederman 213dd266d4 namespace: ensure clone_flags are always stored in an unsigned long
While working on unshare support for the network namespace I noticed we
were putting clone flags in an int.  Which is weird because the syscall
uses unsigned long and we at least need an unsigned to properly hold all of
the unshare flags.

So to make the code consistent, this patch updates the code to use
unsigned long instead of int for the clone flags in those places
where we get it wrong today.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:48 -07:00
Dave Hansen e3a68e30d2 ext3: remove extra IS_RDONLY() check
ext3_change_inode_journal_flag() is only called from one location:
ext3_ioctl(EXT3_IOC_SETFLAGS).  That ioctl case already has a IS_RDONLY()
call in it so this one is superfluous.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:48 -07:00
Vasily Tarasov b716395e2b diskquota: 32bit quota tools on 64bit architectures
OpenVZ Linux kernel team has discovered the problem with 32bit quota tools
working on 64bit architectures.  In 2.6.10 kernel sys32_quotactl() function
was replaced by sys_quotactl() with the comment "sys_quotactl seems to be
32/64bit clean, enable it for 32bit" However this isn't right.  Look at
if_dqblk structure:

struct if_dqblk {
        __u64 dqb_bhardlimit;
        __u64 dqb_bsoftlimit;
        __u64 dqb_curspace;
        __u64 dqb_ihardlimit;
        __u64 dqb_isoftlimit;
        __u64 dqb_curinodes;
        __u64 dqb_btime;
        __u64 dqb_itime;
        __u32 dqb_valid;
};

For 32 bit quota tools sizeof(if_dqblk) == 0x44.
But for 64 bit kernel its size is 0x48, 'cause of alignment!
Thus we got a problem. Attached patch reintroduce sys32_quotactl() function,
that handles this and related situations.

[michal.k.k.piotrowski@gmail.com: build fix]
[akpm@linux-foundation.org: Make it link with CONFIG_QUOTA=n]
Signed-off-by: Vasily Tarasov <vtaras@openvz.org>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:48 -07:00
Jan Kara 32c3773011 ext4: fix deadlock in ext4_remount() and orphan list handling
ext4_orphan_add() and ext4_orphan_del() functions lock sb->s_lock with a
transaction started with ext4_mark_recovery_complete() waits for a transaction
holding sb->s_lock, thus leading to a possible deadlock.  At the moment we
call ext4_mark_recovery_complete() from ext4_remount() we have done all the
work needed for remounting and thus we are safe to drop sb->s_lock before we
wait for transactions to commit.  Note that at this moment we are still
guarded by s_umount lock against other remounts/umounts.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Eric Sandeen <sandeen@sandeen.net>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:48 -07:00
Jan Kara 030703e49d ext3: fix deadlock in ext3_remount() and orphan list handling
ext3_orphan_add() and ext3_orphan_del() functions lock sb->s_lock with a
transaction started with ext3_mark_recovery_complete() waits for a transaction
holding sb->s_lock, thus leading to a possible deadlock.  At the moment we
call ext3_mark_recovery_complete() from ext3_remount() we have done all the
work needed for remounting and thus we are safe to drop sb->s_lock before we
wait for transactions to commit.  Note that at this moment we are still
guarded by s_umount lock against other remounts/umounts.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Eric Sandeen <sandeen@sandeen.net>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:47 -07:00
Cedric Le Goater 467e9f4b50 fix create_new_namespaces() return value
dup_mnt_ns() and clone_uts_ns() return NULL on failure.  This is wrong,
create_new_namespaces() uses ERR_PTR() to catch an error.  This means that the
subsequent create_new_namespaces() will hit BUG_ON() in copy_mnt_ns() or
copy_utsname().

Modify create_new_namespaces() to also use the errors returned by the
copy_*_ns routines and not to systematically return ENOMEM.

[oleg@tv-sign.ru: better changelog]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:47 -07:00
Andrew Morton 4d3b573ad9 binfmt_elf warning fix
fs/binfmt_elf.c: In function 'load_elf_binary':
fs/binfmt_elf.c:1002: warning: 'interp_map_addr' may be used uninitialized in this function

The compiler (gcc-4.1.0) is correct, but it failed to notice that we didn't
use the resulting value.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:47 -07:00
Andrew Morton 64d67d2177 revert "vanishing ioctl handler debugging"
Revert my do_ioctl() debugging patch: Paul fixed the bug.

Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:47 -07:00
Lee Schermerhorn b4c07bce79 hugetlbfs: handle empty options string
I was seeing a null pointer deref in fs/super.c:vfs_kern_mount().
Some file system get_sb() handler was returning NULL mnt_sb with
a non-negative return value.  I also noticed a "hugetlbfs: Bad
mount option:" message in the log.

Turns out that hugetlbfs_parse_options() was not checking for an
empty option string after call to strsep().  On failure,
hugetlbfs_parse_options() returns 1.  hugetlbfs_fill_super() just
passed this return code back up the call stack where
vfs_kern_mount() missed the error and proceeded with a NULL mnt_sb.

Apparently introduced by patch:
	hugetlbfs-use-lib-parser-fix-docs.patch

The problem was exposed by this line in my fstab:

none        /huge       hugetlbfs   defaults    0 0

It can also be demonstrated by invoking mount of hugetlbfs
directly with no options or a bogus option.

This patch:

1) adds the check for empty option to hugetlbfs_parse_options(),
2) enhances the error message to bracket any unrecognized
   option with quotes ,
3) modifies hugetlbfs_parse_options() to return -EINVAL on any
   unrecognized option,
4) adds a BUG_ON() to vfs_kern_mount() to catch any get_sb()
   handler that returns a NULL mnt->mnt_sb with a return value
   >= 0.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:46 -07:00
Randy Dunlap e73a75fa7f hugetlbfs: use lib/parser, fix docs
Use lib/parser.c to parse hugetlbfs mount options.  Correct docs in
hugetlbpage.txt.

old size of hugetlbfs_fill_super:  675 bytes
new size of hugetlbfs_fill_super:  686 bytes
(hugetlbfs_parse_options() is inlined)

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Adam Litke <agl@us.ibm.com>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:46 -07:00
Wyatt Banks a5001a2780 HFSPlus: change kmalloc/memset to kzalloc
Removed kmalloc and memset in favor of kzalloc.

To explain the HFSPLUS_SB() macro in the removed memset call:

hfsplus_fs.h:#define HFSPLUS_SB(super)  (*(struct hfsplus_sb_info *)(super)->s_fs_info)

Signed-off-by: Wyatt Banks <wyatt@banksresearch.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:46 -07:00
Maxim Uvarov b663a79c19 taskstats: add context-switch counters
Make available to the user the following task and process performance
statistics:

	* Involuntary Context Switches (task_struct->nivcsw)
	* Voluntary Context Switches (task_struct->nvcsw)

Statistics information is available from:
	1. taskstats interface (Documentation/accounting/)
	2. /proc/PID/status (task only).

This data is useful for detecting hyperactivity patterns between processes.

[akpm@linux-foundation.org: cleanup]
Signed-off-by: Maxim Uvarov <muvarov@ru.mvista.com>
Cc: Shailabh Nagar <nagar@watson.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Cc: Jonathan Lim <jlim@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:46 -07:00
Vasily Averin a6c15c2b0f ext3/ext4: orphan list corruption due bad inode
After ext3 orphan list check has been added into ext3_destroy_inode()
(please see my previous patch) the following situation has been detected:

 EXT3-fs warning (device sda6): ext3_unlink: Deleting nonexistent file (37901290), 0
 Inode 00000101a15b7840: orphan list check failed!
 00000773 6f665f00 74616d72 00000573 65725f00 06737270 66000000 616d726f
...
 Call Trace: [<ffffffff80211ea9>] ext3_destroy_inode+0x79/0x90
  [<ffffffff801a2b16>] sys_unlink+0x126/0x1a0
  [<ffffffff80111479>] error_exit+0x0/0x81
  [<ffffffff80110aba>] system_call+0x7e/0x83

First messages said that unlinked inode has i_nlink=0, then ext3_unlink()
adds this inode into orphan list.

Second message means that this inode has not been removed from orphan list.
 Inode dump has showed that i_fop = &bad_file_ops and it can be set in
make_bad_inode() only.  Then I've found that ext3_read_inode() can call
make_bad_inode() without any error/warning messages, for example in the
following case:

...
        if (inode->i_nlink == 0) {
                if (inode->i_mode == 0 ||
                    !(EXT3_SB(inode->i_sb)->s_mount_state & EXT3_ORPHAN_FS)) {
                        /* this inode is deleted */
                        brelse (bh);
                        goto bad_inode;
...

Bad inode can live some time, ext3_unlink can add it to orphan list, but
ext3_delete_inode() do not deleted this inode from orphan list.  As result
we can have orphan list corruption detected in ext3_destroy_inode().

However it is not clear for me how to fix this issue correctly.

As far as i see is_bad_inode() is called after iget() in all places
excluding ext3_lookup() and ext3_get_parent().  I believe it makes sense to
add bad inode check to these functions too and call iput if bad inode
detected.

Signed-off-by:	Vasily Averin <vvs@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:46 -07:00
Vasily Averin 9f7dd93de0 ext3/ext4: orphan list check on destroy_inode
Customers claims to ext3-related errors, investigation showed that ext3
orphan list has been corrupted and have the reference to non-ext3 inode.
The following debug helps to understand the reasons of this issue.

[akpm@linux-foundation.org: update for print_hex_dump() changes]
Signed-off-by: Vasily Averin <vvs@sw.ru>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:46 -07:00
Alexey Dobriyan aa0ac36518 Remove capability.h from mm.h
I forgot to remove capability.h from mm.h while removing sched.h!  This
patch remedies that, because the only inline function which was using
CAP_something was made out of line.

Cross-compile tested without regressions on:

	all powerpc defconfigs
	all mips defconfigs
	all m68k defconfigs
	all arm defconfigs
	all ia64 defconfigs

	alpha alpha-allnoconfig alpha-defconfig alpha-up
	arm
	i386 i386-allnoconfig i386-defconfig i386-up
	ia64 ia64-allnoconfig ia64-defconfig ia64-up
	m68k
	mips
	parisc parisc-allnoconfig parisc-defconfig parisc-up
	powerpc powerpc-up
	s390 s390-allnoconfig s390-defconfig s390-up
	sparc sparc-allnoconfig sparc-defconfig sparc-up
	sparc64 sparc64-allnoconfig sparc64-defconfig sparc64-up
	um-x86_64
	x86_64 x86_64-allnoconfig x86_64-defconfig x86_64-up

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:45 -07:00
Alexey Dobriyan cb510b8172 seq_file: more atomicity in traverse()
Original problem: in some circumstances seq_file interface can present
infinite proc file to the following script when normally said proc file is
finite:

	while read line; do
		[do something with $line]
	done </proc/$FILE

bash, to implement such loop does essentially

	read(0, buf, 128);
	[find \n]
	lseek(0, -difference, SEEK_CUR);

Consider, proc file prints list of objects each of them consists of many
lines, each line is shorter than 128 bytes.

Two objects in list, with ->index'es being 0 and 1.  Current one is 1, as
bash prints second object line by line.

Imagine first object being removed right before lseek().
traverse() will be called, because there is negative offset.
traverse() will reset ->index to 0 (!).
traverse() will call ->next() and get NULL in any usual iterate-over-list
code using list_for_each_entry_continue() and such. There is one object in
list now after all...
traverse() will return 0, lseek() will update file position and pretend
everything is OK.

So, what we have now: ->f_pos points to place where second object will be
printed, but ->index is 0.  seq_read() instead of returning EOF, will start
printing first line of first object every time it's called, until enough
objects are added to ->f_pos return in bounds.

Fix is to update ->index only after we're sure we saw enough objects down
the road.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:45 -07:00
Ulrich Drepper 4a19542e5f O_CLOEXEC for SCM_RIGHTS
Part two in the O_CLOEXEC saga: adding support for file descriptors received
through Unix domain sockets.

The patch is once again pretty minimal, it introduces a new flag for recvmsg
and passes it just like the existing MSG_CMSG_COMPAT flag.  I think this bit
is not used otherwise but the networking people will know better.

This new flag is not recognized by recvfrom and recv.  These functions cannot
be used for that purpose and the asymmetry this introduces is not worse than
the already existing MSG_CMSG_COMPAT situations.

The patch must be applied on the patch which introduced O_CLOEXEC.  It has to
remove static from the new get_unused_fd_flags function but since scm.c cannot
live in a module the function still hasn't to be exported.

Here's a test program to make sure the code works.  It's so much longer than
the actual patch...

#include <errno.h>
#include <error.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/un.h>

#ifndef O_CLOEXEC
# define O_CLOEXEC 02000000
#endif
#ifndef MSG_CMSG_CLOEXEC
# define MSG_CMSG_CLOEXEC 0x40000000
#endif

int
main (int argc, char *argv[])
{
  if (argc > 1)
    {
      int fd = atol (argv[1]);
      printf ("child: fd = %d\n", fd);
      if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
        {
          puts ("file descriptor valid in child");
          return 1;
        }
      return 0;

    }

  struct sockaddr_un sun;
  strcpy (sun.sun_path, "./testsocket");
  sun.sun_family = AF_UNIX;

  char databuf[] = "hello";
  struct iovec iov[1];
  iov[0].iov_base = databuf;
  iov[0].iov_len = sizeof (databuf);

  union
  {
    struct cmsghdr hdr;
    char bytes[CMSG_SPACE (sizeof (int))];
  } buf;
  struct msghdr msg = { .msg_iov = iov, .msg_iovlen = 1,
                        .msg_control = buf.bytes,
                        .msg_controllen = sizeof (buf) };
  struct cmsghdr *cmsg = CMSG_FIRSTHDR (&msg);

  cmsg->cmsg_level = SOL_SOCKET;
  cmsg->cmsg_type = SCM_RIGHTS;
  cmsg->cmsg_len = CMSG_LEN (sizeof (int));

  msg.msg_controllen = cmsg->cmsg_len;

  pid_t child = fork ();
  if (child == -1)
    error (1, errno, "fork");
  if (child == 0)
    {
      int sock = socket (PF_UNIX, SOCK_STREAM, 0);
      if (sock < 0)
        error (1, errno, "socket");

      if (bind (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
        error (1, errno, "bind");
      if (listen (sock, SOMAXCONN) < 0)
        error (1, errno, "listen");

      int conn = accept (sock, NULL, NULL);
      if (conn == -1)
        error (1, errno, "accept");

      *(int *) CMSG_DATA (cmsg) = sock;
      if (sendmsg (conn, &msg, MSG_NOSIGNAL) < 0)
        error (1, errno, "sendmsg");

      return 0;
    }

  /* For a test suite this should be more robust like a
     barrier in shared memory.  */
  sleep (1);

  int sock = socket (PF_UNIX, SOCK_STREAM, 0);
  if (sock < 0)
    error (1, errno, "socket");

  if (connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
    error (1, errno, "connect");
  unlink (sun.sun_path);

  *(int *) CMSG_DATA (cmsg) = -1;

  if (recvmsg (sock, &msg, MSG_CMSG_CLOEXEC) < 0)
    error (1, errno, "recvmsg");

  int fd = *(int *) CMSG_DATA (cmsg);
  if (fd == -1)
    error (1, 0, "no descriptor received");

  char fdname[20];
  snprintf (fdname, sizeof (fdname), "%d", fd);
  execl ("/proc/self/exe", argv[0], fdname, NULL);
  puts ("execl failed");
  return 1;
}

[akpm@linux-foundation.org: Fix fastcall inconsistency noted by Michael Buesch]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Michael Buesch <mb@bu3sch.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:45 -07:00
Ulrich Drepper f23513e8d9 Introduce O_CLOEXEC
The problem is as follows: in multi-threaded code (or more correctly: all
code using clone() with CLONE_FILES) we have a race when exec'ing.

   thread #1                       thread #2

   fd=open()

                                   fork + exec

  fcntl(fd,F_SETFD,FD_CLOEXEC)

In some applications this can happen frequently.  Take a web browser.  One
thread opens a file and another thread starts, say, an external PDF viewer.
 The result can even be a security issue if that open file descriptor
refers to a sensitive file and the external program can somehow be tricked
into using that descriptor.

Just adding O_CLOEXEC support to open() doesn't solve the whole set of
problems.  There are other ways to create file descriptors (socket,
epoll_create, Unix domain socket transfer, etc).  These can and should be
addressed separately though.  open() is such an easy case that it makes not
much sense putting the fix off.

The test program:

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>

#ifndef O_CLOEXEC
# define O_CLOEXEC 02000000
#endif

int
main (int argc, char *argv[])
{
  int fd;
  if (argc > 1)
    {
      fd = atol (argv[1]);
      printf ("child: fd = %d\n", fd);
      if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
        {
          puts ("file descriptor valid in child");
          return 1;
        }
      return 0;
    }

  fd = open ("/proc/self/exe", O_RDONLY | O_CLOEXEC);
  printf ("in parent: new fd = %d\n", fd);
  char buf[20];
  snprintf (buf, sizeof (buf), "%d", fd);
  execl ("/proc/self/exe", argv[0], buf, NULL);
  puts ("execl failed");
  return 1;
}

[kyle@parisc-linux.org: parisc fix]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:45 -07:00
Eric W. Biederman 4a2d44590a buffer: kill old incorrect comment
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:45 -07:00
Matthias Kaehlcke 203a2935c7 fs/block_dev.c: use list_for_each_entry()
fs/block_dev.c: Use list_for_each_entry() instead of list_for_each()
in nr_blockdev_pages()

Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:45 -07:00
Alexey Dobriyan 00c5746da9 mutex_unlock() later in seq_lseek()
All manipulations with struct seq_file::version are done under
struct seq_file::lock except one introduced in commit
d6b7a781c51c91dd054e5c437885205592faac21
aka "[PATCH] Speed up /proc/pid/maps"

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:44 -07:00
Jan Kara a6739af8b9 ext2: fix a comment when ext2_release_file() is called
Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:44 -07:00
Alexey Dobriyan da58a16173 /proc/*/environ: wrong placing of ptrace_may_attach() check
It's a bit dopey-looking and can permit a task to cause a pagefault in an mm
which it doesn't have permission to read from.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:44 -07:00
young dave f17e121fd0 remove useless tolower in isofs
Remove useless tolower in isofs

Signed-off-by: dave young <hidave.darkstar@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:43 -07:00
Randy Dunlap 03a9c30c23 AFS: drop explicit extern
Don't use explicit extern specifier and quieten sparse warning:
fs/afs/vnode.c:564:12: warning: function 'afs_vnode_link' with external linkage has definition

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-By: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:43 -07:00
David Howells e8d6c55412 AFS: implement file locking
Implement file locking for AFS.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:43 -07:00
Changli Gao 99fc06df72 procfs directory entry cleanup
Function proc_register() will assign proc_dir_operations and
proc_dir_inode_operations to ent's members proc_fops and proc_iops
correctly if ent is a directory. So the early assignment isn't
necessary.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:43 -07:00
Micah Cowan 17973f5af7 Only send SIGXFSZ when exceeding rlimits.
Some users have been having problems with utilities like cp or dd dumping
core when they try to copy a file that's too large for the destination
filesystem (typically, > 4gb).  Apparently, some defunct standards required
SIGXFSZ to be sent in such circumstances, but SUS only requires/allows it
for when a written file exceeds the process's resource limits.  I'd like to
limit SIGXFSZs to the bare minimum required by SUS.

Patch sent per http://lkml.org/lkml/2007/4/10/302

Signed-off-by: Micah Cowan <micahcowan@ubuntu.com>
Acked-by: Alan Cox <alan@redhat.com>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:43 -07:00
Jan Kratochvil 60bfba7e85 PIE randomization
This patch is using mmap()'s randomization functionality in such a way that
it maps the main executable of (specially compiled/linked -pie/-fpie)
ET_DYN binaries onto a random address (in cases in which mmap() is allowed
to perform a randomization).

Origin of this patch is in exec-shield
(http://people.redhat.com/mingo/exec-shield/)

[jkosina@suse.cz: pie randomization: fix BAD_ADDR macro]
Signed-off-by: Jan Kratochvil <honza@jikos.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:42 -07:00
Dave Jones c3ed85a36f isofs: fix up CodingStyle
fs/isofs/* had a bunch of CodingStyle issues.
* Indentation was a mix of spaces and tabs
* "int * foo" instead of "int *foo"
* "while ( foo )" instead of "while (foo)"
* if (foo) blah; on one line instead of two
* Missing printk KERN_ levels
* lots of trailing whitespace
* lines >80 columns changed to wrap.
* Unnecessary prototype removed by shuffling code order in C file.

Should be no functional changes other than slight size increase due to
printk changes.  Further improvement possible, but this is a start..

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:42 -07:00
Denver Gingerich d05e96fe46 fix compiler warnings in acorn.c
warning: 'adfs_partition' defined but not used
warning: 'riscix_partition' defined but not used
warning: 'linux_partition' defined but not used

Signed-off-by: Denver Gingerich <denver@ossguy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:42 -07:00
OGAWA Hirofumi 9aacd59934 fat: gcc 4.3 warning fix
This patch fixes the following warnings.

fs/fat/dir.c: In function 'fat_parse_long':
include/linux/msdos_fs.h:294: warning: array subscript is above array bounds
include/linux/msdos_fs.h:295: warning: array subscript is above array bounds
include/linux/msdos_fs.h:295: warning: array subscript is above array bounds

The ->name is defined as "name[8], ext[3]", but fat_checksum() uses
those as name[11]. There is no actual problem, but it's not a good manner.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:42 -07:00
Pavel Emelianov 259902ea95 Make NFS client use seq_list_xxx helpers
This includes /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes entries.

Both need to show the header and use the list_head.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:42 -07:00
Pavel Emelianov b0765fb857 Make /proc/self/mounts(tats) use seq_list_xxx helpers
One more simple and stupid switching to the new API.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:42 -07:00
Pavel Emelianov 25216b0039 Make /proc/tty/drivers use seq_list_xxx helpers
Simple and stupid like some previous ones.  Just use new API.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:42 -07:00
Pavel Emelianov a6a8bd6d28 Make AFS use seq_list_xxx helpers
These proc files show some header before dumping the list, so the
seq_list_start_head() is used.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:41 -07:00
Andrew Morton 85420ccad1 vxfs warning fixes
gcc-4.3:

fs/freevxfs/vxfs_lookup.c: In function 'vxfs_find_entry':
fs/freevxfs/vxfs_lookup.c:139: warning: cast from pointer to integer of different size
fs/freevxfs/vxfs_lookup.c: In function 'vxfs_readdir':
fs/freevxfs/vxfs_lookup.c:294: warning: cast from pointer to integer of different size

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:41 -07:00
Cyrill Gorcunov 647bd61a5f UDF: check for allocated memory for inode data
This patch adds checking for granted memory while filling up inode data to
prevent possible NULL pointer usage.  If there is not enough memory to fill
inode data we just mark it as "bad".  Also some whitespace cleanup.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:41 -07:00
Cyrill Gorcunov c9c64155f5 UDF: check for allocated memory for data of new inodes
Add checking for granted memory for inode data at the moment of its
creation.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:41 -07:00
Tomas Janousek d62141414a Use boot based time for uptime in /proc
Commit 411187fb05 caused uptime not to increase
during suspend.  This may cause confusion so I restore the old behaviour by
using the boot based time instead of monotonic for uptime.

Signed-off-by: Tomas Janousek <tjanouse@redhat.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:41 -07:00
Tomas Janousek 924b42d5a2 Use boot based time for process start time and boot time in /proc
Commit 411187fb05 caused boot time to move and
process start times to become invalid after suspend.  Using boot based time
for those restores the old behaviour and fixes the issue.

[akpm@linux-foundation.org: little cleanup]
Signed-off-by: Tomas Janousek <tjanouse@redhat.com>
Cc: Tomas Smetana <tsmetana@redhat.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:41 -07:00
Alexey Dobriyan 786d7e1612 Fix rmmod/read/write races in /proc entries
Fix following races:
===========================================
1. Write via ->write_proc sleeps in copy_from_user(). Module disappears
   meanwhile. Or, more generically, system call done on /proc file, method
   supplied by module is called, module dissapeares meanwhile.

   pde = create_proc_entry()
   if (!pde)
	return -ENOMEM;
   pde->write_proc = ...
				open
				write
				copy_from_user
   pde = create_proc_entry();
   if (!pde) {
	remove_proc_entry();
	return -ENOMEM;
	/* module unloaded */
   }
				*boom*
==========================================
2. bogo-revoke aka proc_kill_inodes()

  remove_proc_entry		vfs_read
  proc_kill_inodes		[check ->f_op validness]
				[check ->f_op->read validness]
				[verify_area, security permissions checks]
	->f_op = NULL;
				if (file->f_op->read)
					/* ->f_op dereference, boom */

NOTE, NOTE, NOTE: file_operations are proxied for regular files only. Let's
see how this scheme behaves, then extend if needed for directories.
Directories creators in /proc only set ->owner for them, so proxying for
directories may be unneeded.

NOTE, NOTE, NOTE: methods being proxied are ->llseek, ->read, ->write,
->poll, ->unlocked_ioctl, ->ioctl, ->compat_ioctl, ->open, ->release.
If your in-tree module uses something else, yell on me. Full audit pending.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:39 -07:00
Andrew Morton fc9a07e7bf invalidate_mapping_pages(): add cond_resched
invalidate_mapping_pages() can sometimes take a long time (millions of pages
to free).  Long enough for the softlockup detector to trigger.

We used to have a cond_resched() in there but I took it out because the
drop_caches code calls invalidate_mapping_pages() under inode_lock.

The patch adds a nasty flag and puts the cond_resched() back.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Jan Kara f89b779508 jbd2 commit: fix transaction dropping
We have to check that also the second checkpoint list is non-empty before
dropping the transaction.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Chuck Ebbert <cebbert@redhat.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:34 -07:00
Jan Kara fe28e42b99 jbd commit: fix transaction dropping
We have to check that also the second checkpoint list is non-empty before
dropping the transaction.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Chuck Ebbert <cebbert@redhat.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:34 -07:00
Jens Axboe bcd4f3acba splice: direct splicing updates ppos twice
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> reported that he's noticed
nfsd read corruption in recent kernels, and did the hard work of
discovering that it's due to splice updating the file position twice.
This means that the next operation would start further ahead than it
should.

nfsd_vfs_read()
    splice_direct_to_actor()
        while(len) {
            do_splice_to()                     [update sd->pos]
                -> generic_file_splice_read()  [read from sd->pos]
            nfsd_direct_splice_actor()
                -> __splice_from_pipe()        [update sd->pos]

There's nothing wrong with the core splice code, but the direct
splicing is an addon that calls both input and output paths.
So it has to take care in locally caching offset so it remains correct.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-16 15:02:48 +02:00
Avi Kivity d6d2816849 KVM: Remove kvmfs in favor of the anonymous inodes source
kvm uses a pseudo filesystem, kvmfs, to generate inodes, a job that the
new anonymous inodes source does much better.

Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16 12:05:49 +03:00
Ingo Molnar 8ea0260668 [PATCH] sched: fix up fs/proc/array.c whitespace problems
while changing task_stime() i noticed a whitespace style problem in
array.c - fix it. While at it, fix all the other style problems too,
most of them in the scheduler-stats related portions of array.c.

There is no change in functionality:

   text    data     bss     dec     hex filename
   4356      28       0    4384    1120 array.o-before
   4356      28       0    4384    1120 array.o-after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-16 09:46:31 +02:00
Ingo Molnar 5926c50b83 [PATCH] sched: remove dead code from task_stime()
Alexey Dobriyan noticed that task_stime() contains a piece of dead code.
(which is a remnant of earlier versions of this code) Remove that code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-16 09:46:30 +02:00
Linus Torvalds d2a9a8ded4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: fix a race condition bug in umount which caused a segfault
  9p: re-enable mount time debug option
  9p: cache meta-data when cache=loose
  net/9p: set error to EREMOTEIO if trans->write returns zero
  net/9p: change net/9p module name to 9pnet
  9p: Reorganization of 9p file system code
2007-07-15 16:44:53 -07:00
Linus Torvalds 2d896c780d Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: (37 commits)
  [XFS] Fix lockdep annotations for xfs_lock_inodes
  [LIB]: export radix_tree_preload()
  [XFS] Fix XFS_IOC_FSBULKSTAT{,_SINGLE} & XFS_IOC_FSINUMBERS in compat mode
  [XFS] Compat ioctl handler for handle operations
  [XFS] Compat ioctl handler for XFS_IOC_FSGEOMETRY_V1.
  [XFS] Clean up function name handling in tracing code
  [XFS] Quota inode has no parent.
  [XFS] Concurrent Multi-File Data Streams
  [XFS] Use uninitialized_var macro to stop warning about rtx
  [XFS] XFS should not be looking at filp reference counts
  [XFS] Use is_power_of_2 instead of open coding checks
  [XFS] Reduce shouting by removing unnecessary macros from dir2 code.
  [XFS] Simplify XFS min/max macros.
  [XFS] Kill off xfs_count_bits
  [XFS] Cancel transactions on xfs_itruncate_start error.
  [XFS] Use do_div() on 64 bit types.
  [XFS] Fix remount,readonly path to flush everything correctly.
  [XFS] Cleanup inode extent size hint extraction
  [XFS] Prevent ENOSPC from aborting transactions that need to succeed
  [XFS] Prevent deadlock when flushing inodes on unmount
  ...
2007-07-15 16:43:43 -07:00
Al Viro 7c9e3c2e3b wrong order of arguments of ->readdir()
Shows how many people are testing coda - the bug had been there for 5 years
and results of stepping on it are not subtle.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-15 16:40:51 -07:00
Eric Van Hensbergen 9e2f6688c0 9p: re-enable mount time debug option
During reorganization, the mount time debug option was removed in favor
of module-load-time parameters.  However, the mount time option is still
a useful for feature during debug and for user-fault isolation when the
module is compiled into the kernel.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-07-14 15:14:14 -05:00
Eric Van Hensbergen 9523a841b1 9p: cache meta-data when cache=loose
This patch expands the impact of the loose cache mode to allow for cached
metadata increasing the performance of directory listings and other metadata
read operations.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-07-14 15:14:08 -05:00
Latchesar Ionkov bd238fb431 9p: Reorganization of 9p file system code
This patchset moves non-filesystem interfaces of v9fs from fs/9p to net/9p.
It moves the transport, packet marshalling and connection layers to net/9p
leaving only the VFS related files in fs/9p.  This work is being done in
preparation for in-kernel 9p servers as well as alternate 9p clients (other
than VFS).

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-07-14 15:13:40 -05:00
David Chinner 0f1145cc18 [XFS] Fix lockdep annotations for xfs_lock_inodes
SGI-PV: 967035
SGI-Modid: xfs-linux-melb:xfs-kern:29026a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-14 18:09:42 +10:00
Michal Marek faa63e9584 [XFS] Fix XFS_IOC_FSBULKSTAT{,_SINGLE} & XFS_IOC_FSINUMBERS in compat mode
* 32bit struct xfs_fsop_bulkreq has different size and layout of
members, no matter the alignment. Move the code out of the #else
branch (why was it there in the first place?). Define _32 variants of
the ioctl constants.
* 32bit struct xfs_bstat is different because of time_t and on
i386 because of different padding. Make xfs_bulkstat_one() accept a
custom "output formatter" in the private_data argument which takes care
of the xfs_bulkstat_one_compat() that takes care of the different
layout in the compat case.
* i386 struct xfs_inogrp has different padding.
Add a similar "output formatter" mecanism to xfs_inumbers().

SGI-PV: 967354
SGI-Modid: xfs-linux-melb:xfs-kern:29102a

Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-14 15:42:50 +10:00