dect
/
linux-2.6
Archived
13
0
Fork 0
Commit Graph

8644 Commits

Author SHA1 Message Date
Christoph Hellwig 126468b115 [XFS] kill xfs_rwlock/xfs_rwunlock
We can just use xfs_ilock/xfs_iunlock instead and get rid of the ugly
bhv_vrwlock_t.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30533a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:39:25 +10:00
Christoph Hellwig 43973964a3 [XFS] kill xfs_get_dir_entry
Instead of of xfs_get_dir_entry use a macro to get the xfs_inode from the
dentry in the callers and grab the reference manually.

Only grab the reference once as it's fine to keep it over the dmapi calls.
(And even that reference is actually superflous in Linux but I'll leave
that for another patch)

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30531a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:39:14 +10:00
Christoph Hellwig a8b3acd57e [XFS] vnode cleanup in xfs_fs_subr.c
Cleanup the unneeded intermediate vnode step in the flushing helpers and
go directly from the xfs_inode to the struct address_space.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30530a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:39:03 +10:00
Christoph Hellwig db0bb7baa1 [XFS] cleanup xfs_vn_mknod
- use proper goto based unwinding instead of the current mess of
  multiple conditionals
- rename ip to inode because that's the normal convention for Linux
  inodes while ip is the convention for xfs_inodes
- remove unlikely checks for the default_acl - branches marked unlikely
  might lead to extreme branch bredictor slowdons if taken and for some
  workloads a default acl is quite common
- properly indent the switch statements
- remove xfs_has_fs_struct as nfsd has a fs_struct in any semi-recent
  kernel

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30529a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:38:53 +10:00
David Chinner 155cc6b784 [XFS] Use atomics for iclog reference counting
Now that we update the log tail LSN less frequently on transaction
completion, we pass the contention straight to the global log state lock
(l_iclog_lock) during transaction completion.

We currently have to take this lock to decrement the iclog reference
count. there is a reference count on each iclog, so we need to take þhe
global lock for all refcount changes.

When large numbers of processes are all doing small trnasctions, the iclog
reference counts will be quite high, and the state change that absolutely
requires the l_iclog_lock is the except rather than the norm.

Change the reference counting on the iclogs to use atomic_inc/dec so that
we can use atomic_dec_and_lock during transaction completion and avoid the
need for grabbing the l_iclog_lock for every reference count decrement
except the one that matters - the last.

SGI-PV: 975671
SGI-Modid: xfs-linux-melb:xfs-kern:30505a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:38:10 +10:00
David Chinner b589334c7a [XFS] Prevent AIL lock contention during transaction completion
When hundreds of processors attempt to commit transactions at the same
time, they can contend on the AIL lock when updating the tail LSN held in
the in-core log structure.

At the moment, the tail LSN is only needed when actually writing out an
iclog, so it really does not need to be updated on every single
transaction completion - only those that result in switching iclogs and
flushing them to disk.

The result is that we reduce the number of times we need to grab the AIL
lock and the log grant lock by up to two orders of magnitude on large
processor count machines. The problem has previously been hidden by AIL
lock contention walking the AIL list which was recently solved and
uncovered this issue.

SGI-PV: 975671
SGI-Modid: xfs-linux-melb:xfs-kern:30504a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:38:01 +10:00
David Chinner 3354040897 [XFS] Use xfs_inode_clean() in more places
Remove open coded checks for the whether the inode is clean and replace
them with an inlined function.

SGI-PV: 977461
SGI-Modid: xfs-linux-melb:xfs-kern:30503a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:37:51 +10:00
David Chinner bad5584332 [XFS] Remove the xfs_icluster structure
Remove the xfs_icluster structure and replace with a radix tree lookup.

We don't need to keep a list of inodes in each cluster around anymore as
we can look them up quickly when we need to. The only time we need to do
this now is during inode writeback.

Factor the inode cluster writeback code out of xfs_iflush and convert it
to use radix_tree_gang_lookup() instead of walking a list of inodes built
when we first read in the inodes.

This remove 3 pointers from each xfs_inode structure and the xfs_icluster
structure per inode cluster. Hence we reduce the cache footprint of the
xfs_inodes by between 5-10% depending on cluster sparseness.

To be truly efficient we need a radix_tree_gang_lookup_range() call to
stop searching once we are past the end of the cluster instead of trying
to find a full cluster's worth of inodes.

Before (ia64):

$ cat /sys/slab/xfs_inode/object_size 536

After:

$ cat /sys/slab/xfs_inode/object_size 512

SGI-PV: 977460
SGI-Modid: xfs-linux-melb:xfs-kern:30502a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:37:41 +10:00
David Chinner a3f74ffb6d [XFS] Don't block pdflush when writing back inodes
When pdflush is writing back inodes, it can get stuck on inode cluster
buffers that are currently under I/O. This occurs when we write data to
multiple inodes in the same inode cluster at the same time.

Effectively, delayed allocation marks the inode dirty during the data
writeback. Hence if the inode cluster was flushed during the writeback of
the first inode, the writeback of the second inode will block waiting for
the inode cluster write to complete before writing it again for the newly
dirtied inode.

Basically, we want to avoid this from happening so we don't block pdflush
and slow down all of writeback. Hence we introduce a non-blocking async
inode flush flag that pdflush uses. If this flag is set, we use
non-blocking operations (e.g. try locks) whereever we can to avoid
blocking or extra I/O being issued.

SGI-PV: 970925
SGI-Modid: xfs-linux-melb:xfs-kern:30501a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:37:32 +10:00
David Chinner 4ae29b4321 [XFS] Factor xfs_itobp() and xfs_inotobp().
The only difference between the functions is one passes an inode for the
lookup, the other passes an inode number. However, they don't do the same
validity checking or set all the same state on the buffer that is returned
yet they should.

Factor the functions into a common implementation.

SGI-PV: 970925
SGI-Modid: xfs-linux-melb:xfs-kern:30500a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:37:19 +10:00
Lachlan McIlroy e9a56b7cda [XFS] Fix regression due to refcache removal
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30490a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
2008-04-18 11:37:06 +10:00
Donald Douwsma 163d3686bb [XFS] Remove the xfs_refcache
Remove the xfs_refcache, it was only needed while we were still
building for 2.4 kernels.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30472a

Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:36:55 +10:00
Lachlan McIlroy 461aa8a225 [XFS] make inode reclaim synchronise with xfs_iflush_done()
On a forced shutdown, xfs_finish_reclaim() will skip flushing the inode.
If the inode flush lock is not already held and there is an outstanding
xfs_iflush_done() then we might free the inode prematurely. By acquiring
and releasing the flush lock we will synchronise with xfs_iflush_done().

SGI-PV: 909874
SGI-Modid: xfs-linux-melb:xfs-kern:30468a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2008-04-18 11:34:54 +10:00
Niv Sardi e12070a5dc [XFS] actually check error returned by xfs_flush_pages, clean up and
bailout if fails.

SGI-PV: 973041
SGI-Modid: xfs-linux-melb:xfs-kern:30462a

Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:34:47 +10:00
Steve French 675c46796d [CIFS] Add various missing flags and defintions
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-04-17 23:41:01 +00:00
Steve French 20e673810c Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-04-17 23:38:45 +00:00
Bob Copeland f845fced91 udf: use crc_itu_t from lib instead of udf_crc
As pointed out by Sergey Vlasov, UDF implements its own version of
the CRC ITU-T V.41.  Convert it to use the one in the library.

Signed-off-by: Bob Copeland <me@bobcopeland.com>
Cc: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:29:56 +02:00
Sebastian Manciulea 706047a797 udf: Fix compilation warnings when UDF debug is on
Fix two compilation warnings (and actual bugs in message formatting)
when UDF debugging is turned on.

Signed-off-by: Sebastian Manciulea <manciuleas@yahoo.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:29:48 +02:00
Sebastian Manciulea 47c9358a01 udf: Fix bug in VAT mapping code
Fix mapping of blocks using VAT when it is stored in an inode.
UDF_I(inode)->i_data already points to the beginning of VAT header so there's
no need to add udf_ext0_offset(inode).

Signed-off-by: Sebastian Manciulea <manciuleas@yahoo.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:29:43 +02:00
Jan Kara bfb257a598 udf: Add read-only support for 2.50 UDF media
This patch implements parsing of metadata partitions and reading of Metadata
File thus allowing to read UDF 2.50 media. Error resilience is implemented
through accessing the Metadata Mirror File in case the data the Metadata File
cannot be read. The patch is based on the original patch by Sebastian Manciulea
<manciuleas@yahoo.com> and Mircea Fedoreanu <mirceaf_spl@yahoo.com>.

Signed-off-by: Sebastian Manciulea <manciuleas@yahoo.com>
Signed-off-by: Mircea Fedoreanu <mirceaf_spl@yahoo.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:29:36 +02:00
Sebastian Manciulea f4bcbbd92e udf: Fix handling of multisession media
According to OSTA UDF specification, only anchor blocks and primary volume
descriptors are placed on media relative to the last session. All other block
numbers are absolute (in the partition or the whole media). This seems to be
confirmed by multisession media created by other systems.

Signed-off-by: Sebastian Manciulea <manciuleas@yahoo.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:28:33 +02:00
Jan Kara 96200be307 udf: Mount filesystem read-only if it has pseudooverwrite partition
As we don't properly support writing to pseudooverwrite partition (we should
add entries to VAT and relocate blocks instead of just writing them), mount
filesystems with such partition as read-only.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:28:14 +02:00
Jan Kara fa5e081563 udf: Handle VAT packed inside inode properly
We didn't handle VAT packed inside the inode - we tried to call udf_block_map()
on such file which lead to strange results at best. Add proper handling of
packed VAT as we do it with other packed files.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:25:35 +02:00
Jan Kara 742e1795e2 udf: Allow loading of VAT inode
UDF media with VAT could have never worked because udf_fill_inode() didn't
know about case FILE_TYPE_VAT20. Fix this.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:41 +02:00
Jan Kara c82a127505 udf: Fix detection of VAT version
We incorrectly (way to strictly) checked version of VAT on loading and thus
refuse to mount correct media.  There are just two format versions - below 2.0
and above 2.0 and we understand both. So update the version check accordingly.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:39 +02:00
Jan Kara 4f7874c868 udf: Silence warning about accesses beyond end of device
Some of the computed positions of anchor block could be beyond the end of
device. Skip reading such blocks.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:17 +02:00
Jan Kara 5fb28aa25a udf: Improve anchor block detection
Add <last block>+1 and <last block>-1 to a list of blocks which can be the
real last recorded block on a UDF media. Sebastian Manciulea
<manciuleas@yahoo.com> claims this helps some drive + media combinations
he is able to test.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:13 +02:00
Jan Kara 423cf6dc04 udf: Cleanup anchor block detection.
UDF anchor block detection is complicated by several things - there are several
places where the anchor point can be, some of them relative to the last
recorded block which some devices report wrongly. Moreover some devices on some
media seem to have 7 spare blocks sectors for every 32 blocks (at least as far
as I understand the old code) so we have to count also with that possibility.

This patch splits anchor block detection into several functions so that it is
clearer what we actually try to do. We fix several bugs of the type "for such
and such media, we fail to check block blah" as a result of the cleanup.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:10 +02:00
Jan Kara 38b74a53e5 udf: Move processing of virtual partitions
This patch move processing of UDF virtual partitions close to the place
where other partition types are processed. As a result we now also
properly fill in partition access type.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:04 +02:00
Jan Kara 3fb38dfa0e udf: Move filling of partition descriptor info into a separate function
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:04 +02:00
Jan Kara 2e0838fd0c udf: Improve error recovery on mount
Report error when we fail to allocate memory for a bitmap and properly
release allocated memory and inodes for all the partitions in case of
mount failure and umount.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:04 +02:00
Jan Kara c0eb31ed13 udf: Cleanup volume descriptor sequence processing
Cleanup processing of volume descriptor sequence so that it is more readable,
make code handle errors (e.g. media problems) better.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:04 +02:00
Pavel Emelyanov d0db181c07 udf: fix anchor point detection
According to ECMA 167 rev.  3 (see 3/8.4.2.1), Anchor Volume Descriptor
Pointer should be recorded at two or more anchor points located at sectors
256, N, N - 256, where N - is a largest logical sector number at volume
space.

So we should always try to detect N on UDF volume before trying to find
Anchor Volume Descriptor (i.e.  calling to udf_find_anchor()).

That said, all this patch does is updates the s_last_block even if the
udf_vrs() returns positive value.

Originally written and tested by Yuri Per, ported on latest mainline by me.

Signed-off-by: Yuri Per <Yuri.Per@acronis.com>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Max Lyadvinsky <Max.Lyadvinsky@acronis.com>
Cc: Vladimir Simonov <Vladimir.Simonov@acronis.com>
Cc: Andrew Neporada <Andrew.Neporada@acronis.com>
Cc: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:04 +02:00
Jan Kara b80697c14d udf: Remove declarations of arrays of size UDF_NAME_LEN (256 bytes)
There are several places in UDF where we declared temporary arrays of
UDF_NAME_LEN bytes on stack. This is not nice to stack usage so this patch
changes those places to use kmalloc() instead. Also clean up bail-out paths
in those functions when we are changing them.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:04 +02:00
Jan Kara 9bf2c6b834 udf: Remove checking of existence of filename in udf_add_entry()
We don't have to check whether a directory entry already exists in a directory
when creating a new one since we've already checked that earlier by lookup and
we are holding directory i_mutex all the time.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:23:03 +02:00
Jan Kara 200a3592cd udf: Mark udf_process_sequence() as noinline
Mark udf_process_sequence() as noinline since stack usage is terrible
otherwise.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:49 +02:00
Marcin Slusarz 165923fa45 udf: super.c reorganization
reorganize few code blocks in super.c which
were needlessly indented (and hard to read):

so change from:
rettype fun()
{
	init;
	if (sth) {
		long block of code;
	}
}

to:
rettype fun()
{
	init;
	if (!sth)
		return;
	long block of code;
}

or

from:
rettype fun2()
{
	init;
	while (sth) {
		init2();
		if (sth2) {
			long block of code;
		}
	}
}

to:
rettype fun2()
{
	init;
	while (sth) {
		init2();
		if (!sth2)
			continue;
		long block of code;
	}
}

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:45 +02:00
Marcin Slusarz af15a298a4 udf: remove unneeded kernel_timestamp type
remove now unneeded kernel_timestamp type with conversion functions

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:42 +02:00
Marcin Slusarz 56774805d5 udf: convert udf_stamp_to_time and udf_time_to_stamp to use timestamps
* kernel_timestamp type was almost unused - only callers of udf_stamp_to_time
and udf_time_to_stamp used it, so let these functions handle endianness
internally and don't clutter code with conversions

* rename udf_stamp_to_time to udf_disk_stamp_to_time
  and udf_time_to_stamp to udf_time_to_disk_stamp

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:29 +02:00
marcin.slusarz@gmail.com cbf5676a0e udf: convert udf_stamp_to_time to return struct timespec
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:29 +02:00
marcin.slusarz@gmail.com c87e8e90d0 udf: create function for conversion from timestamp to timespec
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:29 +02:00
marcin.slusarz@gmail.com f18f17b033 udf: udf_get_block, inode_bmap - remove unneeded checks
block cannot be less than 0, because it's sector_t,
so remove unneeded checks

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:29 +02:00
Marcin Slusarz 01b954a36a udf: convert udf_count_free_bitmap to use bitmap_weight
replace handwritten bits counting with bitmap_weight

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:29 +02:00
marcin.slusarz@gmail.com d652eefb70 udf: replace udf_*_offset macros with functions
- translate udf_file_entry_alloc_offset macro into function
- translate udf_ext0_offset macro into function
- add comment about crypticly named fields in struct udf_inode_info

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:29 +02:00
marcin.slusarz@gmail.com 1ab9278570 udf: simplify __udf_read_inode
- move all brelse(ibh) after main if, because it's called
  on every path except one where ibh is null
- move variables to the most inner blocks

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:28 +02:00
marcin.slusarz@gmail.com c2104fda5e udf: replace all adds to little endians variables with le*_add_cpu
replace all:
	little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) +
	                                    expression_in_cpu_byteorder);
with:
	leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder);
sparse didn't generate any new warning with this patch

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:28 +02:00
marcin.slusarz@gmail.com 456390de46 udf: truncate: create function for updating of Allocation Ext Descriptor
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:28 +02:00
marcin.slusarz@gmail.com 9de90b76eb udf: simple cleanup of truncate.c
- remove one indentation level by little code reorganization
- convert "if (smth) BUG();" to "BUG_ON(smth);"

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:28 +02:00
marcin.slusarz@gmail.com c8ed837d37 udf: constify crc
- constify internal crc table
- mark udf_crc "in" parameter as const

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:24 +02:00
marcin.slusarz@gmail.com 34f953ddfd udf: udf_CS0toNLS cleanup
- fix error handling - always zero output variable
- don't zero explicitely fields zeroed by memset
- mark "in" paramater as const

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:24 +02:00
Marcin Slusarz 6305a0a9d5 udf: fix udf_build_ustr
udf_build_ustr was broken:

- size == 1:
    dest->u_len = ptr[1 - 1], but at ptr[0] there's cmpID,
    so we created string with wrong length
    it should not happen, so we BUG() it
- size > 1 and size < UDF_NAME_LEN:
    we set u_len correctly, but memcpy copied one needless byte
- size == UDF_NAME_LEN - 1:
    memcpy overwrited u_len - with correct value, but...
- size >= UDF_NAME_LEN:
    we copied UDF_NAME_LEN - 1 bytes, but dest->u_name is array
    of UDF_NAME_LEN - 2 bytes, so we were overwriting u_len with
    character from input string

nobody noticed because all callers set size
to acceptable values (constants within range)

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:24 +02:00
marcin.slusarz@gmail.com 79cfe0ff5f udf: udf_CS0toUTF8 cleanup
- fix error handling - always zero output variable
- don't zero explicitely fields zeroed by memset
- mark "in" paramater as const
- remove outdated comment

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:23 +02:00
Adrian Bunk b8145a7697 make udf_error() static
This patch makes the needlessly global udf_error() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:23 +02:00
Julia Lawall 8dee00bb75 fs/udf: Use DIV_ROUND_UP
The kernel.h macro DIV_ROUND_UP performs the computation (((n) + (d) - 1) /
(d)) but is perhaps more readable.

An extract of the semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@haskernel@
@@

#include <linux/kernel.h>

@depends on haskernel@
expression n,d;
@@

(
- (n + d - 1) / d
+ DIV_ROUND_UP(n,d)
|
- (n + (d - 1)) / d
+ DIV_ROUND_UP(n,d)
)

@depends on haskernel@
expression n,d;
@@

- DIV_ROUND_UP((n),d)
+ DIV_ROUND_UP(n,d)

@depends on haskernel@
expression n,d;
@@

- DIV_ROUND_UP(n,(d))
+ DIV_ROUND_UP(n,d)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:23 +02:00
Christoph Hellwig 15aebd2866 udf: move headers out include/linux/
There's really no reason to keep udf headers in include/linux as they're
not used by anything but fs/udf/.

This patch merges most of include/linux/udf_fs_i.h into fs/udf/udf_i.h,
include/linux/udf_fs_sb.h into fs/udf/udf_sb.h and
include/linux/udf_fs.h into fs/udf/udfdecl.h.

The only thing remaining in include/linux/ is a stub of udf_fs_i.h
defining the four user-visible udf ioctls.  It's also moved from
unifdef-y to headers-y because it can be included unconditionally now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:23 +02:00
Christoph Hellwig b1e321266d udf: kill useless file header comments for vfs method implementations
There's not need to document vfs method invocation rules, we have
Documentation/filesystems/vfs.txt and Documentation/filesystems/Locking
for that.  Also a lot of these comments where either plain wrong or
horrible out of date.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:23 +02:00
Christoph Hellwig f1f73ba8e9 udf: kill udf_set_blocksize
This helper has been quite useless since sb_min_blocksize was introduced
and is misnamed while we're at it.  Just opencode the few lines in the
caller instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2008-04-17 14:22:23 +02:00
Paul Bolle 424b00e2c0 AFS: Do not describe debug parameters with their value
Describe debug parameters with their names (and not their values).

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-16 07:43:48 -07:00
Steve French 8d142137b4 [CIFS] make cifs_dfs_automount_list_static
This patch makes the needlessly global cifs_dfs_automount_list static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-04-16 03:56:51 +00:00
Jan Kara 335e92e8a5 vfs: fix possible deadlock in ext2, ext3, ext4 when using xattrs
mb_cache_entry_alloc() was allocating cache entries with GFP_KERNEL.  But
filesystems are calling this function while holding xattr_sem so possible
recursion into the fs violates locking ordering of xattr_sem and transaction
start / i_mutex for ext2-4.  Change mb_cache_entry_alloc() so that filesystems
can specify desired gfp mask and use GFP_NOFS from all of them.

Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Dave Jones <davej@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-15 19:35:41 -07:00
Steve French 5d941ca628 [CIFS] Fix oops when slow oplock process races with unmount
If a tcon is being freed in call tconInfoFree, clean up any entries that may
exist in global oplock queue as the tcon structure hanging off of those entries
will be invalid and can cause oops while accesing any elements in the
tcon structure.

Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-04-15 18:40:48 +00:00
Steve French e48d199ba1 Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-04-15 18:38:29 +00:00
Alexey Korolev abe2f41430 JFFS2 Fix of panics caused by wrong condition for hole frag creation in write_begin
This fixes a regression introduced in commit
205c109a7a when switching to
write_begin/write_end operations in JFFS2.

The page offset is miscalculated, leading to corruption of the fragment
lists and subsequently to memory corruption and panics.

[ Side note: the bug is a fairly direct result of the naming.  Nick was
  likely misled by the use of "offs", since we tend to use the notion of
  "offset" not as an absolute position, but as an offset _within_ a page
  or allocation.

  Alternatively, a "pgoff_t" is a page index, but not a byte offset -
  our VM naming can be a bit confusing.

  So in this case, a VM person would likely have called this a "pos",
  not an "offs", or perhaps talked about byte offsets rather than page
  offsets (since it's counted in bytes, not pages).    - Linus ]

Signed-off-by: Alexey Korolev <akorolev@infradead.org>
Signed-off-by: Vasiliy Leonenko <vasiliy.leonenko@mail.ru>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-14 15:43:14 -07:00
J. Bruce Fields 19e729a928 locks: fix possible infinite loop in fcntl(F_SETLKW) over nfs
Miklos Szeredi found the bug:

	"Basically what happens is that on the server nlm_fopen() calls
	nfsd_open() which returns -EACCES, to which nlm_fopen() returns
	NLM_LCK_DENIED.

	"On the client this will turn into a -EAGAIN (nlm_stat_to_errno()),
	which in will cause fcntl_setlk() to retry forever."

So, for example, opening a file on an nfs filesystem, changing
permissions to forbid further access, then trying to lock the file,
could result in an infinite loop.

And Trond Myklebust identified the culprit, from Marc Eshel and I:

	7723ec9777 "locks: factor out
	generic/filesystem switch from setlock code"

That commit claimed to just be reshuffling code, but actually introduced
a behavioral change by calling the lock method repeatedly as long as it
returned -EAGAIN.

We assumed this would be safe, since we assumed a lock of type SETLKW
would only return with either success or an error other than -EAGAIN.
However, nfs does can in fact return -EAGAIN in this situation, and
independently of whether that behavior is correct or not, we don't
actually need this change, and it seems far safer not to depend on such
assumptions about the filesystem's ->lock method.

Therefore, revert the problematic part of the original commit.  This
leaves vfs_lock_file() and its other callers unchanged, while returning
fcntl_setlk and fcntl_setlk64 to their former behavior.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tested-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-14 12:22:14 -07:00
Linus Torvalds 14897e35fd Merge branch 'docs' of git://git.lwn.net/linux-2.6
* 'docs' of git://git.lwn.net/linux-2.6:
  Add additional examples in Documentation/spinlocks.txt
  Move sched-rt-group.txt to scheduler/
  Documentation: move rpc-cache.txt to filesystems/
  Documentation: move nfsroot.txt to filesystems/
  Spell out behavior of atomic_dec_and_lock() in kerneldoc
  Fix a typo in highres.txt
  Fixes to the seq_file document
  Fill out information on patch tags in SubmittingPatches
  Add the seq_file documentation
2008-04-11 13:24:16 -07:00
J. Bruce Fields 6ded55da6b Documentation: move nfsroot.txt to filesystems/
Documentation/ is a little large, and filesystems/ seems an obvious
place for this file.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2008-04-11 13:18:01 -06:00
Davide Libenzi 0859ab59a8 signalfd: fix for incorrect SI_QUEUE user data reporting
Michael Kerrisk found out that signalfd was not reporting back user data
pushed using sigqueue:

  http://groups.google.com/group/linux.kernel/msg/9397cab8551e3123

The following patch makes signalfd report back the ssi_ptr and ssi_int members
of the signalfd_siginfo structure.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Acked-by: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-11 08:06:44 -07:00
Davide Libenzi 8d1c98b0b5 eventfd/kaio integration fix
Jeff Roberson discovered a race when using kaio eventfd based notifications.
When it occurs it can lead tomissed wakeups and hung userspace.

This patch fixes the race by moving the notification inside the spinlocked
section of kaio.  The operation is safe since eventfd spinlock and kaio one
are unrelated.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Jeff Roberson <jroberson@chesapeake.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-11 08:06:43 -07:00
Roland McGrath 598af051a7 asmlinkage_protect sys_io_getevents
Use asmlinkage_protect in sys_io_getevents, because GCC for i386 with
CONFIG_FRAME_POINTER=n can decide to clobber an argument word on the
stack, i.e. the user struct pt_regs.  Here the problem is not a tail
call, but just the compiler's use of the stack when it inlines and
optimizes the body of the called function.  This seems to avoid it.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 17:28:26 -07:00
Roland McGrath 54a0151041 asmlinkage_protect replaces prevent_tail_call
The prevent_tail_call() macro works around the problem of the compiler
clobbering argument words on the stack, which for asmlinkage functions
is the caller's (user's) struct pt_regs.  The tail/sibling-call
optimization is not the only way that the compiler can decide to use
stack argument words as scratch space, which we have to prevent.
Other optimizations can do it too.

Until we have new compiler support to make "asmlinkage" binding on the
compiler's own use of the stack argument frame, we have work around all
the manifestations of this issue that crop up.

More cases seem to be prevented by also keeping the incoming argument
variables live at the end of the function.  This makes their original
stack slots attractive places to leave those variables, so the compiler
tends not clobber them for something else.  It's still no guarantee, but
it handles some observed cases that prevent_tail_call() did not.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 17:28:26 -07:00
Linus Torvalds 5d69a029ab Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
  [XFS] Ensure "both" features2 slots are consistent
  [XFS] Fix superblock features2 field alignment problem
  [XFS] remove shouting-indirection macros from xfs_sb.h
2008-04-10 13:39:29 -07:00
Linus Torvalds 999646e3f9 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  cfq-iosched: do not leak ioc_data across iosched switches
  splice: fix infinite loop in generic_file_splice_read()
2008-04-10 13:39:07 -07:00
Roman Zippel 76b0c26af2 HFS+: fix unlink of links
Some time ago while attempting to handle invalid link counts, I botched
the unlink of links itself, so this patch fixes this now correctly, so
that only the link count of nodes that don't point to links is ignored.
Thanks to Vlado Plaga <rechner@vlado-do.de> to notify me of this
problem.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-10 13:37:51 -07:00
Josef Bacik 16c5f06f15 [GFS2] fix GFP_KERNEL misuses
There are several places where GFP_KERNEL allocations happen under a glock,
which will result in hangs if we're under memory pressure and go to re-enter the
fs in order to flush stuff out.  This patch changes the culprits to GFS_NOFS to
keep this problem from happening.  Thank you,

Signed-off-by: Josef Bacik <jbacik@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-04-10 09:55:26 +01:00
Eric Sandeen e6957ea484 [XFS] Ensure "both" features2 slots are consistent
Since older kernels may look in the sb_bad_features2 slot for flags,
rather than zeroing it out on fixup, we should make it equal to the
sb_features2 value.

Also, if the ATTR2 flag was not found prior to features2 fixup, it was not
set in the mount flags, so re-check after the fixup so that the current
session will use the feature.

Also fix up the comments to reflect these changes.

SGI-PV: 980085
SGI-Modid: xfs-linux-melb:xfs-kern:30778a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-10 16:25:26 +10:00
David Chinner ee1c090825 [XFS] Fix superblock features2 field alignment problem
Due to the xfs_dsb_t structure not being 64 bit aligned, the last field of
the on-disk superblock can vary in location This causes problems when the
filesystem gets moved to a different platform, or there is a 32 bit
userspace and 64 bit kernel.

This patch detects the defect at mount time, logs a warning such as:

XFS: correcting sb_features alignment problem

in dmesg and corrects the problem so that everything is OK. it also
blacklists the bad field in the superblock so it does not get used for
something else later on.

SGI-PV: 977636
SGI-Modid: xfs-linux-melb:xfs-kern:30539a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-10 16:25:15 +10:00
Eric Sandeen 6211870992 [XFS] remove shouting-indirection macros from xfs_sb.h
Remove macro-to-small-function indirection from xfs_sb.h, and remove some
which are completely unused.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30528a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-10 16:24:45 +10:00
Jens Axboe 8191ecd1d1 splice: fix infinite loop in generic_file_splice_read()
There's a quirky loop in generic_file_splice_read() that could go
on indefinitely, if the file splice returns 0 permanently (and not
just as a temporary condition). Get rid of the loop and pass
back -EAGAIN correctly from __generic_file_splice_read(), so we
handle that condition properly as well.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-10 08:24:25 +02:00
Steve French cce246ee5f [CIFS] Fix acl length when very short ACL being modified by chmod
Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-04-09 20:55:31 +00:00
Steve French 35028d7111 [CIFS] Fix looping on reconnect to Samba when unexpected tree connect fail on reconnect
Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-04-09 20:32:42 +00:00
Bryan Wu 240ee83118 fix bug - executing FDPIC ELF on NFS mount triggers BUG() at mm/nommu.c:862:/do_mmap_private()
NFS needs a NOMMU version mmap function to support uClinux on NOMMU machine
http://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_id=141&tracker_item_id=3992

Signed-off-by: Bryan Wu <cooloney@kernel.org>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-08 21:06:56 -04:00
Jeff Layton 66d3aac041 NFS: initialize flags field in nfs_open_context
The nfs_open_context struct had a "flags" field added recently, but the
allocator isn't initializing it. It also looks like the allocator isn't
initializing the mode or list either, but they seem to be overwritten
by the caller, so that's less of an issue.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-08 21:06:53 -04:00
Steve French 932e2d23c8 [CIFS] minor update to change log
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-04-04 21:59:35 +00:00
Linus Torvalds 1be62dc190 Be more careful about marking buffers dirty
Mikulas Patocka noted that the optimization where we check if a buffer
was already dirty (and we avoid re-dirtying it) was not really SMP-safe.

Since the read of the old status was not synchronized with anything, an
aggressive CPU re-ordering of memory accesses might have moved that read
up to before the data was even written to the buffer, and another CPU
that cleaned it again, causing the newly dirty state to never actually
hit the disk.

Admittedly this would probably never trigger in practice, but it's still
wrong.

Mikulas sent a patch that fixed the problem, but I dislike the subtlety
of the whole optimization, so this is an alternate fix that is more
explicit about the particular SMP ordering for the optimization, and
separates out the speculative reads of the buffer state into its own
conditional (and makes the memory barrier only happen if we are likely
to actually hit the optimized case in the first place).

I considered removing the optimization entirely, but Andrew argued for
it's continued existence. I'm a push-over.

Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-04 14:38:17 -07:00
Sven Schnelle ad16df848d afs: remove smp_prcessor_id() from debug macro
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-03 15:40:53 -07:00
Hugh Dickins 4cd1350465 splice: use mapping_gfp_mask
The loop block driver is careful to mask __GFP_IO|__GFP_FS out of its
mapping_gfp_mask, to avoid hangs under memory pressure.  But nowadays
it uses splice, usually going through __generic_file_splice_read.  That
must use mapping_gfp_mask instead of GFP_KERNEL to avoid those hangs.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-03 15:39:49 -07:00
David S. Miller 3bb5da3837 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-04-03 14:33:42 -07:00
Robert P. J. Day 865965a66e efs: update error msg to not refer to deleted read_inode()
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02 15:28:19 -07:00
Sven Schnelle a5f37c3252 afs: add missing up_write() on return
If afs_cell_alloc() fails, afs_cells_sem doesn't get unlocked, which
leads to a deadlock.  Unlock it before returning.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02 07:40:54 -07:00
Denis V. Lunev 856f6ff7a3 [NETNS]: Remove ifdef CONFIG_NET braces in fs/proc/proc_net.c.
They are redundant as this file is linked in iff CONFIG_NET is turned
on.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-02 00:10:04 -07:00
Julia Lawall 773adff8e9 [GFS2] test for IS_ERR rather than 0
The function gfs2_inode_lookup always returns either a valid pointer or a
value made with ERR_PTR, so its result should be tested with IS_ERR, not
with a test for 0.

The problem was found using the following semantic match.
(http://www.emn.fr/x-info/coccinelle/)

//<smpl>
@a@
expression E, E1;
statement S,S1;
position p;
@@

E = gfs2_inode_lookup(...)
... when != E = E1
if@p (E) S else S1

@n@
position a.p;
expression E,E1;
statement S,S1;
@@

E = NULL
... when != E = E1
if@p (E) S else S1

@depends on !n@
expression E;
statement S,S1;
position a.p;
@@

* if@p (E)
  S else S1
//</smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:46 +01:00
Benjamin Marzinski 58e9fee13e [GFS2] Invalidate cache at correct point
GFS2 wasn't invalidating its cache before it called into the lock manager
with a request that could potentially drop a lock.  This was leaving a
window where the lock could be actually be held by another node, but the
file's page cache would still appear valid, causing coherency problems.
This patch moves the cache invalidation to before the lock manager call
when dropping a lock. It also adds the option to the lock_dlm lock
manager to not use conversion mode deadlock avoidance, which, on a
conversion from shared to exclusive, could internally drop the lock, and
then reacquire in. GFS2 now asks lock_dlm to not do this.  Instead, GFS2
manually drops the lock and reacquires it.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:44 +01:00
akpm@linux-foundation.org f5a8cd0201 [GFS2] fs/gfs2/recovery.c: suppress warnings
fs/gfs2/recovery.c: In function 'get_log_header':
fs/gfs2/recovery.c:152: warning: 'lh.lh_sequence' may be used uninitialized in this function
fs/gfs2/recovery.c:152: warning: 'lh.lh_flags' may be used uninitialized in this function
fs/gfs2/recovery.c:152: warning: 'lh.lh_tail' may be used uninitialized in this function
fs/gfs2/recovery.c:152: warning: 'lh.lh_blkno' may be used uninitialized in this function
fs/gfs2/recovery.c:152: warning: 'lh.lh_hash' may be used uninitialized in this function

Cc: David Teigland <teigland@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:41 +01:00
Bob Peterson 1f466a47e8 [GFS2] Faster gfs2_bitfit algorithm
This version of the gfs2_bitfit algorithm includes the latest
suggestions from Steve Whitehouse.  It is typically eight to
ten times faster than the version we're using today.  If there
is a lot of metadata mixed in (lots of small files) the
algorithm is often 15 times faster, and given the right
conditions, I've seen peaks of 20 times faster.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:39 +01:00
Steven Whitehouse d82661d969 [GFS2] Streamline quota lock/check for no-quota case
This patch streamlines the quota checking in the "no quota" case by
making the check inline in the calling function, thus reducing the
number of function calls. Eventually we might be able to remove the
checks from the gfs2_quota_lock() and gfs2_quota_check() functions, but
currently we can't as there are a very few places in the code which need
to call these functions directly still.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Abhijith Das <adas@redhat.com>
2008-03-31 10:41:36 +01:00
Steven Whitehouse 860b25d4a9 [GFS2] Remove drop of module ref where not needed
In an earlier patch "[GFS2] fix file_system_type leak on gfs2meta mount"
we removed the code to grab a ref to the module which was not needed
(since we know that the module cannot be unloaded at that time) so
this patch removes the code to drop that reference.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:33 +01:00
Abhijith Das 20b95bf2c4 [GFS2] gfs2_adjust_quota has broken unstuffing code
This patch combines the 2 patches in bug 434736 to correct the lock
ordering in the unstuffing of the quota inode in gfs2_adjust_quota and
adjusting the number of revokes in gfs2_write_jdata_pagevec

Signed-off-by: Abhijith Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:30 +01:00
Cyrill Gorcunov 182fe5abd8 [GFS2] possible null pointer dereference fixup
gfs2_alloc_get may fail so we have to check it to prevent
NULL pointer dereference.

Signed-off-by: Cyrill Gorcunov <gorcunov@gamil.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:28 +01:00
Steven Whitehouse 105284970b [GFS2] Need to ensure that sector_t is 64bits for GFS2
We need to ensure that sector_t is 64bits for GFS2, so that we need to
depend on LBD as well as LSF.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:25 +01:00
Denis Cheng 43a33c53cc [GFS2] re-support special inode
a previous commit removed call to
init_special_inode from inode lookuping, this cause problems as:

 # mknod /mnt/gfs2/dev/null c 1 3
 # cat /mnt/gfs2/dev/null
 cat: /mnt/gfs2/dev/null: Invalid argument

without special inode, GFS2 cannot support char device file,
block device file, fifo pipe, and socket file, lose many important
features as a common file system.

this one line patch re add special inode support.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:22 +01:00
Denis Cheng d83225d45d [GFS2] remove gfs2_dev_iops
struct inode_operations gfs2_dev_iops is always the same as gfs2_file_iops,
since Jan 2006, when GFS2 merged into mainstream kernel.

So one of them could be removed.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:20 +01:00
Christoph Hellwig 7dc2cf1c8f [GFS2] fix file_system_type leak on gfs2meta mount
get_gfs2_sb does a get_fs_type without doing a put_filesystem and
thus leaking a file_system_type reference everytime it's called.

Just use gfs2_fs_type directly instead of doing the lookup and thus
fix the problem.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:17 +01:00
Steven Whitehouse 9b8c81d1de [GFS2] Allow bmap to allocate extents
We've supported mapping of extents when no block allocation is required
for some time. This patch extends that to mapping of extents when an
allocation has been requested. In that case we try to allocate as many
blocks as are requested, but we might return fewer in case there is
something preventing us from returning the complete amount (e.g. an
already allocated block is in the way).

Currently the only code path which can actually request multiple data
blocks in a single bmap call is the page_mkwrite path and even then it
only happens if there are multiple blocks per page. What this patch does
do however, is merge the allocation requests for metadata (growing the
metadata tree in either height or depth) with the allocation of the data
blocks in the case that both are needed. This results in lower overheads
even in the single block allocation case.

The one thing which we can't handle here at the moment is unstuffing. I
would like to be able to do that, but the problem which arises is that
in order to unstuff one has to get a locked page from the page cache
which results in locking problems in the (usual) case that the caller is
holding the page lock on the page it wishes to map. So that case will
have to be addressed in future patches.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:14 +01:00
Steven Whitehouse 7afd88d916 [GFS2] Fix a page lock / glock deadlock
We've previously been using a "try lock" in readpage on the basis that
it would prevent deadlocks due to the inverted lock ordering (our normal
lock ordering is glock first and then page lock). Unfortunately tests
have shown that this isn't enough. If the glock has a demote request
queued such that run_queue() in the glock code tries to do a demote when
its called under readpage then it will try and write out all the dirty
pages which requires locking them. This then deadlocks with the page
locked by readpage.

The solution is to always require two calls into readpage. The first
unlocks the page, gets the glock and returns AOP_TRUNCATED_PAGE, the
second does the actual readpage and unlocks the glock & page as
required.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:12 +01:00
Adrian Bunk 60b779cfc1 [GFS2] proper extern for gfs2/locking/dlm/mount.c:gdlm_ops
This patch adds a proper extern declaration for gdlm_ops in
fs/gfs2/locking/dlm/lock_dlm.h

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:09 +01:00
Adrian Bunk 8af4c72f7d [GFS2] gfs2/ops_file.c should #include "ops_inode.h"
Every file should include the headers containing the prototypes for
its global functions (in this case for gfs2_set_inode_flags()).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:06 +01:00
Marcin Slusarz bb16b342b2 [GFS2] be*_add_cpu conversion
replace all:
big_endian_variable = cpu_to_beX(beX_to_cpu(big_endian_variable) +
					expression_in_cpu_byteorder);
with:
	beX_add_cpu(&big_endian_variable, expression_in_cpu_byteorder);
generated with semantic patch

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:03 +01:00
Steven Whitehouse 840ca0ec70 [GFS2] Fix bug where we called drop_bh incorrectly
As a result of an earlier patch, drop_bh was being called in cases
when it shouldn't have been. Since we never have a gh in the drop
case and we always have a gh in the promote case, we can use that
extra information to tell which case has been seen.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
2008-03-31 10:41:01 +01:00
Steven Whitehouse e23159d2a7 [GFS2] Get inode buffer only once per block map call
In the case that we needed to grow the height of the metadata tree
we were looking up the inode buffer and then brelse()ing it despite
the fact that it is needed later in the block map process.

This patch ensures that we look up the inode's buffer once and only
once during the block map process.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:58 +01:00
Steven Whitehouse 77658aad22 [GFS2] Eliminate (almost) duplicate field from gfs2_inode
The blocks counter is almost a duplicate of the i_blocks
field in the VFS inode. The only difference is that i_blocks
can be only 32bits long for 32bit arch without large single file
support. Since GFS2 doesn't handle the non-large single file
case (for 32 bit anyway) this adds a new config dependency on
64BIT || LSF. This has always been the case, however we've never
explicitly said so before.

Even if we do add support for the non-LSF case, we will still
not require this field to be duplicated since we will not be
able to access oversized files anyway.

So the net result of all this is that we shave 8 bytes from a gfs2_inode
and get our config deps correct.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:55 +01:00
Steven Whitehouse 30cbf189cd [GFS2] Add a function to interate over an extent
This adds a function (currently the only use is during mapping
of already allocated blocks, but watch this space) which iterates
over a number of pointers in a block and returns the extent length.

If the initial pointer is 0 (i.e. unallocated) it will return the
number of unallocated blocks in the extent. If the initial pointer
is allocated, then it returns the number of contiguously allocated
blocks in the extent.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:53 +01:00
Steven Whitehouse c85a665f06 [GFS2] The case of the missing asterisk
A dereference was forgotten. This adds it back correctly.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:50 +01:00
Steven Whitehouse b45e41d7d5 [GFS2] Add extent allocation to block allocator
Rather than having to allocate a single block at a time, this patch
allows the block allocator to allocate an extent. Since there is
no difference (so far as the block allocator is concerned) between
data blocks and indirect blocks, it is posible to allocate a single
extent and for the caller to unrevoke just the blocks required
for indirect blocks.

Currently the only bit of GFS2 to make use of this feature is the
build height function. The intention is that gfs2_block_map will
be changed to make use of this feature in future patches.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:47 +01:00
Steven Whitehouse 1639431a3f [GFS2] Merge gfs2_alloc_meta and gfs2_alloc_data
Thanks to the preceeding patches, the only difference between
these two functions is their name. We can thus merge them
and call the new function gfs2_alloc_block to reflect the
fact that it can allocate either kind of block.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:45 +01:00
Steven Whitehouse 5731be53e3 [GFS2] Update gfs2_trans_add_unrevoke to accept extents
By adding an extra argument to gfs2_trans_add_unrevoke we can now
specify an extent length of blocks to unrevoke. This means that
we only need to make one pass through the list for each extent
rather than each block. Currently the only extent length which
is used is 1, but that will change in the future.

Also gfs2_trans_add_unrevoke is removed from gfs2_alloc_meta
since its the only difference between this and gfs2_alloc_data
which is left. This will allow a future patch to merge these
two functions into one (i.e. one call to allocate both data
and metadata in a single extent in the future).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:42 +01:00
Steven Whitehouse ac576cc5be [GFS2] Merge the rd_last_alloc_meta and rd_last_alloc_data fields
We don't need to keep track of when we last allocated data
and metadata separately since the only thing thats important
when searching for a free block is whether its free or not,
which is independent from what type of block it is.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:39 +01:00
Steven Whitehouse ce276b06e8 [GFS2] Reduce inode size by merging fields
There were three fields being used to keep track of the location
of the most recently allocated block for each inode. These have
been merged into a single field in order to better keep the
data and metadata for an inode close on disk, and also to reduce
the space required for storage.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:37 +01:00
Bob Peterson 9feb7c889f [GFS2] Remove unused counters
This is kind of trivial in the greater scheme of things, but
this removes three counters that AFAICT are never used.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:34 +01:00
Steven Whitehouse 9a0045088d [GFS2] Shrink & rename di_depth
This patch forms a pair with the previous patch which shrunk
di_height. Like that patch di_depth is renamed i_depth and moved
into struct gfs2_inode directly. Also the field goes from 16 bits
to 8 bits since it is also limited to a max value which is rather
small (17 in this case). In addition we also now validate the field
against this maximum value when its read in.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:31 +01:00
Bob Peterson cf45b752c9 [GFS2] Remove rgrp and glock version numbers
This patch further reduces GFS2's memory requirements by
eliminating the 64-bit version number fields in lieu of
a couple bits.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:29 +01:00
Steven Whitehouse da755fdb41 [GFS2] Remove lm.[ch] and distribute content
The functions in lm.c were just wrappers which were mostly
only used in one other file. By moving the functions to
the files where they are being used, they can be marked
static and also this will usually result in them being inlined
since they are often only used from one point in the code.

A couple of really trivial functions have been inlined by hand
into the function which called them as it makes the code clearer
to do that.

We also gain from one fewer function call in the glock lock and
unlock paths.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:26 +01:00
Bob Peterson ab0d756681 [GFS2] Eliminate gl_req_bh
This patch further reduces the memory needs of GFS2 by
eliminating the gl_req_bh variable from struct gfs2_glock.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:23 +01:00
Steven Whitehouse 110acf3837 [GFS2] Add consts to various bits of rgrp.c
There are a couple of routines which scan bitmaps where we can
mark the bitmaps const, plus a couple of call sites that can
be updated too.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:21 +01:00
Steven Whitehouse dbac6710a6 [GFS2] Introduce array of buffers to struct metapath
The reason for doing this is to allow all the block mapping code
to share the same array. As a result we can remove two arguments
from lookup_metapath since they are now returned via the array.

We also add a function to drop all refs to buffer heads when we
are done with the metapath. The build_height function shares the
struct metapath, but currently still frees its own buffers, and
this will change in a future patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:18 +01:00
Steven Whitehouse 11707ea05e [GFS2] Move part of gfs2_block_map into a separate function
This is required to enable future changes to the block
mapping code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:15 +01:00
Bob Peterson 29d38cd163 [GFS2] Get rid of gl_waiters2
This patch reduces memory by replacing the int variable
gl_waiters2 by a single bit in the gl_flags.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:13 +01:00
Bob Peterson 42d52e3818 [GFS2] Combine rg_flags and rd_flags
This patch reduces the memory required by GFS2 by combining
the rd_flags and rg_flags (in core only).

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:10 +01:00
Bob Peterson 6bdd9be628 [GFS2] Allocate gfs2_rgrpd from slab memory
This patch moves the gfs2_rgrpd structure to its own slab
memory.  This makes it easier to control and monitor, and
yields less memory fragmentation.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:07 +01:00
Bob Peterson 3ad62e87cd [GFS2] Plug an unlikely leak
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:05 +01:00
Adrian Bunk 048786f1e6 [GFS2] make gfs2_glock_hold() static
gfs2_glock_hold() can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:02 +01:00
Bob Peterson ef8c441cb7 [GFS2] Only wake the reclaim daemon if we need to
This patch only wakes up the glock reclaim daemon if there is
actually something to be reclaimed.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:00 +01:00
Bob Peterson 7eabb77e65 [GFS2] Misc fixups
This patch contains two small fixups that didn't fit elsewhere.
They are: (1) get rid of temp variable in find_metapath.
(2) Remove vestigial "ret" variable from gfs2_writepage_common.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:39:57 +01:00
Bob Peterson d0109bfa84 [GFS2] Only do lo_incore_commit once
This patch is performance related.  When we're doing a log flush,
I noticed we were calling buf_lo_incore_commit twice: once for
data bufs and once for metadata bufs.  Since this is the same
function and does the same thing in both cases, there should be
no reason to call it twice.  Since we only need to call it once,
we can also make it faster by removing it from the generic "lops"
code and making it a stand-along static function.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:39:54 +01:00
Bob Peterson ca390601a8 [GFS2] Fix debug inode printing
I noticed that the latest change to i_height got rid of the
value from the inode dump.  This patch adds it back.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:39:52 +01:00
Bob Peterson fe6c991c52 [GFS2] Get rid of unneeded parameter in gfs2_rlist_alloc
This patch removed the unnecessary parameter from function
gfs2_rlist_alloc.  The parameter was always passed in as 0.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:39:49 +01:00
Steven Whitehouse ecc30c7915 [GFS2] Streamline indirect pointer tree height calculation
This patch improves the calculation of the tree height in order to reduce
the number of operations which are carried out on each call to gfs2_block_map.
In the common case, we now make a single comparison, rather than calculating
the required tree height from scratch each time. Also in the case that the
tree does need some extra height, we start from the current height rather from
zero when we work out what the new height ought to be.

In addition the di_height field is moved into the inode proper and reduced
in size to a u8 since the value must be between 0 and GFS2_MAX_META_HEIGHT (10).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:39:46 +01:00
Steven Whitehouse 941e6d7d09 [GFS2] Speed up gfs2_write_alloc_required, deprecate gfs2_extent_map
This patch removes the call to gfs2_extent_map from gfs2_write_alloc_required,
instead we call gfs2_block_map directly. This results in fewer overall calls
to gfs2_block_map in the multi-block case.

Also, gfs2_extent_map is marked as deprecated so that people know that its
going away as soon as all the callers have been converted.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:39:44 +01:00
Al Viro 2b210adcb0 cifs: fix misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-30 14:20:23 -07:00
Al Viro 9dce07f1a4 NULL noise: fs/*, mm/*, kernel/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-30 14:18:41 -07:00
Al Viro 1076d17ac7 jbd/jbd2 NULL noise
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-30 14:18:41 -07:00
Linus Torvalds af8be4e4b3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  [PATCH] mnt_expire is protected by namespace_sem, no need for vfsmount_lock
  [PATCH] do shrink_submounts() for all fs types
  [PATCH] sanitize locking in mark_mounts_for_expiry() and shrink_submounts()
  [PATCH] count ghost references to vfsmounts
  [PATCH] reduce stack footprint in namespace.c
2008-03-28 15:23:01 -07:00
Sven Schnelle 5214b729e1 afs: prevent double cell registration
kafs doesn't check if the cell already exists - so if you do an echo "add
newcell.org 1.2.3.4" >/proc/fs/afs/cells it will try to create this cell
again.  kobject will also complain about a double registration.  To prevent
such problems, return -EEXIST in that case.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28 14:45:21 -07:00
Dmitri Monakhov 5b41e74ad1 vfs: fix data leak in nobh_write_end()
Current nobh_write_end() implementation ignore partial writes(copied < len)
case if page was fully mapped and simply mark page as Uptodate, which is
totally wrong because area [pos+copied, pos+len) wasn't updated explicitly in
previous write_begin call.  It simply contains garbage from pagecache and
result in data leakage.

#TEST_CASE_BEGIN:
~~~~~~~~~~~~~~~~
In fact issue triggered by classical testcase
	open("/mnt/test", O_RDWR|O_CREAT|O_TRUNC, 0666) = 3
	ftruncate(3, 409600)                    = 0
	writev(3, [{"a", 1}, {NULL, 4095}], 2)  = 1
##TESTCASE_SOURCE:
~~~~~~~~~~~~~~~~~
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/uio.h>
#include <sys/mman.h>
#include <errno.h>
int main(int argc, char **argv)
{
	int fd,  ret;
	void* p;
	struct iovec iov[2];
	fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0666);
	ftruncate(fd, 409600);
	iov[0].iov_base="a";
	iov[0].iov_len=1;
	iov[1].iov_base=NULL;
	iov[1].iov_len=4096;
	ret = writev(fd, iov, sizeof(iov)/sizeof(struct iovec));
	printf("writev  = %d, err = %d\n", ret, errno);
	return 0;
}
##TESTCASE RESULT:
~~~~~~~~~~~~~~~~~~
[root@ts63 ~]# mount | grep mnt2
/dev/mapper/test on /mnt2 type ext2 (rw,nobh)
[root@ts63 ~]#  /tmp/writev /mnt2/test
writev  = 1, err = 0
[root@ts63 ~]# hexdump -C /mnt2/test

00000000  61 65 62 6f 6f 74 00 00  f0 b9 b4 59 3a 00 00 00  |aeboot.....Y:...|
00000010  20 00 00 00 00 00 00 00  21 00 00 00 00 00 00 00  | .......!.......|
00000020  df df df df df df df df  df df df df df df df df  |................|
00000030  3a 00 00 00 2a 00 00 00  21 00 00 00 00 00 00 00  |:...*...!.......|
00000040  60 c0 8c 00 00 00 00 00  40 4a 8d 00 00 00 00 00  |`.......@J......|
00000050  00 00 00 00 00 00 00 00  41 00 00 00 00 00 00 00  |........A.......|
00000060  74 69 6d 65 20 64 64 20  69 66 3d 2f 64 65 76 2f  |time dd if=/dev/|
00000070  6c 6f 6f 70 30 20 20 6f  66 3d 2f 64 65 76 2f 6e  |loop0  of=/dev/n|
skip..
00000f50  00 00 00 00 00 00 00 00  31 00 00 00 00 00 00 00  |........1.......|
00000f60  6d 6b 66 73 2e 65 78 74  33 20 2f 64 65 76 2f 76  |mkfs.ext3 /dev/v|
00000f70  7a 76 67 2f 74 65 73 74  20 2d 62 34 30 39 36 00  |zvg/test -b4096.|
00000f80  a0 fe 8c 00 00 00 00 00  21 00 00 00 00 00 00 00  |........!.......|
00000f90  23 31 32 30 35 39 35 30  34 30 34 00 3a 00 00 00  |#1205950404.:...|
00000fa0  20 00 8d 00 00 00 00 00  21 00 00 00 00 00 00 00  | .......!.......|
00000fb0  d0 cf 8c 00 00 00 00 00  10 d0 8c 00 00 00 00 00  |................|
00000fc0  00 00 00 00 00 00 00 00  41 00 00 00 00 00 00 00  |........A.......|
00000fd0  6d 6f 75 6e 74 20 2f 64  65 76 2f 76 7a 76 67 2f  |mount /dev/vzvg/|
00000fe0  74 65 73 74 20 20 2f 76  7a 20 2d 6f 20 64 61 74  |test  /vz -o dat|
00000ff0  61 3d 77 72 69 74 65 62  61 63 6b 00 00 00 00 00  |a=writeback.....|
00001000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

As you can see file's page contains garbage from pagecache instead of zeros.
#TEST_CASE_END

Attached patch:
- Add sanity check BUG_ON in order to prevent incorrect usage by caller,
  This is function invariant because page can has buffers and in no zero
  *fadata pointer at the same time.
- Always attach buffers to page is it is partial write case.
- Always switch back to generic_write_end if page has buffers.
  This is reasonable because if page already has buffer then generic_write_begin
  was called previously.

Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
Reviewed-by: Nick Piggin <npiggin@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28 14:45:21 -07:00
Al Viro 6758f953d0 [PATCH] mnt_expire is protected by namespace_sem, no need for vfsmount_lock
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27 20:48:04 -04:00
Al Viro c35038beca [PATCH] do shrink_submounts() for all fs types
... and take it out of ->umount_begin() instances.  Call with all locks
already taken (by do_umount()) and leave calling release_mounts() to
caller (it will do release_mounts() anyway, so we can just put into
the same list).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27 20:47:58 -04:00
Al Viro bcc5c7d2b6 [PATCH] sanitize locking in mark_mounts_for_expiry() and shrink_submounts()
... and fix a race on access of ->mnt_share et.al. without namespace_sem
in the latter.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27 20:47:52 -04:00
Al Viro 7c4b93d826 [PATCH] count ghost references to vfsmounts
make propagate_mount_busy() exclude references from the vfsmounts
that had been isolated by umount_tree() and are just waiting for
release_mounts() to dispose of their ->mnt_parent/->mnt_mountpoint.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27 20:47:46 -04:00
Al Viro 1a39068954 [PATCH] reduce stack footprint in namespace.c
A lot of places misuse struct nameidata when they need struct path.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27 20:47:40 -04:00
Denis V. Lunev 0e5f8be138 [NETNS]: Compile NET /proc support only if CONFIG_NET is set.
This fix broken compilation for 'allnoconfig'. This was introduced by
Introduced by commit 1218854afa ("[NET]
NETNS: Omit seq_net_private->net without CONFIG_NET_NS.")

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 14:25:53 -07:00
YOSHIFUJI Hideaki 1218854afa [NET] NETNS: Omit seq_net_private->net without CONFIG_NET_NS.
Without CONFIG_NET_NS, no namespace other than &init_net exists,
no need to store net in seq_net_private.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:56 +09:00
Linus Torvalds 7ed7fe5e82 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  [PATCH] get stack footprint of pathname resolution back to relative sanity
  [PATCH] double iput() on failure exit in hugetlb
  [PATCH] double dput() on failure exit in tiny-shmem
  [PATCH] fix up new filp allocators
  [PATCH] check for null vfsmount in dentry_open()
  [PATCH] reiserfs: eliminate private use of struct file in xattr
  [PATCH] sanitize hppfs
  hppfs pass vfsmount to dentry_open()
  [PATCH] restore export of do_kern_mount()
2008-03-25 08:57:47 -07:00
Andrew Morton 815d2d50da driver core: debug for bad dev_attr_show() return value.
Try to find the culprit who caused
http://bugzilla.kernel.org/show_bug.cgi?id=10150

Cc: <balajirrao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-24 22:33:49 -07:00
Linus Torvalds 0d995b2b44 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] Fix mem leak on dfs referral
  [CIFS] file create with acl support enabled is slow
  [CIFS] Fix mtime on cp -p when file data cached but written out too late
  [CIFS] Fix build problem
  [CIFS] cifs: replace remaining __FUNCTION__ occurrences
  [CIFS]  DFS patch that connects inode with dfs handling ops
2008-03-22 17:05:31 -07:00
Hans Rosenfeld f16278c679 Change pagemap output format to allow for future reporting of huge pages
Change pagemap output format to allow for future reporting of huge pages.

(Format comment and minor cleanups: mpm@selenic.com)

Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-22 17:03:10 -07:00
Steve French 04b6e6ec1a [CIFS] Fix mem leak on dfs referral
Signed-off-by: Igor Mammedov <niallain@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-03-22 22:57:44 +00:00
Linus Torvalds 7d3628b230 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (46 commits)
  [NET] ifb: set separate lockdep classes for queue locks
  [IPV6] KCONFIG: Fix description about IPV6_TUNNEL.
  [TCP]: Fix shrinking windows with window scaling
  netpoll: zap_completion_queue: adjust skb->users counter
  bridge: use time_before() in br_fdb_cleanup()
  [TG3]: Fix build warning on sparc32.
  MAINTAINERS: bluez-devel is subscribers-only
  audit: netlink socket can be auto-bound to pid other than current->pid (v2)
  [NET]: Fix permissions of /proc/net
  [SCTP]: Fix a race between module load and protosw access
  [NETFILTER]: ipt_recent: sanity check hit count
  [NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup()
  [RT2X00] drivers/net/wireless/rt2x00/rt2x00dev.c: remove dead code, fix warning
  [IPV4]: esp_output() misannotations
  [8021Q]: vlan_dev misannotations
  xfrm: ->eth_proto is __be16
  [IPV4]: ipv4_is_lbcast() misannotations
  [SUNRPC]: net/* NULL noise
  [SCTP]: fix misannotated __sctp_rcv_asconf_lookup()
  [PKT_SCHED]: annotate cls_u32
  ...
2008-03-21 07:57:45 -07:00
Andre Noll 4f42c288e6 [NET]: Fix permissions of /proc/net
commit e9720ac ([NET]: Make /proc/net a symlink on /proc/self/net (v3))
broke ganglia and probably other applications that read /proc/net/dev.

This is due to the change of permissions of /proc/net that was
introduced in that commit.

Before: dr-xr-xr-x 5 root root 0 Mar 19 11:30 /proc/net
After: dr-xr--r-- 5 root root 0 Mar 19 11:29 /proc/self/net

This patch restores the permissions to the old value which makes
ganglia happy again.

Pavel Emelyanov says:

	This also broke the postfix, as it was reported in bug #10286
	and described in detail by Benjamin.

Signed-off-by: Andre Noll <maan@systemlinux.org>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:27:28 -07:00
Linus Torvalds 49ccf74aaf Merge branch 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  nfs: don't ignore return value from nfs_pageio_add_request
2008-03-20 11:59:34 -07:00
Andrew Morton 9df130392f fs/ufs/balloc.c: fix sparc64 printk warning
fs/ufs/balloc.c: In function `ufs_change_blocknr':
fs/ufs/balloc.c:317: warning: long long unsigned int format, long unsigned int arg (arg 2)
fs/ufs/balloc.c:317: warning: long long unsigned int format, long unsigned int arg (arg 3)

sector_t is u64 and we don't know what type the architecture uses to implement
u64.

Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:37 -07:00
Dave Young 08ca0db8aa zisofs: fix readpage() outside i_size
A read request outside i_size will be handled in do_generic_file_read().  So
we just return 0 to avoid getting -EIO as normal reading, let
do_generic_file_read do the rest.

At the same time we need unlock the page to avoid system stuck.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10227

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Report-by: Christian Perle <chris@linuxinfotag.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:36 -07:00
Randy Dunlap a6b91919e0 fs: fix kernel-doc notation warnings
Fix kernel-doc notation warnings in fs/.

Warning(mmotm-2008-0314-1449//fs/super.c:560): missing initial short description on line:
 *	mark_files_ro
Warning(mmotm-2008-0314-1449//fs/locks.c:1277): missing initial short description on line:
 *	lease_get_mtime
Warning(mmotm-2008-0314-1449//fs/locks.c:1277): missing initial short description on line:
 *	lease_get_mtime
Warning(mmotm-2008-0314-1449//fs/namei.c:1368): missing initial short description on line:
 * lookup_one_len:  filesystem helper to lookup single pathname component
Warning(mmotm-2008-0314-1449//fs/buffer.c:3221): missing initial short description on line:
 * bh_uptodate_or_lock: Test whether the buffer is uptodate
Warning(mmotm-2008-0314-1449//fs/buffer.c:3240): missing initial short description on line:
 * bh_submit_read: Submit a locked buffer for reading
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:30): missing initial short description on line:
 * writeback_acquire: attempt to get exclusive writeback access to a device
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:47): missing initial short description on line:
 * writeback_in_progress: determine whether there is writeback in progress
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:58): missing initial short description on line:
 * writeback_release: relinquish exclusive writeback access against a device.
Warning(mmotm-2008-0314-1449//include/linux/jbd.h:351): contents before sections
Warning(mmotm-2008-0314-1449//include/linux/jbd.h:561): contents before sections
Warning(mmotm-2008-0314-1449//fs/jbd/transaction.c:1935): missing initial short description on line:
 * void journal_invalidatepage()

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:36 -07:00
Michael Halcrow 5366dc9fd1 eCryptfs: Swap dput() and mntput()
ecryptfs_d_release() is doing a mntput before doing the dput.  This patch
moves the dput before the mntput.

Thanks to Rajouri Jammu for reporting this.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Rajouri Jammu <rajouri.jammu@gmail.com>
Cc: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:36 -07:00
Duane Griffin d00256766a jbd2: correctly unescape journal data blocks
Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping.  At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JBD2_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device.  Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:36 -07:00
Duane Griffin 439aeec639 jbd: correctly unescape journal data blocks
Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping.  At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JFS_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device.  Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:36 -07:00
David Howells 4ebf89845b ROMFS: Fix up an error in iget removal
Fix up an error in iget removal in which romfs_lookup() making a successful
call to romfs_iget() continues through the negative/error handling (previously
the successful case jumped around the negative/error handling case):

 (1) inode is initialised to NULL at the top of the function, eliminating the
     need for specific negative-inode handling.  This means the positive
     success handling now flows straight through.

 (2) Rename the labels to be clearer about what they mean.

Also make romfs_lookup()'s result variable of type long so as to avoid
32-bit/64-bit conversions with PTR_ERR() and friends.

Based upon a report and patch from Adam Richter.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: "Adam J. Richter" <adam@yggdrasil.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:36 -07:00
Josef Bacik c587f0c0a6 ext3: fix wrong gfp type under transaction
There are several places where we make allocations with GFP_KERNEL while under
a transaction, which could lead to an assertion panic or lockup if under
memory pressure.  This patch switches these problem areas to use GFP_NOFS to
keep these problems from happening.

Signed-off-by: Josef Bacik <jbacik@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:36 -07:00
Jan Kara 87cb055bc1 quota: add possibly missing iput() when quotaon and quotaoff races
We should always put inode we have reference to, even if quota was reenabled
in the mean time.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:35 -07:00
Randy Dunlap 0cf01f6685 jbd: fix jbd kernel-doc notation
Fix kernel-doc notation in jbd.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:35 -07:00
Quentin Barnes 6cb2a21049 aio: bad AIO race in aio_complete() leads to process hang
My group ran into a AIO process hang on a 2.6.24 kernel with the process
sleeping indefinitely in io_getevents(2) waiting for the last wakeup to come
and it never would.

We ran the tests on x86_64 SMP.  The hang only occurred on a Xeon box
("Clovertown") but not a Core2Duo ("Conroe").  On the Xeon, the L2 cache isn't
shared between all eight processors, but is L2 is shared between between all
two processors on the Core2Duo we use.

My analysis of the hang is if you go down to the second while-loop
in read_events(), what happens on processor #1:
	1) add_wait_queue_exclusive() adds thread to ctx->wait
	2) aio_read_evt() to check tail
	3) if aio_read_evt() returned 0, call [io_]schedule() and sleep

In aio_complete() with processor #2:
	A) info->tail = tail;
	B) waitqueue_active(&ctx->wait)
	C) if waitqueue_active() returned non-0, call wake_up()

The way the code is written, step 1 must be seen by all other processors
before processor 1 checks for pending events in step 2 (that were recorded by
step A) and step A by processor 2 must be seen by all other processors
(checked in step 2) before step B is done.

The race I believed I was seeing is that steps 1 and 2 were
effectively swapped due to the __list_add() being delayed by the L2
cache not shared by some of the other processors.  Imagine:
proc 2: just before step A
proc 1, step 1: adds to ctx->wait, but is not visible by other processors yet
proc 1, step 2: checks tail and sees no pending events
proc 2, step A: updates tail
proc 1, step 3: calls [io_]schedule() and sleeps
proc 2, step B: checks ctx->wait, but sees no one waiting, skips wakeup
                so proc 1 sleeps indefinitely

My patch adds a memory barrier between steps A and B.  It ensures that the
update in step 1 gets seen on processor 2 before continuing.  If processor 1
was just before step 1, the memory barrier makes sure that step A (update
tail) gets seen by the time processor 1 makes it to step 2 (check tail).

Before the patch our AIO process would hang virtually 100% of the time.  After
the patch, we have yet to see the process ever hang.

Signed-off-by: Quentin Barnes <qbarnes+linux@yahoo-inc.com>
Reviewed-by: Zach Brown <zach.brown@oracle.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: <stable@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ We should probably disallow that "if (waitqueue_active()) wake_up()"
  coding pattern, because it's so often buggy wrt memory ordering ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:35 -07:00
Chuck Lever 0490a54a00 lockd: introduce new function to encode private argument in SM_MON requests
Clean up: refactor the encoding of the opaque 16-byte private argument in
xdr_encode_mon().  This will be updated later to support IPv6 addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:01:10 -04:00
Chuck Lever 2ca7754d4c lockd: Fix up incorrect RPC buffer size calculations.
Switch to using the new mon_id encoder function.

Now that we've refactored the encoding of SM_MON requests, we've
discovered that the pre-computed buffer length maximums are
incorrect!

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:01:07 -04:00
Chuck Lever ea72a7f170 lockd: document use of mon_id argument in SM_MON requests
Clean up: document the argument type that xdr_encode_common() is
marshalling by introducing a new function.  The new function will replace
xdr_encode_common() in just a sec.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:01:04 -04:00
Chuck Lever 850c95fd07 lockd: refactor SM_MON my_id argument encoder
Clean up: introduce a new XDR encoder specifically for the my_id
argument of SM_MON requests.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:01:01 -04:00
Chuck Lever 4969517431 lockd: refactor SM_MON mon_name argument encoder
Clean up: introduce a new XDR encoder specifically for the mon_name
argument of SM_MON requests.  This will be updated later to support IPv6
addresses in addition to IPv4 addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:57 -04:00
Chuck Lever 099bd05f27 lockd: Ensure NSM strings aren't longer than protocol allows
Introduce a special helper function to check the length of NSM strings
before they are placed on the wire.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:54 -04:00
Chuck Lever eb18860e13 NLM: NLM protocol version numbers are u32
Clean up: RPC protocol version numbers are u32.  Make sure we use an
appropriate type for NLM version numbers when calling nlm_lookup_host().

Eliminates a harmless mixed sign comparison in nlm_host_lookup().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:47 -04:00
Chuck Lever 90d5b18061 NLM: LOCKD fails to load if CONFIG_SYSCTL is not set
Bruce Fields says:
"By the way, we've got another config-related nit here:

	http://bugzilla.linux-nfs.org/show_bug.cgi?id=156

You can build lockd without CONFIG_SYSCTL set, but then the module will
fail to load."

For now, disable the sysctl registration calls in lockd if CONFIG_SYSCTL
is not enabled.  This allows the kernel to build properly if PROC_FS or
SYSCTL is not enabled, but an NFS client is desired.

In the long run, we would like to be able to build the kernel with an
NFS client but without lockd.  This makes sense, for example, if you want
an NFSv4-only NFS client, as NFSv4 doesn't use NLM at all.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:44 -04:00
Chuck Lever 1e40316bc4 SUNRPC: Add a default setting for CONFIG_SUNRPC_BIND34
Most distros will want support for rpcbind protocols 3 and 4 to default off
until they have integrated user-space support for the new rpcbind daemon
which supports IPv6 RPC services.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:41 -04:00
Chuck Lever 327a299d8a SUNRPC: Update help Kconfig text
Clean up: refresh the help text for Kconfig items related to the sunrpc
module.  Remove obsolete URLs, and make the language consistent among
the options.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:38 -04:00
Chuck Lever ecfc555a83 NFS: Always enable NFS direct I/O
Since O_DIRECT is a standard feature that is enabled in most distros,
eliminate the CONFIG_NFS_DIRECTIO build option, and change the
fs/nfs/Makefile to always build in the NFS direct I/O engine.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:34 -04:00
Chuck Lever 82d101d58a NFS: Show most mount options via nfs_show_options()
Display all mount options in /proc/mount which may be needed to reconstruct
a previous mount.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:29 -04:00
Chuck Lever 3f8400d1f1 NFS: Save the values of the "mount*=" mount options
Save the value of the mountproto= mountport= mountvers= and mountaddr=
options so that these values can be displayed later via
nfs_show_options().

This preserves the intent of the original mount options, should the file
system need to be remounted based on what's displayed in /proc/mounts.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:22 -04:00
Chuck Lever f22d6d79fe NFS: Save the value of the "port=" mount option
During a remount based on the mount options displayed in /proc/mounts, we
want to preserve the original behavior of the mount request.  Let's save
the original setting of the "port=" mount option in the mount's nfs_server
structure.

This allows us to simplify the default behavior of port setting for NFSv4
mounts: by default, NFSv2/3 mounts first try an RPC bind to determine the
NFS server's port, unless the user specified the "port=" mount option;
Users can force the client to skip the RPC bind by explicitly specifying
"port=<value>".

NFSv4, by contrast, assumes the NFS server port is 2049 and skips the RPC
bind, unless the user specifies "port=".  Users can force an RPC bind for
NFSv4 by explicitly specifying "port=0".

I added a couple of extra comments to clarify this behavior.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:19 -04:00
Chuck Lever 78fa701f34 NFS: Fix up data types of fields in nfs_parsed_mount_options
Clean up: make data types of fields in nfs_parsed_mount_options more
consistent with other uses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:16 -04:00
Chuck Lever 2d76743227 NFS: numeric mount parameters are unsigned
Clean up: use %u instead of %d when displaying NFS mount options.

Nit: Fix reporting of "namlen=" option in nfs_show_mount_stats.  The mount
option is called "namlen" without the "e".

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:13 -04:00
Jeff Layton 7bda2cdf48 NFS: clean up short packet handling for NFSv4 readdir
Currently, the NFS readdir decoders have a workaround for buggy servers
that send an empty readdir response with the EOF bit unset. If the
server sends a malformed response in some cases, this workaround kicks
in and just returns an empty response rather than returning a proper
error to the caller.

This patch does 3 things:

1) have malformed responses with no entries return error (-EIO)

2) preserve existing workaround for servers that send empty
   responses with the EOF marker unset.

3) Add some comments to clarify the logic in decode_readdir().

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:10 -04:00
Jeff Layton 643f81115b NFS: clean up short packet handling for NFSv3 readdir
Currently, the NFS readdir decoders have a workaround for buggy servers
that send an empty readdir response with the EOF bit unset. If the
server sends a malformed response in some cases, this workaround kicks
in and just returns an empty response rather than returning a proper
error to the caller.

This patch does 3 things:

1) have malformed responses with no entries return error (-EIO)

2) preserve existing workaround for servers that send empty
   responses with the EOF marker unset.

3) Add some comments to clarify the logic in nfs3_xdr_readdirres().

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:06 -04:00
Jeff Layton caa02bd540 NFS: clean up short packet handling for NFSv2 readdir
Currently, the NFS readdir decoders have a workaround for buggy servers
that send an empty readdir response with the EOF bit unset. If the
server sends a malformed response in some cases, this workaround kicks
in and just returns an empty response rather than returning a proper
error to the caller.

This patch does 3 things:

1) have malformed responses with no entries return error (-EIO)

2) preserve existing workaround for servers that send empty
   responses with the EOF marker unset.

3) Add some comments to clarify the logic in nfs_xdr_readdirres().

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 18:00:03 -04:00
Fred Isaman 4af68bffac nfs: remove duplicate initializations of nfs_read_data field
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 17:59:59 -04:00
Fred 6d884e8fc8 nfs: nfs_redirty_request
Both flush functions have the same error handling routine.  Pull
it out as a function.

Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 17:59:56 -04:00
Trond Myklebust c7c350e92a Merge branch 'hotfixes' into devel 2008-03-19 17:59:44 -04:00
Fred Isaman f8512ad0da nfs: don't ignore return value from nfs_pageio_add_request
Ignoring the return value from nfs_pageio_add_request can cause deadlocks.

In read path:
  call nfs_pageio_add_request from readpage_async_filler
  assume at this point that there are requests already in desc, that
    can't be merged with the current request.
  so nfs_pageio_doio is fired up to clear out desc.
  assume something goes wrong in setting up the io, so desc->pg_error is set.
  This causes nfs_pageio_add_request to return 0, *WITHOUT* adding the original
    request.
  BUT, since return code is ignored, readpage_async_filler assumes it has
    been added, and does nothing further, leaving page locked.
  do_generic_mapping_read will eventually call lock_page, resulting in deadlock

In write path:
  page is marked dirty by generic_perform_write
  nfs_writepages is called
  call nfs_pageio_add_request from nfs_page_async_flush
  assume at this point that there are requests already in desc, that
    can't be merged with the current request.
  so nfs_pageio_doio is fired up to clear out desc.
  assume something goes wrong in setting up the io, so desc->pg_error is set.
  This causes nfs_page_async_flush to return 0, *WITHOUT* adding the original
    request, yet marking the request as locked (PG_BUSY) and in writeback,
    clearing dirty marks.
  The next time a write is done to the page, deadlock will result as
    nfs_write_end calls nfs_update_request

Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-19 17:59:02 -04:00
Al Viro a02f76c34d [PATCH] get stack footprint of pathname resolution back to relative sanity
Somebody had put struct nameidata in stack frame of link_path_walk().
Unfortunately, there are certain realities to deal with:
	* It's in the middle of recursion.  Depth is equal to the nesting
depth of symlinks, i.e. up to 8.
	* struct namiedata is, even if one discards the intent junk,
at least 12 pointers + 5 ints.
	* moreover, adding a stack frame is not free in that situation.
	* there are fs methods called on top of that, and they also have
stack footprint.
	* kernel stack is not infinite.

The thing is, even if one chooses to deal with -ESTALE that way (and it's
one hell of an overkill), the only thing that needs to be preserved is
vfsmount + dentry, not the entire struct nameidata.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-19 06:55:46 -04:00
Al Viro b4d232e65f [PATCH] double iput() on failure exit in hugetlb
once we'd done d_instantiate(), we should only do dput().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-19 06:55:01 -04:00
Dave Hansen 430e285e08 [PATCH] fix up new filp allocators
Some new uses of get_empty_filp() have crept in; switched
to alloc_file() to make sure that pieces of initialization
won't be missing.

We really need to kill get_empty_filp().

[AV] fixed dentry leak on failure exit in anon_inode_getfd()

Cc: Erez Zadok <ezk@cs.sunysb.edu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J Bruce Fields" <bfields@fieldses.org>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-19 06:54:05 -04:00
Christoph Hellwig 322ee5b36e [PATCH] check for null vfsmount in dentry_open()
Make sure no-one calls dentry_open with a NULL vfsmount argument and crap
out with a stacktrace otherwise.  A NULL file->f_vfsmnt has always been
problematic, but with the per-mount r/o tracking we can't accept anymore
at all.

[AV] the last place that passed NULL had been eliminated by the previous
patch (reiserfs xattr stuff)

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-19 06:50:44 -04:00
Jeff Mahoney 3227e14c3c [PATCH] reiserfs: eliminate private use of struct file in xattr
After several posts and bug reports regarding interaction with the NULL
nameidata, here's a patch to clean up the mess with struct file in the
reiserfs xattr code.

As observed in several of the posts, there's really no need for struct file
to exist in the xattr code.  It was really only passed around due to the
f_op->readdir() and a_ops->{prepare,commit}_write prototypes requiring it.

reiserfs_prepare_write() and reiserfs_commit_write() don't actually use the
struct file passed to it, and the xattr code uses a private version of
reiserfs_readdir() to enumerate the xattr directories.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-19 06:49:36 -04:00
Al Viro f382d6e631 [PATCH] sanitize hppfs
* hppfs_iget() and its users are racy; there's no need to pollute icache
  anyway, new_inode() works fine and is safe, unlike the current kludges
  (these relied on overwriting ->i_ino before another iget_locked() gets
  to that one - and did it after unlocking).
* merge hppfs_iget()/init_inode()/hppfs_read_inode(), while we are
  at it.
* to pass proper vfsmount to dentry_open() store the reference
  in hppfs superblock.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 --
2008-03-19 06:42:18 -04:00
Dave Hansen 1dd0dd111f hppfs pass vfsmount to dentry_open()
Here's patch for hppfs that uses vfs_kern_mount to make sure it always has a
procfs instance and passed the vfsmount on through the inode private data.
Also fixes a procfs file_system_type leak for every attempted hppfs mount.

[ jdike - gave this file a style workover, plus deleted hppfs_dentry_ops ]

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-19 06:39:56 -04:00
Linus Torvalds f920bb6f5f Merge branch 'audit.b49' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b49' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] export sessionid alongside the loginuid in procfs
2008-03-18 08:43:59 -07:00
Eric Paris 1e0bd7550e [PATCH] export sessionid alongside the loginuid in procfs
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-18 10:51:22 -04:00
Linus Torvalds 92f53c6f1e Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Revert "unexport bio_{,un}map_user"
  relay: fix subbuf_splice_actor() adding too many pages
  The ps2esdi driver was marked as BROKEN more than two years ago due to being
2008-03-18 07:43:14 -07:00
David S. Miller 2f633928cb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-03-17 23:44:31 -07:00
Al Viro e6f1cebf71 [NET] endianness noise: INADDR_ANY
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:44:53 -07:00
Al Viro 8a4e98d9d7 [PATCH] restore export of do_kern_mount()
vfs_kern_mount() requires having a reference to fs type, which
makes it impossible for module to create procfs, etc. private
mount.  Open-coding is not an option, since e.g. put_filesystem()
is _not_ exported, and for a good reason.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-17 22:58:04 -04:00
Jens Axboe 40044ce0bf Revert "unexport bio_{,un}map_user"
Outside users like asmlib uses the mapping functions. API wise, the
export is definitely sane. It's a better idea to keep this export
than to require external users to open-code this piece of code instead.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-03-17 21:14:40 +01:00
Al Viro 3d10a15d69 hfs_bnode_find() can fail, resulting in hfs_bnode_split() breakage
oops and fs corruption; the latter can happen even on valid fs in case of oom.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-17 09:46:55 -07:00
J. Bruce Fields b663c6fd98 nfsd: fix oops on access from high-numbered ports
This bug was always here, but before my commit 6fa02839bf
("recheck for secure ports in fh_verify"), it could only be triggered by
failure of a kmalloc().  After that commit it could be triggered by a
client making a request from a non-reserved port for access to an export
marked "secure".  (Exports are "secure" by default.)

The result is a struct svc_export with a reference count one too low,
resulting in likely oopses next time the export is accessed.

The reference counting here is not straightforward; a later patch will
clean up fh_verify().

Thanks to Lukas Hejtmanek for the bug report and followup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Lukas Hejtmanek <xhejtman@ics.muni.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-14 16:49:15 -07:00
Steve French 8b1327f6ed [CIFS] file create with acl support enabled is slow
Shirish Pargaonkar noted:
With cifsacl mount option, when a file is created on the Windows server,
exclusive oplock is broken right away because the get cifs acl code
again opens the file to obtain security descriptor.
The client does not have the newly created file handle or inode in any
of its lists yet so it does not respond to oplock break and server waits for
its duration and then responds to the second open. This slows down file
creation signficantly.  The fix is to pass the file descriptor to the get
cifsacl code wherever available so that get cifs acl code does not send
second open (NT Create ANDX) and oplock is not broken.

CC: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-03-14 22:37:16 +00:00
Steve French ebe8912be2 Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-03-14 19:29:18 +00:00
Steve French 50531444fa [CIFS] Fix mtime on cp -p when file data cached but written out too late
Kukks noticed that cp -p can write out file data too late, after the timestamp
is already set.  This was introduced as an unintentional sideeffect of the change
in an earlier patch (see below) which fixed some delayed return code propagation.

cea218054a
Author: Jeff Layton <jlayton@redhat.com>
Date:   Tue Nov 20 23:19:03 2007 +0000

Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-03-14 19:21:31 +00:00
Fred Isaman 2f42b5d043 NFS: fix encode_fsinfo_maxsz
The previous value was not taking into account space for bitmap array size.

Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-14 13:47:17 -04:00
Trond Myklebust 98a8e32394 SUNRPC: Add a helper rpcauth_lookup_generic_cred()
The NFSv4 protocol allows clients to negotiate security protocols on the
fly in the case where an administrator on the server changes the export
settings and/or in the case where we may have a filesystem migration event.

Instead of having the NFS client code cache credentials that are tied to a
particular AUTH method it is therefore preferable to have a generic credential
that can be converted into whatever AUTH is in use by the RPC client when
the read/write/sillyrename/... is put on the wire.

We do this by means of the new "generic" credential, which basically just
caches the minimal information that is needed to look up an RPCSEC_GSS,
AUTH_SYS, or AUTH_NULL credential.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-14 13:42:49 -04:00
Marcelo Tosatti fb39380b8d pagemap: proper read error handling
Fix pagemap_read() error handling by releasing acquired resources and checking
for get_user_pages() partial failure.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-13 13:11:43 -07:00
Linus Torvalds 609eb39c8d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  [SCTP]: Fix local_addr deletions during list traversals.
  net: fix build with CONFIG_NET=n
  [TCP]: Prevent sending past receiver window with TSO (at last skb)
  rt2x00: Add new D-Link USB ID
  rt2x00: never disable multicast because it disables broadcast too
  libertas: fix the 'compare command with itself' properly
  drivers/net/Kconfig: fix whitespace for GELIC_WIRELESS entry
  [NETFILTER]: nf_queue: don't return error when unregistering a non-existant handler
  [NETFILTER]: nfnetlink_queue: fix EPERM when binding/unbinding and instance 0 exists
  [NETFILTER]: nfnetlink_log: fix EPERM when binding/unbinding and instance 0 exists
  [NETFILTER]: nf_conntrack: replace horrible hack with ksize()
  [NETFILTER]: nf_conntrack: add \n to "expectation table full" message
  [NETFILTER]: xt_time: fix failure to match on Sundays
  [NETFILTER]: nfnetlink_log: fix computation of netlink skb size
  [NETFILTER]: nfnetlink_queue: fix computation of allocated size for netlink skb.
  [NETFILTER]: nfnetlink: fix ifdef in nfnetlink_compat.h
  [NET]: include <linux/types.h> into linux/ethtool.h for __u* typedef
  [NET]: Make /proc/net a symlink on /proc/self/net (v3)
  RxRPC: fix rxrpc_recvmsg()'s returning of msg_name
  net/enc28j60: oops fix
  ...
2008-03-12 13:08:09 -07:00
Andrew Morton b2211a361a net: fix build with CONFIG_NET=n
fs/built-in.o:(.rodata+0x1134): undefined reference to `proc_net_inode_operations'
fs/built-in.o:(.rodata+0x1138): undefined reference to `proc_net_operations'

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-11 18:03:35 -07:00
Steve French bc5b6e24a1 [CIFS] Fix build problem
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-03-11 21:07:48 +00:00
Steve French 5b4d4771e2 Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-03-11 19:15:41 +00:00
Tao Ma cdef59a94c ocfs2: Fix NULL pointer dereferences in o2net
In some situations, ocfs2_set_nn_state might get called with sc = NULL and
valid = 0. If sc = NULL, we can't dereference it to get the o2nm_node
member. Instead, do what o2net_initialize_handshake does and use NULL when
calling o2net_reconnect_delay and o2net_idle_timeout.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-10 15:14:19 -07:00
Sunil Mushran c824c3c723 ocfs2/dlm: dlm_thread should not sleep while holding the dlm_spinlock
This patch addresses the bug in which the dlm_thread could go to sleep
while holding the dlm_spinlock.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-10 15:14:17 -07:00
Sunil Mushran 535f7026fd ocfs2/dlm: Print message showing the recovery master
Knowing the dlm recovery master helps in debugging recovery
issues. This patch prints a message on the recovery master node.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-10 15:14:14 -07:00
Sunil Mushran b31cfc0237 ocfs2/dlm: Add missing dlm_lockres_put()s
dlm_master_request_handler() forgot to put a lockres when
dlm_assert_master_worker() failed or was skipped.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-10 15:14:12 -07:00
Sunil Mushran 52987e2ab4 ocfs2/dlm: Add missing dlm_lockres_put()s in migration path
During migration, the recovery master node may be asked to master a lockres
it may not know about. In that case, it would not only have to create a
lockres and add it to the hash, but also remember to to do the _put_
corresponding to the kref_init in dlm_init_lockres(), as soon as the migration
is completed. Yes, we don't wait for the dlm_purge_lockres() to do that
matching put. Note the ref added for it being in the hash protects the lockres
from being freed prematurely.

This patch adds that missing put, as described above, to plug a memleak.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-10 15:14:11 -07:00
Sunil Mushran 2c5c54aca9 ocfs2/dlm: Add missing dlm_lock_put()s
Normally locks for remote nodes are freed when that node sends an UNLOCK
message to the master. The master node tags an DLM_UNLOCK_FREE_LOCK action
to do an extra put on the lock at the end.

However, there are times when the master node has to free the locks for the
remote nodes forcibly.

Two cases when this happens are:
1. When the master has migrated the lockres plus all locks to another node.
2. When the master is clearing all the locks of a dead node.

It was in the above two conditions that the dlm was missing the extra put.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-10 15:14:09 -07:00
Tao Ma 4338ab6a75 ocfs2: Fix an endian bug in online resize.
In ocfs2_group_add, 'cr' is a disk field of type 'ocfs2_chain_rec', and we
were putting cpu byteorder values into it. Swap things to the right endian
before storing.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-10 15:14:07 -07:00
Jan Engelhardt 90d99779a4 [PATCH] [OCFS2]: constify function pointer tables
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-10 15:13:57 -07:00
Joel Becker 0f71b7b40f ocfs2: Fix endian bug in o2dlm protocol negotiation.
struct dlm_query_join_packet is made up of four one-byte fields.  They
are effectively in big-endian order already.  However, little-endian
machines swap them before putting the packet on the wire (because
query_join's response is a status, and that status is treated as a u32
on the wire).  Thus, a big-endian and little-endian machines will
treat this structure differently.

The solution is to have little-endian machines swap the structure when
converting from the structure to the u32 representation.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-10 15:13:54 -07:00
Tao Ma 2af37ce82d ocfs2: Use dlm_print_one_lock_resource for lock resource print
__dlm_print_one_lock_resource must be called with spin_lock
the res->spinlock. While in some cases, we use it without this
precondition and lead to the failure of assert_spin_locked.
So call dlm_print_one_lock_resource instead.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-10 15:13:50 -07:00
Andrew Morton 3a4780a85d [PATCH] fs/ocfs2/dlm/dlmdomain.c: fix printk warning
fs/ocfs2/dlm/dlmdomain.c: In function 'dlm_send_join_cancels':
fs/ocfs2/dlm/dlmdomain.c:983: warning: format '%u' expects type 'unsigned int', but argument 7 has type 'long unsigned int'

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-10 15:13:39 -07:00
Harvey Harrison 55f78e1771 [CIFS] cifs: replace remaining __FUNCTION__ occurrences
__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-03-10 17:14:34 +00:00
Igor Mammedov 7962670e64 [CIFS] DFS patch that connects inode with dfs handling ops
if DFS junction point

Signed-off-by: Igor Mammedov <niallain@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-03-09 03:44:18 +00:00
Trond Myklebust 9446389ef6 Merge commit 'origin' into devel 2008-03-08 11:49:24 -05:00
Linus Torvalds 4c1aa6f8b9 Merge branch 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix dentry revalidation for NFSv4 referrals and mountpoint crossings
  NFS: Fix the fsid revalidation in nfs_update_inode()
  SUNRPC: Fix a nfs4 over rdma transport oops
  NFS: Fix an f_mode/f_flags confusion in fs/nfs/write.c
2008-03-07 12:08:07 -08:00
Trond Myklebust 4e99a1ff34 NFS: Fix dentry revalidation for NFSv4 referrals and mountpoint crossings
As long as the directory contents haven't changed, we should just let the
path walk proceed to cross the mountpoint. Apart from being an optimisation
in the case of 'nohide' mountpoint traversals, it also fixes an issue with
referrals: referral inodes don't have valid filehandles, so calling
nfs_revalidate_inode() on them is a bug.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-07 14:35:41 -05:00
Trond Myklebust c37dcd334c NFS: Fix the fsid revalidation in nfs_update_inode()
When we detect that we've crossed a mountpoint on the remote server, we
must take care not to use that inode to revalidate the fsid on our
current superblock. To do so, we label the inode as a remote mountpoint,
and check for that in nfs_update_inode().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-07 14:35:37 -05:00
Trond Myklebust af1b8c2ff7 NFS: Fix an f_mode/f_flags confusion in fs/nfs/write.c
O_SYNC is stored in filp->f_flags.
Thanks to Al Viro for pointing out the bug.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-07 14:33:40 -05:00
Pavel Emelyanov e9720acd72 [NET]: Make /proc/net a symlink on /proc/self/net (v3)
Current /proc/net is done with so called "shadows", but current
implementation is broken and has little chances to get fixed.

The problem is that dentries subtree of /proc/net directory has
fancy revalidation rules to make processes living in different
net namespaces see different entries in /proc/net subtree, but
currently, tasks see in the /proc/net subdir the contents of any
other namespace, depending on who opened the file first.

The proposed fix is to turn /proc/net into a symlink, which points
to /proc/self/net, which in turn shows what previously was in
/proc/net - the network-related info, from the net namespace the
appropriate task lives in.

# ls -l /proc/net
lrwxrwxrwx  1 root root 8 Mar  5 15:17 /proc/net -> self/net

In other words - this behaves like /proc/mounts, but unlike
"mounts", "net" is not a file, but a directory.

Changes from v2:
* Fixed discrepancy of /proc/net nlink count and selinux labeling
  screwup pointed out by Stephen.

  To get the correct nlink count the ->getattr callback for /proc/net
  is overridden to read one from the net->proc_net entry.

  To make selinux still work the net->proc_net entry is initialized
  properly, i.e. with the "net" name and the proc_net parent.

Selinux fixes are
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>

Changes from v1:
* Fixed a task_struct leak in get_proc_task_net, pointed out by Paul.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:08:40 -08:00
Linus Torvalds 910da1a48e Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
  [XFS] fix inode leak in xfs_iget_core()
  [XFS] 977545 977545 977545 977545 977545 977545 xfsaild causing too many
2008-03-06 08:14:00 -08:00
David Chinner 72772a3b5b [XFS] fix inode leak in xfs_iget_core()
If the radix_tree_preload() fails, we need to destroy the inode we just
read in before trying again. This could leak xfs_vnode structures when
there is memory pressure. Noticed by Christoph Hellwig.

SGI-PV: 977823
SGI-Modid: xfs-linux-melb:xfs-kern:30606a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-03-06 16:38:50 +11:00
David Chinner 92d9cd1059 [XFS] 977545 977545 977545 977545 977545 977545 xfsaild causing too many
wakeups

Idle state is not being detected properly by the xfsaild push code. The
current idle state is detected by an empty list which may never happen
with mostly idle filesystem or one using lazy superblock counters. A
single dirty item in the list that exists beyond the push target can
result repeated looping attempting to push up to the target because it
fails to check if the push target has been acheived or not.

Fix by considering a dirty list with everything past the target as an idle
state and set the timeout appropriately.

SGI-PV: 977545
SGI-Modid: xfs-linux-melb:xfs-kern:30532a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-03-06 16:38:17 +11:00
Eric Paris f9c3a38021 NFS: use new LSM interfaces to explicitly set mount options
NFS and SELinux worked together previously because SELinux had NFS
specific knowledge built in.  This design was approved by both groups
back in 2004 but the recent NFS changes to use nfs_parsed_mount_data and
the usage of nfs_clone_mount_data showed this to be a poor fragile
solution.  This patch fixes the NFS functionality regression by making
use of the new LSM interfaces to allow an FS to explicitly set its own
mount options.

The explicit setting of mount options is done in the nfs get_sb
functions which are called before the generic vfs hooks try to set mount
options for filesystems which use text mount data.

This does not currently support NFSv4 as that functionality did not
exist in previous kernels and thus there is no regression.  I will be
adding the needed code, which I believe to be the exact same as the v3
code, in nfs4_get_sb for 2.6.26.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-03-06 08:40:59 +11:00
Eric Paris e000752989 LSM/SELinux: Interfaces to allow FS to control mount options
Introduce new LSM interfaces to allow an FS to deal with their own mount
options.  This includes a new string parsing function exported from the
LSM that an FS can use to get a security data blob and a new security
data blob.  This is particularly useful for an FS which uses binary
mount data, like NFS, which does not pass strings into the vfs to be
handled by the loaded LSM.  Also fix a BUG() in both SELinux and SMACK
when dealing with binary mount data.  If the binary mount data is less
than one page the copy_page() in security_sb_copy_data() can cause an
illegal page fault and boom.  Remove all NFSisms from the SELinux code
since they were broken by past NFS changes.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-03-06 08:40:53 +11:00
Harvey Harrison 15732a1cb5 jfs: replace __inline with inline
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
2008-03-05 14:38:22 -06:00
Linus Torvalds 2c6f2db13a Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6:
  debugfs: fix sparse warnings
  Driver core: Fix cleanup when failing device_add().
  driver core: Remove dpm_sysfs_remove() from error path of device_add()
  PM: fix new mutex-locking bug in the PM core
  PM: Do not acquire device semaphores upfront during suspend
  kobject: properly initialize ksets
  sysfs: CONFIG_SYSFS_DEPRECATED fix
  driver core: fix up Kconfig text for CONFIG_SYSFS_DEPRECATED
2008-03-04 16:37:35 -08:00
Josef Bacik 92587216f8 ext3: fix mount option parsing
The "resize" option won't be noticed as it comes after the NULL option, so if
you try to mount (or in this case remount) with that option it won't be
recognized.

Signed-off-by: Josef Bacik <jbacik@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04 16:35:18 -08:00
Michael Halcrow e4465fdaeb eCryptfs: make ecryptfs_prepare_write decrypt the page
When the page is not up to date, ecryptfs_prepare_write() should be
acting much like ecryptfs_readpage(). This includes the painfully
obvious step of actually decrypting the page contents read from the
lower encrypted file.

Note that this patch resolves a bug in eCryptfs in 2.6.24 that one can
produce with these steps:

# mount -t ecryptfs /secret /secret
# echo "abc" > /secret/file.txt
# umount /secret
# mount -t ecryptfs /secret /secret
# echo "def" >> /secret/file.txt
# cat /secret/file.txt

Without this patch, the resulting data returned from cat is likely to
be something other than "abc\ndef\n".

(Thanks to Benedikt Driessen for reporting this.)

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Benedikt Driessen <bdriessen@escrypt.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04 16:35:16 -08:00
Julia Lawall acc1f3ede9 fs/reiserfs/super.c: correct use of ! and &
In commit e6bafba5b4 ("wmi: (!x & y)
strikes again"), a bug was fixed that involved converting !x & y to !(x
& y).  The code below shows the same pattern, and thus should perhaps be
fixed in the same way.

This is not tested and clearly changes the semantics, so it is only
something to consider.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@ expression E1,E2; @@
(
  !E1 & !E2
|
- !E1 & E2
+ !(E1 & E2)
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Jan Kara <jack@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04 16:35:16 -08:00
Jan Kara e3892296de vfs: fix NULL pointer dereference in fsync_buffers_list()
Fix NULL pointer dereference in fsync_buffers_list() introduced by recent fix
of races in private_list handling.  Since bh->b_assoc_map has been cleared in
__remove_assoc_queue() we should really use original value stored in the
'mapping' variable.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04 16:35:10 -08:00
Roland McGrath d31472b6d4 core dump: user_regset writeback
This makes the user_regset-based core dump code call user_regset writeback
hooks when available.  This is necessary groundwork to allow IA64 to set
CORE_DUMP_USE_REGSET.

Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04 16:35:10 -08:00
Harvey Harrison 3634634edd debugfs: fix sparse warnings
extern does not belong in C files, move declaration to linux/debugfs.h
fs/debugfs/file.c:42:30: warning: symbol 'debugfs_file_operations' was not declared. Should it be static?
fs/debugfs/file.c:54:31: warning: symbol 'debugfs_link_operations' was not declared. Should it be static?

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-03-04 14:47:06 -08:00
Linus Torvalds c5750ee69f Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  [PATCH] fs/ocfs2/aops.c: Correct use of ! and &
  [2.6 patch] ocfs2: make dlm_do_assert_master() static
  [2.6 patch] make ocfs2_downconvert_thread() static
  [2.6 patch] fs/ocfs2/: possible cleanups
  [PATCH] ocfs2: le*_add_cpu conversion
  ocfs2: Fix writeout in ocfs2_data_convert_worker()
  ocfs2: Enable localalloc for local mounts
2008-03-04 11:27:32 -08:00
Linus Torvalds ce932967b9 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: fix blkdev_issue_flush() not detecting and passing EOPNOTSUPP back
  block: fix shadowed variable warning in blk-map.c
  block: remove extern on function definition
  cciss: remove READ_AHEAD define and use block layer defaults
  make cdrom.c:check_for_audio_disc() static
  block/genhd.c: proper externs
  unexport blk_rq_map_user_iov
  unexport blk_{get,put}_queue
  block/genhd.c: cleanups
  proper prototype for blk_dev_init()
  block/blk-tag.c should #include "blk.h"
  Fix DMA access of block device in 64-bit kernel on some non-x86 systems with 4GB or upper 4GB memory
  block: separate out padding from alignment
  block: restore the meaning of rq->data_len to the true data length
  resubmit: cciss: procfs updates to display info about many
  splice: only return -EAGAIN if there's hope of more data
  block: fix kernel-docbook parameters and files
2008-03-04 08:08:05 -08:00
Adrian Bunk a0db701a6b block/genhd.c: proper externs
This patch adds proper externs for two structs in include/linux/genhd.h

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-03-04 11:28:36 +01:00
Jens Axboe 02cf01aea5 splice: only return -EAGAIN if there's hope of more data
sys_tee() currently is a bit eager in returning -EAGAIN, it may do so
even if we don't have a chance of anymore data becoming available. So
improve the logic and only return -EAGAIN if we have an attached writer
to the input pipe.

Reported by Johann Felix Soden <johfel@gmx.de> and
Patrick McManus <mcmanus@ducksong.com>.

Tested-by: Johann Felix Soden <johfel@users.sourceforge.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-03-04 11:14:39 +01:00
Steve French 966ea8c4b7 Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-03-04 04:22:00 +00:00
Julia Lawall 86c838b03d [PATCH] fs/ocfs2/aops.c: Correct use of ! and &
In commit e6bafba5b4, a bug was fixed that
involved converting !x & y to !(x & y).  The code below shows the same
pattern, and thus should perhaps be fixed in the same way.

This is not tested and clearly changes the semantics, so it is only
something to consider.

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-03 15:50:21 -08:00
Adrian Bunk 05488bbebe [2.6 patch] ocfs2: make dlm_do_assert_master() static
This patch makes the needlessly global dlm_do_assert_master() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-03 15:50:21 -08:00
Adrian Bunk 200bfae37a [2.6 patch] make ocfs2_downconvert_thread() static
This patch makes the needlessly global ocfs2_downconvert_thread()
static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-03 15:50:21 -08:00
Adrian Bunk 006000566d [2.6 patch] fs/ocfs2/: possible cleanups
This patch contains the following cleanups that are now possible:
- make the following needlessly global functions static:
  - dlmglue.c:ocfs2_process_blocked_lock()
  - heartbeat.c:ocfs2_node_map_init()
- #if 0 the following unused global function plus support functions:
  - heartbeat.c:ocfs2_node_map_is_only()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-03 15:50:21 -08:00
Marcin Slusarz 0dd3256e04 [PATCH] ocfs2: le*_add_cpu conversion
replace all:
little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) +
					expression_in_cpu_byteorder);
with:
	leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder);
generated with semantic patch

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-03 15:50:21 -08:00
Mark Fasheh 1044e401af ocfs2: Fix writeout in ocfs2_data_convert_worker()
Commit f1f540688e "optimized"
ocfs2_data_convert_worker() to "only do work for regular files".
Unfortunately, I left out a '!', which casued it to *skip* regular files.
This was hidden from testing until recently because the default data
journaling mode (data=ordered) doesn't exercise this code.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2008-03-03 15:50:21 -08:00
Sunil Mushran 7ad8b3d30e ocfs2: Enable localalloc for local mounts
Commit 2fbe8d1ebe disabled localalloc
for local mounts. This caused issues as ocfs2 uses localalloc to
provide write locality. This patch enables localalloc for local mounts.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2008-03-03 15:50:21 -08:00
Randy Dunlap 78a4a50a86 docbook: fix filesystems.tmpl source files
Fix docbook problems in filesystems.tmpl.
These cause the generated docbook to be incorrect.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-03 10:47:13 -08:00
Linus Torvalds a64e715fc7 Allow ARG_MAX execve string space even with a small stack limit
The new code that removed the limitation on the execve string size
(which was historically 32 pages) replaced it with a much softer limit
based on RLIMIT_STACK which is usually much larger than the traditional
limit.  See commit b6a2fea393 ("mm:
variable length argument support") for details.

However, if you have a small stack limit (perhaps because you need lots
of stacks in a threaded environment), the new heuristic of allowing up
to 1/4th of RLIMIT_STACK to be used for argument and environment strings
could actually be smaller than the old limit.

So just say that it's ok to have up to ARG_MAX strings regardless of the
value of RLIMIT_STACK, and check the rlimit only when going over that
traditional limit.

(Of course, if you actually have a *really* small stack limit, the whole
stack itself will be limited before you hit ARG_MAX, but that has always
been true and is clearly the right behaviour anyway).

Acked-by: Carlos O'Donell <carlos@codesourcery.com>
Cc: Michael Kerrisk <michael.kerrisk@googlemail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ollie Wild <aaw@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-03 10:12:14 -08:00
Steve French 0dbd888936 Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-03-01 18:29:55 +00:00
Trond Myklebust cdd0972945 Merge branch 'cleanups' into next 2008-02-28 23:48:05 -08:00
Trond Myklebust 5e4424af9a SUNRPC: Remove now-redundant RCU-safe rpc_task free path
Now that we've tightened up the locking rules for RPC queue wakeups, we can
remove the RCU-safe kfree calls...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-28 23:26:28 -08:00
Trond Myklebust f6a1cc8930 SUNRPC: Add a (empty for the moment) destructor for rpc_wait_queues
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-28 23:17:27 -08:00
Josef Jeff Sipek 1bd960ee2b [XFS] If you mount an XFS filesystem with no mount options at all, then
the "ikeep" option is set rather than "noikeep".

This regression was introduced in 970451.

With no mount options specified, xfs_parseargs() does the following:

int ikeep = 0;

args->flags |= XFSMNT_BARRIER;

args->flags2 |= XFSMNT2_COMPAT_IOSIZE;

if (!options)

goto done;

It only sets the above two options by default and before, it also used to
set XFSMNT_IDELETE by default.

If options are specified, then

if (!(args->flags & XFSMNT_DMAPI) && !ikeep)

args->flags |= XFSMNT_IDELETE;

is executed later on which is skipped by the "goto done;" above.

The solution is to invert the logic.

SGI-PV: 977771
SGI-Modid: xfs-linux-melb:xfs-kern:30590a

Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-28 20:37:56 -08:00
Linus Torvalds 7704a8b6fc Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
  [XFS] Undo bit ops cleanup mod due to regression on 32-bit powermac
  [XFS] Undo bit ops cleanup mod due to regression on 32-bit powermac
  Remove empty file fs/xfs/Makefile-linux-2.6.
2008-02-26 07:55:29 -08:00
Linus Torvalds adefe11c53 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: add missing ext4_journal_stop()
  ext4: ext4_find_next_zero_bit needs an aligned address on some arch
  ext4: set EXT4_EXTENTS_FL only for directory and regular files
  ext4: Don't mark filesystem error if fallocate fails
  ext4: Fix BUG when writing to an unitialized extent
  ext4: Don't use ext4_dec_count() if not needed
  ext4: modify block allocation algorithm for the last group
  ext4: Don't claim block from group which has corrupt bitmap
  ext4: Get journal write access before modifying the extent tree
  ext4: Fix memory and buffer head leak in callers to ext4_ext_find_extent()
  ext4: Don't leave behind a half-created inode if ext4_mkdir() fails
  ext4: Fix kernel BUG at fs/ext4/mballoc.c:910!
  ext4: Fix locking hierarchy violation in ext4_fallocate()
  Remove incorrect BKL comments in ext4
2008-02-26 07:50:16 -08:00
Lachlan McIlroy ef8ece55d9 [XFS] Undo bit ops cleanup mod due to regression on 32-bit powermac
platform.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30559a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-26 17:05:44 +11:00
Lachlan McIlroy db69c915e6 [XFS] Undo bit ops cleanup mod due to regression on 32-bit powermac
platform.

SGI-PV: 974005
SGI-Modid: xfs-linux-melb:xfs-kern:30558a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-26 17:05:37 +11:00
Trond Myklebust 5d00837b90 SUNRPC: Run rpc timeout functions as callbacks instead of in softirqs
An audit of the current RPC timeout functions shows that they don't really
ever need to run in the softirq context. As long as the softirq is
able to signal that the wakeup is due to a timeout (which it can do by
setting task->tk_status to -ETIMEDOUT) then the callback functions can just
run as standard task->tk_callback functions (in the rpciod/process
context).

The only possible border-line case would be xprt_timer() for the case of
UDP, when the callback is used to reduce the size of the transport
congestion window. In testing, however, the effect of moving that update
to a callback would appear to be minor.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:44 -08:00
Trond Myklebust fda1393938 SUNRPC: Convert users of rpc_wake_up_task to use rpc_wake_up_queued_task
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:42 -08:00
Trond Myklebust 101070ca2f NFS: Ensure that the asynchronous RPC calls complete on nfsiod.
We want to ensure that rpc_call_ops that involve mntput() are run on nfsiod
rather than on rpciod, so that they don't deadlock when the resulting
umount calls rpc_shutdown_client(). Hence we specify that read, write and
commit calls must complete on nfsiod.
Ditto for NFSv4 open, lock, locku and close asynchronous calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:37 -08:00
Trond Myklebust 5746006f1d NFS: Add an nfsiod workqueue
NFS post-rpciod cleanups often involve tasks that cannot be safely
performed within the rpciod context (due to deadlock concerns). We
therefore add a dedicated NFS workqueue that can perform tasks like
cleaning up state after an interrupted NFSv4 open() call, or calling
put_nfs_open_context() after an asynchronous read or write call.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:36 -08:00
Trond Myklebust 383ba71938 NFS: Fix a deadlock with lazy umount
We can't allow rpc callback functions like task->tk_ops->rpc_call_prepare()
and task->tk_ops->rpc_call_done() to call mntput() in any way, since
that will cause a deadlock when the call to rpc_shutdown_client() attempts
to wait on 'task' to complete.

We can avoid the above deadlock by moving calls to mntput to
task->tk_ops->rpc_release() callback, since at that time the task will be
marked as completed, and so rpc_shutdown_client won't attempt to wait on
it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:33 -08:00
Steve French 0b442d2c28 [CIFS] remove unused variable
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-02-26 03:44:02 +00:00
Trond Myklebust 4b5621f6b1 NFS: Fix an f_mode/f_flags confusion in fs/nfs/write.c
O_SYNC is stored in filp->f_flags.
Thanks to Al Viro for pointing out the bug.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 15:56:29 -08:00
Akinobu Mita 5606bf5d0c ext4: add missing ext4_journal_stop()
Add missing ext4_journal_stop() in error handling.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: adilger@clusterfs.com
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mingming Cao <cmm@us.ibm.com>
2008-02-25 15:37:42 -05:00
Christoph Hellwig 75f12983d9 [CIFS] consolidate duplicate code in posix/unix inode handling
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-02-25 20:25:21 +00:00
Hiroshi Shimamoto 13d77c37ca latencytop: change /proc task_struct access method
Change getting task_struct by get_proc_task() at read or write time,
and returns -ESRCH if get_proc_task() returns NULL.
This is same behavior as other /proc files.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-25 16:34:18 +01:00
Hiroshi Shimamoto d6643d12cb latencytop: fix memory leak on latency proc file
At lstats_open(), calling get_proc_task() gets task struct, but it never put.
put_task_struct() should be called when releasing.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-25 16:34:17 +01:00
Hiroshi Shimamoto ae0027869d latencytop: fix kernel panic while reading latency proc file
Reading /proc/<pid>/latency or /proc/<pid>/task/<tid>/latency could cause
NULL pointer dereference.

In lstats_open(), get_proc_task() can return NULL, in which case the kernel
will oops at lstats_show_proc() because m->private is NULL.

When get_proc_task() returns NULL, the kernel should return -ENOENT.

This can be reproduced by the following script.
while :
do
        date
        bash -c 'ls > ls.$$' &
        pid=$!
        cat /proc/$pid/latency &
        cat /proc/$pid/latency &
        cat /proc/$pid/latency &
        cat /proc/$pid/latency
done

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-25 16:34:17 +01:00
David Woodhouse dd919660aa [JFFS2] Use ALLOC_DELETION priority for truncation to zero length
This is going to obsolete all previous nodes, so treat it as deletion.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2008-02-25 15:25:25 +00:00
David Woodhouse b28ba9fa01 [JFFS2] Set i_blocks when truncating files
Addresses OLPC trac #6480

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2008-02-25 15:20:50 +00:00
Eugene Teo 8808117ca5 proc: add RLIMIT_RTTIME to /proc/<pid>/limits
RLIMIT_RTTIME was introduced to allow the user to set a runtime timeout on
real-time tasks: http://lkml.org/lkml/2007/12/18/218. This patch updates
/proc/<pid>/limits with the new rlimit.

Signed-off-by: Eugene Teo <eugeneteo@kernel.sg>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23 17:12:15 -08:00
Christoph Hellwig 45254b4fb2 efs: move headers out of include/linux/
Merge include/linux/efs_fs{_i,_dir}.h into fs/efs/efs.h.  efs_vh.h remains
there because this is the IRIX volume header and shouldn't really be
handled by efs but by the partitioning code.  efs_sb.h remains there for
now because it's exported to userspace.  Of course this wrong and aboot
should have a copy of it's own, but I'll leave that to a separate patch to
avoid any contention.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23 17:12:15 -08:00
Hans Rosenfeld 745329c4a2 /proc/pid/pagemap: fix PM_SPECIAL macro
There seems to be a bug in the PM_SPECIAL macro for /proc/pid/pagemap.  I
think masking out those other bits makes more sense then setting all those
mask bits.

Signed-off-by: Hans Rosenfeld <Hans.Rosenfeld@amd.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23 17:12:13 -08:00
Roel Kluin f81e8a4387 ufs: fix parenthesisation in ufs_set_fs_state()
This bug snuck in with

commit 252e211e90
Author: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Date:   Tue Oct 16 23:26:31 2007 -0700

    Add in SunOS 4.1.x compatible mode for UFS

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Acked-by: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23 17:12:13 -08:00
Miklos Szeredi 1a823ac9ff fuse: fix permission checking
I added a nasty local variable shadowing bug to fuse in 2.6.24, with the
result, that the 'default_permissions' mount option is basically ignored.

How did this happen?

 - old err declaration in inner scope
 - new err getting declared in outer scope
 - 'return err' from inner scope getting removed
 - old declaration not being noticed

-Wshadow would have saved us, but it doesn't seem practical for
the kernel :(

More testing would have also saved us :((

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23 17:12:13 -08:00
Aneesh Kumar K.V ffad0a44b7 ext4: ext4_find_next_zero_bit needs an aligned address on some arch
ext4_find_next_zero_bit and ext4_find_next_bit needs a long aligned
address on x8_64. Add mb_find_next_zero_bit and mb_find_next_bit
and use them in the mballoc.

Fix: https://bugzilla.redhat.com/show_bug.cgi?id=433286

Eric Sandeen debugged the problem and suggested the fix.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by:      Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-23 01:38:34 -05:00
Aneesh Kumar K.V 42bf0383d1 ext4: set EXT4_EXTENTS_FL only for directory and regular files
In addition, don't inherit EXT4_EXTENTS_FL from parent directory.
If we have a directory with extent flag set and later mount the file
system with -o noextents, the files created in that directory will also
have extent flag set but we would not have called ext4_ext_tree_init for
them. This will cause error later when we are verifying the extent header

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-25 16:38:03 -05:00
Aneesh Kumar K.V 2c98615d3b ext4: Don't mark filesystem error if fallocate fails
If we fail to allocate blocks don't call ext4_error. Also don't hide
errors from ext4_get_blocks_wrap

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-25 15:41:35 -05:00
Mingming Cao f5ab0d1f8f ext4: Fix BUG when writing to an unitialized extent
This patch fixes a bug when writing to preallocated but uninitialized
blocks, which resulted in a BUG in fs/buffer.c saying that the buffer
is not mapped.

When writing to a file, ext4_get_block_wrap() is called with create=1 in
order to request that blocks be allocated if necessary.  It currently
calls ext4_get_blocks() with create=0 in order to do a lookup first.  If
the inode contains an unitialized data block, the buffer head is left
unampped, which ext4_get_blocks_wrap() returns, causing the BUG.

We fix this by checking to see if the buffer head is unmapped, and if
so, we make sure the the buffer head is mapped by calling
ext4_ext_get_blocks with create=1.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-25 15:29:55 -05:00
Linus Torvalds 1a4c6be4ac Merge branch 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  Wrap buffers used for rpc debug printks into RPC_IFDEBUG
  nfs: fix sparse warnings
  NFS: flush signals before taking down callback thread
2008-02-21 17:19:48 -08:00
Pavel Emelyanov 5216a8e70e Wrap buffers used for rpc debug printks into RPC_IFDEBUG
Sorry for the noise, but here's the v3 of this compilation fix :)

There are some places, which declare the char buf[...] on the stack
to push it later into dprintk(). Since the dprintk sometimes (if the
CONFIG_SYSCTL=n) becomes an empty do { } while (0) stub, these buffers
cause gcc to produce appropriate warnings.

Wrap these buffers with RPC_IFDEBUG macro, as Trond proposed, to
compile them out when not needed.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-21 18:42:29 -05:00
David Teigland 599e0f584d dlm: fix rcom_names message to self
The recent patch to validate data lengths in rcom_names messages
failed to account for fake messages a node directs to itself before
ever sending it.  In this case we need to fill in the message length
in the header for the validation code to use.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-02-21 15:19:54 -06:00
Linus Torvalds 1803f3389b Remove empty file remnants that were left in the tree by mistake
Noted by various people (Sam, Jeff, Roland..)

Commit 58b7983d15 intended to remove the
xfs "Makefile-linux-2.6" file, but it was mistakenly still left in the
tree as a empty file, and would cause git to correctly complain about a
tracked file being removed after a "make distclean" (which removes empty
files as garbage).

And the asm-x86/desc_64.h file was supposed to be removed by commit
c81c6ca45a, but instead stayed around
containing just a single newline.

Get rid of them both properly.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-20 19:56:01 -08:00
Harvey Harrison 90dc7d2796 nfs: fix sparse warnings
fs/nfs/nfs4state.c:788:34: warning: Using plain integer as NULL pointer
fs/nfs/delegation.c:52:34: warning: Using plain integer as NULL pointer
fs/nfs/idmap.c:312:12: warning: Using plain integer as NULL pointer
fs/nfs/callback_xdr.c:257:6: warning: Using plain integer as NULL pointer
fs/nfs/callback_xdr.c:270:6: warning: Using plain integer as NULL pointer
fs/nfs/callback_xdr.c:281:6: warning: Using plain integer as NULL pointer

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-20 16:15:44 -05:00
Jeff Layton 1227a74e2e NFS: flush signals before taking down callback thread
Now that the reference counting on the callback thread is working as
expected, it uncovers another problem.  Peter Staubach noticed while
testing that patch on an older kernel that he would occasionally see
this printk in rpc_register fire:

    "RPC: failed to contact portmap (errno -512).

The NFSv4 callback thread is signaled by nfs_callback_down(), but never
flushes that signal. All of the shutdown processing is done with that
signal pending. This makes it fail the call to unregister the port with
the portmapper.

In actuality, this rpc_register call isn't necessary at all since the
port isn't actually registered with the portmapper anymore. Regardless,
there doesn't seem to be any reason to leave the signal pending while
the thread is being shut down and flushing it should generally silence
that printk.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-20 13:32:43 -05:00
Adrian Bunk 86b6c7a7f7 fs/block_dev.c: remove #if 0'ed code
Commit b2e895dbd8 #if 0'ed this code stating:

<--  snip  -->

    [PATCH] revert blockdev direct io back to 2.6.19 version

    Andrew Vasquez is reporting as-iosched oopses and a 65% throughput
    slowdown due to the recent special-casing of direct-io against
    blockdevs.  We don't know why either of these things are occurring.

    The patch minimally reverts us back to the 2.6.19 code for a 2.6.20
    release.

<--  snip  -->

It has since been dead code, and unless someone wants to revive it now
it's time to remove it.

This patch also makes bio_release_pages() static again and removes the
ki_bio_count member from struct kiocb, reverting changes that had been
done for this dead code.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
2008-02-19 10:04:00 +01:00
Adrian Bunk 4c54ac62dc make struct def_blk_aops static
This patch makes the needlessly global struct def_blk_aops static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
2008-02-19 10:04:00 +01:00
Steve French e086fcea86 [CIFS] fix build break when proc disabled
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-02-18 04:03:58 +00:00
Lachlan McIlroy c58310bf49 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-linus 2008-02-18 13:51:42 +11:00
Lachlan McIlroy 269cdfaf76 [XFS] Added quota targets and removed dmapi directory
Fixes build failures introduced by bad merge to mainline.
2008-02-18 13:06:17 +11:00
Eric Sandeen 794f744b22 [XFS] Fix up xfs out-of-tree builds. (a.k.a. external modules)
Change -I include directives to find headers in the out-of-tree spot. This
allows a directory containing only xfs files to be built as:

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29878a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-18 12:59:11 +11:00
Andi Kleen 58b7983d15 [XFS] Remove Makefile wrappers in XFS
Makefile (and Kbuild) would include Makefile-linux-26 I doubt XFS will
really still compile on 2.4; so drop that. This moves Makefile-linux-26
into Makefile and drops Kbuild. Also having wrappers as both Kbuild and
Makefile seemed redundant anyways.

The patch is relatively large because it renames a file, but no functional
changes.

SGI-PV: 971050
SGI-Modid: xfs-linux-melb:xfs-kern:29781a

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-18 12:48:03 +11:00
Steve French 0a3abcf75b Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-02-15 21:06:08 +00:00
Christoph Hellwig 70eff55d2d [CIFS] factoring out common code in get_inode_info functions
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-02-15 20:55:05 +00:00
Theodore Ts'o 825f1481ea ext4: Don't use ext4_dec_count() if not needed
The ext4_dec_count() function is only needed when dropping the i_nlink
count on inodes which are (or which could be) directories.  If we
*know* that the inode in question can't possibly be a directory, use
drop_nlink or clear_nlink() if we know i_nlink is 1.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-15 15:00:38 -05:00
Steve French c2d68ea65b [CIFS] fix prepath conversion when server supports posix paths
Jeff Layton that we were converting \ to / in the posix path case which is
not always right (depends on what the old delim was).

CC: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-02-15 19:20:18 +00:00
Igor Mammedov 11b6d6450c [CIFS] Only convert / when server does not support posix paths
Also add warning if posix path setting changes on reconnect

Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-02-15 19:06:04 +00:00
Valerie Clement 74d3487fc8 ext4: modify block allocation algorithm for the last group
When a directory inode is allocated in the last group and the last group
contains less than s_blocks_per_group blocks, the initial block allocated
for the directory is not always allocated in the same group as the
directory inode, but in one of the first groups of the filesystem (group 1
for example).
Depending on the current process's pid, ext4_find_near() and 
ext4_ext_find_goal() can return a block number greater than the maximum
blocks count in the filesystem and in that case the block will be not
allocated in the same group as the inode.

The following patch fixes the problem.

Should the modification also be done in ext2/3 code?

Signed-off-by: Valerie Clement <valerie.clement@bull.net>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-15 13:43:07 -05:00
Aneesh Kumar K.V e56eb65906 ext4: Don't claim block from group which has corrupt bitmap
In ext4_mb_complex_scan_group, if the extent length of the newly
found extentet is greater than than the total free blocks counted
in group info, break without claiming the block.

Document different ext4_error usage, explaining the state with which we
continue if we mount with errors=continue

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-15 13:48:21 -05:00
Aneesh Kumar K.V 9df5643ad1 ext4: Get journal write access before modifying the extent tree
When the user was writing into an unitialized extent,
ext4_ext_convert_to_initialize() was not requesting journal write access
before it started to modify the extent tree.   Fix this oversight.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-22 06:17:31 -05:00
Aneesh Kumar K.V b35905c16a ext4: Fix memory and buffer head leak in callers to ext4_ext_find_extent()
The path variable returned via ext4_ext_find_extent is a kmalloc
variable and needs to be freeded.  It also contains a reference to
buffer_head which needs to be dropped.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-25 16:54:37 -05:00
Aneesh Kumar K.V 4cdeed861b ext4: Don't leave behind a half-created inode if ext4_mkdir() fails
If ext4_mkdir() fails to allocate the initial block for the directory,
don't leave behind a half-created directory inode with the link count
left at one.  This was caused by an inappropriate call to ext4_dec_count().

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-22 06:17:31 -05:00
Valerie Clement b73fce69ec ext4: Fix kernel BUG at fs/ext4/mballoc.c:910!
With the flex_bg feature enabled, a large file creation oopses the
kernel.   The BUG_ON is:
	BUG_ON(len >= EXT4_BLOCKS_PER_GROUP(sb));

As the allocation of the bitmaps and the inode table can be done
outside the block group with flex_bg, this allows to allocate up to
EXT4_BLOCKS_PER_GROUP blocks in a group.

This patch fixes the oops.

Signed-off-by: Valerie Clement <valerie.clement@bull.net>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-15 13:48:51 -05:00
Trond Myklebust 52833e897f Merge branch 'linus_origin' into hotfixes 2008-02-15 13:36:30 -05:00
Igor Mammedov 8aad018b6c [CIFS] Fix mixed case name in structure dfs_info3_param
Signed-off-by: Igor Mammedov <niallain@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-02-15 18:21:49 +00:00
Aneesh Kumar K.V 55bd725aa3 ext4: Fix locking hierarchy violation in ext4_fallocate()
ext4_fallocate() was trying to acquire i_data_sem outside of
jbd2_start_transaction/jbd2_journal_stop, which violates ext4's locking
hierarchy.  So we take i_mutex to prevent writes and truncates during
the complete fallocate operation, and use ext4_get_block_wrap() which
acquires and releases i_data_sem for each block allocation.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-15 12:47:21 -05:00
Andi Kleen 642be6ec21 Remove incorrect BKL comments in ext4
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-02-25 17:20:46 -05:00
Christoph Lameter 4a0962abd1 dentries: Extract common code to remove dentry from lru
Extract the common code to remove a dentry from the lru into a new function
dentry_lru_remove().

Two call sites used list_del() instead of list_del_init().  AFAIK the
performance of both is the same.  dentry_lru_remove() does a list_del_init().

As a result dentry->d_lru is now always empty when a dentry is freed.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:17:09 -08:00
Jan Blunck cf28b4863f d_path: Make d_path() use a struct path
d_path() is used on a <dentry,vfsmount> pair.  Lets use a struct path to
reflect this.

[akpm@linux-foundation.org: fix build in mm/memory.c]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Bryan Wu <bryan.wu@analog.com>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:17:09 -08:00
Jan Blunck c32c2f63a9 d_path: Make seq_path() use a struct path argument
seq_path() is always called with a dentry and a vfsmount from a struct path.
Make seq_path() take it directly as an argument.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:17:08 -08:00
Jan Blunck e83aece3af Use struct path in struct svc_expkey
I'm embedding struct path into struct svc_expkey.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:17:08 -08:00
Jan Blunck 5477549161 Use struct path in struct svc_export
I'm embedding struct path into struct svc_export.

[akpm@linux-foundation.org: coding-style fixes]
[ezk@cs.sunysb.edu: NFSD: fix wrong mnt_writer count in rename]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:17:08 -08:00
Jan Blunck 448678a0f3 d_path: Make get_dcookie() use a struct path argument
get_dcookie() is always called with a dentry and a vfsmount from a struct
path.  Make get_dcookie() take it directly as an argument.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:17:08 -08:00
Jan Blunck 3dcd25f37c d_path: Make proc_get_link() use a struct path argument
proc_get_link() is always called with a dentry and a vfsmount from a struct
path.  Make proc_get_link() take it directly as an argument.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:17:08 -08:00
Jan Blunck a03a8a709a d_path: kerneldoc cleanup
Move and update d_path() kernel API documentation.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:17:08 -08:00
Jan Blunck 329c97f0af One less parameter to __d_path
All callers to __d_path pass the dentry and vfsmount of a struct path to
__d_path.  Pass the struct path directly, instead.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:17:08 -08:00
Jan Blunck ac748a09fc Make set_fs_{root,pwd} take a struct path
In nearly all cases the set_fs_{root,pwd}() calls work on a struct
path. Change the function to reflect this and use path_get() here.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Jan Blunck 6ac08c39a1 Use struct path in fs_struct
* Use struct path in fs_struct.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Jan Blunck 5dd784d049 Introduce path_get()
This introduces the symmetric function to path_put() for getting a reference
to the dentry and vfsmount of a struct path in the right order.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Jan Blunck 09da5916ba Use path_put() in a few places instead of {mnt,d}put()
Use path_put() in a few places instead of {mnt,d}put()

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Jan Blunck 1d957f9bf8 Introduce path_put()
* Add path_put() functions for releasing a reference to the dentry and
  vfsmount of a struct path in the right order

* Switch from path_release(nd) to path_put(&nd->path)

* Rename dput_path() to path_put_conditional()

[akpm@linux-foundation.org: fix cifs]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Jan Blunck 4ac9137858 Embed a struct path into struct nameidata instead of nd->{dentry,mnt}
This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.

Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
  <dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
  struct path in every place where the stack can be traversed
- it reduces the overall code size:

without patch series:
   text    data     bss     dec     hex filename
5321639  858418  715768 6895825  6938d1 vmlinux

with patch series:
   text    data     bss     dec     hex filename
5320026  858418  715768 6894212  693284 vmlinux

This patch:

Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Jan Blunck 429731b155 Remove path_release_on_umount()
path_release_on_umount() should only be called from sys_umount(). I merged the
function into sys_umount() instead of having in in namei.c.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:32 -08:00
Mike Frysinger e2a366dc5c FLAT binaries: drop BINFMT_FLAT bad header magic warning
The warning issued by fs/binfmt_flat.c when the format handler is given a
non-FLAT and non-script executable is annoying to say the least when working
with FDPIC ELF objects.  If you build a kernel that supports both FLAT and
FDPIC ELFs on no-mmu, every time you execute an FDPIC ELF, the kernel spits
out this message.  While I understand a lot of newcomers to the no-mmu world
screw up generation of FLAT binaries, this warning is not usable for systems
that support more than just FLAT.

Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Bernd Schmidt <bernds_cb1@t-online.de>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 20:58:05 -08:00
Harvey Harrison 3c828e4945 inotify: make variables static in inotify_user.c
inotify_max_user_instances, inotify_max_user_watches,
inotify_max_queued_events can all be made static.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 20:58:04 -08:00
Steve French 03a143c909 [CIFS] fixup prefixpaths which contain multiple path components
Currently, when we get a prefixpath as part of mount, the kernel only
changes the first character to be a '/' or '\' depending on whether
posix extensions are enabled. This is problematic as it expects
mount.cifs to pass in the correct delimiter in the rest of the
prefixpath. But, mount.cifs may not know *what* the correct delimiter
is. It's a chicken and egg problem.

Note that mount.cifs should not do conversion of the
prefixpath - if we want posix behavior then '\' is legal in a path
(and we have had bugs in the distant path to prove to me that
customers sometimes have apps that require '\').  The kernel code
assumes that the path passed in is posix (and current code will handle
the first path component fine but was broken for Windows mounts
for "deep" prefixpaths unless the user specified a prefixpath with '\'
deep in it.   So e.g. with current kernel code:

1) mount to //server/share/dir1 will work to all server types
2) mount to //server/share/dir1/subdir1 will work to Samba
3) mount to //server/share/dir1\\subdir1 will work to Windows

But case two would fail to Windows without the fix.
With the kernel cifs module fix case two now works.

First analyzed by Jeff Layton and Simo Sorce

CC: Jeff Layton <jlayton@redhat.com>
CC: Simo Sorce <simo@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-02-14 06:38:30 +00:00
Nick Piggin 2785259631 nfs: use GFP_NOFS preloads for radix-tree insertion
NFS should use GFP_NOFS mode radix tree preloads rather than GFP_ATOMIC
allocations at radix-tree insertion-time.  This is important to reduce the
atomic memory requirement.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-13 23:24:09 -05:00
Olga Kornievskaia 8d042218b0 NFS: add missing spkm3 strings to mount option parser
This patch adds previous missing spkm3 string values that are needed
to parse mount options in the kernel.
2008-02-13 23:24:08 -05:00
Jeff Layton 25606656b1 NFS: remove error field from nfs_readdir_descriptor_t
The error field in nfs_readdir_descriptor_t is never used outside of the
function in which it is set. Remove the field and change the place that
does use it to use an existing local variable.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-13 23:24:07 -05:00
Dan Muntz 497799e7c0 NFS: missing spaces in KERN_WARNING
The warning message for a v4 server returning various bad sequence-ids is
missing spaces.

Signed-off-by: Dan Muntz <dmuntz@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-13 23:24:06 -05:00
Chuck Lever 4267c9561d NFS: Allow text-based mounts via compat_sys_mount
The compat_sys_mount() system call throws EINVAL for text-based NFSv4
mounts.

The text-based mount interface assumes that any mount option blob that
doesn't set the version field to "1" is a C string (ie not a legacy
mount request).  The compat_sys_mount() call treats blobs that don't
set the version field to "1" as an error.  We just relax the check in
compat_sys_mount() a bit to allow C strings to be passed down to the NFSv4
client.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-13 23:24:05 -05:00
Jeff Layton 8e60029f40 NFS: fix reference counting for NFSv4 callback thread
The reference counting for the NFSv4 callback thread stays artificially
high. When this thread comes down, it doesn't properly tear down the
svc_serv, causing a memory leak. In my testing on an older kernel on
x86_64, memory would leak out of the 8k kmalloc slab. So, we're leaking
at least a page of memory every time the thread comes down.

svc_create() creates the svc_serv with a sv_nrthreads count of 1, and
then svc_create_thread() increments that count. Whenever the callback
thread is started it has a sv_nrthreads count of 2. When coming down, it
calls svc_exit_thread() which decrements that count and if it hits 0, it
tears everything down. That never happens here since the count is always
at 2 when the thread exits.

The problem is that nfs_callback_up() should be calling svc_destroy() on
the svc_serv on both success and failure. This is how lockd_up_proto()
handles the reference counting, and doing that here fixes the leak.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-13 23:24:04 -05:00
Marcin Slusarz cba44359d1 udf: fix udf_add_free_space
In commit 742ba02a51 (udf: create common
function for changing free space counter) by accident I reversed safety
condition which lead to null pointer dereference in case of media error and
wrong counting of free space in normal situation

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-13 16:21:20 -08:00