linux-2.6

dect

Archived

Author	SHA1	Message	Date
tao.ma@oracle.com	30b8548f2c	[PATCH] ocfs2: Fix a wrong cluster calculation. In ocfs2_alloc_write_write_ctxt, the written clusters length is calculated by the byte length only. This may cause some problems if we start to write at some position in the end of one cluster and last to a second cluster while the "len" is smaller than a cluster size. In that case, we have to write 2 clusters actually. So we have to take the start position into consideration also. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-11 11:39:05 -07:00
Tiger Yang	c0123adef6	[PATCH] ocfs2: fix mount option parsing For some mount option types, ocfs2_parse_options() will try to access sb->s_fs_info to get at the ocfs2 private superblock. Unfortunately, that hasn't been allocated yet and will cause a kernel crash. Fix this by storing options in a struct which can then get pushed into the ocfs2_super once it's been allocated later. If we need more options which store to the ocfs2_super in the future, we can just fields to this struct. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-11 11:38:48 -07:00
Mark Fasheh	e0dceaf0a4	ocfs2: set non-default s_time_gran during mount We need to manually set this to '1' during mount, otherwise inode_setattr() will chop off the nanosecond portion of our timestamps. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:27:58 -07:00
Sunil Mushran	ce17204ae6	ocfs2: Retry sendpage() if it returns EAGAIN Instead of treating EAGAIN, returned from sendpage(), as an error, this patch retries the operation. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:27:38 -07:00
Sunil Mushran	480214d71f	ocfs2: Fix rename/extend race If one process is extending a file while another is renaming it, there exists a window when rename could flush the old inode's stale i_size to disk. This patch recognizes the fact that rename is only updating the old inode's ctime, so it ensures only that value is flushed to disk. Signed-off-by: Sunil Mushran <sunil.musran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:27:10 -07:00
Adrian Bunk	6a18380e7d	[2.6 patch] ocfs2_insert_extent(): remove dead code This patch removes some now dead code. Spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:26:03 -07:00
Mark Fasheh	5a25403175	ocfs2: Fix max offset calculations ocfs2_max_file_offset() was over-estimating the largest file size for several cases. This wasn't really a problem before, but now that we support sparse files, it needs to be more accurate. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:49 -07:00
Mark Fasheh	ce76fd30ce	ocfs2: check ia_size limits in setattr We have to manually check the requested truncate size as the check in vmtruncate() comes too late for Ocfs2. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:38 -07:00
Mark Fasheh	7c08d70c69	ocfs2: Fix some casting errors related to file writes ocfs2_align_clusters_to_page_index() needs to cast the clusters shift to pgoff_t and ocfs2_file_buffered_write() needs loff_t when calculating destination start for memcpy. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:27 -07:00
Mark Fasheh	a00cce356b	ocfs2: use s_maxbytes directly in ocfs2_change_file_space() There's no need to recalculate things via ocfs2_max_file_offset() as we've already done that to fill s_maxbytes, so use that instead. We can also un-export ocfs2_max_file_offset() then. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:07 -07:00
Mark Fasheh	c11e9fafb3	ocfs2: Restrict inode changes in ocfs2_update_inode_atime() ocfs2_update_inode_atime() calls ocfs2_mark_inode_dirty() to push changes from the struct inode into the ocfs2 disk inode. The problem is, ocfs2_mark_inode_dirty() might change other fields, depending on what happened to the struct inode. Since we don't always have locking to serialize changes to other fields (like i_size, etc), just fix things up to only touch the atime field. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:23:50 -07:00
Jens Axboe	3836df6b52	ocfs2: bad kunmap_atomic() kunmap_atomic() takes the virtual address, not the mapped page as argument. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-24 16:02:55 -07:00
Nick Piggin	1833633803	fix some conversion overflows Fix page index to offset conversion overflows in buffer layer, ecryptfs, and ocfs2. It would be nice to convert the whole tree to page_offset, but for now just fix the bugs. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-20 08:44:19 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Linus Torvalds	f745bb1c73	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: ->fallocate() support	2007-07-19 14:16:44 -07:00
Nick Piggin	d0217ac04c	mm: fault feedback #1 Change ->fault prototype. We now return an int, which contains VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte. FAULT_RET_ code tells the VM whether a page was found, whether it has been locked, and potentially other things. This is not quite the way he wanted it yet, but that's changed in the next patch (which requires changes to arch code). This means we no longer set VM_CAN_INVALIDATE in the vma in order to say that a page is locked which requires filemap_nopage to go away (because we can no longer remain backward compatible without that flag), but we were going to do that anyway. struct fault_data is renamed to struct vm_fault as Linus asked. address is now a void __user * that we should firmly encourage drivers not to use without really good reason. The page is now returned via a page pointer in the vm_fault struct. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Nick Piggin	54cb8821de	mm: merge populate and nopage into fault (fixes nonlinear) Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes the virtual address -> file offset differently from linear mappings. ->populate is a layering violation because the filesystem/pagecache code should need to know anything about the virtual memory mapping. The hitch here is that the ->nopage handler didn't pass down enough information (ie. pgoff). But it is more logical to pass pgoff rather than have the ->nopage function calculate it itself anyway (because that's a similar layering violation). Having the populate handler install the pte itself is likewise a nasty thing to be doing. This patch introduces a new fault handler that replaces ->nopage and ->populate and (later) ->nopfn. Most of the old mechanism is still in place so there is a lot of duplication and nice cleanups that can be removed if everyone switches over. The rationale for doing this in the first place is that nonlinear mappings are subject to the pagefault vs invalidate/truncate race too, and it seemed stupid to duplicate the synchronisation logic rather than just consolidate the two. After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in pagecache. Seems like a fringe functionality anyway. NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no users have hit mainline yet. [akpm@linux-foundation.org: cleanup] [randy.dunlap@oracle.com: doc. fixes for readahead] [akpm@linux-foundation.org: build fix] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Nick Piggin	d00806b183	mm: fix fault vs invalidate race for linear mappings Fix the race between invalidate_inode_pages and do_no_page. Andrea Arcangeli identified a subtle race between invalidation of pages from pagecache with userspace mappings, and do_no_page. The issue is that invalidation has to shoot down all mappings to the page, before it can be discarded from the pagecache. Between shooting down ptes to a particular page, and actually dropping the struct page from the pagecache, do_no_page from any process might fault on that page and establish a new mapping to the page just before it gets discarded from the pagecache. The most common case where such invalidation is used is in file truncation. This case was catered for by doing a sort of open-coded seqlock between the file's i_size, and its truncate_count. Truncation will decrease i_size, then increment truncate_count before unmapping userspace pages; do_no_page will read truncate_count, then find the page if it is within i_size, and then check truncate_count under the page table lock and back out and retry if it had subsequently been changed (ptl will serialise against unmapping, and ensure a potentially updated truncate_count is actually visible). Complexity and documentation issues aside, the locking protocol fails in the case where we would like to invalidate pagecache inside i_size. do_no_page can come in anytime and filemap_nopage is not aware of the invalidation in progress (as it is when it is outside i_size). The end result is that dangling (->mapping == NULL) pages that appear to be from a particular file may be mapped into userspace with nonsense data. Valid mappings to the same place will see a different page. Andrea implemented two working fixes, one using a real seqlock, another using a page->flags bit. He also proposed using the page lock in do_no_page, but that was initially considered too heavyweight. However, it is not a global or per-file lock, and the page cacheline is modified in do_no_page to increment _count and _mapcount anyway, so a further modification should not be a large performance hit. Scalability is not an issue. This patch implements this latter approach. ->nopage implementations return with the page locked if it is possible for their underlying file to be invalidated (in that case, they must set a special vm_flags bit to indicate so). do_no_page only unlocks the page after setting up the mapping completely. invalidation is excluded because it holds the page lock during invalidation of each page (and ensures that the page is not mapped while holding the lock). This also allows significant simplifications in do_no_page, because we have the page locked in the right place in the pagecache from the start. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Mark Fasheh	385820a38d	ocfs2: ->fallocate() support Plug ocfs2 into the ->fallocate() callback. This just re-uses the existing preallocation code. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-19 00:23:55 -07:00
Jeremy Fitzhardinge	86313c488a	usermodehelper: Tidy up waiting Rather than using a tri-state integer for the wait flag in call_usermodehelper_exec, define a proper enum, and use that. I've preserved the integer values so that any callers I've missed should still work OK. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Andi Kleen <ak@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: David Howells <dhowells@redhat.com>	2007-07-18 08:47:40 -07:00
Linus Torvalds	b8c638acac	Merge branch 'uninit-var' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6 * 'uninit-var' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6: arch/i386/* fs/* ipc/: mark variables with uninitialized_var() drivers/: mark variables with uninitialized_var()	2007-07-17 15:19:06 -07:00
Jeff Garzik	8e1c091ccc	arch/i386/* fs/* ipc/*: mark variables with uninitialized_var() Mark variables with uninitialized_var() if such a warning appears, and analysis proves that the var is initialized properly on all paths it is used. Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-07-17 16:23:19 -04:00
Satyam Sharma	3bd858ab1c	Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check Introduce is_owner_or_cap() macro in fs.h, and convert over relevant users to it. This is done because we want to avoid bugs in the future where we check for only effective fsuid of the current task against a file's owning uid, without simultaneously checking for CAP_FOWNER as well, thus violating its semantics. [ XFS uses special macros and structures, and in general looked ... untouchable, so we leave it alone -- but it has been looked over. ] The (current->fsuid != inode->i_uid) check in generic_permission() and exec_permission_lite() is left alone, because those operations are covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone. Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Cc: Al Viro <viro@ftp.linux.org.uk> Acked-by: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-17 12:00:03 -07:00
Christoph Hellwig	a569425512	knfsd: exportfs: add exportfs.h header currently the export_operation structure and helpers related to it are in fs.h. fs.h is already far too large and there are very few places needing the export bits, so split them off into a separate header. [akpm@linux-foundation.org: fix cifs build] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Neil Brown <neilb@suse.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-17 10:23:06 -07:00
Linus Torvalds	add096909d	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (32 commits) [PATCH] ocfs2: zero_user_page conversion ocfs2: Support xfs style space reservation ioctls ocfs2: support for removing file regions ocfs2: update truncate handling of partial clusters ocfs2: btree support for removal of arbirtrary extents ocfs2: Support creation of unwritten extents ocfs2: support writing of unwritten extents ocfs2: small cleanup of ocfs2_write_begin_nolock() ocfs2: btree changes for unwritten extents ocfs2: abstract btree growing calls ocfs2: use all extent block suballocators ocfs2: plug truncate into cached dealloc routines ocfs2: simplify deallocation locking ocfs2: harden buffer check during mapping of page blocks ocfs2: shared writeable mmap ocfs2: factor out write aops into nolock variants ocfs2: rework ocfs2_buffered_write_cluster() ocfs2: take ip_alloc_sem during entire truncate ocfs2: Add "preferred slot" mount option [KJ PATCH] Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c ...	2007-07-16 10:52:55 -07:00
Tejun Heo	7b595756ec	sysfs: kill unnecessary attribute->owner sysfs is now completely out of driver/module lifetime game. After deletion, a sysfs node doesn't access anything outside sysfs proper, so there's no reason to hold onto the attribute owners. Note that often the wrong modules were accounted for as owners leading to accessing removed modules. This patch kills now unnecessary attribute->owner. Note that with this change, userland holding a sysfs node does not prevent the backing module from being unloaded. For more info regarding lifetime rule cleanup, please read the following message. http://article.gmane.org/gmane.linux.kernel/510293 (tweaked by Greg to not delete the field just yet, to make it easier to merge things properly.) Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:06 -07:00
Eric Sandeen	54c57dc3b6	[PATCH] ocfs2: zero_user_page conversion Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:10 -07:00
Mark Fasheh	b25801038d	ocfs2: Support xfs style space reservation ioctls We re-use the RESVSP/UNRESVSP ioctls from xfs which allow the user to allocate and deallocate regions to a file without zeroing data or changing i_size. Though renamed, the structure passed in from user is identical to struct xfs_flock64. The three fields that are actually used right now are l_whence, l_start and l_len. This should get ocfs2 immediate compatibility with userspace software using the pre-existing xfs ioctls. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:09 -07:00
Mark Fasheh	063c4561f5	ocfs2: support for removing file regions Provide an internal interface for the removal of arbitrary file regions. ocfs2_remove_inode_range() takes a byte range within a file and will remove existing extents within that range. Partial clusters will be zeroed so that any read from within the region will return zeros. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:08 -07:00
Mark Fasheh	35edec1d52	ocfs2: update truncate handling of partial clusters The partial cluster zeroing code used during truncate usually assumes that the rightmost byte in the range to be zeroed lies on a cluster boundary. This makes sense for truncate, but punching holes might require zeroing on non-aligned rightmost boundaries. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:07 -07:00
Mark Fasheh	d0c7d7082e	ocfs2: btree support for removal of arbirtrary extents Add code to the btree paths to support the removal of arbitrary regions within an existing extent. With proper higher level support this can be used to "punch holes" in a file. Truncate (a special case of hole punching) could also be converted to use these methods. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:05 -07:00
Mark Fasheh	2ae99a6037	ocfs2: Support creation of unwritten extents This can now be trivially supported with re-use of our existing extend code. ocfs2_allocate_unwritten_extents() takes a start offset and a byte length and iterates over the inode, adding extents (marked as unwritten) until len is reached. Existing extents are skipped over. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:04 -07:00
Mark Fasheh	b27b7cbcf1	ocfs2: support writing of unwritten extents Update the write code to detect when the user is asking to write to an unwritten extent. Like writing to a hole, we must zero the region between the write and the cluster boundaries. Most of the existing cluster zeroing logic can be re-used with some additional checks for the unwritten flag on extent records. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:03 -07:00
Mark Fasheh	0d172baa55	ocfs2: small cleanup of ocfs2_write_begin_nolock() We can easily seperate out the write descriptor setup and manipulation into helper functions. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:01 -07:00
Mark Fasheh	328d5752e1	ocfs2: btree changes for unwritten extents Writes to a region marked as unwritten might result in a record split or merge. We can support splits by making minor changes to the existing insert code. Merges require left rotations which mostly re-use right rotation support functions. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:32:00 -07:00
Mark Fasheh	c3afcbb344	ocfs2: abstract btree growing calls The top level calls and logic for growing a tree can easily be abstracted out of ocfs2_insert_extent() into a seperate function - ocfs2_grow_tree(). This allows future code to easily grow btrees when needed. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:31:58 -07:00
Mark Fasheh	1f6697d072	ocfs2: use all extent block suballocators Now that we have a method to deallocate blocks from them, each node should allocate extent blocks from their local suballocator file. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:31:56 -07:00
Mark Fasheh	59a5e416d1	ocfs2: plug truncate into cached dealloc routines Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:31:55 -07:00
Mark Fasheh	2b604351bc	ocfs2: simplify deallocation locking Deallocation of suballocator blocks, most notably extent blocks, might involve multiple suballocator inodes. The locking for this can get extremely complicated, especially when the suballocator inodes to delete from aren't known until deep within an unrelated codepath. Implement a simple scheme for recording the blocks to be unlinked so that the actual deallocation can be done in a context which won't deadlock. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:31:54 -07:00
Mark Fasheh	bce997682f	ocfs2: harden buffer check during mapping of page blocks We don't want to submit buffer_new blocks for read i/o. This actually won't happen right now because those requests during an allocating write are all nicely aligned. It's probably a good idea to provide an explicit check though. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:31:52 -07:00
Mark Fasheh	7307de8051	ocfs2: shared writeable mmap Implement cluster consistent shared writeable mappings using the ->page_mkwrite() callback. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:31:51 -07:00
Mark Fasheh	607d44aa3f	ocfs2: factor out write aops into nolock variants ocfs2_mkwrite() will want this so that it can add some mmap specific checks before asking for a write. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:31:49 -07:00
Mark Fasheh	3a307ffc27	ocfs2: rework ocfs2_buffered_write_cluster() Use some ideas from the new-aops patch series and turn ocfs2_buffered_write_cluster() into a 2 stage operation with the caller copying data in between. The code now understands multiple cluster writes as a result of having to deal with a full page write for greater than 4k pages. This sets us up to easily call into the write path during ->page_mkwrite(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:31:46 -07:00
Mark Fasheh	2e89b2e48e	ocfs2: take ip_alloc_sem during entire truncate Use of the alloc sem during truncate was too narrow - we want to protect the i_size change and page truncation against mmap now. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:19:57 -07:00
Sunil Mushran	baf4661a82	ocfs2: Add "preferred slot" mount option ocfs2 will attempt to assign the node the slot# provided in the mount option. Failure to assign the preferred slot is not an error. This small feature can be useful for automated testing. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:19:54 -07:00
Shani Moideen	5fb0f7f010	[KJ PATCH] Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c Signed-off-by: Shani Moideen <shani.moideen@wipro.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:19:52 -07:00
Christoph Hellwig	800deef3f6	[PATCH] ocfs2: use list_for_each_entry where benefical Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:19:49 -07:00
Joel Becker	e6df3a663a	ocfs2: Wake up a starting region if it gets killed in the background. Tell o2cb_region_dev_write() to wake up if rmdir(2) happens on the heartbeat region while it is starting up. Then o2hb_region_dev_write() can check to see if it is alive and act accordingly. This prevents a hang (not being woken) and a crash (if it's woken by a signal). Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:19:46 -07:00
Joel Becker	16c6a4f24d	ocfs2: live heartbeat depends on the local node configuration Removing the local node configuration out from underneath a running heartbeat is "bad". Provide an API in the ocfs2 nodemanager to request a configfs dependancy on the local node, then use it in heartbeat. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:19:43 -07:00
Joel Becker	14829422be	ocfs2: Depend on configfs heartbeat items. ocfs2 mounts require a heartbeat region. Use the new configfs_depend_item() facility to actually depend on them so they can't go away from under us. First, teach cluster/nodemanager.c to depend an item on the o2cb subsystem. Then teach o2hb_register_callbacks to take a UUID and depend on the appropriate region. Finally, teach all users of o2hb to pass a UUID or NULL if they don't require a pin. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:19:40 -07:00
Joel Becker	e6bd07aee7	configfs: Convert subsystem semaphore to mutex Convert the su_sem member of struct configfs_subsystem to a struct mutex, as that's what it is. Also convert all the users and update Documentation/configfs.txt and Documentation/configfs_example.c accordingly. [ Conflict in fs/dlm/config.c with commit `3168b0780d` manually resolved. --Mark ] Inspired-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-07-10 17:10:56 -07:00
Jens Axboe	cac36bb06e	pipe: change the ->pin() operation to ->confirm() The name 'pin' was badly chosen, it doesn't pin a pipe buffer in the most commonly used sense in the kernel. So change the name to 'confirm', after debating this issue with Hugh Dickins a bit. A good return from ->confirm() means that the buffer is really there, and that the contents are good. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:15 +02:00
Jens Axboe	d6b29d7cee	splice: divorce the splice structure/function definitions from the pipe header We need to move even more stuff into the header so that folks can use the splice_to_pipe() implementation instead of open-coding a lot of pipe knowledge (see relay implementation), so move to our own header file finally. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:14 +02:00
Jens Axboe	5ffc4ef45b	sendfile: remove .sendfile from filesystems that use generic_file_sendfile() They can use generic_file_splice_read() instead. Since sys_sendfile() now prefers that, there should be no change in behaviour. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:13 +02:00
Jens Axboe	6a14b90bb6	vmsplice: add vmsplice-to-user support A bit of a cheat, it actually just copies the data to userspace. But this makes the interface nice and symmetric and enables people to build on splice, with room for future improvement in performance. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:12 +02:00
Jens Axboe	c66ab6fa70	splice: abstract out actor data For direct splicing (or private splicing), the output may not be a file. So abstract out the handling into a specified actor function and put the data in the splice_desc structure earlier, so we can build on top of that. This is the first step in better splice handling for drivers, and also for implementing vmsplice _to_ user memory. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:12 +02:00
Mark Fasheh	eeb47d1234	ocfs2: Fix invalid assertion during write on 64k pages The write path code intends to bug if a math error (or unhandled case) results in a write outside of the current cluster boundaries. The actual BUG_ON() statements however are incorrect, leading to a crash on kernels with 64k page size. Fix those by checking against the right variables. Also, move the assertions higher up within the functions so that they trip before the code starts to mark buffers. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-06-06 16:42:03 -07:00
Tiger Yang	59be7dc97b	ocfs2: Fix masklog breakage Some of the sysfs changes inadvertantly broke the simple runtime debug log filtering employed in ocfs2. Fix this by properly exporting the masklog category filter names. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-06-06 16:41:08 -07:00
Christoph Hellwig	d9b08b9efe	[PATCH] ocfs2: use generic_segment_checks Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-25 11:06:37 -07:00
Mark Fasheh	8fccfc829a	ocfs2: fix inode leak We weren't cleaning up our inode reference on error in ocfs2_reserve_local_alloc_bits(). Add a check for error return and iput() if need be. Move the code to set the alloc context inode info to the end of the function so we don't have any possibility of passing back a bad pointer. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-25 11:00:46 -07:00
Nate Diller	5c3c6bb770	[PATCH] ocfs2: use zero_user_page Use zero_user_page() instead of open-coding it. Signed-off-by: Nate Diller <nate.diller@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-25 11:00:39 -07:00
Mark Fasheh	1024c902ab	ocfs2: unmap_mapping_range() in ocfs2_truncate() We weren't calling this before, but since ocfs2 handles the entire truncate operation, we should. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-25 11:00:31 -07:00
Mark Fasheh	e9dfc0b2bc	ocfs2: trylock in ocfs2_readpage() Similarly to the page lock / cluster lock inversion in ocfs2_readpage, we can deadlock on ip_alloc_sem. We can down_read_trylock() instead and just return AOP_TRUNCATED_PAGE if the operation fails. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-25 11:00:23 -07:00
Christoph Lameter	a35afb830f	Remove SLAB_CTOR_CONSTRUCTOR SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@ucw.cz> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-17 05:23:04 -07:00
Randy Dunlap	c4a7f5eb5f	ocfs2: kobject/kset foobar Fix gcc warning and Oops that it causes: fs/ocfs2/cluster/masklog.c:161: warning: assignment from incompatible pointer type [ 2776.204120] OCFS2 Node Manager 1.3.3 [ 2776.211729] BUG: spinlock bad magic on CPU#0, modprobe/4424 [ 2776.214269] lock: ffff810021c8fe18, .magic: ffffffff, .owner: /6394416, .owner_cpu: 0 [ 2776.217864] [ 2776.217865] Call Trace: [ 2776.219662] [<ffffffff803426c8>] spin_bug+0x9e/0xe9 [ 2776.221921] [<ffffffff803427bf>] _raw_spin_lock+0x23/0xf9 [ 2776.224417] [<ffffffff8051acf4>] _spin_lock+0x9/0xb [ 2776.226676] [<ffffffff8033c3b1>] kobject_shadow_add+0x98/0x1ac [ 2776.229367] [<ffffffff8033c4d0>] kobject_add+0xb/0xd [ 2776.231665] [<ffffffff8033c4df>] kset_add+0xd/0xf [ 2776.233845] [<ffffffff8033c5a6>] kset_register+0x23/0x28 [ 2776.236309] [<ffffffff8808ccb7>] :ocfs2_nodemanager:mlog_sys_init+0x68/0x6d [ 2776.239518] [<ffffffff8808ccee>] :ocfs2_nodemanager:o2cb_sys_init+0x32/0x4a [ 2776.242726] [<ffffffff880b80a6>] :ocfs2_nodemanager:init_o2nm+0xa6/0xd5 [ 2776.245772] [<ffffffff8025266c>] sys_init_module+0x1471/0x15d2 [ 2776.248465] [<ffffffff8033f250>] simple_strtoull+0x0/0xdc [ 2776.250959] [<ffffffff8020948e>] system_call+0x7e/0x83 Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-10 09:26:52 -07:00
Randy Dunlap	e63340ae6b	header cleaning: don't include smp_lock.h when not used Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:07 -07:00
Christoph Lameter	50953fe9e0	slab allocators: Remove SLAB_DEBUG_INITIAL flag I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by SLAB. I think its purpose was to have a callback after an object has been freed to verify that the state is the constructor state again? The callback is performed before each freeing of an object. I would think that it is much easier to check the object state manually before the free. That also places the check near the code object manipulation of the object. Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was compiled with SLAB debugging on. If there would be code in a constructor handling SLAB_DEBUG_INITIAL then it would have to be conditional on SLAB_DEBUG otherwise it would just be dead code. But there is no such code in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real use of, difficult to understand and there are easier ways to accomplish the same effect (i.e. add debug code before kfree). There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be clear in fs inode caches. Remove the pointless checks (they would even be pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors. This is the last slab flag that SLUB did not support. Remove the check for unimplemented flags from SLUB. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:57 -07:00
Nick Piggin	6fe6900e1e	mm: make read_cache_page synchronous Ensure pages are uptodate after returning from read_cache_page, which allows us to cut out most of the filesystem-internal PageUptodate calls. I didn't have a great look down the call chains, but this appears to fixes 7 possible use-before uptodate in hfs, 2 in hfsplus, 1 in jfs, a few in ecryptfs, 1 in jffs2, and a possible cleared data overwritten with readpage in block2mtd. All depending on whether the filler is async and/or can return with a !uptodate page. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:51 -07:00
Linus Torvalds	fa24aa561a	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: Force use of GFP_NOFS in ocfs2_write() ocfs2: fix sparse warnings in fs/ocfs2/cluster ocfs2: fix sparse warnings in fs/ocfs2/dlm ocfs2: fix sparse warnings in fs/ocfs2 [PATCH] Copy i_flags to ocfs2 inode flags on write [PATCH] ocfs2: use __set_current_state() ocfs2: Wrap access of directory allocations with ip_alloc_sem. [PATCH] fs/ocfs2/: make 3 functions static ocfs2: Implement compat_ioctl()	2007-05-04 20:44:54 -07:00
Greg Kroah-Hartman	823bccfc40	remove "struct subsystem" as it is no longer needed We need to work on cleaning up the relationship between kobjects, ksets and ktypes. The removal of 'struct subsystem' is the first step of this, especially as it is not really needed at all. Thanks to Kay for fixing the bugs in this patch. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-05-02 18:57:59 -07:00
Mark Fasheh	9315f130e1	ocfs2: Force use of GFP_NOFS in ocfs2_write() We can otherwise recurse into the file system. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-02 15:08:34 -07:00
Mark Fasheh	5fdf1e6771	ocfs2: fix sparse warnings in fs/ocfs2/cluster Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-02 15:08:23 -07:00
Mark Fasheh	a7d25539fd	ocfs2: fix sparse warnings in fs/ocfs2/dlm Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-02 15:08:15 -07:00
Mark Fasheh	1ca1a111b1	ocfs2: fix sparse warnings in fs/ocfs2 None of these are actually harmful, but the noise makes looking for real problems difficult. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-02 15:08:08 -07:00
Jan Kara	6e4b0d5692	[PATCH] Copy i_flags to ocfs2 inode flags on write Propagate flags such as S_APPEND, S_IMMUTABLE, etc. from i_flags into ocfs2-specific ip_attr. Hence, when someone sets these flags via a different interface than ioctl, they are stored correctly. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-02 15:07:58 -07:00
Milind Arun Choudhary	5c2c9d383e	[PATCH] ocfs2: use __set_current_state() use __set_current_state(TASK_) instead of current->state = TASK_, in fs/ocfs2 Signed-off-by: Milind Arun Choudhary <milindchoudhary@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-02 15:07:50 -07:00
Joel Becker	ee19a77956	ocfs2: Wrap access of directory allocations with ip_alloc_sem. OCFS2_I(inode)->ip_alloc_sem is a read-write semaphore protecting local concurrent access of ocfs2 inodes. However, ocfs2 directories were not taking the semaphore while they accessed or modified the allocation tree. ocfs2_extend_dir() needs to take the semaphore in a write mode when it adds to the allocation. All other directory users get there via ocfs2_bread(), which takes the semaphore in read mode. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-02 15:07:42 -07:00
Adrian Bunk	6cb129f567	[PATCH] fs/ocfs2/: make 3 functions static This patch makes the following needlessly global functions static: - aops.c: ocfs2_write_data_page() - dlmglue.c: ocfs2_dump_meta_lvb_info() - file.c: ocfs2_set_inode_size() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-02 15:07:27 -07:00
Mark Fasheh	586d232b19	ocfs2: Implement compat_ioctl() We need this to support 32 bit system calls on 64 bit kernels. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-05-02 15:07:16 -07:00
Mark Fasheh	8341897882	ocfs2: Cache extent records The extent map code was ripped out earlier because of an inability to deal with holes. This patch adds back a simpler caching scheme requiring far less code. Our old extent map caching was designed back when meta data block caching in Ocfs2 didn't work very well, resulting in many disk reads. These days our metadata caching is much better, resulting in no un-necessary disk reads. As a result, extent caching doesn't have to be as fancy, nor does it have to cache as many extents. Keeping the last 3 extents seen should be sufficient to give us a small performance boost on some streaming workloads. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:10:40 -07:00
Mark Fasheh	7cdfc3a1c3	ocfs2: Remember rw lock level during direct io Cluster locking might have been redone because a direct write won't complete, so this needs to be reflected in the iocb. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:07:45 -07:00
Mark Fasheh	8110b073a9	ocfs2: Fix up i_blocks calculation to know about holes Older file systems which didn't support holes did a dumb calculation of i_blocks based on i_size. This is no longer accurate, so fix things up to take actual allocation into account. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:07:40 -07:00
Mark Fasheh	4f902c3772	ocfs2: Fix extent lookup to return true size of holes Initially, we had wired things to return a size '1' of holes. Cook up a small amount of code to find the next extent and calculate the number of clusters between the virtual offset and the next allocated extent. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:02:45 -07:00
Mark Fasheh	49cb8d2d49	ocfs2: Read from an unwritten extent returns zeros Return an optional extent flags field from our lookup functions and wire up callers to treat unwritten regions as holes for the purpose of returning zeros to the user. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:02:41 -07:00
Mark Fasheh	e48edee2d8	ocfs2: make room for unwritten extents flag Due to the size of our group bitmaps, we'll never have a leaf node extent record with more than 16 bits worth of clusters. Split e_clusters up so that leaf nodes can get a flags field where we can mark unwritten extents. Interior nodes whose length references all the child nodes beneath it can't split their e_clusters field, so we use a union to preserve sizing there. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:02:37 -07:00
Mark Fasheh	6af67d8205	ocfs2: Use own splice write actor We need to fill holes during a splice write. Provide our own splice write actor which can call ocfs2_file_buffered_write() with a splice-specific callback. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:02:34 -07:00
Mark Fasheh	fa41045fcb	ocfs2: Use do_sync_mapping_range() in ocfs2_zero_tail_for_truncate() Do this instead of filemap_fdatawrite() - this way we sync only the range between i_size and the cluster boundary. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:02:30 -07:00
Mark Fasheh	60b11392f1	ocfs2: zero tail of sparse files on truncate Since we don't zero on extend anymore, truncate needs to be fixed up to zero the part of a file between i_size and and end of it's cluster. Otherwise a subsequent extend could expose bad data. This introduced a new helper, which can be used in ocfs2_write(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:02:20 -07:00
Mark Fasheh	25baf2da14	ocfs2: Teach ocfs2_get_block() about holes ocfs2_get_block() didn't understand sparse files, fix that. Also remove some code that isn't really useful anymore. We can fix up ocfs2_direct_IO_get_blocks() at the same time. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:02:16 -07:00
Mark Fasheh	5069120b72	ocfs2: remove ocfs2_prepare_write() and ocfs2_commit_write() These are no longer used, and can't handle file systems with sparse file allocation. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:02:12 -07:00
Mark Fasheh	9517bac6cc	ocfs2: teach ocfs2_file_aio_write() about sparse files Unfortunately, ocfs2 can no longer make use of generic_file_aio_write_nlock() because allocating writes will require zeroing of pages adjacent to the I/O for cluster sizes greater than page size. Implement a custom file write here, which can order page locks for zeroing. This also has the advantage that cluster locks can easily be ordered outside of the page locks. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:02:08 -07:00
Mark Fasheh	89488984ac	ocfs2: Turn off shared writeable mmap for local files systems with holes. This will be turned back on once we can do allocation in ->page_mkwrite(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:02:01 -07:00
Mark Fasheh	abf8b15694	ocfs2: abstract out allocation locking Right now, file allocation for ocfs2 is done within ocfs2_extend_file(), which is either called from ->setattr() (for an i_size change), or at the top of ocfs2_file_aio_write(). Inodes on file systems with sparse file support will want to do their allocation during the actual write call. In either case the cluster locking decisions are the same. We abstract out that code into a new function, ocfs2_lock_allocators() which will be used by a later patch to enable writing to sparse files. This also provides a nice cleanup of ocfs2_extend_allocation(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:01:58 -07:00
Mark Fasheh	3a0782d09c	ocfs2: teach extend/truncate about sparse files For ocfs2_truncate_file(), we eliminate the "simple" truncate case which no longer exists since i_size is not tied to i_clusters. In ocfs2_extend_file(), we skip the allocation / page zeroing code for file systems which understand sparse files. The core truncate code is changed to do a bottom up tree traversal. This gets abstracted out into it's own function. To make things more readable, most of the special case handling for in-inode extents from ocfs2_do_truncate() is also removed. Though write support for sparse files comes in a later patch, we at least update ocfs2_prepare_inode_for_write() to skip allocation for sparse files. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:01:56 -07:00
Mark Fasheh	363041a5f7	ocfs2: temporarily remove extent map caching The code in extent_map.c is not prepared to deal with a subtree being rotated between lookups. This can happen when filling holes in sparse files. Instead of a lengthy patch to update the code (which would likely lose the benefit of caching subtree roots), we remove most of the algorithms and implement a simple path based lookup. A less ambitious extent caching scheme will be added in a later patch. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 15:01:31 -07:00
Mark Fasheh	dcd0538ff4	ocfs2: sparse b-tree support Introduce tree rotations into the b-tree code. This will allow ocfs2 to support sparse files. Much of the added code is designed to be generic (in the ocfs2 sense) so that it can later be re-used to implement large extended attributes. This patch only adds the rotation code and does minimal updates to callers of the extent api. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 14:44:03 -07:00
Mark Fasheh	6f16bf655c	ocfs2: small cleanup of ocfs2_request_delete() There are two checks in there (one for inode newness, one for other mounted nodes) which are unnecessary, so remove them. The DLM will allow the trylock in either case without any messaging overhead. Removing these makes ocfs2_request_delete() a one liner function, so just move the trylock out one level into ocfs2_query_inode_wipe(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 14:40:55 -07:00
Tiger Yang	68e2b740c4	ocfs2: remove unused code Remove node messaging code that becomes unused with the delete inode vote removal. [Removed even more cruft which I spotted during review --Mark] Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 14:40:16 -07:00
Tiger Yang	500086300e	ocfs2: Remove delete inode vote Ocfs2 currently does cluster-wide node messaging to check the open state of an inode during delete. This patch removes that mechanism in favor of an inode cluster lock which is taken at shared read when an inode is first read and dropped in clear_inode(). This allows a deleting node to test the liveness of an inode by attempting to take an exclusive lock. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 14:39:48 -07:00
Mark Fasheh	a9f5f70739	ocfs2: filter more error prints We don't want to print anything at all in ocfs2_lookup() when getting an error from ocfs2_iget() - it could be something as innocuous as a signal being detected in the dlm. ocfs2_permission() should filter on -ENOENT which ocfs2_meta_lock() can return if the inode was deleted on another node. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-04-26 13:39:08 -07:00

1 2 3 4 5 ...

431 Commits