dect
/
linux-2.6
Archived
13
0
Fork 0

Merge Btrfs into fs/btrfs

This commit is contained in:
Chris Mason 2008-09-25 15:32:36 -04:00
commit aef8755711
56 changed files with 34936 additions and 0 deletions

356
fs/btrfs/COPYING Normal file
View File

@ -0,0 +1,356 @@
NOTE! This copyright does *not* cover user programs that use kernel
services by normal system calls - this is merely considered normal use
of the kernel, and does *not* fall under the heading of "derived work".
Also note that the GPL below is copyrighted by the Free Software
Foundation, but the instance of code that it refers to (the Linux
kernel) is copyrighted by me and others who actually wrote it.
Also note that the only valid version of the GPL as far as the kernel
is concerned is _this_ particular version of the license (ie v2, not
v2.2 or v3.x or whatever), unless explicitly otherwise stated.
Linus Torvalds
----------------------------------------
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Library General
Public License instead of this License.

48
fs/btrfs/INSTALL Normal file
View File

@ -0,0 +1,48 @@
Install Instructions
Btrfs puts snapshots and subvolumes into the root directory of the FS. This
directory can only be changed by btrfsctl right now, and normal filesystem
operations do not work on it. The default subvolume is called 'default',
and you can create files and directories in mount_point/default
Btrfs uses libcrc32c in the kernel for file and metadata checksums. You need
to compile the kernel with:
CONFIG_LIBCRC32C=m
libcrc32c can be static as well. Once your kernel is setup, typing make in the
btrfs module sources will build against the running kernel. When the build is
complete:
modprobe libcrc32c
insmod btrfs.ko
The Btrfs utility programs require libuuid to build. This can be found
in the e2fsprogs sources, and is usually available as libuuid or
e2fsprogs-devel from various distros.
Building the utilities is just make ; make install. The programs go
into /usr/local/bin. The commands available are:
mkfs.btrfs: create a filesystem
btrfsctl: control program to create snapshots and subvolumes:
mount /dev/sda2 /mnt
btrfsctl -s new_subvol_name /mnt
btrfsctl -s snapshot_of_default /mnt/default
btrfsctl -s snapshot_of_new_subvol /mnt/new_subvol_name
btrfsctl -s snapshot_of_a_snapshot /mnt/snapshot_of_new_subvol
ls /mnt
default snapshot_of_a_snapshot snapshot_of_new_subvol
new_subvol_name snapshot_of_default
Snapshots and subvolumes cannot be deleted right now, but you can
rm -rf all the files and directories inside them.
btrfsck: do a limited check of the FS extent trees.</li>
debug-tree: print all of the FS metadata in text form. Example:
debug-tree /dev/sda2 >& big_output_file

29
fs/btrfs/Makefile Normal file
View File

@ -0,0 +1,29 @@
ifneq ($(KERNELRELEASE),)
# kbuild part of makefile
obj-m := btrfs.o
btrfs-y := super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
file-item.o inode-item.o inode-map.o disk-io.o \
transaction.o bit-radix.o inode.o file.o tree-defrag.o \
extent_map.o sysfs.o struct-funcs.o xattr.o ordered-data.o \
extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \
ref-cache.o export.o tree-log.o acl.o free-space-cache.o
else
# Normal Makefile
KERNELDIR := /lib/modules/`uname -r`/build
all: version
$(MAKE) -C $(KERNELDIR) M=`pwd` modules
version:
bash version.sh
modules_install:
$(MAKE) -C $(KERNELDIR) M=`pwd` modules_install
clean:
$(MAKE) -C $(KERNELDIR) M=`pwd` clean
tester:
$(MAKE) -C $(KERNELDIR) M=`pwd` tree-defrag.o transaction.o sysfs.o super.o root-tree.o inode-map.o inode-item.o inode.o file-item.o file.o extent_map.o disk-io.o ctree.o dir-item.o extent-tree.o
endif

20
fs/btrfs/TODO Normal file
View File

@ -0,0 +1,20 @@
* cleanup, add more error checking, get rid of BUG_ONs
* Fix ENOSPC handling
* Make allocator smarter
* add a block group to struct inode
* Do actual block accounting
* Check compat and incompat flags on the inode
* Get rid of struct ctree_path, limiting tree levels held at one time
* Add generation number to key pointer in nodes
* Add generation number to inode
* forbid cross subvolume renames and hardlinks
* Release
* Do real tree locking
* Add extent mirroring (backup copies of blocks)
* Add fancy interface to get access to incremental backups
* Add fancy striped extents to make big reads faster
* Use relocation to try and fix write errors
* Make allocator much smarter
* xattrs (directory streams for regular files)
* Scrub & defrag

352
fs/btrfs/acl.c Normal file
View File

@ -0,0 +1,352 @@
/*
* Copyright (C) 2007 Red Hat. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/fs.h>
#include <linux/string.h>
#include <linux/xattr.h>
#include <linux/posix_acl_xattr.h>
#include <linux/posix_acl.h>
#include <linux/sched.h>
#include "ctree.h"
#include "btrfs_inode.h"
#include "xattr.h"
#ifdef CONFIG_FS_POSIX_ACL
static void btrfs_update_cached_acl(struct inode *inode,
struct posix_acl **p_acl,
struct posix_acl *acl)
{
spin_lock(&inode->i_lock);
if (*p_acl && *p_acl != BTRFS_ACL_NOT_CACHED)
posix_acl_release(*p_acl);
*p_acl = posix_acl_dup(acl);
spin_unlock(&inode->i_lock);
}
static struct posix_acl *btrfs_get_acl(struct inode *inode, int type)
{
int size;
const char *name;
char *value = NULL;
struct posix_acl *acl = NULL, **p_acl;
switch (type) {
case ACL_TYPE_ACCESS:
name = POSIX_ACL_XATTR_ACCESS;
p_acl = &BTRFS_I(inode)->i_acl;
break;
case ACL_TYPE_DEFAULT:
name = POSIX_ACL_XATTR_DEFAULT;
p_acl = &BTRFS_I(inode)->i_default_acl;
break;
default:
return ERR_PTR(-EINVAL);
}
spin_lock(&inode->i_lock);
if (*p_acl != BTRFS_ACL_NOT_CACHED)
acl = posix_acl_dup(*p_acl);
spin_unlock(&inode->i_lock);
if (acl)
return acl;
size = __btrfs_getxattr(inode, name, "", 0);
if (size > 0) {
value = kzalloc(size, GFP_NOFS);
if (!value)
return ERR_PTR(-ENOMEM);
size = __btrfs_getxattr(inode, name, value, size);
if (size > 0) {
acl = posix_acl_from_xattr(value, size);
btrfs_update_cached_acl(inode, p_acl, acl);
}
kfree(value);
} else if (size == -ENOENT) {
acl = NULL;
btrfs_update_cached_acl(inode, p_acl, acl);
}
return acl;
}
static int btrfs_xattr_get_acl(struct inode *inode, int type,
void *value, size_t size)
{
struct posix_acl *acl;
int ret = 0;
acl = btrfs_get_acl(inode, type);
if (IS_ERR(acl))
return PTR_ERR(acl);
if (acl == NULL)
return -ENODATA;
ret = posix_acl_to_xattr(acl, value, size);
posix_acl_release(acl);
return ret;
}
/*
* Needs to be called with fs_mutex held
*/
static int btrfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
{
int ret, size = 0;
const char *name;
struct posix_acl **p_acl;
char *value = NULL;
mode_t mode;
if (acl) {
ret = posix_acl_valid(acl);
if (ret < 0)
return ret;
ret = 0;
}
switch (type) {
case ACL_TYPE_ACCESS:
mode = inode->i_mode;
ret = posix_acl_equiv_mode(acl, &mode);
if (ret < 0)
return ret;
ret = 0;
inode->i_mode = mode;
name = POSIX_ACL_XATTR_ACCESS;
p_acl = &BTRFS_I(inode)->i_acl;
break;
case ACL_TYPE_DEFAULT:
if (!S_ISDIR(inode->i_mode))
return acl ? -EINVAL : 0;
name = POSIX_ACL_XATTR_DEFAULT;
p_acl = &BTRFS_I(inode)->i_default_acl;
break;
default:
return -EINVAL;
}
if (acl) {
size = posix_acl_xattr_size(acl->a_count);
value = kmalloc(size, GFP_NOFS);
if (!value) {
ret = -ENOMEM;
goto out;
}
ret = posix_acl_to_xattr(acl, value, size);
if (ret < 0)
goto out;
}
ret = __btrfs_setxattr(inode, name, value, size, 0);
out:
if (value)
kfree(value);
if (!ret)
btrfs_update_cached_acl(inode, p_acl, acl);
return ret;
}
static int btrfs_xattr_set_acl(struct inode *inode, int type,
const void *value, size_t size)
{
int ret = 0;
struct posix_acl *acl = NULL;
if (value) {
acl = posix_acl_from_xattr(value, size);
if (acl == NULL) {
value = NULL;
size = 0;
} else if (IS_ERR(acl)) {
return PTR_ERR(acl);
}
}
ret = btrfs_set_acl(inode, acl, type);
posix_acl_release(acl);
return ret;
}
static int btrfs_xattr_acl_access_get(struct inode *inode, const char *name,
void *value, size_t size)
{
return btrfs_xattr_get_acl(inode, ACL_TYPE_ACCESS, value, size);
}
static int btrfs_xattr_acl_access_set(struct inode *inode, const char *name,
const void *value, size_t size, int flags)
{
return btrfs_xattr_set_acl(inode, ACL_TYPE_ACCESS, value, size);
}
static int btrfs_xattr_acl_default_get(struct inode *inode, const char *name,
void *value, size_t size)
{
return btrfs_xattr_get_acl(inode, ACL_TYPE_DEFAULT, value, size);
}
static int btrfs_xattr_acl_default_set(struct inode *inode, const char *name,
const void *value, size_t size, int flags)
{
return btrfs_xattr_set_acl(inode, ACL_TYPE_DEFAULT, value, size);
}
int btrfs_check_acl(struct inode *inode, int mask)
{
struct posix_acl *acl;
int error = -EAGAIN;
acl = btrfs_get_acl(inode, ACL_TYPE_ACCESS);
if (IS_ERR(acl))
return PTR_ERR(acl);
if (acl) {
error = posix_acl_permission(inode, acl, mask);
posix_acl_release(acl);
}
return error;
}
/*
* btrfs_init_acl is already generally called under fs_mutex, so the locking
* stuff has been fixed to work with that. If the locking stuff changes, we
* need to re-evaluate the acl locking stuff.
*/
int btrfs_init_acl(struct inode *inode, struct inode *dir)
{
struct posix_acl *acl = NULL;
int ret = 0;
/* this happens with subvols */
if (!dir)
return 0;
if (!S_ISLNK(inode->i_mode)) {
if (IS_POSIXACL(dir)) {
acl = btrfs_get_acl(dir, ACL_TYPE_DEFAULT);
if (IS_ERR(acl))
return PTR_ERR(acl);
}
if (!acl)
inode->i_mode &= ~current->fs->umask;
}
if (IS_POSIXACL(dir) && acl) {
struct posix_acl *clone;
mode_t mode;
if (S_ISDIR(inode->i_mode)) {
ret = btrfs_set_acl(inode, acl, ACL_TYPE_DEFAULT);
if (ret)
goto failed;
}
clone = posix_acl_clone(acl, GFP_NOFS);
ret = -ENOMEM;
if (!clone)
goto failed;
mode = inode->i_mode;
ret = posix_acl_create_masq(clone, &mode);
if (ret >= 0) {
inode->i_mode = mode;
if (ret > 0) {
/* we need an acl */
ret = btrfs_set_acl(inode, clone,
ACL_TYPE_ACCESS);
}
}
}
failed:
posix_acl_release(acl);
return ret;
}
int btrfs_acl_chmod(struct inode *inode)
{
struct posix_acl *acl, *clone;
int ret = 0;
if (S_ISLNK(inode->i_mode))
return -EOPNOTSUPP;
if (!IS_POSIXACL(inode))
return 0;
acl = btrfs_get_acl(inode, ACL_TYPE_ACCESS);
if (IS_ERR(acl) || !acl)
return PTR_ERR(acl);
clone = posix_acl_clone(acl, GFP_KERNEL);
posix_acl_release(acl);
if (!clone)
return -ENOMEM;
ret = posix_acl_chmod_masq(clone, inode->i_mode);
if (!ret)
ret = btrfs_set_acl(inode, clone, ACL_TYPE_ACCESS);
posix_acl_release(clone);
return ret;
}
struct xattr_handler btrfs_xattr_acl_default_handler = {
.prefix = POSIX_ACL_XATTR_DEFAULT,
.get = btrfs_xattr_acl_default_get,
.set = btrfs_xattr_acl_default_set,
};
struct xattr_handler btrfs_xattr_acl_access_handler = {
.prefix = POSIX_ACL_XATTR_ACCESS,
.get = btrfs_xattr_acl_access_get,
.set = btrfs_xattr_acl_access_set,
};
#else /* CONFIG_FS_POSIX_ACL */
int btrfs_acl_chmod(struct inode *inode)
{
return 0;
}
int btrfs_init_acl(struct inode *inode, struct inode *dir)
{
return 0;
}
int btrfs_check_acl(struct inode *inode, int mask)
{
return 0;
}
#endif /* CONFIG_FS_POSIX_ACL */

343
fs/btrfs/async-thread.c Normal file
View File

@ -0,0 +1,343 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/version.h>
#include <linux/kthread.h>
#include <linux/list.h>
#include <linux/spinlock.h>
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,20)
# include <linux/freezer.h>
#else
# include <linux/sched.h>
#endif
#include "async-thread.h"
/*
* container for the kthread task pointer and the list of pending work
* One of these is allocated per thread.
*/
struct btrfs_worker_thread {
/* pool we belong to */
struct btrfs_workers *workers;
/* list of struct btrfs_work that are waiting for service */
struct list_head pending;
/* list of worker threads from struct btrfs_workers */
struct list_head worker_list;
/* kthread */
struct task_struct *task;
/* number of things on the pending list */
atomic_t num_pending;
unsigned long sequence;
/* protects the pending list. */
spinlock_t lock;
/* set to non-zero when this thread is already awake and kicking */
int working;
/* are we currently idle */
int idle;
};
/*
* helper function to move a thread onto the idle list after it
* has finished some requests.
*/
static void check_idle_worker(struct btrfs_worker_thread *worker)
{
if (!worker->idle && atomic_read(&worker->num_pending) <
worker->workers->idle_thresh / 2) {
unsigned long flags;
spin_lock_irqsave(&worker->workers->lock, flags);
worker->idle = 1;
list_move(&worker->worker_list, &worker->workers->idle_list);
spin_unlock_irqrestore(&worker->workers->lock, flags);
}
}
/*
* helper function to move a thread off the idle list after new
* pending work is added.
*/
static void check_busy_worker(struct btrfs_worker_thread *worker)
{
if (worker->idle && atomic_read(&worker->num_pending) >=
worker->workers->idle_thresh) {
unsigned long flags;
spin_lock_irqsave(&worker->workers->lock, flags);
worker->idle = 0;
list_move_tail(&worker->worker_list,
&worker->workers->worker_list);
spin_unlock_irqrestore(&worker->workers->lock, flags);
}
}
/*
* main loop for servicing work items
*/
static int worker_loop(void *arg)
{
struct btrfs_worker_thread *worker = arg;
struct list_head *cur;
struct btrfs_work *work;
do {
spin_lock_irq(&worker->lock);
while(!list_empty(&worker->pending)) {
cur = worker->pending.next;
work = list_entry(cur, struct btrfs_work, list);
list_del(&work->list);
clear_bit(0, &work->flags);
work->worker = worker;
spin_unlock_irq(&worker->lock);
work->func(work);
atomic_dec(&worker->num_pending);
spin_lock_irq(&worker->lock);
check_idle_worker(worker);
}
worker->working = 0;
if (freezing(current)) {
refrigerator();
} else {
set_current_state(TASK_INTERRUPTIBLE);
spin_unlock_irq(&worker->lock);
schedule();
__set_current_state(TASK_RUNNING);
}
} while (!kthread_should_stop());
return 0;
}
/*
* this will wait for all the worker threads to shutdown
*/
int btrfs_stop_workers(struct btrfs_workers *workers)
{
struct list_head *cur;
struct btrfs_worker_thread *worker;
list_splice_init(&workers->idle_list, &workers->worker_list);
while(!list_empty(&workers->worker_list)) {
cur = workers->worker_list.next;
worker = list_entry(cur, struct btrfs_worker_thread,
worker_list);
kthread_stop(worker->task);
list_del(&worker->worker_list);
kfree(worker);
}
return 0;
}
/*
* simple init on struct btrfs_workers
*/
void btrfs_init_workers(struct btrfs_workers *workers, char *name, int max)
{
workers->num_workers = 0;
INIT_LIST_HEAD(&workers->worker_list);
INIT_LIST_HEAD(&workers->idle_list);
spin_lock_init(&workers->lock);
workers->max_workers = max;
workers->idle_thresh = 32;
workers->name = name;
}
/*
* starts new worker threads. This does not enforce the max worker
* count in case you need to temporarily go past it.
*/
int btrfs_start_workers(struct btrfs_workers *workers, int num_workers)
{
struct btrfs_worker_thread *worker;
int ret = 0;
int i;
for (i = 0; i < num_workers; i++) {
worker = kzalloc(sizeof(*worker), GFP_NOFS);
if (!worker) {
ret = -ENOMEM;
goto fail;
}
INIT_LIST_HEAD(&worker->pending);
INIT_LIST_HEAD(&worker->worker_list);
spin_lock_init(&worker->lock);
atomic_set(&worker->num_pending, 0);
worker->task = kthread_run(worker_loop, worker,
"btrfs-%s-%d", workers->name,
workers->num_workers + i);
worker->workers = workers;
if (IS_ERR(worker->task)) {
kfree(worker);
ret = PTR_ERR(worker->task);
goto fail;
}
spin_lock_irq(&workers->lock);
list_add_tail(&worker->worker_list, &workers->idle_list);
worker->idle = 1;
workers->num_workers++;
spin_unlock_irq(&workers->lock);
}
return 0;
fail:
btrfs_stop_workers(workers);
return ret;
}
/*
* run through the list and find a worker thread that doesn't have a lot
* to do right now. This can return null if we aren't yet at the thread
* count limit and all of the threads are busy.
*/
static struct btrfs_worker_thread *next_worker(struct btrfs_workers *workers)
{
struct btrfs_worker_thread *worker;
struct list_head *next;
int enforce_min = workers->num_workers < workers->max_workers;
/*
* if we find an idle thread, don't move it to the end of the
* idle list. This improves the chance that the next submission
* will reuse the same thread, and maybe catch it while it is still
* working
*/
if (!list_empty(&workers->idle_list)) {
next = workers->idle_list.next;
worker = list_entry(next, struct btrfs_worker_thread,
worker_list);
return worker;
}
if (enforce_min || list_empty(&workers->worker_list))
return NULL;
/*
* if we pick a busy task, move the task to the end of the list.
* hopefully this will keep things somewhat evenly balanced
*/
next = workers->worker_list.next;
worker = list_entry(next, struct btrfs_worker_thread, worker_list);
atomic_inc(&worker->num_pending);
worker->sequence++;
if (worker->sequence % workers->idle_thresh == 0)
list_move_tail(next, &workers->worker_list);
return worker;
}
static struct btrfs_worker_thread *find_worker(struct btrfs_workers *workers)
{
struct btrfs_worker_thread *worker;
unsigned long flags;
again:
spin_lock_irqsave(&workers->lock, flags);
worker = next_worker(workers);
spin_unlock_irqrestore(&workers->lock, flags);
if (!worker) {
spin_lock_irqsave(&workers->lock, flags);
if (workers->num_workers >= workers->max_workers) {
struct list_head *fallback = NULL;
/*
* we have failed to find any workers, just
* return the force one
*/
if (!list_empty(&workers->worker_list))
fallback = workers->worker_list.next;
if (!list_empty(&workers->idle_list))
fallback = workers->idle_list.next;
BUG_ON(!fallback);
worker = list_entry(fallback,
struct btrfs_worker_thread, worker_list);
spin_unlock_irqrestore(&workers->lock, flags);
} else {
spin_unlock_irqrestore(&workers->lock, flags);
/* we're below the limit, start another worker */
btrfs_start_workers(workers, 1);
goto again;
}
}
return worker;
}
/*
* btrfs_requeue_work just puts the work item back on the tail of the list
* it was taken from. It is intended for use with long running work functions
* that make some progress and want to give the cpu up for others.
*/
int btrfs_requeue_work(struct btrfs_work *work)
{
struct btrfs_worker_thread *worker = work->worker;
unsigned long flags;
if (test_and_set_bit(0, &work->flags))
goto out;
spin_lock_irqsave(&worker->lock, flags);
atomic_inc(&worker->num_pending);
list_add_tail(&work->list, &worker->pending);
check_busy_worker(worker);
spin_unlock_irqrestore(&worker->lock, flags);
out:
return 0;
}
/*
* places a struct btrfs_work into the pending queue of one of the kthreads
*/
int btrfs_queue_worker(struct btrfs_workers *workers, struct btrfs_work *work)
{
struct btrfs_worker_thread *worker;
unsigned long flags;
int wake = 0;
/* don't requeue something already on a list */
if (test_and_set_bit(0, &work->flags))
goto out;
worker = find_worker(workers);
spin_lock_irqsave(&worker->lock, flags);
atomic_inc(&worker->num_pending);
check_busy_worker(worker);
list_add_tail(&work->list, &worker->pending);
/*
* avoid calling into wake_up_process if this thread has already
* been kicked
*/
if (!worker->working)
wake = 1;
worker->working = 1;
spin_unlock_irqrestore(&worker->lock, flags);
if (wake)
wake_up_process(worker->task);
out:
return 0;
}

82
fs/btrfs/async-thread.h Normal file
View File

@ -0,0 +1,82 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __BTRFS_ASYNC_THREAD_
#define __BTRFS_ASYNC_THREAD_
struct btrfs_worker_thread;
/*
* This is similar to a workqueue, but it is meant to spread the operations
* across all available cpus instead of just the CPU that was used to
* queue the work. There is also some batching introduced to try and
* cut down on context switches.
*
* By default threads are added on demand up to 2 * the number of cpus.
* Changing struct btrfs_workers->max_workers is one way to prevent
* demand creation of kthreads.
*
* the basic model of these worker threads is to embed a btrfs_work
* structure in your own data struct, and use container_of in a
* work function to get back to your data struct.
*/
struct btrfs_work {
/*
* only func should be set to the function you want called
* your work struct is passed as the only arg
*/
void (*func)(struct btrfs_work *work);
/*
* flags should be set to zero. It is used to make sure the
* struct is only inserted once into the list.
*/
unsigned long flags;
/* don't touch these */
struct btrfs_worker_thread *worker;
struct list_head list;
};
struct btrfs_workers {
/* current number of running workers */
int num_workers;
/* max number of workers allowed. changed by btrfs_start_workers */
int max_workers;
/* once a worker has this many requests or fewer, it is idle */
int idle_thresh;
/* list with all the work threads */
struct list_head worker_list;
struct list_head idle_list;
/* lock for finding the next worker thread to queue on */
spinlock_t lock;
/* extra name for this worker */
char *name;
};
int btrfs_queue_worker(struct btrfs_workers *workers, struct btrfs_work *work);
int btrfs_start_workers(struct btrfs_workers *workers, int num_workers);
int btrfs_stop_workers(struct btrfs_workers *workers);
void btrfs_init_workers(struct btrfs_workers *workers, char *name, int max);
int btrfs_requeue_work(struct btrfs_work *work);
#endif

130
fs/btrfs/bit-radix.c Normal file
View File

@ -0,0 +1,130 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include "bit-radix.h"
#define BIT_ARRAY_BYTES 256
#define BIT_RADIX_BITS_PER_ARRAY ((BIT_ARRAY_BYTES - sizeof(unsigned long)) * 8)
extern struct kmem_cache *btrfs_bit_radix_cachep;
int set_radix_bit(struct radix_tree_root *radix, unsigned long bit)
{
unsigned long *bits;
unsigned long slot;
int bit_slot;
int ret;
slot = bit / BIT_RADIX_BITS_PER_ARRAY;
bit_slot = bit % BIT_RADIX_BITS_PER_ARRAY;
bits = radix_tree_lookup(radix, slot);
if (!bits) {
bits = kmem_cache_alloc(btrfs_bit_radix_cachep, GFP_NOFS);
if (!bits)
return -ENOMEM;
memset(bits + 1, 0, BIT_ARRAY_BYTES - sizeof(unsigned long));
bits[0] = slot;
ret = radix_tree_insert(radix, slot, bits);
if (ret)
return ret;
}
ret = test_and_set_bit(bit_slot, bits + 1);
if (ret < 0)
ret = 1;
return ret;
}
int test_radix_bit(struct radix_tree_root *radix, unsigned long bit)
{
unsigned long *bits;
unsigned long slot;
int bit_slot;
slot = bit / BIT_RADIX_BITS_PER_ARRAY;
bit_slot = bit % BIT_RADIX_BITS_PER_ARRAY;
bits = radix_tree_lookup(radix, slot);
if (!bits)
return 0;
return test_bit(bit_slot, bits + 1);
}
int clear_radix_bit(struct radix_tree_root *radix, unsigned long bit)
{
unsigned long *bits;
unsigned long slot;
int bit_slot;
int i;
int empty = 1;
slot = bit / BIT_RADIX_BITS_PER_ARRAY;
bit_slot = bit % BIT_RADIX_BITS_PER_ARRAY;
bits = radix_tree_lookup(radix, slot);
if (!bits)
return 0;
clear_bit(bit_slot, bits + 1);
for (i = 1; i < BIT_ARRAY_BYTES / sizeof(unsigned long); i++) {
if (bits[i]) {
empty = 0;
break;
}
}
if (empty) {
bits = radix_tree_delete(radix, slot);
BUG_ON(!bits);
kmem_cache_free(btrfs_bit_radix_cachep, bits);
}
return 0;
}
int find_first_radix_bit(struct radix_tree_root *radix, unsigned long *retbits,
unsigned long start, int nr)
{
unsigned long *bits;
unsigned long *gang[4];
int found;
int ret;
int i;
int total_found = 0;
unsigned long slot;
slot = start / BIT_RADIX_BITS_PER_ARRAY;
ret = radix_tree_gang_lookup(radix, (void **)gang, slot,
ARRAY_SIZE(gang));
found = start % BIT_RADIX_BITS_PER_ARRAY;
for (i = 0; i < ret && nr > 0; i++) {
bits = gang[i];
while(nr > 0) {
found = find_next_bit(bits + 1,
BIT_RADIX_BITS_PER_ARRAY,
found);
if (found < BIT_RADIX_BITS_PER_ARRAY) {
*retbits = bits[0] *
BIT_RADIX_BITS_PER_ARRAY + found;
retbits++;
nr--;
total_found++;
found++;
} else
break;
}
found = 0;
}
return total_found;
}

33
fs/btrfs/bit-radix.h Normal file
View File

@ -0,0 +1,33 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __BIT_RADIX__
#define __BIT_RADIX__
#include <linux/radix-tree.h>
int set_radix_bit(struct radix_tree_root *radix, unsigned long bit);
int test_radix_bit(struct radix_tree_root *radix, unsigned long bit);
int clear_radix_bit(struct radix_tree_root *radix, unsigned long bit);
int find_first_radix_bit(struct radix_tree_root *radix, unsigned long *retbits,
unsigned long start, int nr);
static inline void init_bit_radix(struct radix_tree_root *radix)
{
INIT_RADIX_TREE(radix, GFP_NOFS);
}
#endif

85
fs/btrfs/btrfs_inode.h Normal file
View File

@ -0,0 +1,85 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __BTRFS_I__
#define __BTRFS_I__
#include "extent_map.h"
#include "extent_io.h"
#include "ordered-data.h"
/* in memory btrfs inode */
struct btrfs_inode {
struct btrfs_root *root;
struct btrfs_block_group_cache *block_group;
struct btrfs_key location;
struct extent_map_tree extent_tree;
struct extent_io_tree io_tree;
struct extent_io_tree io_failure_tree;
struct mutex csum_mutex;
struct mutex extent_mutex;
struct mutex log_mutex;
struct inode vfs_inode;
struct btrfs_ordered_inode_tree ordered_tree;
struct posix_acl *i_acl;
struct posix_acl *i_default_acl;
/* for keeping track of orphaned inodes */
struct list_head i_orphan;
struct list_head delalloc_inodes;
/* full 64 bit generation number */
u64 generation;
/*
* transid of the trans_handle that last modified this inode
*/
u64 last_trans;
/*
* transid that last logged this inode
*/
u64 logged_trans;
/* trans that last made a change that should be fully fsync'd */
u64 log_dirty_trans;
u64 delalloc_bytes;
u64 disk_i_size;
u32 flags;
/*
* if this is a directory then index_cnt is the counter for the index
* number for new files that are created
*/
u64 index_cnt;
};
static inline struct btrfs_inode *BTRFS_I(struct inode *inode)
{
return container_of(inode, struct btrfs_inode, vfs_inode);
}
static inline void btrfs_i_size_write(struct inode *inode, u64 size)
{
inode->i_size = size;
BTRFS_I(inode)->disk_i_size = size;
}
#endif

60
fs/btrfs/compat.h Normal file
View File

@ -0,0 +1,60 @@
#ifndef _COMPAT_H_
#define _COMPAT_H_
#if LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,26)
#define trylock_page(page) (!TestSetPageLocked(page))
#endif
#if LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,27)
static inline struct dentry *d_obtain_alias(struct inode *inode)
{
struct dentry *d;
if (!inode)
return NULL;
if (IS_ERR(inode))
return ERR_CAST(inode);
d = d_alloc_anon(inode);
if (!d)
iput(inode);
return d;
}
#endif
#if LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,18)
static inline void btrfs_drop_nlink(struct inode *inode)
{
inode->i_nlink--;
}
static inline void btrfs_inc_nlink(struct inode *inode)
{
inode->i_nlink++;
}
#else
# define btrfs_drop_nlink(inode) drop_nlink(inode)
# define btrfs_inc_nlink(inode) inc_nlink(inode)
#endif
/*
* Even if AppArmor isn't enabled, it still has different prototypes.
* Add more distro/version pairs here to declare which has AppArmor applied.
*/
#if defined(CONFIG_SUSE_KERNEL)
# if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,22)
# define REMOVE_SUID_PATH 1
# endif
#endif
/*
* catch any other distros that have patched in apparmor. This isn't
* 100% reliable because it won't catch people that hand compile their
* own distro kernels without apparmor compiled in. But, it is better
* than nothing.
*/
#ifdef CONFIG_SECURITY_APPARMOR
# define REMOVE_SUID_PATH 1
#endif
#endif /* _COMPAT_H_ */

108
fs/btrfs/crc32c.h Normal file
View File

@ -0,0 +1,108 @@
#ifndef __BTRFS_CRC32C__
#define __BTRFS_CRC32C__
#include <asm/byteorder.h>
#include <linux/crc32c.h>
#include <linux/version.h>
/* #define CONFIG_BTRFS_HW_SUM 1 */
#ifdef CONFIG_BTRFS_HW_SUM
#ifdef CONFIG_X86
/*
* Using hardware provided CRC32 instruction to accelerate the CRC32 disposal.
* CRC32C polynomial:0x1EDC6F41(BE)/0x82F63B78(LE)
* CRC32 is a new instruction in Intel SSE4.2, the reference can be found at:
* http://www.intel.com/products/processor/manuals/
* Intel(R) 64 and IA-32 Architectures Software Developer's Manual
* Volume 2A: Instruction Set Reference, A-M
*/
#include <asm/cpufeature.h>
#include <asm/processor.h>
#define X86_FEATURE_XMM4_2 (4*32+20) /* Streaming SIMD Extensions-4.2 */
#define cpu_has_xmm4_2 boot_cpu_has(X86_FEATURE_XMM4_2)
#ifdef CONFIG_X86_64
#define REX_PRE "0x48, "
#define SCALE_F 8
#else
#define REX_PRE
#define SCALE_F 4
#endif
static inline u32 btrfs_crc32c_le_hw_byte(u32 crc, unsigned char const *data,
size_t length)
{
while (length--) {
__asm__ __volatile__(
".byte 0xf2, 0xf, 0x38, 0xf0, 0xf1"
:"=S"(crc)
:"0"(crc), "c"(*data)
);
data++;
}
return crc;
}
static inline u32 __pure btrfs_crc32c_le_hw(u32 crc, unsigned char const *p,
size_t len)
{
unsigned int iquotient = len / SCALE_F;
unsigned int iremainder = len % SCALE_F;
#ifdef CONFIG_X86_64
u64 *ptmp = (u64 *)p;
#else
u32 *ptmp = (u32 *)p;
#endif
while (iquotient--) {
__asm__ __volatile__(
".byte 0xf2, " REX_PRE "0xf, 0x38, 0xf1, 0xf1;"
:"=S"(crc)
:"0"(crc), "c"(*ptmp)
);
ptmp++;
}
if (iremainder)
crc = btrfs_crc32c_le_hw_byte(crc, (unsigned char *)ptmp,
iremainder);
return crc;
}
#endif /* CONFIG_BTRFS_HW_SUM */
static inline u32 __btrfs_crc32c(u32 crc, unsigned char const *address,
size_t len)
{
#ifdef CONFIG_BTRFS_HW_SUM
if (cpu_has_xmm4_2)
return btrfs_crc32c_le_hw(crc, address, len);
#endif
return crc32c_le(crc, address, len);
}
#else
#define __btrfs_crc32c(seed, data, length) crc32c(seed, data, length)
#endif /* CONFIG_X86 */
/**
* implementation of crc32c_le() changed in linux-2.6.23,
* has of v0.13 btrfs-progs is using the latest version.
* We must workaround older implementations of crc32c_le()
* found on older kernel versions.
*/
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,23)
#define btrfs_crc32c(seed, data, length) \
__cpu_to_le32( __btrfs_crc32c( __le32_to_cpu(seed), \
(unsigned char const *)data, length) )
#else
#define btrfs_crc32c(seed, data, length) \
__btrfs_crc32c(seed, (unsigned char const *)data, length)
#endif
#endif

3450
fs/btrfs/ctree.c Normal file

File diff suppressed because it is too large Load Diff

1875
fs/btrfs/ctree.h Normal file

File diff suppressed because it is too large Load Diff

345
fs/btrfs/dir-item.c Normal file
View File

@ -0,0 +1,345 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include "ctree.h"
#include "disk-io.h"
#include "hash.h"
#include "transaction.h"
static struct btrfs_dir_item *insert_with_overflow(struct btrfs_trans_handle
*trans,
struct btrfs_root *root,
struct btrfs_path *path,
struct btrfs_key *cpu_key,
u32 data_size,
const char *name,
int name_len)
{
int ret;
char *ptr;
struct btrfs_item *item;
struct extent_buffer *leaf;
ret = btrfs_insert_empty_item(trans, root, path, cpu_key, data_size);
if (ret == -EEXIST) {
struct btrfs_dir_item *di;
di = btrfs_match_dir_item_name(root, path, name, name_len);
if (di)
return ERR_PTR(-EEXIST);
ret = btrfs_extend_item(trans, root, path, data_size);
WARN_ON(ret > 0);
}
if (ret < 0)
return ERR_PTR(ret);
WARN_ON(ret > 0);
leaf = path->nodes[0];
item = btrfs_item_nr(leaf, path->slots[0]);
ptr = btrfs_item_ptr(leaf, path->slots[0], char);
BUG_ON(data_size > btrfs_item_size(leaf, item));
ptr += btrfs_item_size(leaf, item) - data_size;
return (struct btrfs_dir_item *)ptr;
}
int btrfs_insert_xattr_item(struct btrfs_trans_handle *trans,
struct btrfs_root *root, const char *name,
u16 name_len, const void *data, u16 data_len,
u64 dir)
{
int ret = 0;
struct btrfs_path *path;
struct btrfs_dir_item *dir_item;
unsigned long name_ptr, data_ptr;
struct btrfs_key key, location;
struct btrfs_disk_key disk_key;
struct extent_buffer *leaf;
u32 data_size;
key.objectid = dir;
btrfs_set_key_type(&key, BTRFS_XATTR_ITEM_KEY);
key.offset = btrfs_name_hash(name, name_len);
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
if (name_len + data_len + sizeof(struct btrfs_dir_item) >
BTRFS_LEAF_DATA_SIZE(root) - sizeof(struct btrfs_item))
return -ENOSPC;
data_size = sizeof(*dir_item) + name_len + data_len;
dir_item = insert_with_overflow(trans, root, path, &key, data_size,
name, name_len);
/*
* FIXME: at some point we should handle xattr's that are larger than
* what we can fit in our leaf. We set location to NULL b/c we arent
* pointing at anything else, that will change if we store the xattr
* data in a separate inode.
*/
BUG_ON(IS_ERR(dir_item));
memset(&location, 0, sizeof(location));
leaf = path->nodes[0];
btrfs_cpu_key_to_disk(&disk_key, &location);
btrfs_set_dir_item_key(leaf, dir_item, &disk_key);
btrfs_set_dir_type(leaf, dir_item, BTRFS_FT_XATTR);
btrfs_set_dir_name_len(leaf, dir_item, name_len);
btrfs_set_dir_transid(leaf, dir_item, trans->transid);
btrfs_set_dir_data_len(leaf, dir_item, data_len);
name_ptr = (unsigned long)(dir_item + 1);
data_ptr = (unsigned long)((char *)name_ptr + name_len);
write_extent_buffer(leaf, name, name_ptr, name_len);
write_extent_buffer(leaf, data, data_ptr, data_len);
btrfs_mark_buffer_dirty(path->nodes[0]);
btrfs_free_path(path);
return ret;
}
int btrfs_insert_dir_item(struct btrfs_trans_handle *trans, struct btrfs_root
*root, const char *name, int name_len, u64 dir,
struct btrfs_key *location, u8 type, u64 index)
{
int ret = 0;
int ret2 = 0;
struct btrfs_path *path;
struct btrfs_dir_item *dir_item;
struct extent_buffer *leaf;
unsigned long name_ptr;
struct btrfs_key key;
struct btrfs_disk_key disk_key;
u32 data_size;
key.objectid = dir;
btrfs_set_key_type(&key, BTRFS_DIR_ITEM_KEY);
key.offset = btrfs_name_hash(name, name_len);
path = btrfs_alloc_path();
data_size = sizeof(*dir_item) + name_len;
dir_item = insert_with_overflow(trans, root, path, &key, data_size,
name, name_len);
if (IS_ERR(dir_item)) {
ret = PTR_ERR(dir_item);
if (ret == -EEXIST)
goto second_insert;
goto out;
}
leaf = path->nodes[0];
btrfs_cpu_key_to_disk(&disk_key, location);
btrfs_set_dir_item_key(leaf, dir_item, &disk_key);
btrfs_set_dir_type(leaf, dir_item, type);
btrfs_set_dir_data_len(leaf, dir_item, 0);
btrfs_set_dir_name_len(leaf, dir_item, name_len);
btrfs_set_dir_transid(leaf, dir_item, trans->transid);
name_ptr = (unsigned long)(dir_item + 1);
write_extent_buffer(leaf, name, name_ptr, name_len);
btrfs_mark_buffer_dirty(leaf);
second_insert:
/* FIXME, use some real flag for selecting the extra index */
if (root == root->fs_info->tree_root) {
ret = 0;
goto out;
}
btrfs_release_path(root, path);
btrfs_set_key_type(&key, BTRFS_DIR_INDEX_KEY);
key.offset = index;
dir_item = insert_with_overflow(trans, root, path, &key, data_size,
name, name_len);
if (IS_ERR(dir_item)) {
ret2 = PTR_ERR(dir_item);
goto out;
}
leaf = path->nodes[0];
btrfs_cpu_key_to_disk(&disk_key, location);
btrfs_set_dir_item_key(leaf, dir_item, &disk_key);
btrfs_set_dir_type(leaf, dir_item, type);
btrfs_set_dir_data_len(leaf, dir_item, 0);
btrfs_set_dir_name_len(leaf, dir_item, name_len);
btrfs_set_dir_transid(leaf, dir_item, trans->transid);
name_ptr = (unsigned long)(dir_item + 1);
write_extent_buffer(leaf, name, name_ptr, name_len);
btrfs_mark_buffer_dirty(leaf);
out:
btrfs_free_path(path);
if (ret)
return ret;
if (ret2)
return ret2;
return 0;
}
struct btrfs_dir_item *btrfs_lookup_dir_item(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct btrfs_path *path, u64 dir,
const char *name, int name_len,
int mod)
{
int ret;
struct btrfs_key key;
int ins_len = mod < 0 ? -1 : 0;
int cow = mod != 0;
struct btrfs_key found_key;
struct extent_buffer *leaf;
key.objectid = dir;
btrfs_set_key_type(&key, BTRFS_DIR_ITEM_KEY);
key.offset = btrfs_name_hash(name, name_len);
ret = btrfs_search_slot(trans, root, &key, path, ins_len, cow);
if (ret < 0)
return ERR_PTR(ret);
if (ret > 0) {
if (path->slots[0] == 0)
return NULL;
path->slots[0]--;
}
leaf = path->nodes[0];
btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
if (found_key.objectid != dir ||
btrfs_key_type(&found_key) != BTRFS_DIR_ITEM_KEY ||
found_key.offset != key.offset)
return NULL;
return btrfs_match_dir_item_name(root, path, name, name_len);
}
struct btrfs_dir_item *
btrfs_lookup_dir_index_item(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct btrfs_path *path, u64 dir,
u64 objectid, const char *name, int name_len,
int mod)
{
int ret;
struct btrfs_key key;
int ins_len = mod < 0 ? -1 : 0;
int cow = mod != 0;
key.objectid = dir;
btrfs_set_key_type(&key, BTRFS_DIR_INDEX_KEY);
key.offset = objectid;
ret = btrfs_search_slot(trans, root, &key, path, ins_len, cow);
if (ret < 0)
return ERR_PTR(ret);
if (ret > 0)
return ERR_PTR(-ENOENT);
return btrfs_match_dir_item_name(root, path, name, name_len);
}
struct btrfs_dir_item *btrfs_lookup_xattr(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct btrfs_path *path, u64 dir,
const char *name, u16 name_len,
int mod)
{
int ret;
struct btrfs_key key;
int ins_len = mod < 0 ? -1 : 0;
int cow = mod != 0;
struct btrfs_key found_key;
struct extent_buffer *leaf;
key.objectid = dir;
btrfs_set_key_type(&key, BTRFS_XATTR_ITEM_KEY);
key.offset = btrfs_name_hash(name, name_len);
ret = btrfs_search_slot(trans, root, &key, path, ins_len, cow);
if (ret < 0)
return ERR_PTR(ret);
if (ret > 0) {
if (path->slots[0] == 0)
return NULL;
path->slots[0]--;
}
leaf = path->nodes[0];
btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
if (found_key.objectid != dir ||
btrfs_key_type(&found_key) != BTRFS_XATTR_ITEM_KEY ||
found_key.offset != key.offset)
return NULL;
return btrfs_match_dir_item_name(root, path, name, name_len);
}
struct btrfs_dir_item *btrfs_match_dir_item_name(struct btrfs_root *root,
struct btrfs_path *path,
const char *name, int name_len)
{
struct btrfs_dir_item *dir_item;
unsigned long name_ptr;
u32 total_len;
u32 cur = 0;
u32 this_len;
struct extent_buffer *leaf;
leaf = path->nodes[0];
dir_item = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_dir_item);
total_len = btrfs_item_size_nr(leaf, path->slots[0]);
while(cur < total_len) {
this_len = sizeof(*dir_item) +
btrfs_dir_name_len(leaf, dir_item) +
btrfs_dir_data_len(leaf, dir_item);
name_ptr = (unsigned long)(dir_item + 1);
if (btrfs_dir_name_len(leaf, dir_item) == name_len &&
memcmp_extent_buffer(leaf, name, name_ptr, name_len) == 0)
return dir_item;
cur += this_len;
dir_item = (struct btrfs_dir_item *)((char *)dir_item +
this_len);
}
return NULL;
}
int btrfs_delete_one_dir_name(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct btrfs_path *path,
struct btrfs_dir_item *di)
{
struct extent_buffer *leaf;
u32 sub_item_len;
u32 item_len;
int ret = 0;
leaf = path->nodes[0];
sub_item_len = sizeof(*di) + btrfs_dir_name_len(leaf, di) +
btrfs_dir_data_len(leaf, di);
item_len = btrfs_item_size_nr(leaf, path->slots[0]);
if (sub_item_len == item_len) {
ret = btrfs_del_item(trans, root, path);
} else {
/* MARKER */
unsigned long ptr = (unsigned long)di;
unsigned long start;
start = btrfs_item_ptr_offset(leaf, path->slots[0]);
memmove_extent_buffer(leaf, ptr, ptr + sub_item_len,
item_len - (ptr + sub_item_len - start));
ret = btrfs_truncate_item(trans, root, path,
item_len - sub_item_len, 1);
}
return 0;
}

2056
fs/btrfs/disk-io.c Normal file

File diff suppressed because it is too large Load Diff

84
fs/btrfs/disk-io.h Normal file
View File

@ -0,0 +1,84 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __DISKIO__
#define __DISKIO__
#define BTRFS_SUPER_INFO_OFFSET (16 * 1024)
#define BTRFS_SUPER_INFO_SIZE 4096
struct btrfs_device;
struct btrfs_fs_devices;
struct extent_buffer *read_tree_block(struct btrfs_root *root, u64 bytenr,
u32 blocksize, u64 parent_transid);
int readahead_tree_block(struct btrfs_root *root, u64 bytenr, u32 blocksize,
u64 parent_transid);
struct extent_buffer *btrfs_find_create_tree_block(struct btrfs_root *root,
u64 bytenr, u32 blocksize);
int clean_tree_block(struct btrfs_trans_handle *trans,
struct btrfs_root *root, struct extent_buffer *buf);
struct btrfs_root *open_ctree(struct super_block *sb,
struct btrfs_fs_devices *fs_devices,
char *options);
int close_ctree(struct btrfs_root *root);
int write_ctree_super(struct btrfs_trans_handle *trans,
struct btrfs_root *root);
struct extent_buffer *btrfs_find_tree_block(struct btrfs_root *root,
u64 bytenr, u32 blocksize);
struct btrfs_root *btrfs_lookup_fs_root(struct btrfs_fs_info *fs_info,
u64 root_objectid);
struct btrfs_root *btrfs_read_fs_root(struct btrfs_fs_info *fs_info,
struct btrfs_key *location,
const char *name, int namelen);
struct btrfs_root *btrfs_read_fs_root_no_radix(struct btrfs_root *tree_root,
struct btrfs_key *location);
struct btrfs_root *btrfs_read_fs_root_no_name(struct btrfs_fs_info *fs_info,
struct btrfs_key *location);
int btrfs_insert_dev_radix(struct btrfs_root *root,
struct block_device *bdev,
u64 device_id,
u64 block_start,
u64 num_blocks);
void btrfs_btree_balance_dirty(struct btrfs_root *root, unsigned long nr);
int btrfs_free_fs_root(struct btrfs_fs_info *fs_info, struct btrfs_root *root);
void btrfs_mark_buffer_dirty(struct extent_buffer *buf);
int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid);
int btrfs_set_buffer_uptodate(struct extent_buffer *buf);
int wait_on_tree_block_writeback(struct btrfs_root *root,
struct extent_buffer *buf);
int btrfs_read_buffer(struct extent_buffer *buf, u64 parent_transid);
u32 btrfs_csum_data(struct btrfs_root *root, char *data, u32 seed, size_t len);
void btrfs_csum_final(u32 crc, char *result);
int btrfs_open_device(struct btrfs_device *dev);
int btrfs_verify_block_csum(struct btrfs_root *root,
struct extent_buffer *buf);
int btrfs_bio_wq_end_io(struct btrfs_fs_info *info, struct bio *bio,
int metadata);
int btrfs_wq_submit_bio(struct btrfs_fs_info *fs_info, struct inode *inode,
int rw, struct bio *bio, int mirror_num,
extent_submit_bio_hook_t *submit_bio_hook);
int btrfs_congested_async(struct btrfs_fs_info *info, int iodone);
unsigned long btrfs_async_submit_limit(struct btrfs_fs_info *info);
int btrfs_write_tree_block(struct extent_buffer *buf);
int btrfs_wait_tree_block_writeback(struct extent_buffer *buf);
int btrfs_free_log_root_tree(struct btrfs_trans_handle *trans,
struct btrfs_fs_info *fs_info);
int btrfs_init_log_root_tree(struct btrfs_trans_handle *trans,
struct btrfs_fs_info *fs_info);
int btree_lock_page_hook(struct page *page);
#endif

207
fs/btrfs/export.c Normal file
View File

@ -0,0 +1,207 @@
#include <linux/fs.h>
#include <linux/types.h>
#include "ctree.h"
#include "disk-io.h"
#include "btrfs_inode.h"
#include "print-tree.h"
#include "export.h"
#include "compat.h"
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,28)
#define FILEID_BTRFS_WITHOUT_PARENT 0x4d
#define FILEID_BTRFS_WITH_PARENT 0x4e
#define FILEID_BTRFS_WITH_PARENT_ROOT 0x4f
#endif
#define BTRFS_FID_SIZE_NON_CONNECTABLE (offsetof(struct btrfs_fid, parent_objectid)/4)
#define BTRFS_FID_SIZE_CONNECTABLE (offsetof(struct btrfs_fid, parent_root_objectid)/4)
#define BTRFS_FID_SIZE_CONNECTABLE_ROOT (sizeof(struct btrfs_fid)/4)
static int btrfs_encode_fh(struct dentry *dentry, u32 *fh, int *max_len,
int connectable)
{
struct btrfs_fid *fid = (struct btrfs_fid *)fh;
struct inode *inode = dentry->d_inode;
int len = *max_len;
int type;
if ((len < BTRFS_FID_SIZE_NON_CONNECTABLE) ||
(connectable && len < BTRFS_FID_SIZE_CONNECTABLE))
return 255;
len = BTRFS_FID_SIZE_NON_CONNECTABLE;
type = FILEID_BTRFS_WITHOUT_PARENT;
fid->objectid = BTRFS_I(inode)->location.objectid;
fid->root_objectid = BTRFS_I(inode)->root->objectid;
fid->gen = inode->i_generation;
if (connectable && !S_ISDIR(inode->i_mode)) {
struct inode *parent;
u64 parent_root_id;
spin_lock(&dentry->d_lock);
parent = dentry->d_parent->d_inode;
fid->parent_objectid = BTRFS_I(parent)->location.objectid;
fid->parent_gen = parent->i_generation;
parent_root_id = BTRFS_I(parent)->root->objectid;
spin_unlock(&dentry->d_lock);
if (parent_root_id != fid->root_objectid) {
fid->parent_root_objectid = parent_root_id;
len = BTRFS_FID_SIZE_CONNECTABLE_ROOT;
type = FILEID_BTRFS_WITH_PARENT_ROOT;
} else {
len = BTRFS_FID_SIZE_CONNECTABLE;
type = FILEID_BTRFS_WITH_PARENT;
}
}
*max_len = len;
return type;
}
static struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid,
u64 root_objectid, u32 generation)
{
struct btrfs_root *root;
struct inode *inode;
struct btrfs_key key;
key.objectid = root_objectid;
btrfs_set_key_type(&key, BTRFS_ROOT_ITEM_KEY);
key.offset = (u64)-1;
root = btrfs_read_fs_root_no_name(btrfs_sb(sb)->fs_info, &key);
if (IS_ERR(root))
return ERR_CAST(root);
key.objectid = objectid;
btrfs_set_key_type(&key, BTRFS_INODE_ITEM_KEY);
key.offset = 0;
inode = btrfs_iget(sb, &key, root, NULL);
if (IS_ERR(inode))
return (void *)inode;
if (generation != inode->i_generation) {
iput(inode);
return ERR_PTR(-ESTALE);
}
return d_obtain_alias(inode);
}
static struct dentry *btrfs_fh_to_parent(struct super_block *sb, struct fid *fh,
int fh_len, int fh_type)
{
struct btrfs_fid *fid = (struct btrfs_fid *) fh;
u64 objectid, root_objectid;
u32 generation;
if (fh_type == FILEID_BTRFS_WITH_PARENT) {
if (fh_len != BTRFS_FID_SIZE_CONNECTABLE)
return NULL;
root_objectid = fid->root_objectid;
} else if (fh_type == FILEID_BTRFS_WITH_PARENT_ROOT) {
if (fh_len != BTRFS_FID_SIZE_CONNECTABLE_ROOT)
return NULL;
root_objectid = fid->parent_root_objectid;
} else
return NULL;
objectid = fid->parent_objectid;
generation = fid->parent_gen;
return btrfs_get_dentry(sb, objectid, root_objectid, generation);
}
static struct dentry *btrfs_fh_to_dentry(struct super_block *sb, struct fid *fh,
int fh_len, int fh_type)
{
struct btrfs_fid *fid = (struct btrfs_fid *) fh;
u64 objectid, root_objectid;
u32 generation;
if ((fh_type != FILEID_BTRFS_WITH_PARENT ||
fh_len != BTRFS_FID_SIZE_CONNECTABLE) &&
(fh_type != FILEID_BTRFS_WITH_PARENT_ROOT ||
fh_len != BTRFS_FID_SIZE_CONNECTABLE_ROOT) &&
(fh_type != FILEID_BTRFS_WITHOUT_PARENT ||
fh_len != BTRFS_FID_SIZE_NON_CONNECTABLE))
return NULL;
objectid = fid->objectid;
root_objectid = fid->root_objectid;
generation = fid->gen;
return btrfs_get_dentry(sb, objectid, root_objectid, generation);
}
static struct dentry *btrfs_get_parent(struct dentry *child)
{
struct inode *dir = child->d_inode;
struct btrfs_root *root = BTRFS_I(dir)->root;
struct btrfs_key key;
struct btrfs_path *path;
struct extent_buffer *leaf;
int slot;
u64 objectid;
int ret;
path = btrfs_alloc_path();
key.objectid = dir->i_ino;
btrfs_set_key_type(&key, BTRFS_INODE_REF_KEY);
key.offset = (u64)-1;
ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
if (ret < 0) {
/* Error */
btrfs_free_path(path);
return ERR_PTR(ret);
}
leaf = path->nodes[0];
slot = path->slots[0];
if (ret) {
/* btrfs_search_slot() returns the slot where we'd want to
insert a backref for parent inode #0xFFFFFFFFFFFFFFFF.
The _real_ backref, telling us what the parent inode
_actually_ is, will be in the slot _before_ the one
that btrfs_search_slot() returns. */
if (!slot) {
/* Unless there is _no_ key in the tree before... */
btrfs_free_path(path);
return ERR_PTR(-EIO);
}
slot--;
}
btrfs_item_key_to_cpu(leaf, &key, slot);
btrfs_free_path(path);
if (key.objectid != dir->i_ino || key.type != BTRFS_INODE_REF_KEY)
return ERR_PTR(-EINVAL);
objectid = key.offset;
/* If we are already at the root of a subvol, return the real root */
if (objectid == dir->i_ino)
return dget(dir->i_sb->s_root);
/* Build a new key for the inode item */
key.objectid = objectid;
btrfs_set_key_type(&key, BTRFS_INODE_ITEM_KEY);
key.offset = 0;
return d_obtain_alias(btrfs_iget(root->fs_info->sb, &key, root, NULL));
}
const struct export_operations btrfs_export_ops = {
.encode_fh = btrfs_encode_fh,
.fh_to_dentry = btrfs_fh_to_dentry,
.fh_to_parent = btrfs_fh_to_parent,
.get_parent = btrfs_get_parent,
};

19
fs/btrfs/export.h Normal file
View File

@ -0,0 +1,19 @@
#ifndef BTRFS_EXPORT_H
#define BTRFS_EXPORT_H
#include <linux/exportfs.h>
extern const struct export_operations btrfs_export_ops;
struct btrfs_fid {
u64 objectid;
u64 root_objectid;
u32 gen;
u64 parent_objectid;
u32 parent_gen;
u64 parent_root_objectid;
} __attribute__ ((packed));
#endif

4034
fs/btrfs/extent-tree.c Normal file

File diff suppressed because it is too large Load Diff

3441
fs/btrfs/extent_io.c Normal file

File diff suppressed because it is too large Load Diff

247
fs/btrfs/extent_io.h Normal file
View File

@ -0,0 +1,247 @@
#ifndef __EXTENTIO__
#define __EXTENTIO__
#include <linux/rbtree.h>
/* bits for the extent state */
#define EXTENT_DIRTY 1
#define EXTENT_WRITEBACK (1 << 1)
#define EXTENT_UPTODATE (1 << 2)
#define EXTENT_LOCKED (1 << 3)
#define EXTENT_NEW (1 << 4)
#define EXTENT_DELALLOC (1 << 5)
#define EXTENT_DEFRAG (1 << 6)
#define EXTENT_DEFRAG_DONE (1 << 7)
#define EXTENT_BUFFER_FILLED (1 << 8)
#define EXTENT_ORDERED (1 << 9)
#define EXTENT_ORDERED_METADATA (1 << 10)
#define EXTENT_IOBITS (EXTENT_LOCKED | EXTENT_WRITEBACK)
/*
* page->private values. Every page that is controlled by the extent
* map has page->private set to one.
*/
#define EXTENT_PAGE_PRIVATE 1
#define EXTENT_PAGE_PRIVATE_FIRST_PAGE 3
struct extent_state;
typedef int (extent_submit_bio_hook_t)(struct inode *inode, int rw,
struct bio *bio, int mirror_num);
struct extent_io_ops {
int (*fill_delalloc)(struct inode *inode, u64 start, u64 end);
int (*writepage_start_hook)(struct page *page, u64 start, u64 end);
int (*writepage_io_hook)(struct page *page, u64 start, u64 end);
extent_submit_bio_hook_t *submit_bio_hook;
int (*merge_bio_hook)(struct page *page, unsigned long offset,
size_t size, struct bio *bio);
int (*readpage_io_hook)(struct page *page, u64 start, u64 end);
int (*readpage_io_failed_hook)(struct bio *bio, struct page *page,
u64 start, u64 end,
struct extent_state *state);
int (*writepage_io_failed_hook)(struct bio *bio, struct page *page,
u64 start, u64 end,
struct extent_state *state);
int (*readpage_end_io_hook)(struct page *page, u64 start, u64 end,
struct extent_state *state);
int (*writepage_end_io_hook)(struct page *page, u64 start, u64 end,
struct extent_state *state, int uptodate);
int (*set_bit_hook)(struct inode *inode, u64 start, u64 end,
unsigned long old, unsigned long bits);
int (*clear_bit_hook)(struct inode *inode, u64 start, u64 end,
unsigned long old, unsigned long bits);
int (*write_cache_pages_lock_hook)(struct page *page);
};
struct extent_io_tree {
struct rb_root state;
struct rb_root buffer;
struct address_space *mapping;
u64 dirty_bytes;
spinlock_t lock;
spinlock_t buffer_lock;
struct extent_io_ops *ops;
};
struct extent_state {
u64 start;
u64 end; /* inclusive */
struct rb_node rb_node;
struct extent_io_tree *tree;
wait_queue_head_t wq;
atomic_t refs;
unsigned long state;
/* for use by the FS */
u64 private;
struct list_head leak_list;
};
struct extent_buffer {
u64 start;
unsigned long len;
char *map_token;
char *kaddr;
unsigned long map_start;
unsigned long map_len;
struct page *first_page;
atomic_t refs;
int flags;
struct list_head leak_list;
struct rb_node rb_node;
struct mutex mutex;
};
struct extent_map_tree;
static inline struct extent_state *extent_state_next(struct extent_state *state)
{
struct rb_node *node;
node = rb_next(&state->rb_node);
if (!node)
return NULL;
return rb_entry(node, struct extent_state, rb_node);
}
typedef struct extent_map *(get_extent_t)(struct inode *inode,
struct page *page,
size_t page_offset,
u64 start, u64 len,
int create);
void extent_io_tree_init(struct extent_io_tree *tree,
struct address_space *mapping, gfp_t mask);
int try_release_extent_mapping(struct extent_map_tree *map,
struct extent_io_tree *tree, struct page *page,
gfp_t mask);
int try_release_extent_buffer(struct extent_io_tree *tree, struct page *page);
int try_release_extent_state(struct extent_map_tree *map,
struct extent_io_tree *tree, struct page *page,
gfp_t mask);
int lock_extent(struct extent_io_tree *tree, u64 start, u64 end, gfp_t mask);
int unlock_extent(struct extent_io_tree *tree, u64 start, u64 end, gfp_t mask);
int extent_read_full_page(struct extent_io_tree *tree, struct page *page,
get_extent_t *get_extent);
int __init extent_io_init(void);
void extent_io_exit(void);
u64 count_range_bits(struct extent_io_tree *tree,
u64 *start, u64 search_end,
u64 max_bytes, unsigned long bits);
int test_range_bit(struct extent_io_tree *tree, u64 start, u64 end,
int bits, int filled);
int clear_extent_bits(struct extent_io_tree *tree, u64 start, u64 end,
int bits, gfp_t mask);
int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
int bits, int wake, int delete, gfp_t mask);
int set_extent_bits(struct extent_io_tree *tree, u64 start, u64 end,
int bits, gfp_t mask);
int set_extent_uptodate(struct extent_io_tree *tree, u64 start, u64 end,
gfp_t mask);
int set_extent_new(struct extent_io_tree *tree, u64 start, u64 end,
gfp_t mask);
int set_extent_dirty(struct extent_io_tree *tree, u64 start, u64 end,
gfp_t mask);
int clear_extent_dirty(struct extent_io_tree *tree, u64 start, u64 end,
gfp_t mask);
int clear_extent_ordered(struct extent_io_tree *tree, u64 start, u64 end,
gfp_t mask);
int clear_extent_ordered_metadata(struct extent_io_tree *tree, u64 start,
u64 end, gfp_t mask);
int set_extent_delalloc(struct extent_io_tree *tree, u64 start, u64 end,
gfp_t mask);
int set_extent_ordered(struct extent_io_tree *tree, u64 start, u64 end,
gfp_t mask);
int find_first_extent_bit(struct extent_io_tree *tree, u64 start,
u64 *start_ret, u64 *end_ret, int bits);
struct extent_state *find_first_extent_bit_state(struct extent_io_tree *tree,
u64 start, int bits);
int extent_invalidatepage(struct extent_io_tree *tree,
struct page *page, unsigned long offset);
int extent_write_full_page(struct extent_io_tree *tree, struct page *page,
get_extent_t *get_extent,
struct writeback_control *wbc);
int extent_writepages(struct extent_io_tree *tree,
struct address_space *mapping,
get_extent_t *get_extent,
struct writeback_control *wbc);
int extent_readpages(struct extent_io_tree *tree,
struct address_space *mapping,
struct list_head *pages, unsigned nr_pages,
get_extent_t get_extent);
int extent_prepare_write(struct extent_io_tree *tree,
struct inode *inode, struct page *page,
unsigned from, unsigned to, get_extent_t *get_extent);
int extent_commit_write(struct extent_io_tree *tree,
struct inode *inode, struct page *page,
unsigned from, unsigned to);
sector_t extent_bmap(struct address_space *mapping, sector_t iblock,
get_extent_t *get_extent);
int set_range_dirty(struct extent_io_tree *tree, u64 start, u64 end);
int set_state_private(struct extent_io_tree *tree, u64 start, u64 private);
int get_state_private(struct extent_io_tree *tree, u64 start, u64 *private);
void set_page_extent_mapped(struct page *page);
struct extent_buffer *alloc_extent_buffer(struct extent_io_tree *tree,
u64 start, unsigned long len,
struct page *page0,
gfp_t mask);
struct extent_buffer *find_extent_buffer(struct extent_io_tree *tree,
u64 start, unsigned long len,
gfp_t mask);
void free_extent_buffer(struct extent_buffer *eb);
int read_extent_buffer_pages(struct extent_io_tree *tree,
struct extent_buffer *eb, u64 start, int wait,
get_extent_t *get_extent, int mirror_num);
static inline void extent_buffer_get(struct extent_buffer *eb)
{
atomic_inc(&eb->refs);
}
int memcmp_extent_buffer(struct extent_buffer *eb, const void *ptrv,
unsigned long start,
unsigned long len);
void read_extent_buffer(struct extent_buffer *eb, void *dst,
unsigned long start,
unsigned long len);
void write_extent_buffer(struct extent_buffer *eb, const void *src,
unsigned long start, unsigned long len);
void copy_extent_buffer(struct extent_buffer *dst, struct extent_buffer *src,
unsigned long dst_offset, unsigned long src_offset,
unsigned long len);
void memcpy_extent_buffer(struct extent_buffer *dst, unsigned long dst_offset,
unsigned long src_offset, unsigned long len);
void memmove_extent_buffer(struct extent_buffer *dst, unsigned long dst_offset,
unsigned long src_offset, unsigned long len);
void memset_extent_buffer(struct extent_buffer *eb, char c,
unsigned long start, unsigned long len);
int wait_on_extent_buffer_writeback(struct extent_io_tree *tree,
struct extent_buffer *eb);
int wait_on_extent_writeback(struct extent_io_tree *tree, u64 start, u64 end);
int wait_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, int bits);
int clear_extent_buffer_dirty(struct extent_io_tree *tree,
struct extent_buffer *eb);
int set_extent_buffer_dirty(struct extent_io_tree *tree,
struct extent_buffer *eb);
int set_extent_buffer_uptodate(struct extent_io_tree *tree,
struct extent_buffer *eb);
int clear_extent_buffer_uptodate(struct extent_io_tree *tree,
struct extent_buffer *eb);
int extent_buffer_uptodate(struct extent_io_tree *tree,
struct extent_buffer *eb);
int map_extent_buffer(struct extent_buffer *eb, unsigned long offset,
unsigned long min_len, char **token, char **map,
unsigned long *map_start,
unsigned long *map_len, int km);
int map_private_extent_buffer(struct extent_buffer *eb, unsigned long offset,
unsigned long min_len, char **token, char **map,
unsigned long *map_start,
unsigned long *map_len, int km);
void unmap_extent_buffer(struct extent_buffer *eb, char *token, int km);
int release_extent_buffer_tail_pages(struct extent_buffer *eb);
int extent_range_uptodate(struct extent_io_tree *tree,
u64 start, u64 end);
#endif

332
fs/btrfs/extent_map.c Normal file
View File

@ -0,0 +1,332 @@
#include <linux/err.h>
#include <linux/gfp.h>
#include <linux/slab.h>
#include <linux/module.h>
#include <linux/spinlock.h>
#include <linux/version.h>
#include <linux/hardirq.h>
#include "extent_map.h"
/* temporary define until extent_map moves out of btrfs */
struct kmem_cache *btrfs_cache_create(const char *name, size_t size,
unsigned long extra_flags,
void (*ctor)(void *, struct kmem_cache *,
unsigned long));
static struct kmem_cache *extent_map_cache;
int __init extent_map_init(void)
{
extent_map_cache = btrfs_cache_create("extent_map",
sizeof(struct extent_map), 0,
NULL);
if (!extent_map_cache)
return -ENOMEM;
return 0;
}
void extent_map_exit(void)
{
if (extent_map_cache)
kmem_cache_destroy(extent_map_cache);
}
/**
* extent_map_tree_init - initialize extent map tree
* @tree: tree to initialize
* @mask: flags for memory allocations during tree operations
*
* Initialize the extent tree @tree. Should be called for each new inode
* or other user of the extent_map interface.
*/
void extent_map_tree_init(struct extent_map_tree *tree, gfp_t mask)
{
tree->map.rb_node = NULL;
spin_lock_init(&tree->lock);
}
EXPORT_SYMBOL(extent_map_tree_init);
/**
* alloc_extent_map - allocate new extent map structure
* @mask: memory allocation flags
*
* Allocate a new extent_map structure. The new structure is
* returned with a reference count of one and needs to be
* freed using free_extent_map()
*/
struct extent_map *alloc_extent_map(gfp_t mask)
{
struct extent_map *em;
em = kmem_cache_alloc(extent_map_cache, mask);
if (!em || IS_ERR(em))
return em;
em->in_tree = 0;
em->flags = 0;
atomic_set(&em->refs, 1);
return em;
}
EXPORT_SYMBOL(alloc_extent_map);
/**
* free_extent_map - drop reference count of an extent_map
* @em: extent map beeing releasead
*
* Drops the reference out on @em by one and free the structure
* if the reference count hits zero.
*/
void free_extent_map(struct extent_map *em)
{
if (!em)
return;
WARN_ON(atomic_read(&em->refs) == 0);
if (atomic_dec_and_test(&em->refs)) {
WARN_ON(em->in_tree);
kmem_cache_free(extent_map_cache, em);
}
}
EXPORT_SYMBOL(free_extent_map);
static struct rb_node *tree_insert(struct rb_root *root, u64 offset,
struct rb_node *node)
{
struct rb_node ** p = &root->rb_node;
struct rb_node * parent = NULL;
struct extent_map *entry;
while(*p) {
parent = *p;
entry = rb_entry(parent, struct extent_map, rb_node);
WARN_ON(!entry->in_tree);
if (offset < entry->start)
p = &(*p)->rb_left;
else if (offset >= extent_map_end(entry))
p = &(*p)->rb_right;
else
return parent;
}
entry = rb_entry(node, struct extent_map, rb_node);
entry->in_tree = 1;
rb_link_node(node, parent, p);
rb_insert_color(node, root);
return NULL;
}
static struct rb_node *__tree_search(struct rb_root *root, u64 offset,
struct rb_node **prev_ret,
struct rb_node **next_ret)
{
struct rb_node * n = root->rb_node;
struct rb_node *prev = NULL;
struct rb_node *orig_prev = NULL;
struct extent_map *entry;
struct extent_map *prev_entry = NULL;
while(n) {
entry = rb_entry(n, struct extent_map, rb_node);
prev = n;
prev_entry = entry;
WARN_ON(!entry->in_tree);
if (offset < entry->start)
n = n->rb_left;
else if (offset >= extent_map_end(entry))
n = n->rb_right;
else
return n;
}
if (prev_ret) {
orig_prev = prev;
while(prev && offset >= extent_map_end(prev_entry)) {
prev = rb_next(prev);
prev_entry = rb_entry(prev, struct extent_map, rb_node);
}
*prev_ret = prev;
prev = orig_prev;
}
if (next_ret) {
prev_entry = rb_entry(prev, struct extent_map, rb_node);
while(prev && offset < prev_entry->start) {
prev = rb_prev(prev);
prev_entry = rb_entry(prev, struct extent_map, rb_node);
}
*next_ret = prev;
}
return NULL;
}
static inline struct rb_node *tree_search(struct rb_root *root, u64 offset)
{
struct rb_node *prev;
struct rb_node *ret;
ret = __tree_search(root, offset, &prev, NULL);
if (!ret)
return prev;
return ret;
}
static int mergable_maps(struct extent_map *prev, struct extent_map *next)
{
if (test_bit(EXTENT_FLAG_PINNED, &prev->flags))
return 0;
if (extent_map_end(prev) == next->start &&
prev->flags == next->flags &&
prev->bdev == next->bdev &&
((next->block_start == EXTENT_MAP_HOLE &&
prev->block_start == EXTENT_MAP_HOLE) ||
(next->block_start == EXTENT_MAP_INLINE &&
prev->block_start == EXTENT_MAP_INLINE) ||
(next->block_start == EXTENT_MAP_DELALLOC &&
prev->block_start == EXTENT_MAP_DELALLOC) ||
(next->block_start < EXTENT_MAP_LAST_BYTE - 1 &&
next->block_start == extent_map_block_end(prev)))) {
return 1;
}
return 0;
}
/**
* add_extent_mapping - add new extent map to the extent tree
* @tree: tree to insert new map in
* @em: map to insert
*
* Insert @em into @tree or perform a simple forward/backward merge with
* existing mappings. The extent_map struct passed in will be inserted
* into the tree directly, with an additional reference taken, or a
* reference dropped if the merge attempt was sucessfull.
*/
int add_extent_mapping(struct extent_map_tree *tree,
struct extent_map *em)
{
int ret = 0;
struct extent_map *merge = NULL;
struct rb_node *rb;
struct extent_map *exist;
exist = lookup_extent_mapping(tree, em->start, em->len);
if (exist) {
free_extent_map(exist);
ret = -EEXIST;
goto out;
}
assert_spin_locked(&tree->lock);
rb = tree_insert(&tree->map, em->start, &em->rb_node);
if (rb) {
ret = -EEXIST;
free_extent_map(merge);
goto out;
}
atomic_inc(&em->refs);
if (em->start != 0) {
rb = rb_prev(&em->rb_node);
if (rb)
merge = rb_entry(rb, struct extent_map, rb_node);
if (rb && mergable_maps(merge, em)) {
em->start = merge->start;
em->len += merge->len;
em->block_start = merge->block_start;
merge->in_tree = 0;
rb_erase(&merge->rb_node, &tree->map);
free_extent_map(merge);
}
}
rb = rb_next(&em->rb_node);
if (rb)
merge = rb_entry(rb, struct extent_map, rb_node);
if (rb && mergable_maps(em, merge)) {
em->len += merge->len;
rb_erase(&merge->rb_node, &tree->map);
merge->in_tree = 0;
free_extent_map(merge);
}
out:
return ret;
}
EXPORT_SYMBOL(add_extent_mapping);
static u64 range_end(u64 start, u64 len)
{
if (start + len < start)
return (u64)-1;
return start + len;
}
/**
* lookup_extent_mapping - lookup extent_map
* @tree: tree to lookup in
* @start: byte offset to start the search
* @len: length of the lookup range
*
* Find and return the first extent_map struct in @tree that intersects the
* [start, len] range. There may be additional objects in the tree that
* intersect, so check the object returned carefully to make sure that no
* additional lookups are needed.
*/
struct extent_map *lookup_extent_mapping(struct extent_map_tree *tree,
u64 start, u64 len)
{
struct extent_map *em;
struct rb_node *rb_node;
struct rb_node *prev = NULL;
struct rb_node *next = NULL;
u64 end = range_end(start, len);
assert_spin_locked(&tree->lock);
rb_node = __tree_search(&tree->map, start, &prev, &next);
if (!rb_node && prev) {
em = rb_entry(prev, struct extent_map, rb_node);
if (end > em->start && start < extent_map_end(em))
goto found;
}
if (!rb_node && next) {
em = rb_entry(next, struct extent_map, rb_node);
if (end > em->start && start < extent_map_end(em))
goto found;
}
if (!rb_node) {
em = NULL;
goto out;
}
if (IS_ERR(rb_node)) {
em = ERR_PTR(PTR_ERR(rb_node));
goto out;
}
em = rb_entry(rb_node, struct extent_map, rb_node);
if (end > em->start && start < extent_map_end(em))
goto found;
em = NULL;
goto out;
found:
atomic_inc(&em->refs);
out:
return em;
}
EXPORT_SYMBOL(lookup_extent_mapping);
/**
* remove_extent_mapping - removes an extent_map from the extent tree
* @tree: extent tree to remove from
* @em: extent map beeing removed
*
* Removes @em from @tree. No reference counts are dropped, and no checks
* are done to see if the range is in use
*/
int remove_extent_mapping(struct extent_map_tree *tree, struct extent_map *em)
{
int ret = 0;
WARN_ON(test_bit(EXTENT_FLAG_PINNED, &em->flags));
assert_spin_locked(&tree->lock);
rb_erase(&em->rb_node, &tree->map);
em->in_tree = 0;
return ret;
}
EXPORT_SYMBOL(remove_extent_mapping);

57
fs/btrfs/extent_map.h Normal file
View File

@ -0,0 +1,57 @@
#ifndef __EXTENTMAP__
#define __EXTENTMAP__
#include <linux/rbtree.h>
#define EXTENT_MAP_LAST_BYTE (u64)-4
#define EXTENT_MAP_HOLE (u64)-3
#define EXTENT_MAP_INLINE (u64)-2
#define EXTENT_MAP_DELALLOC (u64)-1
/* bits for the flags field */
#define EXTENT_FLAG_PINNED 0 /* this entry not yet on disk, don't free it */
struct extent_map {
struct rb_node rb_node;
/* all of these are in bytes */
u64 start;
u64 len;
u64 block_start;
unsigned long flags;
struct block_device *bdev;
atomic_t refs;
int in_tree;
};
struct extent_map_tree {
struct rb_root map;
spinlock_t lock;
};
static inline u64 extent_map_end(struct extent_map *em)
{
if (em->start + em->len < em->start)
return (u64)-1;
return em->start + em->len;
}
static inline u64 extent_map_block_end(struct extent_map *em)
{
if (em->block_start + em->len < em->block_start)
return (u64)-1;
return em->block_start + em->len;
}
void extent_map_tree_init(struct extent_map_tree *tree, gfp_t mask);
struct extent_map *lookup_extent_mapping(struct extent_map_tree *tree,
u64 start, u64 len);
int add_extent_mapping(struct extent_map_tree *tree,
struct extent_map *em);
int remove_extent_mapping(struct extent_map_tree *tree, struct extent_map *em);
struct extent_map *alloc_extent_map(gfp_t mask);
void free_extent_map(struct extent_map *em);
int __init extent_map_init(void);
void extent_map_exit(void);
#endif

512
fs/btrfs/file-item.c Normal file
View File

@ -0,0 +1,512 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/bio.h>
#include <linux/pagemap.h>
#include <linux/highmem.h>
#include "ctree.h"
#include "disk-io.h"
#include "transaction.h"
#include "print-tree.h"
#define MAX_CSUM_ITEMS(r) ((((BTRFS_LEAF_DATA_SIZE(r) - \
sizeof(struct btrfs_item) * 2) / \
BTRFS_CRC32_SIZE) - 1))
int btrfs_insert_file_extent(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
u64 objectid, u64 pos,
u64 disk_offset, u64 disk_num_bytes,
u64 num_bytes, u64 offset)
{
int ret = 0;
struct btrfs_file_extent_item *item;
struct btrfs_key file_key;
struct btrfs_path *path;
struct extent_buffer *leaf;
path = btrfs_alloc_path();
BUG_ON(!path);
file_key.objectid = objectid;
file_key.offset = pos;
btrfs_set_key_type(&file_key, BTRFS_EXTENT_DATA_KEY);
ret = btrfs_insert_empty_item(trans, root, path, &file_key,
sizeof(*item));
if (ret < 0)
goto out;
BUG_ON(ret);
leaf = path->nodes[0];
item = btrfs_item_ptr(leaf, path->slots[0],
struct btrfs_file_extent_item);
btrfs_set_file_extent_disk_bytenr(leaf, item, disk_offset);
btrfs_set_file_extent_disk_num_bytes(leaf, item, disk_num_bytes);
btrfs_set_file_extent_offset(leaf, item, offset);
btrfs_set_file_extent_num_bytes(leaf, item, num_bytes);
btrfs_set_file_extent_generation(leaf, item, trans->transid);
btrfs_set_file_extent_type(leaf, item, BTRFS_FILE_EXTENT_REG);
btrfs_mark_buffer_dirty(leaf);
out:
btrfs_free_path(path);
return ret;
}
struct btrfs_csum_item *btrfs_lookup_csum(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct btrfs_path *path,
u64 objectid, u64 offset,
int cow)
{
int ret;
struct btrfs_key file_key;
struct btrfs_key found_key;
struct btrfs_csum_item *item;
struct extent_buffer *leaf;
u64 csum_offset = 0;
int csums_in_item;
file_key.objectid = objectid;
file_key.offset = offset;
btrfs_set_key_type(&file_key, BTRFS_CSUM_ITEM_KEY);
ret = btrfs_search_slot(trans, root, &file_key, path, 0, cow);
if (ret < 0)
goto fail;
leaf = path->nodes[0];
if (ret > 0) {
ret = 1;
if (path->slots[0] == 0)
goto fail;
path->slots[0]--;
btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
if (btrfs_key_type(&found_key) != BTRFS_CSUM_ITEM_KEY ||
found_key.objectid != objectid) {
goto fail;
}
csum_offset = (offset - found_key.offset) >>
root->fs_info->sb->s_blocksize_bits;
csums_in_item = btrfs_item_size_nr(leaf, path->slots[0]);
csums_in_item /= BTRFS_CRC32_SIZE;
if (csum_offset >= csums_in_item) {
ret = -EFBIG;
goto fail;
}
}
item = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_csum_item);
item = (struct btrfs_csum_item *)((unsigned char *)item +
csum_offset * BTRFS_CRC32_SIZE);
return item;
fail:
if (ret > 0)
ret = -ENOENT;
return ERR_PTR(ret);
}
int btrfs_lookup_file_extent(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct btrfs_path *path, u64 objectid,
u64 offset, int mod)
{
int ret;
struct btrfs_key file_key;
int ins_len = mod < 0 ? -1 : 0;
int cow = mod != 0;
file_key.objectid = objectid;
file_key.offset = offset;
btrfs_set_key_type(&file_key, BTRFS_EXTENT_DATA_KEY);
ret = btrfs_search_slot(trans, root, &file_key, path, ins_len, cow);
return ret;
}
int btrfs_lookup_bio_sums(struct btrfs_root *root, struct inode *inode,
struct bio *bio)
{
u32 sum;
struct bio_vec *bvec = bio->bi_io_vec;
int bio_index = 0;
u64 offset;
u64 item_start_offset = 0;
u64 item_last_offset = 0;
u32 diff;
int ret;
struct btrfs_path *path;
struct btrfs_csum_item *item = NULL;
struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree;
path = btrfs_alloc_path();
if (bio->bi_size > PAGE_CACHE_SIZE * 8)
path->reada = 2;
WARN_ON(bio->bi_vcnt <= 0);
while(bio_index < bio->bi_vcnt) {
offset = page_offset(bvec->bv_page) + bvec->bv_offset;
ret = btrfs_find_ordered_sum(inode, offset, &sum);
if (ret == 0)
goto found;
if (!item || offset < item_start_offset ||
offset >= item_last_offset) {
struct btrfs_key found_key;
u32 item_size;
if (item)
btrfs_release_path(root, path);
item = btrfs_lookup_csum(NULL, root, path,
inode->i_ino, offset, 0);
if (IS_ERR(item)) {
ret = PTR_ERR(item);
if (ret == -ENOENT || ret == -EFBIG)
ret = 0;
sum = 0;
printk("no csum found for inode %lu start "
"%llu\n", inode->i_ino,
(unsigned long long)offset);
item = NULL;
goto found;
}
btrfs_item_key_to_cpu(path->nodes[0], &found_key,
path->slots[0]);
item_start_offset = found_key.offset;
item_size = btrfs_item_size_nr(path->nodes[0],
path->slots[0]);
item_last_offset = item_start_offset +
(item_size / BTRFS_CRC32_SIZE) *
root->sectorsize;
item = btrfs_item_ptr(path->nodes[0], path->slots[0],
struct btrfs_csum_item);
}
/*
* this byte range must be able to fit inside
* a single leaf so it will also fit inside a u32
*/
diff = offset - item_start_offset;
diff = diff / root->sectorsize;
diff = diff * BTRFS_CRC32_SIZE;
read_extent_buffer(path->nodes[0], &sum,
((unsigned long)item) + diff,
BTRFS_CRC32_SIZE);
found:
set_state_private(io_tree, offset, sum);
bio_index++;
bvec++;
}
btrfs_free_path(path);
return 0;
}
int btrfs_csum_one_bio(struct btrfs_root *root, struct inode *inode,
struct bio *bio)
{
struct btrfs_ordered_sum *sums;
struct btrfs_sector_sum *sector_sum;
struct btrfs_ordered_extent *ordered;
char *data;
struct bio_vec *bvec = bio->bi_io_vec;
int bio_index = 0;
unsigned long total_bytes = 0;
unsigned long this_sum_bytes = 0;
u64 offset;
WARN_ON(bio->bi_vcnt <= 0);
sums = kzalloc(btrfs_ordered_sum_size(root, bio->bi_size), GFP_NOFS);
if (!sums)
return -ENOMEM;
sector_sum = sums->sums;
sums->file_offset = page_offset(bvec->bv_page) + bvec->bv_offset;
sums->len = bio->bi_size;
INIT_LIST_HEAD(&sums->list);
ordered = btrfs_lookup_ordered_extent(inode, sums->file_offset);
BUG_ON(!ordered);
while(bio_index < bio->bi_vcnt) {
offset = page_offset(bvec->bv_page) + bvec->bv_offset;
if (offset >= ordered->file_offset + ordered->len ||
offset < ordered->file_offset) {
unsigned long bytes_left;
sums->len = this_sum_bytes;
this_sum_bytes = 0;
btrfs_add_ordered_sum(inode, ordered, sums);
btrfs_put_ordered_extent(ordered);
bytes_left = bio->bi_size - total_bytes;
sums = kzalloc(btrfs_ordered_sum_size(root, bytes_left),
GFP_NOFS);
BUG_ON(!sums);
sector_sum = sums->sums;
sums->len = bytes_left;
sums->file_offset = offset;
ordered = btrfs_lookup_ordered_extent(inode,
sums->file_offset);
BUG_ON(!ordered);
}
data = kmap_atomic(bvec->bv_page, KM_USER0);
sector_sum->sum = ~(u32)0;
sector_sum->sum = btrfs_csum_data(root,
data + bvec->bv_offset,
sector_sum->sum,
bvec->bv_len);
kunmap_atomic(data, KM_USER0);
btrfs_csum_final(sector_sum->sum,
(char *)&sector_sum->sum);
sector_sum->offset = page_offset(bvec->bv_page) +
bvec->bv_offset;
sector_sum++;
bio_index++;
total_bytes += bvec->bv_len;
this_sum_bytes += bvec->bv_len;
bvec++;
}
this_sum_bytes = 0;
btrfs_add_ordered_sum(inode, ordered, sums);
btrfs_put_ordered_extent(ordered);
return 0;
}
int btrfs_csum_file_blocks(struct btrfs_trans_handle *trans,
struct btrfs_root *root, struct inode *inode,
struct btrfs_ordered_sum *sums)
{
u64 objectid = inode->i_ino;
u64 offset;
int ret;
struct btrfs_key file_key;
struct btrfs_key found_key;
u64 next_offset;
u64 total_bytes = 0;
int found_next;
struct btrfs_path *path;
struct btrfs_csum_item *item;
struct btrfs_csum_item *item_end;
struct extent_buffer *leaf = NULL;
u64 csum_offset;
struct btrfs_sector_sum *sector_sum;
u32 nritems;
u32 ins_size;
char *eb_map;
char *eb_token;
unsigned long map_len;
unsigned long map_start;
path = btrfs_alloc_path();
BUG_ON(!path);
sector_sum = sums->sums;
again:
next_offset = (u64)-1;
found_next = 0;
offset = sector_sum->offset;
file_key.objectid = objectid;
file_key.offset = offset;
btrfs_set_key_type(&file_key, BTRFS_CSUM_ITEM_KEY);
mutex_lock(&BTRFS_I(inode)->csum_mutex);
item = btrfs_lookup_csum(trans, root, path, objectid, offset, 1);
if (!IS_ERR(item)) {
leaf = path->nodes[0];
ret = 0;
goto found;
}
ret = PTR_ERR(item);
if (ret == -EFBIG) {
u32 item_size;
/* we found one, but it isn't big enough yet */
leaf = path->nodes[0];
item_size = btrfs_item_size_nr(leaf, path->slots[0]);
if ((item_size / BTRFS_CRC32_SIZE) >= MAX_CSUM_ITEMS(root)) {
/* already at max size, make a new one */
goto insert;
}
} else {
int slot = path->slots[0] + 1;
/* we didn't find a csum item, insert one */
nritems = btrfs_header_nritems(path->nodes[0]);
if (path->slots[0] >= nritems - 1) {
ret = btrfs_next_leaf(root, path);
if (ret == 1)
found_next = 1;
if (ret != 0)
goto insert;
slot = 0;
}
btrfs_item_key_to_cpu(path->nodes[0], &found_key, slot);
if (found_key.objectid != objectid ||
found_key.type != BTRFS_CSUM_ITEM_KEY) {
found_next = 1;
goto insert;
}
next_offset = found_key.offset;
found_next = 1;
goto insert;
}
/*
* at this point, we know the tree has an item, but it isn't big
* enough yet to put our csum in. Grow it
*/
btrfs_release_path(root, path);
ret = btrfs_search_slot(trans, root, &file_key, path,
BTRFS_CRC32_SIZE, 1);
if (ret < 0)
goto fail_unlock;
if (ret == 0) {
BUG();
}
if (path->slots[0] == 0) {
goto insert;
}
path->slots[0]--;
leaf = path->nodes[0];
btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
csum_offset = (offset - found_key.offset) >>
root->fs_info->sb->s_blocksize_bits;
if (btrfs_key_type(&found_key) != BTRFS_CSUM_ITEM_KEY ||
found_key.objectid != objectid ||
csum_offset >= MAX_CSUM_ITEMS(root)) {
goto insert;
}
if (csum_offset >= btrfs_item_size_nr(leaf, path->slots[0]) /
BTRFS_CRC32_SIZE) {
u32 diff = (csum_offset + 1) * BTRFS_CRC32_SIZE;
diff = diff - btrfs_item_size_nr(leaf, path->slots[0]);
if (diff != BTRFS_CRC32_SIZE)
goto insert;
ret = btrfs_extend_item(trans, root, path, diff);
BUG_ON(ret);
goto csum;
}
insert:
btrfs_release_path(root, path);
csum_offset = 0;
if (found_next) {
u64 tmp = min((u64)i_size_read(inode), next_offset);
tmp -= offset & ~((u64)root->sectorsize -1);
tmp >>= root->fs_info->sb->s_blocksize_bits;
tmp = max((u64)1, tmp);
tmp = min(tmp, (u64)MAX_CSUM_ITEMS(root));
ins_size = BTRFS_CRC32_SIZE * tmp;
} else {
ins_size = BTRFS_CRC32_SIZE;
}
ret = btrfs_insert_empty_item(trans, root, path, &file_key,
ins_size);
if (ret < 0)
goto fail_unlock;
if (ret != 0) {
WARN_ON(1);
goto fail_unlock;
}
csum:
leaf = path->nodes[0];
item = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_csum_item);
ret = 0;
item = (struct btrfs_csum_item *)((unsigned char *)item +
csum_offset * BTRFS_CRC32_SIZE);
found:
item_end = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_csum_item);
item_end = (struct btrfs_csum_item *)((unsigned char *)item_end +
btrfs_item_size_nr(leaf, path->slots[0]));
eb_token = NULL;
mutex_unlock(&BTRFS_I(inode)->csum_mutex);
cond_resched();
next_sector:
if (!eb_token ||
(unsigned long)item + BTRFS_CRC32_SIZE >= map_start + map_len) {
int err;
if (eb_token)
unmap_extent_buffer(leaf, eb_token, KM_USER1);
eb_token = NULL;
err = map_private_extent_buffer(leaf, (unsigned long)item,
BTRFS_CRC32_SIZE,
&eb_token, &eb_map,
&map_start, &map_len, KM_USER1);
if (err)
eb_token = NULL;
}
if (eb_token) {
memcpy(eb_token + ((unsigned long)item & (PAGE_CACHE_SIZE - 1)),
&sector_sum->sum, BTRFS_CRC32_SIZE);
} else {
write_extent_buffer(leaf, &sector_sum->sum,
(unsigned long)item, BTRFS_CRC32_SIZE);
}
total_bytes += root->sectorsize;
sector_sum++;
if (total_bytes < sums->len) {
item = (struct btrfs_csum_item *)((char *)item +
BTRFS_CRC32_SIZE);
if (item < item_end && offset + PAGE_CACHE_SIZE ==
sector_sum->offset) {
offset = sector_sum->offset;
goto next_sector;
}
}
if (eb_token) {
unmap_extent_buffer(leaf, eb_token, KM_USER1);
eb_token = NULL;
}
btrfs_mark_buffer_dirty(path->nodes[0]);
cond_resched();
if (total_bytes < sums->len) {
btrfs_release_path(root, path);
goto again;
}
out:
btrfs_free_path(path);
return ret;
fail_unlock:
mutex_unlock(&BTRFS_I(inode)->csum_mutex);
goto out;
}
int btrfs_csum_truncate(struct btrfs_trans_handle *trans,
struct btrfs_root *root, struct btrfs_path *path,
u64 isize)
{
struct btrfs_key key;
struct extent_buffer *leaf = path->nodes[0];
int slot = path->slots[0];
int ret;
u32 new_item_size;
u64 new_item_span;
u64 blocks;
btrfs_item_key_to_cpu(leaf, &key, slot);
if (isize <= key.offset)
return 0;
new_item_span = isize - key.offset;
blocks = (new_item_span + root->sectorsize - 1) >>
root->fs_info->sb->s_blocksize_bits;
new_item_size = blocks * BTRFS_CRC32_SIZE;
if (new_item_size >= btrfs_item_size_nr(leaf, slot))
return 0;
ret = btrfs_truncate_item(trans, root, path, new_item_size, 1);
BUG_ON(ret);
return ret;
}

1133
fs/btrfs/file.c Normal file

File diff suppressed because it is too large Load Diff

415
fs/btrfs/free-space-cache.c Normal file
View File

@ -0,0 +1,415 @@
/*
* Copyright (C) 2008 Red Hat. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/sched.h>
#include "ctree.h"
static int tree_insert_offset(struct rb_root *root, u64 offset,
struct rb_node *node)
{
struct rb_node **p = &root->rb_node;
struct rb_node *parent = NULL;
struct btrfs_free_space *info;
while (*p) {
parent = *p;
info = rb_entry(parent, struct btrfs_free_space, offset_index);
if (offset < info->offset)
p = &(*p)->rb_left;
else if (offset > info->offset)
p = &(*p)->rb_right;
else
return -EEXIST;
}
rb_link_node(node, parent, p);
rb_insert_color(node, root);
return 0;
}
static int tree_insert_bytes(struct rb_root *root, u64 bytes,
struct rb_node *node)
{
struct rb_node **p = &root->rb_node;
struct rb_node *parent = NULL;
struct btrfs_free_space *info;
while (*p) {
parent = *p;
info = rb_entry(parent, struct btrfs_free_space, bytes_index);
if (bytes < info->bytes)
p = &(*p)->rb_left;
else
p = &(*p)->rb_right;
}
rb_link_node(node, parent, p);
rb_insert_color(node, root);
return 0;
}
/*
* searches the tree for the given offset. If contains is set we will return
* the free space that contains the given offset. If contains is not set we
* will return the free space that starts at or after the given offset and is
* at least bytes long.
*/
static struct btrfs_free_space *tree_search_offset(struct rb_root *root,
u64 offset, u64 bytes,
int contains)
{
struct rb_node *n = root->rb_node;
struct btrfs_free_space *entry, *ret = NULL;
while (n) {
entry = rb_entry(n, struct btrfs_free_space, offset_index);
if (offset < entry->offset) {
if (!contains &&
(!ret || entry->offset < ret->offset) &&
(bytes <= entry->bytes))
ret = entry;
n = n->rb_left;
} else if (offset > entry->offset) {
if (contains &&
(entry->offset + entry->bytes - 1) >= offset) {
ret = entry;
break;
}
n = n->rb_right;
} else {
if (bytes > entry->bytes) {
n = n->rb_right;
continue;
}
ret = entry;
break;
}
}
return ret;
}
/*
* return a chunk at least bytes size, as close to offset that we can get.
*/
static struct btrfs_free_space *tree_search_bytes(struct rb_root *root,
u64 offset, u64 bytes)
{
struct rb_node *n = root->rb_node;
struct btrfs_free_space *entry, *ret = NULL;
while (n) {
entry = rb_entry(n, struct btrfs_free_space, bytes_index);
if (bytes < entry->bytes) {
/*
* We prefer to get a hole size as close to the size we
* are asking for so we don't take small slivers out of
* huge holes, but we also want to get as close to the
* offset as possible so we don't have a whole lot of
* fragmentation.
*/
if (offset <= entry->offset) {
if (!ret)
ret = entry;
else if (entry->bytes < ret->bytes)
ret = entry;
else if (entry->offset < ret->offset)
ret = entry;
}
n = n->rb_left;
} else if (bytes > entry->bytes) {
n = n->rb_right;
} else {
/*
* Ok we may have multiple chunks of the wanted size,
* so we don't want to take the first one we find, we
* want to take the one closest to our given offset, so
* keep searching just in case theres a better match.
*/
n = n->rb_right;
if (offset > entry->offset)
continue;
else if (!ret || entry->offset < ret->offset)
ret = entry;
}
}
return ret;
}
static void unlink_free_space(struct btrfs_block_group_cache *block_group,
struct btrfs_free_space *info)
{
rb_erase(&info->offset_index, &block_group->free_space_offset);
rb_erase(&info->bytes_index, &block_group->free_space_bytes);
}
static int link_free_space(struct btrfs_block_group_cache *block_group,
struct btrfs_free_space *info)
{
int ret = 0;
ret = tree_insert_offset(&block_group->free_space_offset, info->offset,
&info->offset_index);
if (ret)
return ret;
ret = tree_insert_bytes(&block_group->free_space_bytes, info->bytes,
&info->bytes_index);
if (ret)
return ret;
return ret;
}
int btrfs_add_free_space(struct btrfs_block_group_cache *block_group,
u64 offset, u64 bytes)
{
struct btrfs_free_space *right_info;
struct btrfs_free_space *left_info;
struct btrfs_free_space *info = NULL;
struct btrfs_free_space *alloc_info;
int ret = 0;
alloc_info = kzalloc(sizeof(struct btrfs_free_space), GFP_NOFS);
if (!alloc_info)
return -ENOMEM;
/*
* first we want to see if there is free space adjacent to the range we
* are adding, if there is remove that struct and add a new one to
* cover the entire range
*/
spin_lock(&block_group->lock);
right_info = tree_search_offset(&block_group->free_space_offset,
offset+bytes, 0, 1);
left_info = tree_search_offset(&block_group->free_space_offset,
offset-1, 0, 1);
if (right_info && right_info->offset == offset+bytes) {
unlink_free_space(block_group, right_info);
info = right_info;
info->offset = offset;
info->bytes += bytes;
} else if (right_info && right_info->offset != offset+bytes) {
printk(KERN_ERR "adding space in the middle of an existing "
"free space area. existing: offset=%Lu, bytes=%Lu. "
"new: offset=%Lu, bytes=%Lu\n", right_info->offset,
right_info->bytes, offset, bytes);
BUG();
}
if (left_info) {
unlink_free_space(block_group, left_info);
if (unlikely((left_info->offset + left_info->bytes) !=
offset)) {
printk(KERN_ERR "free space to the left of new free "
"space isn't quite right. existing: offset=%Lu,"
" bytes=%Lu. new: offset=%Lu, bytes=%Lu\n",
left_info->offset, left_info->bytes, offset,
bytes);
BUG();
}
if (info) {
info->offset = left_info->offset;
info->bytes += left_info->bytes;
kfree(left_info);
} else {
info = left_info;
info->bytes += bytes;
}
}
if (info) {
ret = link_free_space(block_group, info);
if (!ret)
info = NULL;
goto out;
}
info = alloc_info;
alloc_info = NULL;
info->offset = offset;
info->bytes = bytes;
ret = link_free_space(block_group, info);
if (ret)
kfree(info);
out:
spin_unlock(&block_group->lock);
if (ret) {
printk(KERN_ERR "btrfs: unable to add free space :%d\n", ret);
if (ret == -EEXIST)
BUG();
}
if (alloc_info)
kfree(alloc_info);
return ret;
}
int btrfs_remove_free_space(struct btrfs_block_group_cache *block_group,
u64 offset, u64 bytes)
{
struct btrfs_free_space *info;
int ret = 0;
spin_lock(&block_group->lock);
info = tree_search_offset(&block_group->free_space_offset, offset, 0,
1);
if (info && info->offset == offset) {
if (info->bytes < bytes) {
printk(KERN_ERR "Found free space at %Lu, size %Lu,"
"trying to use %Lu\n",
info->offset, info->bytes, bytes);
WARN_ON(1);
ret = -EINVAL;
goto out;
}
unlink_free_space(block_group, info);
if (info->bytes == bytes) {
kfree(info);
goto out;
}
info->offset += bytes;
info->bytes -= bytes;
ret = link_free_space(block_group, info);
BUG_ON(ret);
} else {
WARN_ON(1);
}
out:
spin_unlock(&block_group->lock);
return ret;
}
void btrfs_dump_free_space(struct btrfs_block_group_cache *block_group,
u64 bytes)
{
struct btrfs_free_space *info;
struct rb_node *n;
int count = 0;
for (n = rb_first(&block_group->free_space_offset); n; n = rb_next(n)) {
info = rb_entry(n, struct btrfs_free_space, offset_index);
if (info->bytes >= bytes)
count++;
//printk(KERN_INFO "offset=%Lu, bytes=%Lu\n", info->offset,
// info->bytes);
}
printk(KERN_INFO "%d blocks of free space at or bigger than bytes is"
"\n", count);
}
u64 btrfs_block_group_free_space(struct btrfs_block_group_cache *block_group)
{
struct btrfs_free_space *info;
struct rb_node *n;
u64 ret = 0;
for (n = rb_first(&block_group->free_space_offset); n;
n = rb_next(n)) {
info = rb_entry(n, struct btrfs_free_space, offset_index);
ret += info->bytes;
}
return ret;
}
void btrfs_remove_free_space_cache(struct btrfs_block_group_cache *block_group)
{
struct btrfs_free_space *info;
struct rb_node *node;
spin_lock(&block_group->lock);
while ((node = rb_last(&block_group->free_space_bytes)) != NULL) {
info = rb_entry(node, struct btrfs_free_space, bytes_index);
unlink_free_space(block_group, info);
kfree(info);
if (need_resched()) {
spin_unlock(&block_group->lock);
cond_resched();
spin_lock(&block_group->lock);
}
}
spin_unlock(&block_group->lock);
}
struct btrfs_free_space *btrfs_find_free_space_offset(struct
btrfs_block_group_cache
*block_group, u64 offset,
u64 bytes)
{
struct btrfs_free_space *ret;
spin_lock(&block_group->lock);
ret = tree_search_offset(&block_group->free_space_offset, offset,
bytes, 0);
spin_unlock(&block_group->lock);
return ret;
}
struct btrfs_free_space *btrfs_find_free_space_bytes(struct
btrfs_block_group_cache
*block_group, u64 offset,
u64 bytes)
{
struct btrfs_free_space *ret;
spin_lock(&block_group->lock);
ret = tree_search_bytes(&block_group->free_space_bytes, offset, bytes);
spin_unlock(&block_group->lock);
return ret;
}
struct btrfs_free_space *btrfs_find_free_space(struct btrfs_block_group_cache
*block_group, u64 offset,
u64 bytes)
{
struct btrfs_free_space *ret;
spin_lock(&block_group->lock);
ret = tree_search_offset(&block_group->free_space_offset, offset,
bytes, 0);
if (!ret)
ret = tree_search_bytes(&block_group->free_space_bytes,
offset, bytes);
spin_unlock(&block_group->lock);
return ret;
}

27
fs/btrfs/hash.h Normal file
View File

@ -0,0 +1,27 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __HASH__
#define __HASH__
#include "crc32c.h"
static inline u64 btrfs_name_hash(const char *name, int len)
{
return btrfs_crc32c((u32)~1, name, len);
}
#endif

206
fs/btrfs/inode-item.c Normal file
View File

@ -0,0 +1,206 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include "ctree.h"
#include "disk-io.h"
#include "transaction.h"
int find_name_in_backref(struct btrfs_path *path, const char * name,
int name_len, struct btrfs_inode_ref **ref_ret)
{
struct extent_buffer *leaf;
struct btrfs_inode_ref *ref;
unsigned long ptr;
unsigned long name_ptr;
u32 item_size;
u32 cur_offset = 0;
int len;
leaf = path->nodes[0];
item_size = btrfs_item_size_nr(leaf, path->slots[0]);
ptr = btrfs_item_ptr_offset(leaf, path->slots[0]);
while (cur_offset < item_size) {
ref = (struct btrfs_inode_ref *)(ptr + cur_offset);
len = btrfs_inode_ref_name_len(leaf, ref);
name_ptr = (unsigned long)(ref + 1);
cur_offset += len + sizeof(*ref);
if (len != name_len)
continue;
if (memcmp_extent_buffer(leaf, name, name_ptr, name_len) == 0) {
*ref_ret = ref;
return 1;
}
}
return 0;
}
int btrfs_del_inode_ref(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
const char *name, int name_len,
u64 inode_objectid, u64 ref_objectid, u64 *index)
{
struct btrfs_path *path;
struct btrfs_key key;
struct btrfs_inode_ref *ref;
struct extent_buffer *leaf;
unsigned long ptr;
unsigned long item_start;
u32 item_size;
u32 sub_item_len;
int ret;
int del_len = name_len + sizeof(*ref);
key.objectid = inode_objectid;
key.offset = ref_objectid;
btrfs_set_key_type(&key, BTRFS_INODE_REF_KEY);
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
if (ret > 0) {
ret = -ENOENT;
goto out;
} else if (ret < 0) {
goto out;
}
if (!find_name_in_backref(path, name, name_len, &ref)) {
ret = -ENOENT;
goto out;
}
leaf = path->nodes[0];
item_size = btrfs_item_size_nr(leaf, path->slots[0]);
if (index)
*index = btrfs_inode_ref_index(leaf, ref);
if (del_len == item_size) {
ret = btrfs_del_item(trans, root, path);
goto out;
}
ptr = (unsigned long)ref;
sub_item_len = name_len + sizeof(*ref);
item_start = btrfs_item_ptr_offset(leaf, path->slots[0]);
memmove_extent_buffer(leaf, ptr, ptr + sub_item_len,
item_size - (ptr + sub_item_len - item_start));
ret = btrfs_truncate_item(trans, root, path,
item_size - sub_item_len, 1);
BUG_ON(ret);
out:
btrfs_free_path(path);
return ret;
}
int btrfs_insert_inode_ref(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
const char *name, int name_len,
u64 inode_objectid, u64 ref_objectid, u64 index)
{
struct btrfs_path *path;
struct btrfs_key key;
struct btrfs_inode_ref *ref;
unsigned long ptr;
int ret;
int ins_len = name_len + sizeof(*ref);
key.objectid = inode_objectid;
key.offset = ref_objectid;
btrfs_set_key_type(&key, BTRFS_INODE_REF_KEY);
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
ret = btrfs_insert_empty_item(trans, root, path, &key,
ins_len);
if (ret == -EEXIST) {
u32 old_size;
if (find_name_in_backref(path, name, name_len, &ref))
goto out;
old_size = btrfs_item_size_nr(path->nodes[0], path->slots[0]);
ret = btrfs_extend_item(trans, root, path, ins_len);
BUG_ON(ret);
ref = btrfs_item_ptr(path->nodes[0], path->slots[0],
struct btrfs_inode_ref);
ref = (struct btrfs_inode_ref *)((unsigned long)ref + old_size);
btrfs_set_inode_ref_name_len(path->nodes[0], ref, name_len);
btrfs_set_inode_ref_index(path->nodes[0], ref, index);
ptr = (unsigned long)(ref + 1);
ret = 0;
} else if (ret < 0) {
goto out;
} else {
ref = btrfs_item_ptr(path->nodes[0], path->slots[0],
struct btrfs_inode_ref);
btrfs_set_inode_ref_name_len(path->nodes[0], ref, name_len);
btrfs_set_inode_ref_index(path->nodes[0], ref, index);
ptr = (unsigned long)(ref + 1);
}
write_extent_buffer(path->nodes[0], name, ptr, name_len);
btrfs_mark_buffer_dirty(path->nodes[0]);
out:
btrfs_free_path(path);
return ret;
}
int btrfs_insert_empty_inode(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct btrfs_path *path, u64 objectid)
{
struct btrfs_key key;
int ret;
key.objectid = objectid;
btrfs_set_key_type(&key, BTRFS_INODE_ITEM_KEY);
key.offset = 0;
ret = btrfs_insert_empty_item(trans, root, path, &key,
sizeof(struct btrfs_inode_item));
if (ret == 0 && objectid > root->highest_inode)
root->highest_inode = objectid;
return ret;
}
int btrfs_lookup_inode(struct btrfs_trans_handle *trans, struct btrfs_root
*root, struct btrfs_path *path,
struct btrfs_key *location, int mod)
{
int ins_len = mod < 0 ? -1 : 0;
int cow = mod != 0;
int ret;
int slot;
struct extent_buffer *leaf;
struct btrfs_key found_key;
ret = btrfs_search_slot(trans, root, location, path, ins_len, cow);
if (ret > 0 && btrfs_key_type(location) == BTRFS_ROOT_ITEM_KEY &&
location->offset == (u64)-1 && path->slots[0] != 0) {
slot = path->slots[0] - 1;
leaf = path->nodes[0];
btrfs_item_key_to_cpu(leaf, &found_key, slot);
if (found_key.objectid == location->objectid &&
btrfs_key_type(&found_key) == btrfs_key_type(location)) {
path->slots[0]--;
return 0;
}
}
return ret;
}

141
fs/btrfs/inode-map.c Normal file
View File

@ -0,0 +1,141 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include "ctree.h"
#include "disk-io.h"
#include "transaction.h"
int btrfs_find_highest_inode(struct btrfs_root *root, u64 *objectid)
{
struct btrfs_path *path;
int ret;
struct extent_buffer *l;
struct btrfs_key search_key;
struct btrfs_key found_key;
int slot;
path = btrfs_alloc_path();
BUG_ON(!path);
search_key.objectid = BTRFS_LAST_FREE_OBJECTID;
search_key.type = -1;
search_key.offset = (u64)-1;
ret = btrfs_search_slot(NULL, root, &search_key, path, 0, 0);
if (ret < 0)
goto error;
BUG_ON(ret == 0);
if (path->slots[0] > 0) {
slot = path->slots[0] - 1;
l = path->nodes[0];
btrfs_item_key_to_cpu(l, &found_key, slot);
*objectid = found_key.objectid;
} else {
*objectid = BTRFS_FIRST_FREE_OBJECTID;
}
ret = 0;
error:
btrfs_free_path(path);
return ret;
}
/*
* walks the btree of allocated inodes and find a hole.
*/
int btrfs_find_free_objectid(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
u64 dirid, u64 *objectid)
{
struct btrfs_path *path;
struct btrfs_key key;
int ret;
int slot = 0;
u64 last_ino = 0;
int start_found;
struct extent_buffer *l;
struct btrfs_key search_key;
u64 search_start = dirid;
mutex_lock(&root->objectid_mutex);
if (root->last_inode_alloc >= BTRFS_FIRST_FREE_OBJECTID &&
root->last_inode_alloc < BTRFS_LAST_FREE_OBJECTID) {
*objectid = ++root->last_inode_alloc;
mutex_unlock(&root->objectid_mutex);
return 0;
}
path = btrfs_alloc_path();
BUG_ON(!path);
search_start = max(search_start, BTRFS_FIRST_FREE_OBJECTID);
search_key.objectid = search_start;
search_key.type = 0;
search_key.offset = 0;
btrfs_init_path(path);
start_found = 0;
ret = btrfs_search_slot(trans, root, &search_key, path, 0, 0);
if (ret < 0)
goto error;
while (1) {
l = path->nodes[0];
slot = path->slots[0];
if (slot >= btrfs_header_nritems(l)) {
ret = btrfs_next_leaf(root, path);
if (ret == 0)
continue;
if (ret < 0)
goto error;
if (!start_found) {
*objectid = search_start;
start_found = 1;
goto found;
}
*objectid = last_ino > search_start ?
last_ino : search_start;
goto found;
}
btrfs_item_key_to_cpu(l, &key, slot);
if (key.objectid >= search_start) {
if (start_found) {
if (last_ino < search_start)
last_ino = search_start;
if (key.objectid > last_ino) {
*objectid = last_ino;
goto found;
}
}
}
if (key.objectid >= BTRFS_LAST_FREE_OBJECTID)
break;
start_found = 1;
last_ino = key.objectid + 1;
path->slots[0]++;
}
// FIXME -ENOSPC
BUG_ON(1);
found:
btrfs_release_path(root, path);
btrfs_free_path(path);
BUG_ON(*objectid < search_start);
mutex_unlock(&root->objectid_mutex);
return 0;
error:
btrfs_release_path(root, path);
btrfs_free_path(path);
mutex_unlock(&root->objectid_mutex);
return ret;
}

3774
fs/btrfs/inode.c Normal file

File diff suppressed because it is too large Load Diff

790
fs/btrfs/ioctl.c Normal file
View File

@ -0,0 +1,790 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/kernel.h>
#include <linux/bio.h>
#include <linux/buffer_head.h>
#include <linux/file.h>
#include <linux/fs.h>
#include <linux/pagemap.h>
#include <linux/highmem.h>
#include <linux/time.h>
#include <linux/init.h>
#include <linux/string.h>
#include <linux/smp_lock.h>
#include <linux/backing-dev.h>
#include <linux/mpage.h>
#include <linux/swap.h>
#include <linux/writeback.h>
#include <linux/statfs.h>
#include <linux/compat.h>
#include <linux/bit_spinlock.h>
#include <linux/version.h>
#include <linux/xattr.h>
#include <linux/vmalloc.h>
#include "ctree.h"
#include "disk-io.h"
#include "transaction.h"
#include "btrfs_inode.h"
#include "ioctl.h"
#include "print-tree.h"
#include "volumes.h"
#include "locking.h"
static noinline int create_subvol(struct btrfs_root *root, char *name,
int namelen)
{
struct btrfs_trans_handle *trans;
struct btrfs_key key;
struct btrfs_root_item root_item;
struct btrfs_inode_item *inode_item;
struct extent_buffer *leaf;
struct btrfs_root *new_root = root;
struct inode *dir;
int ret;
int err;
u64 objectid;
u64 new_dirid = BTRFS_FIRST_FREE_OBJECTID;
unsigned long nr = 1;
ret = btrfs_check_free_space(root, 1, 0);
if (ret)
goto fail_commit;
trans = btrfs_start_transaction(root, 1);
BUG_ON(!trans);
ret = btrfs_find_free_objectid(trans, root->fs_info->tree_root,
0, &objectid);
if (ret)
goto fail;
leaf = btrfs_alloc_free_block(trans, root, root->leafsize, 0,
objectid, trans->transid, 0, 0, 0);
if (IS_ERR(leaf)) {
ret = PTR_ERR(leaf);
goto fail;
}
btrfs_set_header_nritems(leaf, 0);
btrfs_set_header_level(leaf, 0);
btrfs_set_header_bytenr(leaf, leaf->start);
btrfs_set_header_generation(leaf, trans->transid);
btrfs_set_header_owner(leaf, objectid);
write_extent_buffer(leaf, root->fs_info->fsid,
(unsigned long)btrfs_header_fsid(leaf),
BTRFS_FSID_SIZE);
btrfs_mark_buffer_dirty(leaf);
inode_item = &root_item.inode;
memset(inode_item, 0, sizeof(*inode_item));
inode_item->generation = cpu_to_le64(1);
inode_item->size = cpu_to_le64(3);
inode_item->nlink = cpu_to_le32(1);
inode_item->nblocks = cpu_to_le64(1);
inode_item->mode = cpu_to_le32(S_IFDIR | 0755);
btrfs_set_root_bytenr(&root_item, leaf->start);
btrfs_set_root_level(&root_item, 0);
btrfs_set_root_refs(&root_item, 1);
btrfs_set_root_used(&root_item, 0);
memset(&root_item.drop_progress, 0, sizeof(root_item.drop_progress));
root_item.drop_level = 0;
btrfs_tree_unlock(leaf);
free_extent_buffer(leaf);
leaf = NULL;
btrfs_set_root_dirid(&root_item, new_dirid);
key.objectid = objectid;
key.offset = 1;
btrfs_set_key_type(&key, BTRFS_ROOT_ITEM_KEY);
ret = btrfs_insert_root(trans, root->fs_info->tree_root, &key,
&root_item);
if (ret)
goto fail;
/*
* insert the directory item
*/
key.offset = (u64)-1;
dir = root->fs_info->sb->s_root->d_inode;
ret = btrfs_insert_dir_item(trans, root->fs_info->tree_root,
name, namelen, dir->i_ino, &key,
BTRFS_FT_DIR, 0);
if (ret)
goto fail;
ret = btrfs_insert_inode_ref(trans, root->fs_info->tree_root,
name, namelen, objectid,
root->fs_info->sb->s_root->d_inode->i_ino, 0);
if (ret)
goto fail;
ret = btrfs_commit_transaction(trans, root);
if (ret)
goto fail_commit;
new_root = btrfs_read_fs_root(root->fs_info, &key, name, namelen);
BUG_ON(!new_root);
trans = btrfs_start_transaction(new_root, 1);
BUG_ON(!trans);
ret = btrfs_create_subvol_root(new_root, trans, new_dirid,
BTRFS_I(dir)->block_group);
if (ret)
goto fail;
/* Invalidate existing dcache entry for new subvolume. */
btrfs_invalidate_dcache_root(root, name, namelen);
fail:
nr = trans->blocks_used;
err = btrfs_commit_transaction(trans, new_root);
if (err && !ret)
ret = err;
fail_commit:
btrfs_btree_balance_dirty(root, nr);
return ret;
}
static int create_snapshot(struct btrfs_root *root, char *name, int namelen)
{
struct btrfs_pending_snapshot *pending_snapshot;
struct btrfs_trans_handle *trans;
int ret;
int err;
unsigned long nr = 0;
if (!root->ref_cows)
return -EINVAL;
ret = btrfs_check_free_space(root, 1, 0);
if (ret)
goto fail_unlock;
pending_snapshot = kmalloc(sizeof(*pending_snapshot), GFP_NOFS);
if (!pending_snapshot) {
ret = -ENOMEM;
goto fail_unlock;
}
pending_snapshot->name = kmalloc(namelen + 1, GFP_NOFS);
if (!pending_snapshot->name) {
ret = -ENOMEM;
kfree(pending_snapshot);
goto fail_unlock;
}
memcpy(pending_snapshot->name, name, namelen);
pending_snapshot->name[namelen] = '\0';
trans = btrfs_start_transaction(root, 1);
BUG_ON(!trans);
pending_snapshot->root = root;
list_add(&pending_snapshot->list,
&trans->transaction->pending_snapshots);
ret = btrfs_update_inode(trans, root, root->inode);
err = btrfs_commit_transaction(trans, root);
fail_unlock:
btrfs_btree_balance_dirty(root, nr);
return ret;
}
int btrfs_defrag_file(struct file *file)
{
struct inode *inode = fdentry(file)->d_inode;
struct btrfs_root *root = BTRFS_I(inode)->root;
struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree;
struct btrfs_ordered_extent *ordered;
struct page *page;
unsigned long last_index;
unsigned long ra_pages = root->fs_info->bdi.ra_pages;
unsigned long total_read = 0;
u64 page_start;
u64 page_end;
unsigned long i;
int ret;
ret = btrfs_check_free_space(root, inode->i_size, 0);
if (ret)
return -ENOSPC;
mutex_lock(&inode->i_mutex);
last_index = inode->i_size >> PAGE_CACHE_SHIFT;
for (i = 0; i <= last_index; i++) {
if (total_read % ra_pages == 0) {
btrfs_force_ra(inode->i_mapping, &file->f_ra, file, i,
min(last_index, i + ra_pages - 1));
}
total_read++;
again:
page = grab_cache_page(inode->i_mapping, i);
if (!page)
goto out_unlock;
if (!PageUptodate(page)) {
btrfs_readpage(NULL, page);
lock_page(page);
if (!PageUptodate(page)) {
unlock_page(page);
page_cache_release(page);
goto out_unlock;
}
}
wait_on_page_writeback(page);
page_start = (u64)page->index << PAGE_CACHE_SHIFT;
page_end = page_start + PAGE_CACHE_SIZE - 1;
lock_extent(io_tree, page_start, page_end, GFP_NOFS);
ordered = btrfs_lookup_ordered_extent(inode, page_start);
if (ordered) {
unlock_extent(io_tree, page_start, page_end, GFP_NOFS);
unlock_page(page);
page_cache_release(page);
btrfs_start_ordered_extent(inode, ordered, 1);
btrfs_put_ordered_extent(ordered);
goto again;
}
set_page_extent_mapped(page);
/*
* this makes sure page_mkwrite is called on the
* page if it is dirtied again later
*/
clear_page_dirty_for_io(page);
btrfs_set_extent_delalloc(inode, page_start, page_end);
unlock_extent(io_tree, page_start, page_end, GFP_NOFS);
set_page_dirty(page);
unlock_page(page);
page_cache_release(page);
balance_dirty_pages_ratelimited_nr(inode->i_mapping, 1);
}
out_unlock:
mutex_unlock(&inode->i_mutex);
return 0;
}
/*
* Called inside transaction, so use GFP_NOFS
*/
static int btrfs_ioctl_resize(struct btrfs_root *root, void __user *arg)
{
u64 new_size;
u64 old_size;
u64 devid = 1;
struct btrfs_ioctl_vol_args *vol_args;
struct btrfs_trans_handle *trans;
struct btrfs_device *device = NULL;
char *sizestr;
char *devstr = NULL;
int ret = 0;
int namelen;
int mod = 0;
vol_args = kmalloc(sizeof(*vol_args), GFP_NOFS);
if (!vol_args)
return -ENOMEM;
if (copy_from_user(vol_args, arg, sizeof(*vol_args))) {
ret = -EFAULT;
goto out;
}
vol_args->name[BTRFS_PATH_NAME_MAX] = '\0';
namelen = strlen(vol_args->name);
mutex_lock(&root->fs_info->volume_mutex);
sizestr = vol_args->name;
devstr = strchr(sizestr, ':');
if (devstr) {
char *end;
sizestr = devstr + 1;
*devstr = '\0';
devstr = vol_args->name;
devid = simple_strtoull(devstr, &end, 10);
printk(KERN_INFO "resizing devid %llu\n", devid);
}
device = btrfs_find_device(root, devid, NULL);
if (!device) {
printk(KERN_INFO "resizer unable to find device %llu\n", devid);
ret = -EINVAL;
goto out_unlock;
}
if (!strcmp(sizestr, "max"))
new_size = device->bdev->bd_inode->i_size;
else {
if (sizestr[0] == '-') {
mod = -1;
sizestr++;
} else if (sizestr[0] == '+') {
mod = 1;
sizestr++;
}
new_size = btrfs_parse_size(sizestr);
if (new_size == 0) {
ret = -EINVAL;
goto out_unlock;
}
}
old_size = device->total_bytes;
if (mod < 0) {
if (new_size > old_size) {
ret = -EINVAL;
goto out_unlock;
}
new_size = old_size - new_size;
} else if (mod > 0) {
new_size = old_size + new_size;
}
if (new_size < 256 * 1024 * 1024) {
ret = -EINVAL;
goto out_unlock;
}
if (new_size > device->bdev->bd_inode->i_size) {
ret = -EFBIG;
goto out_unlock;
}
do_div(new_size, root->sectorsize);
new_size *= root->sectorsize;
printk(KERN_INFO "new size for %s is %llu\n",
device->name, (unsigned long long)new_size);
if (new_size > old_size) {
trans = btrfs_start_transaction(root, 1);
ret = btrfs_grow_device(trans, device, new_size);
btrfs_commit_transaction(trans, root);
} else {
ret = btrfs_shrink_device(device, new_size);
}
out_unlock:
mutex_unlock(&root->fs_info->volume_mutex);
out:
kfree(vol_args);
return ret;
}
static noinline int btrfs_ioctl_snap_create(struct btrfs_root *root,
void __user *arg)
{
struct btrfs_ioctl_vol_args *vol_args;
struct btrfs_dir_item *di;
struct btrfs_path *path;
u64 root_dirid;
int namelen;
int ret;
vol_args = kmalloc(sizeof(*vol_args), GFP_NOFS);
if (!vol_args)
return -ENOMEM;
if (copy_from_user(vol_args, arg, sizeof(*vol_args))) {
ret = -EFAULT;
goto out;
}
vol_args->name[BTRFS_PATH_NAME_MAX] = '\0';
namelen = strlen(vol_args->name);
if (strchr(vol_args->name, '/')) {
ret = -EINVAL;
goto out;
}
path = btrfs_alloc_path();
if (!path) {
ret = -ENOMEM;
goto out;
}
root_dirid = root->fs_info->sb->s_root->d_inode->i_ino,
di = btrfs_lookup_dir_item(NULL, root->fs_info->tree_root,
path, root_dirid,
vol_args->name, namelen, 0);
btrfs_free_path(path);
if (di && !IS_ERR(di)) {
ret = -EEXIST;
goto out;
}
if (IS_ERR(di)) {
ret = PTR_ERR(di);
goto out;
}
mutex_lock(&root->fs_info->drop_mutex);
if (root == root->fs_info->tree_root)
ret = create_subvol(root, vol_args->name, namelen);
else
ret = create_snapshot(root, vol_args->name, namelen);
mutex_unlock(&root->fs_info->drop_mutex);
out:
kfree(vol_args);
return ret;
}
static int btrfs_ioctl_defrag(struct file *file)
{
struct inode *inode = fdentry(file)->d_inode;
struct btrfs_root *root = BTRFS_I(inode)->root;
switch (inode->i_mode & S_IFMT) {
case S_IFDIR:
btrfs_defrag_root(root, 0);
btrfs_defrag_root(root->fs_info->extent_root, 0);
break;
case S_IFREG:
btrfs_defrag_file(file);
break;
}
return 0;
}
long btrfs_ioctl_add_dev(struct btrfs_root *root, void __user *arg)
{
struct btrfs_ioctl_vol_args *vol_args;
int ret;
vol_args = kmalloc(sizeof(*vol_args), GFP_NOFS);
if (!vol_args)
return -ENOMEM;
if (copy_from_user(vol_args, arg, sizeof(*vol_args))) {
ret = -EFAULT;
goto out;
}
vol_args->name[BTRFS_PATH_NAME_MAX] = '\0';
ret = btrfs_init_new_device(root, vol_args->name);
out:
kfree(vol_args);
return ret;
}
long btrfs_ioctl_rm_dev(struct btrfs_root *root, void __user *arg)
{
struct btrfs_ioctl_vol_args *vol_args;
int ret;
vol_args = kmalloc(sizeof(*vol_args), GFP_NOFS);
if (!vol_args)
return -ENOMEM;
if (copy_from_user(vol_args, arg, sizeof(*vol_args))) {
ret = -EFAULT;
goto out;
}
vol_args->name[BTRFS_PATH_NAME_MAX] = '\0';
ret = btrfs_rm_device(root, vol_args->name);
out:
kfree(vol_args);
return ret;
}
long btrfs_ioctl_clone(struct file *file, unsigned long src_fd)
{
struct inode *inode = fdentry(file)->d_inode;
struct btrfs_root *root = BTRFS_I(inode)->root;
struct file *src_file;
struct inode *src;
struct btrfs_trans_handle *trans;
struct btrfs_path *path;
struct extent_buffer *leaf;
char *buf;
struct btrfs_key key;
u32 nritems;
int slot;
int ret;
src_file = fget(src_fd);
if (!src_file)
return -EBADF;
src = src_file->f_dentry->d_inode;
ret = -EISDIR;
if (S_ISDIR(src->i_mode) || S_ISDIR(inode->i_mode))
goto out_fput;
ret = -EXDEV;
if (src->i_sb != inode->i_sb || BTRFS_I(src)->root != root)
goto out_fput;
ret = -ENOMEM;
buf = vmalloc(btrfs_level_size(root, 0));
if (!buf)
goto out_fput;
path = btrfs_alloc_path();
if (!path) {
vfree(buf);
goto out_fput;
}
path->reada = 2;
if (inode < src) {
mutex_lock(&inode->i_mutex);
mutex_lock(&src->i_mutex);
} else {
mutex_lock(&src->i_mutex);
mutex_lock(&inode->i_mutex);
}
ret = -ENOTEMPTY;
if (inode->i_size)
goto out_unlock;
/* do any pending delalloc/csum calc on src, one way or
another, and lock file content */
while (1) {
struct btrfs_ordered_extent *ordered;
lock_extent(&BTRFS_I(src)->io_tree, 0, (u64)-1, GFP_NOFS);
ordered = btrfs_lookup_first_ordered_extent(inode, (u64)-1);
if (BTRFS_I(src)->delalloc_bytes == 0 && !ordered)
break;
unlock_extent(&BTRFS_I(src)->io_tree, 0, (u64)-1, GFP_NOFS);
if (ordered)
btrfs_put_ordered_extent(ordered);
btrfs_wait_ordered_range(src, 0, (u64)-1);
}
trans = btrfs_start_transaction(root, 1);
BUG_ON(!trans);
key.objectid = src->i_ino;
key.type = BTRFS_EXTENT_DATA_KEY;
key.offset = 0;
while (1) {
/*
* note the key will change type as we walk through the
* tree.
*/
ret = btrfs_search_slot(trans, root, &key, path, 0, 0);
if (ret < 0)
goto out;
nritems = btrfs_header_nritems(path->nodes[0]);
if (path->slots[0] >= nritems) {
ret = btrfs_next_leaf(root, path);
if (ret < 0)
goto out;
if (ret > 0)
break;
nritems = btrfs_header_nritems(path->nodes[0]);
}
leaf = path->nodes[0];
slot = path->slots[0];
btrfs_item_key_to_cpu(leaf, &key, slot);
if (btrfs_key_type(&key) > BTRFS_CSUM_ITEM_KEY ||
key.objectid != src->i_ino)
break;
if (btrfs_key_type(&key) == BTRFS_EXTENT_DATA_KEY ||
btrfs_key_type(&key) == BTRFS_CSUM_ITEM_KEY) {
u32 size;
struct btrfs_key new_key;
size = btrfs_item_size_nr(leaf, slot);
read_extent_buffer(leaf, buf,
btrfs_item_ptr_offset(leaf, slot),
size);
btrfs_release_path(root, path);
memcpy(&new_key, &key, sizeof(new_key));
new_key.objectid = inode->i_ino;
ret = btrfs_insert_empty_item(trans, root, path,
&new_key, size);
if (ret)
goto out;
leaf = path->nodes[0];
slot = path->slots[0];
write_extent_buffer(leaf, buf,
btrfs_item_ptr_offset(leaf, slot),
size);
btrfs_mark_buffer_dirty(leaf);
}
if (btrfs_key_type(&key) == BTRFS_EXTENT_DATA_KEY) {
struct btrfs_file_extent_item *extent;
int found_type;
extent = btrfs_item_ptr(leaf, slot,
struct btrfs_file_extent_item);
found_type = btrfs_file_extent_type(leaf, extent);
if (found_type == BTRFS_FILE_EXTENT_REG) {
u64 ds = btrfs_file_extent_disk_bytenr(leaf,
extent);
u64 dl = btrfs_file_extent_disk_num_bytes(leaf,
extent);
/* ds == 0 means there's a hole */
if (ds != 0) {
ret = btrfs_inc_extent_ref(trans, root,
ds, dl, leaf->start,
root->root_key.objectid,
trans->transid,
inode->i_ino, key.offset);
BUG_ON(ret);
}
}
}
btrfs_release_path(root, path);
key.offset++;
}
ret = 0;
out:
btrfs_release_path(root, path);
if (ret == 0) {
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
inode->i_blocks = src->i_blocks;
btrfs_i_size_write(inode, src->i_size);
BTRFS_I(inode)->flags = BTRFS_I(src)->flags;
ret = btrfs_update_inode(trans, root, inode);
}
btrfs_end_transaction(trans, root);
unlock_extent(&BTRFS_I(src)->io_tree, 0, (u64)-1, GFP_NOFS);
if (ret)
vmtruncate(inode, 0);
out_unlock:
mutex_unlock(&src->i_mutex);
mutex_unlock(&inode->i_mutex);
vfree(buf);
btrfs_free_path(path);
out_fput:
fput(src_file);
return ret;
}
/*
* there are many ways the trans_start and trans_end ioctls can lead
* to deadlocks. They should only be used by applications that
* basically own the machine, and have a very in depth understanding
* of all the possible deadlocks and enospc problems.
*/
long btrfs_ioctl_trans_start(struct file *file)
{
struct inode *inode = fdentry(file)->d_inode;
struct btrfs_root *root = BTRFS_I(inode)->root;
struct btrfs_trans_handle *trans;
int ret = 0;
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
if (file->private_data) {
ret = -EINPROGRESS;
goto out;
}
mutex_lock(&root->fs_info->trans_mutex);
root->fs_info->open_ioctl_trans++;
mutex_unlock(&root->fs_info->trans_mutex);
trans = btrfs_start_ioctl_transaction(root, 0);
if (trans)
file->private_data = trans;
else
ret = -ENOMEM;
/*printk(KERN_INFO "btrfs_ioctl_trans_start on %p\n", file);*/
out:
return ret;
}
/*
* there are many ways the trans_start and trans_end ioctls can lead
* to deadlocks. They should only be used by applications that
* basically own the machine, and have a very in depth understanding
* of all the possible deadlocks and enospc problems.
*/
long btrfs_ioctl_trans_end(struct file *file)
{
struct inode *inode = fdentry(file)->d_inode;
struct btrfs_root *root = BTRFS_I(inode)->root;
struct btrfs_trans_handle *trans;
int ret = 0;
trans = file->private_data;
if (!trans) {
ret = -EINVAL;
goto out;
}
btrfs_end_transaction(trans, root);
file->private_data = NULL;
mutex_lock(&root->fs_info->trans_mutex);
root->fs_info->open_ioctl_trans--;
mutex_unlock(&root->fs_info->trans_mutex);
out:
return ret;
}
long btrfs_ioctl(struct file *file, unsigned int
cmd, unsigned long arg)
{
struct btrfs_root *root = BTRFS_I(fdentry(file)->d_inode)->root;
switch (cmd) {
case BTRFS_IOC_SNAP_CREATE:
return btrfs_ioctl_snap_create(root, (void __user *)arg);
case BTRFS_IOC_DEFRAG:
return btrfs_ioctl_defrag(file);
case BTRFS_IOC_RESIZE:
return btrfs_ioctl_resize(root, (void __user *)arg);
case BTRFS_IOC_ADD_DEV:
return btrfs_ioctl_add_dev(root, (void __user *)arg);
case BTRFS_IOC_RM_DEV:
return btrfs_ioctl_rm_dev(root, (void __user *)arg);
case BTRFS_IOC_BALANCE:
return btrfs_balance(root->fs_info->dev_root);
case BTRFS_IOC_CLONE:
return btrfs_ioctl_clone(file, arg);
case BTRFS_IOC_TRANS_START:
return btrfs_ioctl_trans_start(file);
case BTRFS_IOC_TRANS_END:
return btrfs_ioctl_trans_end(file);
case BTRFS_IOC_SYNC:
btrfs_start_delalloc_inodes(root);
btrfs_sync_fs(file->f_dentry->d_sb, 1);
return 0;
}
return -ENOTTY;
}

55
fs/btrfs/ioctl.h Normal file
View File

@ -0,0 +1,55 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __IOCTL_
#define __IOCTL_
#include <linux/ioctl.h>
#define BTRFS_IOCTL_MAGIC 0x94
#define BTRFS_VOL_NAME_MAX 255
#define BTRFS_PATH_NAME_MAX 4095
struct btrfs_ioctl_vol_args {
char name[BTRFS_PATH_NAME_MAX + 1];
};
#define BTRFS_IOC_SNAP_CREATE _IOW(BTRFS_IOCTL_MAGIC, 1, \
struct btrfs_ioctl_vol_args)
#define BTRFS_IOC_DEFRAG _IOW(BTRFS_IOCTL_MAGIC, 2, \
struct btrfs_ioctl_vol_args)
#define BTRFS_IOC_RESIZE _IOW(BTRFS_IOCTL_MAGIC, 3, \
struct btrfs_ioctl_vol_args)
#define BTRFS_IOC_SCAN_DEV _IOW(BTRFS_IOCTL_MAGIC, 4, \
struct btrfs_ioctl_vol_args)
/* trans start and trans end are dangerous, and only for
* use by applications that know how to avoid the
* resulting deadlocks
*/
#define BTRFS_IOC_TRANS_START _IO(BTRFS_IOCTL_MAGIC, 6)
#define BTRFS_IOC_TRANS_END _IO(BTRFS_IOCTL_MAGIC, 7)
#define BTRFS_IOC_SYNC _IO(BTRFS_IOCTL_MAGIC, 8)
#define BTRFS_IOC_CLONE _IOW(BTRFS_IOCTL_MAGIC, 9, int)
#define BTRFS_IOC_ADD_DEV _IOW(BTRFS_IOCTL_MAGIC, 10, \
struct btrfs_ioctl_vol_args)
#define BTRFS_IOC_RM_DEV _IOW(BTRFS_IOCTL_MAGIC, 11, \
struct btrfs_ioctl_vol_args)
#define BTRFS_IOC_BALANCE _IOW(BTRFS_IOCTL_MAGIC, 12, \
struct btrfs_ioctl_vol_args)
#endif

74
fs/btrfs/locking.c Normal file
View File

@ -0,0 +1,74 @@
/*
* Copyright (C) 2008 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/sched.h>
#include <linux/gfp.h>
#include <linux/pagemap.h>
#include <linux/spinlock.h>
#include <linux/page-flags.h>
#include <asm/bug.h>
#include "ctree.h"
#include "extent_io.h"
#include "locking.h"
int btrfs_tree_lock(struct extent_buffer *eb)
{
int i;
if (mutex_trylock(&eb->mutex))
return 0;
for (i = 0; i < 512; i++) {
cpu_relax();
if (mutex_trylock(&eb->mutex))
return 0;
}
cpu_relax();
mutex_lock_nested(&eb->mutex, BTRFS_MAX_LEVEL - btrfs_header_level(eb));
return 0;
}
int btrfs_try_tree_lock(struct extent_buffer *eb)
{
return mutex_trylock(&eb->mutex);
}
int btrfs_tree_unlock(struct extent_buffer *eb)
{
mutex_unlock(&eb->mutex);
return 0;
}
int btrfs_tree_locked(struct extent_buffer *eb)
{
return mutex_is_locked(&eb->mutex);
}
int btrfs_path_lock_waiting(struct btrfs_path *path, int level)
{
int i;
struct extent_buffer *eb;
for (i = level; i <= level + 1 && i < BTRFS_MAX_LEVEL; i++) {
eb = path->nodes[i];
if (!eb)
break;
smp_mb();
if (!list_empty(&eb->mutex.wait_list))
return 1;
}
return 0;
}

27
fs/btrfs/locking.h Normal file
View File

@ -0,0 +1,27 @@
/*
* Copyright (C) 2008 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __BTRFS_LOCKING_
#define __BTRFS_LOCKING_
int btrfs_tree_lock(struct extent_buffer *eb);
int btrfs_tree_unlock(struct extent_buffer *eb);
int btrfs_tree_locked(struct extent_buffer *eb);
int btrfs_try_tree_lock(struct extent_buffer *eb);
int btrfs_path_lock_waiting(struct btrfs_path *path, int level);
#endif

709
fs/btrfs/ordered-data.c Normal file
View File

@ -0,0 +1,709 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/gfp.h>
#include <linux/slab.h>
#include <linux/blkdev.h>
#include <linux/writeback.h>
#include <linux/pagevec.h>
#include "ctree.h"
#include "transaction.h"
#include "btrfs_inode.h"
#include "extent_io.h"
static u64 entry_end(struct btrfs_ordered_extent *entry)
{
if (entry->file_offset + entry->len < entry->file_offset)
return (u64)-1;
return entry->file_offset + entry->len;
}
static struct rb_node *tree_insert(struct rb_root *root, u64 file_offset,
struct rb_node *node)
{
struct rb_node ** p = &root->rb_node;
struct rb_node * parent = NULL;
struct btrfs_ordered_extent *entry;
while(*p) {
parent = *p;
entry = rb_entry(parent, struct btrfs_ordered_extent, rb_node);
if (file_offset < entry->file_offset)
p = &(*p)->rb_left;
else if (file_offset >= entry_end(entry))
p = &(*p)->rb_right;
else
return parent;
}
rb_link_node(node, parent, p);
rb_insert_color(node, root);
return NULL;
}
static struct rb_node *__tree_search(struct rb_root *root, u64 file_offset,
struct rb_node **prev_ret)
{
struct rb_node * n = root->rb_node;
struct rb_node *prev = NULL;
struct rb_node *test;
struct btrfs_ordered_extent *entry;
struct btrfs_ordered_extent *prev_entry = NULL;
while(n) {
entry = rb_entry(n, struct btrfs_ordered_extent, rb_node);
prev = n;
prev_entry = entry;
if (file_offset < entry->file_offset)
n = n->rb_left;
else if (file_offset >= entry_end(entry))
n = n->rb_right;
else
return n;
}
if (!prev_ret)
return NULL;
while(prev && file_offset >= entry_end(prev_entry)) {
test = rb_next(prev);
if (!test)
break;
prev_entry = rb_entry(test, struct btrfs_ordered_extent,
rb_node);
if (file_offset < entry_end(prev_entry))
break;
prev = test;
}
if (prev)
prev_entry = rb_entry(prev, struct btrfs_ordered_extent,
rb_node);
while(prev && file_offset < entry_end(prev_entry)) {
test = rb_prev(prev);
if (!test)
break;
prev_entry = rb_entry(test, struct btrfs_ordered_extent,
rb_node);
prev = test;
}
*prev_ret = prev;
return NULL;
}
static int offset_in_entry(struct btrfs_ordered_extent *entry, u64 file_offset)
{
if (file_offset < entry->file_offset ||
entry->file_offset + entry->len <= file_offset)
return 0;
return 1;
}
static inline struct rb_node *tree_search(struct btrfs_ordered_inode_tree *tree,
u64 file_offset)
{
struct rb_root *root = &tree->tree;
struct rb_node *prev;
struct rb_node *ret;
struct btrfs_ordered_extent *entry;
if (tree->last) {
entry = rb_entry(tree->last, struct btrfs_ordered_extent,
rb_node);
if (offset_in_entry(entry, file_offset))
return tree->last;
}
ret = __tree_search(root, file_offset, &prev);
if (!ret)
ret = prev;
if (ret)
tree->last = ret;
return ret;
}
/* allocate and add a new ordered_extent into the per-inode tree.
* file_offset is the logical offset in the file
*
* start is the disk block number of an extent already reserved in the
* extent allocation tree
*
* len is the length of the extent
*
* This also sets the EXTENT_ORDERED bit on the range in the inode.
*
* The tree is given a single reference on the ordered extent that was
* inserted.
*/
int btrfs_add_ordered_extent(struct inode *inode, u64 file_offset,
u64 start, u64 len, int nocow)
{
struct btrfs_ordered_inode_tree *tree;
struct rb_node *node;
struct btrfs_ordered_extent *entry;
tree = &BTRFS_I(inode)->ordered_tree;
entry = kzalloc(sizeof(*entry), GFP_NOFS);
if (!entry)
return -ENOMEM;
mutex_lock(&tree->mutex);
entry->file_offset = file_offset;
entry->start = start;
entry->len = len;
entry->inode = inode;
if (nocow)
set_bit(BTRFS_ORDERED_NOCOW, &entry->flags);
/* one ref for the tree */
atomic_set(&entry->refs, 1);
init_waitqueue_head(&entry->wait);
INIT_LIST_HEAD(&entry->list);
INIT_LIST_HEAD(&entry->root_extent_list);
node = tree_insert(&tree->tree, file_offset,
&entry->rb_node);
if (node) {
printk("warning dup entry from add_ordered_extent\n");
BUG();
}
set_extent_ordered(&BTRFS_I(inode)->io_tree, file_offset,
entry_end(entry) - 1, GFP_NOFS);
spin_lock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
list_add_tail(&entry->root_extent_list,
&BTRFS_I(inode)->root->fs_info->ordered_extents);
spin_unlock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
mutex_unlock(&tree->mutex);
BUG_ON(node);
return 0;
}
/*
* Add a struct btrfs_ordered_sum into the list of checksums to be inserted
* when an ordered extent is finished. If the list covers more than one
* ordered extent, it is split across multiples.
*/
int btrfs_add_ordered_sum(struct inode *inode,
struct btrfs_ordered_extent *entry,
struct btrfs_ordered_sum *sum)
{
struct btrfs_ordered_inode_tree *tree;
tree = &BTRFS_I(inode)->ordered_tree;
mutex_lock(&tree->mutex);
list_add_tail(&sum->list, &entry->list);
mutex_unlock(&tree->mutex);
return 0;
}
/*
* this is used to account for finished IO across a given range
* of the file. The IO should not span ordered extents. If
* a given ordered_extent is completely done, 1 is returned, otherwise
* 0.
*
* test_and_set_bit on a flag in the struct btrfs_ordered_extent is used
* to make sure this function only returns 1 once for a given ordered extent.
*/
int btrfs_dec_test_ordered_pending(struct inode *inode,
u64 file_offset, u64 io_size)
{
struct btrfs_ordered_inode_tree *tree;
struct rb_node *node;
struct btrfs_ordered_extent *entry;
struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree;
int ret;
tree = &BTRFS_I(inode)->ordered_tree;
mutex_lock(&tree->mutex);
clear_extent_ordered(io_tree, file_offset, file_offset + io_size - 1,
GFP_NOFS);
node = tree_search(tree, file_offset);
if (!node) {
ret = 1;
goto out;
}
entry = rb_entry(node, struct btrfs_ordered_extent, rb_node);
if (!offset_in_entry(entry, file_offset)) {
ret = 1;
goto out;
}
ret = test_range_bit(io_tree, entry->file_offset,
entry->file_offset + entry->len - 1,
EXTENT_ORDERED, 0);
if (ret == 0)
ret = test_and_set_bit(BTRFS_ORDERED_IO_DONE, &entry->flags);
out:
mutex_unlock(&tree->mutex);
return ret == 0;
}
/*
* used to drop a reference on an ordered extent. This will free
* the extent if the last reference is dropped
*/
int btrfs_put_ordered_extent(struct btrfs_ordered_extent *entry)
{
struct list_head *cur;
struct btrfs_ordered_sum *sum;
if (atomic_dec_and_test(&entry->refs)) {
while(!list_empty(&entry->list)) {
cur = entry->list.next;
sum = list_entry(cur, struct btrfs_ordered_sum, list);
list_del(&sum->list);
kfree(sum);
}
kfree(entry);
}
return 0;
}
/*
* remove an ordered extent from the tree. No references are dropped
* but, anyone waiting on this extent is woken up.
*/
int btrfs_remove_ordered_extent(struct inode *inode,
struct btrfs_ordered_extent *entry)
{
struct btrfs_ordered_inode_tree *tree;
struct rb_node *node;
tree = &BTRFS_I(inode)->ordered_tree;
mutex_lock(&tree->mutex);
node = &entry->rb_node;
rb_erase(node, &tree->tree);
tree->last = NULL;
set_bit(BTRFS_ORDERED_COMPLETE, &entry->flags);
spin_lock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
list_del_init(&entry->root_extent_list);
spin_unlock(&BTRFS_I(inode)->root->fs_info->ordered_extent_lock);
mutex_unlock(&tree->mutex);
wake_up(&entry->wait);
return 0;
}
int btrfs_wait_ordered_extents(struct btrfs_root *root, int nocow_only)
{
struct list_head splice;
struct list_head *cur;
struct list_head *tmp;
struct btrfs_ordered_extent *ordered;
struct inode *inode;
INIT_LIST_HEAD(&splice);
spin_lock(&root->fs_info->ordered_extent_lock);
list_splice_init(&root->fs_info->ordered_extents, &splice);
list_for_each_safe(cur, tmp, &splice) {
cur = splice.next;
ordered = list_entry(cur, struct btrfs_ordered_extent,
root_extent_list);
if (nocow_only &&
!test_bit(BTRFS_ORDERED_NOCOW, &ordered->flags)) {
cond_resched_lock(&root->fs_info->ordered_extent_lock);
continue;
}
list_del_init(&ordered->root_extent_list);
atomic_inc(&ordered->refs);
inode = ordered->inode;
/*
* the inode can't go away until all the pages are gone
* and the pages won't go away while there is still
* an ordered extent and the ordered extent won't go
* away until it is off this list. So, we can safely
* increment i_count here and call iput later
*/
atomic_inc(&inode->i_count);
spin_unlock(&root->fs_info->ordered_extent_lock);
btrfs_start_ordered_extent(inode, ordered, 1);
btrfs_put_ordered_extent(ordered);
iput(inode);
spin_lock(&root->fs_info->ordered_extent_lock);
}
list_splice_init(&splice, &root->fs_info->ordered_extents);
spin_unlock(&root->fs_info->ordered_extent_lock);
return 0;
}
/*
* Used to start IO or wait for a given ordered extent to finish.
*
* If wait is one, this effectively waits on page writeback for all the pages
* in the extent, and it waits on the io completion code to insert
* metadata into the btree corresponding to the extent
*/
void btrfs_start_ordered_extent(struct inode *inode,
struct btrfs_ordered_extent *entry,
int wait)
{
u64 start = entry->file_offset;
u64 end = start + entry->len - 1;
/*
* pages in the range can be dirty, clean or writeback. We
* start IO on any dirty ones so the wait doesn't stall waiting
* for pdflush to find them
*/
btrfs_fdatawrite_range(inode->i_mapping, start, end, WB_SYNC_NONE);
if (wait)
wait_event(entry->wait, test_bit(BTRFS_ORDERED_COMPLETE,
&entry->flags));
}
/*
* Used to wait on ordered extents across a large range of bytes.
*/
void btrfs_wait_ordered_range(struct inode *inode, u64 start, u64 len)
{
u64 end;
u64 orig_end;
u64 wait_end;
struct btrfs_ordered_extent *ordered;
if (start + len < start) {
orig_end = INT_LIMIT(loff_t);
} else {
orig_end = start + len - 1;
if (orig_end > INT_LIMIT(loff_t))
orig_end = INT_LIMIT(loff_t);
}
wait_end = orig_end;
again:
/* start IO across the range first to instantiate any delalloc
* extents
*/
btrfs_fdatawrite_range(inode->i_mapping, start, orig_end, WB_SYNC_NONE);
btrfs_wait_on_page_writeback_range(inode->i_mapping,
start >> PAGE_CACHE_SHIFT,
orig_end >> PAGE_CACHE_SHIFT);
end = orig_end;
while(1) {
ordered = btrfs_lookup_first_ordered_extent(inode, end);
if (!ordered) {
break;
}
if (ordered->file_offset > orig_end) {
btrfs_put_ordered_extent(ordered);
break;
}
if (ordered->file_offset + ordered->len < start) {
btrfs_put_ordered_extent(ordered);
break;
}
btrfs_start_ordered_extent(inode, ordered, 1);
end = ordered->file_offset;
btrfs_put_ordered_extent(ordered);
if (end == 0 || end == start)
break;
end--;
}
if (test_range_bit(&BTRFS_I(inode)->io_tree, start, orig_end,
EXTENT_ORDERED | EXTENT_DELALLOC, 0)) {
printk("inode %lu still ordered or delalloc after wait "
"%llu %llu\n", inode->i_ino,
(unsigned long long)start,
(unsigned long long)orig_end);
goto again;
}
}
/*
* find an ordered extent corresponding to file_offset. return NULL if
* nothing is found, otherwise take a reference on the extent and return it
*/
struct btrfs_ordered_extent *btrfs_lookup_ordered_extent(struct inode *inode,
u64 file_offset)
{
struct btrfs_ordered_inode_tree *tree;
struct rb_node *node;
struct btrfs_ordered_extent *entry = NULL;
tree = &BTRFS_I(inode)->ordered_tree;
mutex_lock(&tree->mutex);
node = tree_search(tree, file_offset);
if (!node)
goto out;
entry = rb_entry(node, struct btrfs_ordered_extent, rb_node);
if (!offset_in_entry(entry, file_offset))
entry = NULL;
if (entry)
atomic_inc(&entry->refs);
out:
mutex_unlock(&tree->mutex);
return entry;
}
/*
* lookup and return any extent before 'file_offset'. NULL is returned
* if none is found
*/
struct btrfs_ordered_extent *
btrfs_lookup_first_ordered_extent(struct inode * inode, u64 file_offset)
{
struct btrfs_ordered_inode_tree *tree;
struct rb_node *node;
struct btrfs_ordered_extent *entry = NULL;
tree = &BTRFS_I(inode)->ordered_tree;
mutex_lock(&tree->mutex);
node = tree_search(tree, file_offset);
if (!node)
goto out;
entry = rb_entry(node, struct btrfs_ordered_extent, rb_node);
atomic_inc(&entry->refs);
out:
mutex_unlock(&tree->mutex);
return entry;
}
/*
* After an extent is done, call this to conditionally update the on disk
* i_size. i_size is updated to cover any fully written part of the file.
*/
int btrfs_ordered_update_i_size(struct inode *inode,
struct btrfs_ordered_extent *ordered)
{
struct btrfs_ordered_inode_tree *tree = &BTRFS_I(inode)->ordered_tree;
struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree;
u64 disk_i_size;
u64 new_i_size;
u64 i_size_test;
struct rb_node *node;
struct btrfs_ordered_extent *test;
mutex_lock(&tree->mutex);
disk_i_size = BTRFS_I(inode)->disk_i_size;
/*
* if the disk i_size is already at the inode->i_size, or
* this ordered extent is inside the disk i_size, we're done
*/
if (disk_i_size >= inode->i_size ||
ordered->file_offset + ordered->len <= disk_i_size) {
goto out;
}
/*
* we can't update the disk_isize if there are delalloc bytes
* between disk_i_size and this ordered extent
*/
if (test_range_bit(io_tree, disk_i_size,
ordered->file_offset + ordered->len - 1,
EXTENT_DELALLOC, 0)) {
goto out;
}
/*
* walk backward from this ordered extent to disk_i_size.
* if we find an ordered extent then we can't update disk i_size
* yet
*/
node = &ordered->rb_node;
while(1) {
node = rb_prev(node);
if (!node)
break;
test = rb_entry(node, struct btrfs_ordered_extent, rb_node);
if (test->file_offset + test->len <= disk_i_size)
break;
if (test->file_offset >= inode->i_size)
break;
if (test->file_offset >= disk_i_size)
goto out;
}
new_i_size = min_t(u64, entry_end(ordered), i_size_read(inode));
/*
* at this point, we know we can safely update i_size to at least
* the offset from this ordered extent. But, we need to
* walk forward and see if ios from higher up in the file have
* finished.
*/
node = rb_next(&ordered->rb_node);
i_size_test = 0;
if (node) {
/*
* do we have an area where IO might have finished
* between our ordered extent and the next one.
*/
test = rb_entry(node, struct btrfs_ordered_extent, rb_node);
if (test->file_offset > entry_end(ordered)) {
i_size_test = test->file_offset;
}
} else {
i_size_test = i_size_read(inode);
}
/*
* i_size_test is the end of a region after this ordered
* extent where there are no ordered extents. As long as there
* are no delalloc bytes in this area, it is safe to update
* disk_i_size to the end of the region.
*/
if (i_size_test > entry_end(ordered) &&
!test_range_bit(io_tree, entry_end(ordered), i_size_test - 1,
EXTENT_DELALLOC, 0)) {
new_i_size = min_t(u64, i_size_test, i_size_read(inode));
}
BTRFS_I(inode)->disk_i_size = new_i_size;
out:
mutex_unlock(&tree->mutex);
return 0;
}
/*
* search the ordered extents for one corresponding to 'offset' and
* try to find a checksum. This is used because we allow pages to
* be reclaimed before their checksum is actually put into the btree
*/
int btrfs_find_ordered_sum(struct inode *inode, u64 offset, u32 *sum)
{
struct btrfs_ordered_sum *ordered_sum;
struct btrfs_sector_sum *sector_sums;
struct btrfs_ordered_extent *ordered;
struct btrfs_ordered_inode_tree *tree = &BTRFS_I(inode)->ordered_tree;
struct list_head *cur;
unsigned long num_sectors;
unsigned long i;
u32 sectorsize = BTRFS_I(inode)->root->sectorsize;
int ret = 1;
ordered = btrfs_lookup_ordered_extent(inode, offset);
if (!ordered)
return 1;
mutex_lock(&tree->mutex);
list_for_each_prev(cur, &ordered->list) {
ordered_sum = list_entry(cur, struct btrfs_ordered_sum, list);
if (offset >= ordered_sum->file_offset) {
num_sectors = ordered_sum->len / sectorsize;
sector_sums = ordered_sum->sums;
for (i = 0; i < num_sectors; i++) {
if (sector_sums[i].offset == offset) {
*sum = sector_sums[i].sum;
ret = 0;
goto out;
}
}
}
}
out:
mutex_unlock(&tree->mutex);
btrfs_put_ordered_extent(ordered);
return ret;
}
/**
* taken from mm/filemap.c because it isn't exported
*
* __filemap_fdatawrite_range - start writeback on mapping dirty pages in range
* @mapping: address space structure to write
* @start: offset in bytes where the range starts
* @end: offset in bytes where the range ends (inclusive)
* @sync_mode: enable synchronous operation
*
* Start writeback against all of a mapping's dirty pages that lie
* within the byte offsets <start, end> inclusive.
*
* If sync_mode is WB_SYNC_ALL then this is a "data integrity" operation, as
* opposed to a regular memory cleansing writeback. The difference between
* these two operations is that if a dirty page/buffer is encountered, it must
* be waited upon, and not just skipped over.
*/
int btrfs_fdatawrite_range(struct address_space *mapping, loff_t start,
loff_t end, int sync_mode)
{
struct writeback_control wbc = {
.sync_mode = sync_mode,
.nr_to_write = mapping->nrpages * 2,
.range_start = start,
.range_end = end,
.for_writepages = 1,
};
return btrfs_writepages(mapping, &wbc);
}
/**
* taken from mm/filemap.c because it isn't exported
*
* wait_on_page_writeback_range - wait for writeback to complete
* @mapping: target address_space
* @start: beginning page index
* @end: ending page index
*
* Wait for writeback to complete against pages indexed by start->end
* inclusive
*/
int btrfs_wait_on_page_writeback_range(struct address_space *mapping,
pgoff_t start, pgoff_t end)
{
struct pagevec pvec;
int nr_pages;
int ret = 0;
pgoff_t index;
if (end < start)
return 0;
pagevec_init(&pvec, 0);
index = start;
while ((index <= end) &&
(nr_pages = pagevec_lookup_tag(&pvec, mapping, &index,
PAGECACHE_TAG_WRITEBACK,
min(end - index, (pgoff_t)PAGEVEC_SIZE-1) + 1)) != 0) {
unsigned i;
for (i = 0; i < nr_pages; i++) {
struct page *page = pvec.pages[i];
/* until radix tree lookup accepts end_index */
if (page->index > end)
continue;
wait_on_page_writeback(page);
if (PageError(page))
ret = -EIO;
}
pagevec_release(&pvec);
cond_resched();
}
/* Check for outstanding write errors */
if (test_and_clear_bit(AS_ENOSPC, &mapping->flags))
ret = -ENOSPC;
if (test_and_clear_bit(AS_EIO, &mapping->flags))
ret = -EIO;
return ret;
}

149
fs/btrfs/ordered-data.h Normal file
View File

@ -0,0 +1,149 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __BTRFS_ORDERED_DATA__
#define __BTRFS_ORDERED_DATA__
/* one of these per inode */
struct btrfs_ordered_inode_tree {
struct mutex mutex;
struct rb_root tree;
struct rb_node *last;
};
/*
* these are used to collect checksums done just before bios submission.
* They are attached via a list into the ordered extent, and
* checksum items are inserted into the tree after all the blocks in
* the ordered extent are on disk
*/
struct btrfs_sector_sum {
u64 offset;
u32 sum;
};
struct btrfs_ordered_sum {
u64 file_offset;
/*
* this is the length in bytes covered by the sums array below.
* But, the sums array may not be contiguous in the file.
*/
unsigned long len;
struct list_head list;
/* last field is a variable length array of btrfs_sector_sums */
struct btrfs_sector_sum sums[];
};
/*
* bits for the flags field:
*
* BTRFS_ORDERED_IO_DONE is set when all of the blocks are written.
* It is used to make sure metadata is inserted into the tree only once
* per extent.
*
* BTRFS_ORDERED_COMPLETE is set when the extent is removed from the
* rbtree, just before waking any waiters. It is used to indicate the
* IO is done and any metadata is inserted into the tree.
*/
#define BTRFS_ORDERED_IO_DONE 0 /* set when all the pages are written */
#define BTRFS_ORDERED_COMPLETE 1 /* set when removed from the tree */
#define BTRFS_ORDERED_NOCOW 2 /* set when we want to write in place */
struct btrfs_ordered_extent {
/* logical offset in the file */
u64 file_offset;
/* disk byte number */
u64 start;
/* length of the extent in bytes */
u64 len;
/* flags (described above) */
unsigned long flags;
/* reference count */
atomic_t refs;
/* the inode we belong to */
struct inode *inode;
/* list of checksums for insertion when the extent io is done */
struct list_head list;
/* used to wait for the BTRFS_ORDERED_COMPLETE bit */
wait_queue_head_t wait;
/* our friendly rbtree entry */
struct rb_node rb_node;
/* a per root list of all the pending ordered extents */
struct list_head root_extent_list;
};
/*
* calculates the total size you need to allocate for an ordered sum
* structure spanning 'bytes' in the file
*/
static inline int btrfs_ordered_sum_size(struct btrfs_root *root,
unsigned long bytes)
{
unsigned long num_sectors = (bytes + root->sectorsize - 1) /
root->sectorsize;
num_sectors++;
return sizeof(struct btrfs_ordered_sum) +
num_sectors * sizeof(struct btrfs_sector_sum);
}
static inline void
btrfs_ordered_inode_tree_init(struct btrfs_ordered_inode_tree *t)
{
mutex_init(&t->mutex);
t->tree.rb_node = NULL;
t->last = NULL;
}
int btrfs_put_ordered_extent(struct btrfs_ordered_extent *entry);
int btrfs_remove_ordered_extent(struct inode *inode,
struct btrfs_ordered_extent *entry);
int btrfs_dec_test_ordered_pending(struct inode *inode,
u64 file_offset, u64 io_size);
int btrfs_add_ordered_extent(struct inode *inode, u64 file_offset,
u64 start, u64 len, int nocow);
int btrfs_add_ordered_sum(struct inode *inode,
struct btrfs_ordered_extent *entry,
struct btrfs_ordered_sum *sum);
struct btrfs_ordered_extent *btrfs_lookup_ordered_extent(struct inode *inode,
u64 file_offset);
void btrfs_start_ordered_extent(struct inode *inode,
struct btrfs_ordered_extent *entry, int wait);
void btrfs_wait_ordered_range(struct inode *inode, u64 start, u64 len);
struct btrfs_ordered_extent *
btrfs_lookup_first_ordered_extent(struct inode * inode, u64 file_offset);
int btrfs_ordered_update_i_size(struct inode *inode,
struct btrfs_ordered_extent *ordered);
int btrfs_find_ordered_sum(struct inode *inode, u64 offset, u32 *sum);
int btrfs_wait_on_page_writeback_range(struct address_space *mapping,
pgoff_t start, pgoff_t end);
int btrfs_fdatawrite_range(struct address_space *mapping, loff_t start,
loff_t end, int sync_mode);
int btrfs_wait_ordered_extents(struct btrfs_root *root, int nocow_only);
#endif

67
fs/btrfs/orphan.c Normal file
View File

@ -0,0 +1,67 @@
/*
* Copyright (C) 2008 Red Hat. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include "ctree.h"
#include "disk-io.h"
int btrfs_insert_orphan_item(struct btrfs_trans_handle *trans,
struct btrfs_root *root, u64 offset)
{
struct btrfs_path *path;
struct btrfs_key key;
int ret = 0;
key.objectid = BTRFS_ORPHAN_OBJECTID;
btrfs_set_key_type(&key, BTRFS_ORPHAN_ITEM_KEY);
key.offset = offset;
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
ret = btrfs_insert_empty_item(trans, root, path, &key, 0);
btrfs_free_path(path);
return ret;
}
int btrfs_del_orphan_item(struct btrfs_trans_handle *trans,
struct btrfs_root *root, u64 offset)
{
struct btrfs_path *path;
struct btrfs_key key;
int ret = 0;
key.objectid = BTRFS_ORPHAN_OBJECTID;
btrfs_set_key_type(&key, BTRFS_ORPHAN_ITEM_KEY);
key.offset = offset;
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
if (ret)
goto out;
ret = btrfs_del_item(trans, root, path);
out:
btrfs_free_path(path);
return ret;
}

201
fs/btrfs/print-tree.c Normal file
View File

@ -0,0 +1,201 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include "ctree.h"
#include "disk-io.h"
#include "print-tree.h"
static void print_chunk(struct extent_buffer *eb, struct btrfs_chunk *chunk)
{
int num_stripes = btrfs_chunk_num_stripes(eb, chunk);
int i;
printk("\t\tchunk length %llu owner %llu type %llu num_stripes %d\n",
(unsigned long long)btrfs_chunk_length(eb, chunk),
(unsigned long long)btrfs_chunk_owner(eb, chunk),
(unsigned long long)btrfs_chunk_type(eb, chunk),
num_stripes);
for (i = 0 ; i < num_stripes ; i++) {
printk("\t\t\tstripe %d devid %llu offset %llu\n", i,
(unsigned long long)btrfs_stripe_devid_nr(eb, chunk, i),
(unsigned long long)btrfs_stripe_offset_nr(eb, chunk, i));
}
}
static void print_dev_item(struct extent_buffer *eb,
struct btrfs_dev_item *dev_item)
{
printk("\t\tdev item devid %llu "
"total_bytes %llu bytes used %Lu\n",
(unsigned long long)btrfs_device_id(eb, dev_item),
(unsigned long long)btrfs_device_total_bytes(eb, dev_item),
(unsigned long long)btrfs_device_bytes_used(eb, dev_item));
}
void btrfs_print_leaf(struct btrfs_root *root, struct extent_buffer *l)
{
int i;
u32 nr = btrfs_header_nritems(l);
struct btrfs_item *item;
struct btrfs_extent_item *ei;
struct btrfs_root_item *ri;
struct btrfs_dir_item *di;
struct btrfs_inode_item *ii;
struct btrfs_block_group_item *bi;
struct btrfs_file_extent_item *fi;
struct btrfs_key key;
struct btrfs_key found_key;
struct btrfs_extent_ref *ref;
struct btrfs_dev_extent *dev_extent;
u32 type;
printk("leaf %llu total ptrs %d free space %d\n",
(unsigned long long)btrfs_header_bytenr(l), nr,
btrfs_leaf_free_space(root, l));
for (i = 0 ; i < nr ; i++) {
item = btrfs_item_nr(l, i);
btrfs_item_key_to_cpu(l, &key, i);
type = btrfs_key_type(&key);
printk("\titem %d key (%llu %x %llu) itemoff %d itemsize %d\n",
i,
(unsigned long long)key.objectid, type,
(unsigned long long)key.offset,
btrfs_item_offset(l, item), btrfs_item_size(l, item));
switch (type) {
case BTRFS_INODE_ITEM_KEY:
ii = btrfs_item_ptr(l, i, struct btrfs_inode_item);
printk("\t\tinode generation %llu size %llu mode %o\n",
(unsigned long long)btrfs_inode_generation(l, ii),
(unsigned long long)btrfs_inode_size(l, ii),
btrfs_inode_mode(l, ii));
break;
case BTRFS_DIR_ITEM_KEY:
di = btrfs_item_ptr(l, i, struct btrfs_dir_item);
btrfs_dir_item_key_to_cpu(l, di, &found_key);
printk("\t\tdir oid %llu type %u\n",
(unsigned long long)found_key.objectid,
btrfs_dir_type(l, di));
break;
case BTRFS_ROOT_ITEM_KEY:
ri = btrfs_item_ptr(l, i, struct btrfs_root_item);
printk("\t\troot data bytenr %llu refs %u\n",
(unsigned long long)btrfs_disk_root_bytenr(l, ri),
btrfs_disk_root_refs(l, ri));
break;
case BTRFS_EXTENT_ITEM_KEY:
ei = btrfs_item_ptr(l, i, struct btrfs_extent_item);
printk("\t\textent data refs %u\n",
btrfs_extent_refs(l, ei));
break;
case BTRFS_EXTENT_REF_KEY:
ref = btrfs_item_ptr(l, i, struct btrfs_extent_ref);
printk("\t\textent back ref root %llu gen %llu "
"owner %llu offset %llu num_refs %lu\n",
(unsigned long long)btrfs_ref_root(l, ref),
(unsigned long long)btrfs_ref_generation(l, ref),
(unsigned long long)btrfs_ref_objectid(l, ref),
(unsigned long long)btrfs_ref_offset(l, ref),
(unsigned long)btrfs_ref_num_refs(l, ref));
break;
case BTRFS_EXTENT_DATA_KEY:
fi = btrfs_item_ptr(l, i,
struct btrfs_file_extent_item);
if (btrfs_file_extent_type(l, fi) ==
BTRFS_FILE_EXTENT_INLINE) {
printk("\t\tinline extent data size %u\n",
btrfs_file_extent_inline_len(l, item));
break;
}
printk("\t\textent data disk bytenr %llu nr %llu\n",
(unsigned long long)btrfs_file_extent_disk_bytenr(l, fi),
(unsigned long long)btrfs_file_extent_disk_num_bytes(l, fi));
printk("\t\textent data offset %llu nr %llu\n",
(unsigned long long)btrfs_file_extent_offset(l, fi),
(unsigned long long)btrfs_file_extent_num_bytes(l, fi));
break;
case BTRFS_BLOCK_GROUP_ITEM_KEY:
bi = btrfs_item_ptr(l, i,
struct btrfs_block_group_item);
printk("\t\tblock group used %llu\n",
(unsigned long long)btrfs_disk_block_group_used(l, bi));
break;
case BTRFS_CHUNK_ITEM_KEY:
print_chunk(l, btrfs_item_ptr(l, i, struct btrfs_chunk));
break;
case BTRFS_DEV_ITEM_KEY:
print_dev_item(l, btrfs_item_ptr(l, i,
struct btrfs_dev_item));
break;
case BTRFS_DEV_EXTENT_KEY:
dev_extent = btrfs_item_ptr(l, i,
struct btrfs_dev_extent);
printk("\t\tdev extent chunk_tree %llu\n"
"\t\tchunk objectid %llu chunk offset %llu "
"length %llu\n",
(unsigned long long)
btrfs_dev_extent_chunk_tree(l, dev_extent),
(unsigned long long)
btrfs_dev_extent_chunk_objectid(l, dev_extent),
(unsigned long long)
btrfs_dev_extent_chunk_offset(l, dev_extent),
(unsigned long long)
btrfs_dev_extent_length(l, dev_extent));
};
}
}
void btrfs_print_tree(struct btrfs_root *root, struct extent_buffer *c)
{
int i; u32 nr;
struct btrfs_key key;
int level;
if (!c)
return;
nr = btrfs_header_nritems(c);
level = btrfs_header_level(c);
if (level == 0) {
btrfs_print_leaf(root, c);
return;
}
printk("node %llu level %d total ptrs %d free spc %u\n",
(unsigned long long)btrfs_header_bytenr(c),
btrfs_header_level(c), nr,
(u32)BTRFS_NODEPTRS_PER_BLOCK(root) - nr);
for (i = 0; i < nr; i++) {
btrfs_node_key_to_cpu(c, &key, i);
printk("\tkey %d (%llu %u %llu) block %llu\n",
i,
(unsigned long long)key.objectid,
key.type,
(unsigned long long)key.offset,
(unsigned long long)btrfs_node_blockptr(c, i));
}
for (i = 0; i < nr; i++) {
struct extent_buffer *next = read_tree_block(root,
btrfs_node_blockptr(c, i),
btrfs_level_size(root, level - 1),
btrfs_node_ptr_generation(c, i));
if (btrfs_is_leaf(next) &&
btrfs_header_level(c) != 1)
BUG();
if (btrfs_header_level(next) !=
btrfs_header_level(c) - 1)
BUG();
btrfs_print_tree(root, next);
free_extent_buffer(next);
}
}

23
fs/btrfs/print-tree.h Normal file
View File

@ -0,0 +1,23 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __PRINT_TREE_
#define __PRINT_TREE_
void btrfs_print_leaf(struct btrfs_root *root, struct extent_buffer *l);
void btrfs_print_tree(struct btrfs_root *root, struct extent_buffer *t);
#endif

187
fs/btrfs/ref-cache.c Normal file
View File

@ -0,0 +1,187 @@
/*
* Copyright (C) 2008 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/sched.h>
#include "ctree.h"
#include "ref-cache.h"
#include "transaction.h"
struct btrfs_leaf_ref *btrfs_alloc_leaf_ref(struct btrfs_root *root,
int nr_extents)
{
struct btrfs_leaf_ref *ref;
size_t size = btrfs_leaf_ref_size(nr_extents);
ref = kmalloc(size, GFP_NOFS);
if (ref) {
spin_lock(&root->fs_info->ref_cache_lock);
root->fs_info->total_ref_cache_size += size;
spin_unlock(&root->fs_info->ref_cache_lock);
memset(ref, 0, sizeof(*ref));
atomic_set(&ref->usage, 1);
INIT_LIST_HEAD(&ref->list);
}
return ref;
}
void btrfs_free_leaf_ref(struct btrfs_root *root, struct btrfs_leaf_ref *ref)
{
if (!ref)
return;
WARN_ON(atomic_read(&ref->usage) == 0);
if (atomic_dec_and_test(&ref->usage)) {
size_t size = btrfs_leaf_ref_size(ref->nritems);
BUG_ON(ref->in_tree);
kfree(ref);
spin_lock(&root->fs_info->ref_cache_lock);
root->fs_info->total_ref_cache_size -= size;
spin_unlock(&root->fs_info->ref_cache_lock);
}
}
static struct rb_node *tree_insert(struct rb_root *root, u64 bytenr,
struct rb_node *node)
{
struct rb_node ** p = &root->rb_node;
struct rb_node * parent = NULL;
struct btrfs_leaf_ref *entry;
while(*p) {
parent = *p;
entry = rb_entry(parent, struct btrfs_leaf_ref, rb_node);
WARN_ON(!entry->in_tree);
if (bytenr < entry->bytenr)
p = &(*p)->rb_left;
else if (bytenr > entry->bytenr)
p = &(*p)->rb_right;
else
return parent;
}
entry = rb_entry(node, struct btrfs_leaf_ref, rb_node);
entry->in_tree = 1;
rb_link_node(node, parent, p);
rb_insert_color(node, root);
return NULL;
}
static struct rb_node *tree_search(struct rb_root *root, u64 bytenr)
{
struct rb_node * n = root->rb_node;
struct btrfs_leaf_ref *entry;
while(n) {
entry = rb_entry(n, struct btrfs_leaf_ref, rb_node);
WARN_ON(!entry->in_tree);
if (bytenr < entry->bytenr)
n = n->rb_left;
else if (bytenr > entry->bytenr)
n = n->rb_right;
else
return n;
}
return NULL;
}
int btrfs_remove_leaf_refs(struct btrfs_root *root, u64 max_root_gen)
{
struct btrfs_leaf_ref *ref = NULL;
struct btrfs_leaf_ref_tree *tree = root->ref_tree;
if (!tree)
return 0;
spin_lock(&tree->lock);
while(!list_empty(&tree->list)) {
ref = list_entry(tree->list.next, struct btrfs_leaf_ref, list);
BUG_ON(!ref->in_tree);
if (ref->root_gen > max_root_gen)
break;
rb_erase(&ref->rb_node, &tree->root);
ref->in_tree = 0;
list_del_init(&ref->list);
spin_unlock(&tree->lock);
btrfs_free_leaf_ref(root, ref);
cond_resched();
spin_lock(&tree->lock);
}
spin_unlock(&tree->lock);
return 0;
}
struct btrfs_leaf_ref *btrfs_lookup_leaf_ref(struct btrfs_root *root,
u64 bytenr)
{
struct rb_node *rb;
struct btrfs_leaf_ref *ref = NULL;
struct btrfs_leaf_ref_tree *tree = root->ref_tree;
if (!tree)
return NULL;
spin_lock(&tree->lock);
rb = tree_search(&tree->root, bytenr);
if (rb)
ref = rb_entry(rb, struct btrfs_leaf_ref, rb_node);
if (ref)
atomic_inc(&ref->usage);
spin_unlock(&tree->lock);
return ref;
}
int btrfs_add_leaf_ref(struct btrfs_root *root, struct btrfs_leaf_ref *ref)
{
int ret = 0;
struct rb_node *rb;
struct btrfs_leaf_ref_tree *tree = root->ref_tree;
spin_lock(&tree->lock);
rb = tree_insert(&tree->root, ref->bytenr, &ref->rb_node);
if (rb) {
ret = -EEXIST;
} else {
atomic_inc(&ref->usage);
list_add_tail(&ref->list, &tree->list);
}
spin_unlock(&tree->lock);
return ret;
}
int btrfs_remove_leaf_ref(struct btrfs_root *root, struct btrfs_leaf_ref *ref)
{
struct btrfs_leaf_ref_tree *tree = root->ref_tree;
BUG_ON(!ref->in_tree);
spin_lock(&tree->lock);
rb_erase(&ref->rb_node, &tree->root);
ref->in_tree = 0;
list_del_init(&ref->list);
spin_unlock(&tree->lock);
btrfs_free_leaf_ref(root, ref);
return 0;
}

71
fs/btrfs/ref-cache.h Normal file
View File

@ -0,0 +1,71 @@
/*
* Copyright (C) 2008 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __REFCACHE__
#define __REFCACHE__
struct btrfs_extent_info {
u64 bytenr;
u64 num_bytes;
u64 objectid;
u64 offset;
};
struct btrfs_leaf_ref {
struct rb_node rb_node;
int in_tree;
atomic_t usage;
u64 root_gen;
u64 bytenr;
u64 owner;
u64 generation;
int nritems;
struct list_head list;
struct btrfs_extent_info extents[];
};
static inline size_t btrfs_leaf_ref_size(int nr_extents)
{
return sizeof(struct btrfs_leaf_ref) +
sizeof(struct btrfs_extent_info) * nr_extents;
}
static inline void btrfs_leaf_ref_tree_init(struct btrfs_leaf_ref_tree *tree)
{
tree->root.rb_node = NULL;
INIT_LIST_HEAD(&tree->list);
spin_lock_init(&tree->lock);
}
static inline int btrfs_leaf_ref_tree_empty(struct btrfs_leaf_ref_tree *tree)
{
return RB_EMPTY_ROOT(&tree->root);
}
void btrfs_leaf_ref_tree_init(struct btrfs_leaf_ref_tree *tree);
struct btrfs_leaf_ref *btrfs_alloc_leaf_ref(struct btrfs_root *root,
int nr_extents);
void btrfs_free_leaf_ref(struct btrfs_root *root, struct btrfs_leaf_ref *ref);
struct btrfs_leaf_ref *btrfs_lookup_leaf_ref(struct btrfs_root *root,
u64 bytenr);
int btrfs_add_leaf_ref(struct btrfs_root *root, struct btrfs_leaf_ref *ref);
int btrfs_remove_leaf_refs(struct btrfs_root *root, u64 max_root_gen);
int btrfs_remove_leaf_ref(struct btrfs_root *root, struct btrfs_leaf_ref *ref);
#endif

257
fs/btrfs/root-tree.c Normal file
View File

@ -0,0 +1,257 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include "ctree.h"
#include "transaction.h"
#include "disk-io.h"
#include "print-tree.h"
/*
* returns 0 on finding something, 1 if no more roots are there
* and < 0 on error
*/
int btrfs_search_root(struct btrfs_root *root, u64 search_start,
u64 *found_objectid)
{
struct btrfs_path *path;
struct btrfs_key search_key;
int ret;
root = root->fs_info->tree_root;
search_key.objectid = search_start;
search_key.type = (u8)-1;
search_key.offset = (u64)-1;
path = btrfs_alloc_path();
BUG_ON(!path);
again:
ret = btrfs_search_slot(NULL, root, &search_key, path, 0, 0);
if (ret < 0)
goto out;
if (ret == 0) {
ret = 1;
goto out;
}
if (path->slots[0] >= btrfs_header_nritems(path->nodes[0])) {
ret = btrfs_next_leaf(root, path);
if (ret)
goto out;
}
btrfs_item_key_to_cpu(path->nodes[0], &search_key, path->slots[0]);
if (search_key.type != BTRFS_ROOT_ITEM_KEY) {
search_key.offset++;
btrfs_release_path(root, path);
goto again;
}
ret = 0;
*found_objectid = search_key.objectid;
out:
btrfs_free_path(path);
return ret;
}
int btrfs_find_last_root(struct btrfs_root *root, u64 objectid,
struct btrfs_root_item *item, struct btrfs_key *key)
{
struct btrfs_path *path;
struct btrfs_key search_key;
struct btrfs_key found_key;
struct extent_buffer *l;
int ret;
int slot;
search_key.objectid = objectid;
search_key.type = (u8)-1;
search_key.offset = (u64)-1;
path = btrfs_alloc_path();
BUG_ON(!path);
ret = btrfs_search_slot(NULL, root, &search_key, path, 0, 0);
if (ret < 0)
goto out;
BUG_ON(ret == 0);
l = path->nodes[0];
BUG_ON(path->slots[0] == 0);
slot = path->slots[0] - 1;
btrfs_item_key_to_cpu(l, &found_key, slot);
if (found_key.objectid != objectid) {
ret = 1;
goto out;
}
read_extent_buffer(l, item, btrfs_item_ptr_offset(l, slot),
sizeof(*item));
memcpy(key, &found_key, sizeof(found_key));
ret = 0;
out:
btrfs_free_path(path);
return ret;
}
int btrfs_update_root(struct btrfs_trans_handle *trans, struct btrfs_root
*root, struct btrfs_key *key, struct btrfs_root_item
*item)
{
struct btrfs_path *path;
struct extent_buffer *l;
int ret;
int slot;
unsigned long ptr;
path = btrfs_alloc_path();
BUG_ON(!path);
ret = btrfs_search_slot(trans, root, key, path, 0, 1);
if (ret < 0)
goto out;
if (ret != 0) {
btrfs_print_leaf(root, path->nodes[0]);
printk("unable to update root key %Lu %u %Lu\n",
key->objectid, key->type, key->offset);
BUG_ON(1);
}
l = path->nodes[0];
slot = path->slots[0];
ptr = btrfs_item_ptr_offset(l, slot);
write_extent_buffer(l, item, ptr, sizeof(*item));
btrfs_mark_buffer_dirty(path->nodes[0]);
out:
btrfs_release_path(root, path);
btrfs_free_path(path);
return ret;
}
int btrfs_insert_root(struct btrfs_trans_handle *trans, struct btrfs_root
*root, struct btrfs_key *key, struct btrfs_root_item
*item)
{
int ret;
ret = btrfs_insert_item(trans, root, key, item, sizeof(*item));
return ret;
}
int btrfs_find_dead_roots(struct btrfs_root *root, u64 objectid,
struct btrfs_root *latest)
{
struct btrfs_root *dead_root;
struct btrfs_item *item;
struct btrfs_root_item *ri;
struct btrfs_key key;
struct btrfs_key found_key;
struct btrfs_path *path;
int ret;
u32 nritems;
struct extent_buffer *leaf;
int slot;
key.objectid = objectid;
btrfs_set_key_type(&key, BTRFS_ROOT_ITEM_KEY);
key.offset = 0;
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
again:
ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
if (ret < 0)
goto err;
while(1) {
leaf = path->nodes[0];
nritems = btrfs_header_nritems(leaf);
slot = path->slots[0];
if (slot >= nritems) {
ret = btrfs_next_leaf(root, path);
if (ret)
break;
leaf = path->nodes[0];
nritems = btrfs_header_nritems(leaf);
slot = path->slots[0];
}
item = btrfs_item_nr(leaf, slot);
btrfs_item_key_to_cpu(leaf, &key, slot);
if (btrfs_key_type(&key) != BTRFS_ROOT_ITEM_KEY)
goto next;
if (key.objectid < objectid)
goto next;
if (key.objectid > objectid)
break;
ri = btrfs_item_ptr(leaf, slot, struct btrfs_root_item);
if (btrfs_disk_root_refs(leaf, ri) != 0)
goto next;
memcpy(&found_key, &key, sizeof(key));
key.offset++;
btrfs_release_path(root, path);
dead_root =
btrfs_read_fs_root_no_radix(root->fs_info->tree_root,
&found_key);
if (IS_ERR(dead_root)) {
ret = PTR_ERR(dead_root);
goto err;
}
ret = btrfs_add_dead_root(dead_root, latest);
if (ret)
goto err;
goto again;
next:
slot++;
path->slots[0]++;
}
ret = 0;
err:
btrfs_free_path(path);
return ret;
}
int btrfs_del_root(struct btrfs_trans_handle *trans, struct btrfs_root *root,
struct btrfs_key *key)
{
struct btrfs_path *path;
int ret;
u32 refs;
struct btrfs_root_item *ri;
struct extent_buffer *leaf;
path = btrfs_alloc_path();
BUG_ON(!path);
ret = btrfs_search_slot(trans, root, key, path, -1, 1);
if (ret < 0)
goto out;
if (ret) {
btrfs_print_leaf(root, path->nodes[0]);
printk("failed to del %Lu %u %Lu\n", key->objectid, key->type, key->offset);
}
BUG_ON(ret != 0);
leaf = path->nodes[0];
ri = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_root_item);
refs = btrfs_disk_root_refs(leaf, ri);
BUG_ON(refs != 0);
ret = btrfs_del_item(trans, root, path);
out:
btrfs_release_path(root, path);
btrfs_free_path(path);
return ret;
}

111
fs/btrfs/struct-funcs.c Normal file
View File

@ -0,0 +1,111 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/highmem.h>
#define BTRFS_SETGET_FUNCS(name, type, member, bits) \
u##bits btrfs_##name(struct extent_buffer *eb, \
type *s) \
{ \
unsigned long part_offset = (unsigned long)s; \
unsigned long offset = part_offset + offsetof(type, member); \
type *p; \
/* ugly, but we want the fast path here */ \
if (eb->map_token && offset >= eb->map_start && \
offset + sizeof(((type *)0)->member) <= eb->map_start + \
eb->map_len) { \
p = (type *)(eb->kaddr + part_offset - eb->map_start); \
return le##bits##_to_cpu(p->member); \
} \
{ \
int err; \
char *map_token; \
char *kaddr; \
int unmap_on_exit = (eb->map_token == NULL); \
unsigned long map_start; \
unsigned long map_len; \
__le##bits res; \
err = map_extent_buffer(eb, offset, \
sizeof(((type *)0)->member), \
&map_token, &kaddr, \
&map_start, &map_len, KM_USER1); \
if (err) { \
read_eb_member(eb, s, type, member, &res); \
return le##bits##_to_cpu(res); \
} \
p = (type *)(kaddr + part_offset - map_start); \
res = le##bits##_to_cpu(p->member); \
if (unmap_on_exit) \
unmap_extent_buffer(eb, map_token, KM_USER1); \
return res; \
} \
} \
void btrfs_set_##name(struct extent_buffer *eb, \
type *s, u##bits val) \
{ \
unsigned long part_offset = (unsigned long)s; \
unsigned long offset = part_offset + offsetof(type, member); \
type *p; \
/* ugly, but we want the fast path here */ \
if (eb->map_token && offset >= eb->map_start && \
offset + sizeof(((type *)0)->member) <= eb->map_start + \
eb->map_len) { \
p = (type *)(eb->kaddr + part_offset - eb->map_start); \
p->member = cpu_to_le##bits(val); \
return; \
} \
{ \
int err; \
char *map_token; \
char *kaddr; \
int unmap_on_exit = (eb->map_token == NULL); \
unsigned long map_start; \
unsigned long map_len; \
err = map_extent_buffer(eb, offset, \
sizeof(((type *)0)->member), \
&map_token, &kaddr, \
&map_start, &map_len, KM_USER1); \
if (err) { \
val = cpu_to_le##bits(val); \
write_eb_member(eb, s, type, member, &val); \
return; \
} \
p = (type *)(kaddr + part_offset - map_start); \
p->member = cpu_to_le##bits(val); \
if (unmap_on_exit) \
unmap_extent_buffer(eb, map_token, KM_USER1); \
} \
}
#include "ctree.h"
void btrfs_node_key(struct extent_buffer *eb,
struct btrfs_disk_key *disk_key, int nr)
{
unsigned long ptr = btrfs_node_key_ptr_offset(nr);
if (eb->map_token && ptr >= eb->map_start &&
ptr + sizeof(*disk_key) <= eb->map_start + eb->map_len) {
memcpy(disk_key, eb->kaddr + ptr - eb->map_start,
sizeof(*disk_key));
return;
} else if (eb->map_token) {
unmap_extent_buffer(eb, eb->map_token, KM_USER1);
eb->map_token = NULL;
}
read_eb_member(eb, (struct btrfs_key_ptr *)ptr,
struct btrfs_key_ptr, key, disk_key);
}

663
fs/btrfs/super.c Normal file
View File

@ -0,0 +1,663 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/blkdev.h>
#include <linux/module.h>
#include <linux/buffer_head.h>
#include <linux/fs.h>
#include <linux/pagemap.h>
#include <linux/highmem.h>
#include <linux/time.h>
#include <linux/init.h>
#include <linux/string.h>
#include <linux/smp_lock.h>
#include <linux/backing-dev.h>
#include <linux/mount.h>
#include <linux/mpage.h>
#include <linux/swap.h>
#include <linux/writeback.h>
#include <linux/statfs.h>
#include <linux/compat.h>
#include <linux/parser.h>
#include <linux/ctype.h>
#include <linux/namei.h>
#include <linux/miscdevice.h>
#include "ctree.h"
#include "disk-io.h"
#include "transaction.h"
#include "btrfs_inode.h"
#include "ioctl.h"
#include "print-tree.h"
#include "xattr.h"
#include "volumes.h"
#include "version.h"
#include "export.h"
#define BTRFS_SUPER_MAGIC 0x9123683E
static struct super_operations btrfs_super_ops;
static void btrfs_put_super (struct super_block * sb)
{
struct btrfs_root *root = btrfs_sb(sb);
struct btrfs_fs_info *fs = root->fs_info;
int ret;
ret = close_ctree(root);
if (ret) {
printk("close ctree returns %d\n", ret);
}
btrfs_sysfs_del_super(fs);
sb->s_fs_info = NULL;
}
enum {
Opt_degraded, Opt_subvol, Opt_device, Opt_nodatasum, Opt_nodatacow,
Opt_max_extent, Opt_max_inline, Opt_alloc_start, Opt_nobarrier,
Opt_ssd, Opt_thread_pool, Opt_noacl, Opt_err,
};
static match_table_t tokens = {
{Opt_degraded, "degraded"},
{Opt_subvol, "subvol=%s"},
{Opt_device, "device=%s"},
{Opt_nodatasum, "nodatasum"},
{Opt_nodatacow, "nodatacow"},
{Opt_nobarrier, "nobarrier"},
{Opt_max_extent, "max_extent=%s"},
{Opt_max_inline, "max_inline=%s"},
{Opt_alloc_start, "alloc_start=%s"},
{Opt_thread_pool, "thread_pool=%d"},
{Opt_ssd, "ssd"},
{Opt_noacl, "noacl"},
{Opt_err, NULL},
};
u64 btrfs_parse_size(char *str)
{
u64 res;
int mult = 1;
char *end;
char last;
res = simple_strtoul(str, &end, 10);
last = end[0];
if (isalpha(last)) {
last = tolower(last);
switch (last) {
case 'g':
mult *= 1024;
case 'm':
mult *= 1024;
case 'k':
mult *= 1024;
}
res = res * mult;
}
return res;
}
/*
* Regular mount options parser. Everything that is needed only when
* reading in a new superblock is parsed here.
*/
int btrfs_parse_options(struct btrfs_root *root, char *options)
{
struct btrfs_fs_info *info = root->fs_info;
substring_t args[MAX_OPT_ARGS];
char *p, *num;
int intarg;
if (!options)
return 0;
/*
* strsep changes the string, duplicate it because parse_options
* gets called twice
*/
options = kstrdup(options, GFP_NOFS);
if (!options)
return -ENOMEM;
while ((p = strsep(&options, ",")) != NULL) {
int token;
if (!*p)
continue;
token = match_token(p, tokens, args);
switch (token) {
case Opt_degraded:
printk(KERN_INFO "btrfs: allowing degraded mounts\n");
btrfs_set_opt(info->mount_opt, DEGRADED);
break;
case Opt_subvol:
case Opt_device:
/*
* These are parsed by btrfs_parse_early_options
* and can be happily ignored here.
*/
break;
case Opt_nodatasum:
printk(KERN_INFO "btrfs: setting nodatacsum\n");
btrfs_set_opt(info->mount_opt, NODATASUM);
break;
case Opt_nodatacow:
printk(KERN_INFO "btrfs: setting nodatacow\n");
btrfs_set_opt(info->mount_opt, NODATACOW);
btrfs_set_opt(info->mount_opt, NODATASUM);
break;
case Opt_ssd:
printk(KERN_INFO "btrfs: use ssd allocation scheme\n");
btrfs_set_opt(info->mount_opt, SSD);
break;
case Opt_nobarrier:
printk(KERN_INFO "btrfs: turning off barriers\n");
btrfs_set_opt(info->mount_opt, NOBARRIER);
break;
case Opt_thread_pool:
intarg = 0;
match_int(&args[0], &intarg);
if (intarg) {
info->thread_pool_size = intarg;
printk(KERN_INFO "btrfs: thread pool %d\n",
info->thread_pool_size);
}
break;
case Opt_max_extent:
num = match_strdup(&args[0]);
if (num) {
info->max_extent = btrfs_parse_size(num);
kfree(num);
info->max_extent = max_t(u64,
info->max_extent, root->sectorsize);
printk(KERN_INFO "btrfs: max_extent at %llu\n",
info->max_extent);
}
break;
case Opt_max_inline:
num = match_strdup(&args[0]);
if (num) {
info->max_inline = btrfs_parse_size(num);
kfree(num);
if (info->max_inline) {
info->max_inline = max_t(u64,
info->max_inline,
root->sectorsize);
}
printk(KERN_INFO "btrfs: max_inline at %llu\n",
info->max_inline);
}
break;
case Opt_alloc_start:
num = match_strdup(&args[0]);
if (num) {
info->alloc_start = btrfs_parse_size(num);
kfree(num);
printk(KERN_INFO
"btrfs: allocations start at %llu\n",
info->alloc_start);
}
break;
case Opt_noacl:
root->fs_info->sb->s_flags &= ~MS_POSIXACL;
break;
default:
break;
}
}
kfree(options);
return 0;
}
/*
* Parse mount options that are required early in the mount process.
*
* All other options will be parsed on much later in the mount process and
* only when we need to allocate a new super block.
*/
static int btrfs_parse_early_options(const char *options, int flags,
void *holder, char **subvol_name,
struct btrfs_fs_devices **fs_devices)
{
substring_t args[MAX_OPT_ARGS];
char *opts, *p;
int error = 0;
if (!options)
goto out;
/*
* strsep changes the string, duplicate it because parse_options
* gets called twice
*/
opts = kstrdup(options, GFP_KERNEL);
if (!opts)
return -ENOMEM;
while ((p = strsep(&opts, ",")) != NULL) {
int token;
if (!*p)
continue;
token = match_token(p, tokens, args);
switch (token) {
case Opt_subvol:
*subvol_name = match_strdup(&args[0]);
break;
case Opt_device:
error = btrfs_scan_one_device(match_strdup(&args[0]),
flags, holder, fs_devices);
if (error)
goto out_free_opts;
break;
default:
break;
}
}
out_free_opts:
kfree(opts);
out:
/*
* If no subvolume name is specified we use the default one. Allocate
* a copy of the string "default" here so that code later in the
* mount path doesn't care if it's the default volume or another one.
*/
if (!*subvol_name) {
*subvol_name = kstrdup("default", GFP_KERNEL);
if (!*subvol_name)
return -ENOMEM;
}
return error;
}
static int btrfs_fill_super(struct super_block * sb,
struct btrfs_fs_devices *fs_devices,
void * data, int silent)
{
struct inode * inode;
struct dentry * root_dentry;
struct btrfs_super_block *disk_super;
struct btrfs_root *tree_root;
struct btrfs_inode *bi;
int err;
sb->s_maxbytes = MAX_LFS_FILESIZE;
sb->s_magic = BTRFS_SUPER_MAGIC;
sb->s_op = &btrfs_super_ops;
sb->s_export_op = &btrfs_export_ops;
sb->s_xattr = btrfs_xattr_handlers;
sb->s_time_gran = 1;
sb->s_flags |= MS_POSIXACL;
tree_root = open_ctree(sb, fs_devices, (char *)data);
if (IS_ERR(tree_root)) {
printk("btrfs: open_ctree failed\n");
return PTR_ERR(tree_root);
}
sb->s_fs_info = tree_root;
disk_super = &tree_root->fs_info->super_copy;
inode = btrfs_iget_locked(sb, btrfs_super_root_dir(disk_super),
tree_root);
bi = BTRFS_I(inode);
bi->location.objectid = inode->i_ino;
bi->location.offset = 0;
bi->root = tree_root;
btrfs_set_key_type(&bi->location, BTRFS_INODE_ITEM_KEY);
if (!inode) {
err = -ENOMEM;
goto fail_close;
}
if (inode->i_state & I_NEW) {
btrfs_read_locked_inode(inode);
unlock_new_inode(inode);
}
root_dentry = d_alloc_root(inode);
if (!root_dentry) {
iput(inode);
err = -ENOMEM;
goto fail_close;
}
/* this does the super kobj at the same time */
err = btrfs_sysfs_add_super(tree_root->fs_info);
if (err)
goto fail_close;
sb->s_root = root_dentry;
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,25)
save_mount_options(sb, data);
#endif
return 0;
fail_close:
close_ctree(tree_root);
return err;
}
int btrfs_sync_fs(struct super_block *sb, int wait)
{
struct btrfs_trans_handle *trans;
struct btrfs_root *root;
int ret;
root = btrfs_sb(sb);
sb->s_dirt = 0;
if (!wait) {
filemap_flush(root->fs_info->btree_inode->i_mapping);
return 0;
}
btrfs_clean_old_snapshots(root);
trans = btrfs_start_transaction(root, 1);
ret = btrfs_commit_transaction(trans, root);
sb->s_dirt = 0;
return ret;
}
static void btrfs_write_super(struct super_block *sb)
{
sb->s_dirt = 0;
}
static int btrfs_test_super(struct super_block *s, void *data)
{
struct btrfs_fs_devices *test_fs_devices = data;
struct btrfs_root *root = btrfs_sb(s);
return root->fs_info->fs_devices == test_fs_devices;
}
/*
* Find a superblock for the given device / mount point.
*
* Note: This is based on get_sb_bdev from fs/super.c with a few additions
* for multiple device setup. Make sure to keep it in sync.
*/
static int btrfs_get_sb(struct file_system_type *fs_type, int flags,
const char *dev_name, void *data, struct vfsmount *mnt)
{
char *subvol_name = NULL;
struct block_device *bdev = NULL;
struct super_block *s;
struct dentry *root;
struct btrfs_fs_devices *fs_devices = NULL;
int error = 0;
error = btrfs_parse_early_options(data, flags, fs_type,
&subvol_name, &fs_devices);
if (error)
goto error;
error = btrfs_scan_one_device(dev_name, flags, fs_type, &fs_devices);
if (error)
goto error_free_subvol_name;
error = btrfs_open_devices(fs_devices, flags, fs_type);
if (error)
goto error_free_subvol_name;
bdev = fs_devices->latest_bdev;
s = sget(fs_type, btrfs_test_super, set_anon_super, fs_devices);
if (IS_ERR(s))
goto error_s;
if (s->s_root) {
if ((flags ^ s->s_flags) & MS_RDONLY) {
up_write(&s->s_umount);
deactivate_super(s);
error = -EBUSY;
goto error_bdev;
}
} else {
char b[BDEVNAME_SIZE];
s->s_flags = flags;
strlcpy(s->s_id, bdevname(bdev, b), sizeof(s->s_id));
error = btrfs_fill_super(s, fs_devices, data,
flags & MS_SILENT ? 1 : 0);
if (error) {
up_write(&s->s_umount);
deactivate_super(s);
goto error;
}
btrfs_sb(s)->fs_info->bdev_holder = fs_type;
s->s_flags |= MS_ACTIVE;
}
if (!strcmp(subvol_name, "."))
root = dget(s->s_root);
else {
mutex_lock(&s->s_root->d_inode->i_mutex);
root = lookup_one_len(subvol_name, s->s_root, strlen(subvol_name));
mutex_unlock(&s->s_root->d_inode->i_mutex);
if (IS_ERR(root)) {
up_write(&s->s_umount);
deactivate_super(s);
error = PTR_ERR(root);
goto error;
}
if (!root->d_inode) {
dput(root);
up_write(&s->s_umount);
deactivate_super(s);
error = -ENXIO;
goto error;
}
}
mnt->mnt_sb = s;
mnt->mnt_root = root;
kfree(subvol_name);
return 0;
error_s:
error = PTR_ERR(s);
error_bdev:
btrfs_close_devices(fs_devices);
error_free_subvol_name:
kfree(subvol_name);
error:
return error;
}
static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
{
struct btrfs_root *root = btrfs_sb(dentry->d_sb);
struct btrfs_super_block *disk_super = &root->fs_info->super_copy;
int bits = dentry->d_sb->s_blocksize_bits;
__be32 *fsid = (__be32 *)root->fs_info->fsid;
buf->f_namelen = BTRFS_NAME_LEN;
buf->f_blocks = btrfs_super_total_bytes(disk_super) >> bits;
buf->f_bfree = buf->f_blocks -
(btrfs_super_bytes_used(disk_super) >> bits);
buf->f_bavail = buf->f_bfree;
buf->f_bsize = dentry->d_sb->s_blocksize;
buf->f_type = BTRFS_SUPER_MAGIC;
/* We treat it as constant endianness (it doesn't matter _which_)
because we want the fsid to come out the same whether mounted
on a big-endian or little-endian host */
buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]);
buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]);
/* Mask in the root object ID too, to disambiguate subvols */
buf->f_fsid.val[0] ^= BTRFS_I(dentry->d_inode)->root->objectid >> 32;
buf->f_fsid.val[1] ^= BTRFS_I(dentry->d_inode)->root->objectid;
return 0;
}
static struct file_system_type btrfs_fs_type = {
.owner = THIS_MODULE,
.name = "btrfs",
.get_sb = btrfs_get_sb,
.kill_sb = kill_anon_super,
.fs_flags = FS_REQUIRES_DEV,
};
static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
struct btrfs_ioctl_vol_args *vol;
struct btrfs_fs_devices *fs_devices;
int ret = 0;
int len;
vol = kmalloc(sizeof(*vol), GFP_KERNEL);
if (copy_from_user(vol, (void __user *)arg, sizeof(*vol))) {
ret = -EFAULT;
goto out;
}
len = strnlen(vol->name, BTRFS_PATH_NAME_MAX);
switch (cmd) {
case BTRFS_IOC_SCAN_DEV:
ret = btrfs_scan_one_device(vol->name, MS_RDONLY,
&btrfs_fs_type, &fs_devices);
break;
}
out:
kfree(vol);
return ret;
}
static void btrfs_write_super_lockfs(struct super_block *sb)
{
struct btrfs_root *root = btrfs_sb(sb);
mutex_lock(&root->fs_info->transaction_kthread_mutex);
mutex_lock(&root->fs_info->cleaner_mutex);
}
static void btrfs_unlockfs(struct super_block *sb)
{
struct btrfs_root *root = btrfs_sb(sb);
mutex_unlock(&root->fs_info->cleaner_mutex);
mutex_unlock(&root->fs_info->transaction_kthread_mutex);
}
static struct super_operations btrfs_super_ops = {
.delete_inode = btrfs_delete_inode,
.put_super = btrfs_put_super,
.write_super = btrfs_write_super,
.sync_fs = btrfs_sync_fs,
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,25)
.read_inode = btrfs_read_locked_inode,
#else
.show_options = generic_show_options,
#endif
.write_inode = btrfs_write_inode,
.dirty_inode = btrfs_dirty_inode,
.alloc_inode = btrfs_alloc_inode,
.destroy_inode = btrfs_destroy_inode,
.statfs = btrfs_statfs,
.write_super_lockfs = btrfs_write_super_lockfs,
.unlockfs = btrfs_unlockfs,
};
static const struct file_operations btrfs_ctl_fops = {
.unlocked_ioctl = btrfs_control_ioctl,
.compat_ioctl = btrfs_control_ioctl,
.owner = THIS_MODULE,
};
static struct miscdevice btrfs_misc = {
.minor = MISC_DYNAMIC_MINOR,
.name = "btrfs-control",
.fops = &btrfs_ctl_fops
};
static int btrfs_interface_init(void)
{
return misc_register(&btrfs_misc);
}
void btrfs_interface_exit(void)
{
if (misc_deregister(&btrfs_misc) < 0)
printk("misc_deregister failed for control device");
}
static int __init init_btrfs_fs(void)
{
int err;
err = btrfs_init_sysfs();
if (err)
return err;
err = btrfs_init_cachep();
if (err)
goto free_sysfs;
err = extent_io_init();
if (err)
goto free_cachep;
err = extent_map_init();
if (err)
goto free_extent_io;
err = btrfs_interface_init();
if (err)
goto free_extent_map;
err = register_filesystem(&btrfs_fs_type);
if (err)
goto unregister_ioctl;
printk(KERN_INFO "%s loaded\n", BTRFS_BUILD_VERSION);
return 0;
unregister_ioctl:
btrfs_interface_exit();
free_extent_map:
extent_map_exit();
free_extent_io:
extent_io_exit();
free_cachep:
btrfs_destroy_cachep();
free_sysfs:
btrfs_exit_sysfs();
return err;
}
static void __exit exit_btrfs_fs(void)
{
btrfs_destroy_cachep();
extent_map_exit();
extent_io_exit();
btrfs_interface_exit();
unregister_filesystem(&btrfs_fs_type);
btrfs_exit_sysfs();
btrfs_cleanup_fs_uuids();
}
module_init(init_btrfs_fs)
module_exit(exit_btrfs_fs)
MODULE_LICENSE("GPL");

301
fs/btrfs/sysfs.c Normal file
View File

@ -0,0 +1,301 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/completion.h>
#include <linux/buffer_head.h>
#include <linux/module.h>
#include <linux/kobject.h>
#include "ctree.h"
#include "disk-io.h"
#include "transaction.h"
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,25)
static ssize_t root_blocks_used_show(struct btrfs_root *root, char *buf)
{
return snprintf(buf, PAGE_SIZE, "%llu\n",
(unsigned long long)btrfs_root_used(&root->root_item));
}
static ssize_t root_block_limit_show(struct btrfs_root *root, char *buf)
{
return snprintf(buf, PAGE_SIZE, "%llu\n",
(unsigned long long)btrfs_root_limit(&root->root_item));
}
static ssize_t super_blocks_used_show(struct btrfs_fs_info *fs, char *buf)
{
return snprintf(buf, PAGE_SIZE, "%llu\n",
(unsigned long long)btrfs_super_bytes_used(&fs->super_copy));
}
static ssize_t super_total_blocks_show(struct btrfs_fs_info *fs, char *buf)
{
return snprintf(buf, PAGE_SIZE, "%llu\n",
(unsigned long long)btrfs_super_total_bytes(&fs->super_copy));
}
static ssize_t super_blocksize_show(struct btrfs_fs_info *fs, char *buf)
{
return snprintf(buf, PAGE_SIZE, "%llu\n",
(unsigned long long)btrfs_super_sectorsize(&fs->super_copy));
}
/* this is for root attrs (subvols/snapshots) */
struct btrfs_root_attr {
struct attribute attr;
ssize_t (*show)(struct btrfs_root *, char *);
ssize_t (*store)(struct btrfs_root *, const char *, size_t);
};
#define ROOT_ATTR(name, mode, show, store) \
static struct btrfs_root_attr btrfs_root_attr_##name = __ATTR(name, mode, show, store)
ROOT_ATTR(blocks_used, 0444, root_blocks_used_show, NULL);
ROOT_ATTR(block_limit, 0644, root_block_limit_show, NULL);
static struct attribute *btrfs_root_attrs[] = {
&btrfs_root_attr_blocks_used.attr,
&btrfs_root_attr_block_limit.attr,
NULL,
};
/* this is for super attrs (actual full fs) */
struct btrfs_super_attr {
struct attribute attr;
ssize_t (*show)(struct btrfs_fs_info *, char *);
ssize_t (*store)(struct btrfs_fs_info *, const char *, size_t);
};
#define SUPER_ATTR(name, mode, show, store) \
static struct btrfs_super_attr btrfs_super_attr_##name = __ATTR(name, mode, show, store)
SUPER_ATTR(blocks_used, 0444, super_blocks_used_show, NULL);
SUPER_ATTR(total_blocks, 0444, super_total_blocks_show, NULL);
SUPER_ATTR(blocksize, 0444, super_blocksize_show, NULL);
static struct attribute *btrfs_super_attrs[] = {
&btrfs_super_attr_blocks_used.attr,
&btrfs_super_attr_total_blocks.attr,
&btrfs_super_attr_blocksize.attr,
NULL,
};
static ssize_t btrfs_super_attr_show(struct kobject *kobj,
struct attribute *attr, char *buf)
{
struct btrfs_fs_info *fs = container_of(kobj, struct btrfs_fs_info,
super_kobj);
struct btrfs_super_attr *a = container_of(attr,
struct btrfs_super_attr,
attr);
return a->show ? a->show(fs, buf) : 0;
}
static ssize_t btrfs_super_attr_store(struct kobject *kobj,
struct attribute *attr,
const char *buf, size_t len)
{
struct btrfs_fs_info *fs = container_of(kobj, struct btrfs_fs_info,
super_kobj);
struct btrfs_super_attr *a = container_of(attr,
struct btrfs_super_attr,
attr);
return a->store ? a->store(fs, buf, len) : 0;
}
static ssize_t btrfs_root_attr_show(struct kobject *kobj,
struct attribute *attr, char *buf)
{
struct btrfs_root *root = container_of(kobj, struct btrfs_root,
root_kobj);
struct btrfs_root_attr *a = container_of(attr,
struct btrfs_root_attr,
attr);
return a->show ? a->show(root, buf) : 0;
}
static ssize_t btrfs_root_attr_store(struct kobject *kobj,
struct attribute *attr,
const char *buf, size_t len)
{
struct btrfs_root *root = container_of(kobj, struct btrfs_root,
root_kobj);
struct btrfs_root_attr *a = container_of(attr,
struct btrfs_root_attr,
attr);
return a->store ? a->store(root, buf, len) : 0;
}
static void btrfs_super_release(struct kobject *kobj)
{
struct btrfs_fs_info *fs = container_of(kobj, struct btrfs_fs_info,
super_kobj);
complete(&fs->kobj_unregister);
}
static void btrfs_root_release(struct kobject *kobj)
{
struct btrfs_root *root = container_of(kobj, struct btrfs_root,
root_kobj);
complete(&root->kobj_unregister);
}
static struct sysfs_ops btrfs_super_attr_ops = {
.show = btrfs_super_attr_show,
.store = btrfs_super_attr_store,
};
static struct sysfs_ops btrfs_root_attr_ops = {
.show = btrfs_root_attr_show,
.store = btrfs_root_attr_store,
};
static struct kobj_type btrfs_root_ktype = {
.default_attrs = btrfs_root_attrs,
.sysfs_ops = &btrfs_root_attr_ops,
.release = btrfs_root_release,
};
static struct kobj_type btrfs_super_ktype = {
.default_attrs = btrfs_super_attrs,
.sysfs_ops = &btrfs_super_attr_ops,
.release = btrfs_super_release,
};
/* /sys/fs/btrfs/ entry */
static struct kset *btrfs_kset;
int btrfs_sysfs_add_super(struct btrfs_fs_info *fs)
{
int error;
char *name;
char c;
int len = strlen(fs->sb->s_id) + 1;
int i;
name = kmalloc(len, GFP_NOFS);
if (!name) {
error = -ENOMEM;
goto fail;
}
for (i = 0; i < len; i++) {
c = fs->sb->s_id[i];
if (c == '/' || c == '\\')
c = '!';
name[i] = c;
}
name[len] = '\0';
fs->super_kobj.kset = btrfs_kset;
error = kobject_init_and_add(&fs->super_kobj, &btrfs_super_ktype,
NULL, "%s", name);
if (error)
goto fail;
kfree(name);
return 0;
fail:
kfree(name);
printk(KERN_ERR "btrfs: sysfs creation for super failed\n");
return error;
}
int btrfs_sysfs_add_root(struct btrfs_root *root)
{
int error;
error = kobject_init_and_add(&root->root_kobj, &btrfs_root_ktype,
&root->fs_info->super_kobj,
"%s", root->name);
if (error)
goto fail;
return 0;
fail:
printk(KERN_ERR "btrfs: sysfs creation for root failed\n");
return error;
}
void btrfs_sysfs_del_root(struct btrfs_root *root)
{
kobject_put(&root->root_kobj);
wait_for_completion(&root->kobj_unregister);
}
void btrfs_sysfs_del_super(struct btrfs_fs_info *fs)
{
kobject_put(&fs->super_kobj);
wait_for_completion(&fs->kobj_unregister);
}
int btrfs_init_sysfs(void)
{
btrfs_kset = kset_create_and_add("btrfs", NULL, fs_kobj);
if (!btrfs_kset)
return -ENOMEM;
return 0;
}
void btrfs_exit_sysfs(void)
{
kset_unregister(btrfs_kset);
}
#else
int btrfs_sysfs_add_super(struct btrfs_fs_info *fs)
{
return 0;
}
int btrfs_sysfs_add_root(struct btrfs_root *root)
{
return 0;
}
void btrfs_sysfs_del_root(struct btrfs_root *root)
{
return;
}
void btrfs_sysfs_del_super(struct btrfs_fs_info *fs)
{
return;
}
int btrfs_init_sysfs(void)
{
return 0;
}
void btrfs_exit_sysfs(void)
{
return;
}
#endif

950
fs/btrfs/transaction.c Normal file
View File

@ -0,0 +1,950 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/fs.h>
#include <linux/sched.h>
#include <linux/writeback.h>
#include <linux/pagemap.h>
#include "ctree.h"
#include "disk-io.h"
#include "transaction.h"
#include "locking.h"
#include "ref-cache.h"
#include "tree-log.h"
static int total_trans = 0;
extern struct kmem_cache *btrfs_trans_handle_cachep;
extern struct kmem_cache *btrfs_transaction_cachep;
#define BTRFS_ROOT_TRANS_TAG 0
static noinline void put_transaction(struct btrfs_transaction *transaction)
{
WARN_ON(transaction->use_count == 0);
transaction->use_count--;
if (transaction->use_count == 0) {
WARN_ON(total_trans == 0);
total_trans--;
list_del_init(&transaction->list);
memset(transaction, 0, sizeof(*transaction));
kmem_cache_free(btrfs_transaction_cachep, transaction);
}
}
static noinline int join_transaction(struct btrfs_root *root)
{
struct btrfs_transaction *cur_trans;
cur_trans = root->fs_info->running_transaction;
if (!cur_trans) {
cur_trans = kmem_cache_alloc(btrfs_transaction_cachep,
GFP_NOFS);
total_trans++;
BUG_ON(!cur_trans);
root->fs_info->generation++;
root->fs_info->last_alloc = 0;
root->fs_info->last_data_alloc = 0;
root->fs_info->last_log_alloc = 0;
cur_trans->num_writers = 1;
cur_trans->num_joined = 0;
cur_trans->transid = root->fs_info->generation;
init_waitqueue_head(&cur_trans->writer_wait);
init_waitqueue_head(&cur_trans->commit_wait);
cur_trans->in_commit = 0;
cur_trans->blocked = 0;
cur_trans->use_count = 1;
cur_trans->commit_done = 0;
cur_trans->start_time = get_seconds();
INIT_LIST_HEAD(&cur_trans->pending_snapshots);
list_add_tail(&cur_trans->list, &root->fs_info->trans_list);
extent_io_tree_init(&cur_trans->dirty_pages,
root->fs_info->btree_inode->i_mapping,
GFP_NOFS);
spin_lock(&root->fs_info->new_trans_lock);
root->fs_info->running_transaction = cur_trans;
spin_unlock(&root->fs_info->new_trans_lock);
} else {
cur_trans->num_writers++;
cur_trans->num_joined++;
}
return 0;
}
noinline int btrfs_record_root_in_trans(struct btrfs_root *root)
{
struct btrfs_dirty_root *dirty;
u64 running_trans_id = root->fs_info->running_transaction->transid;
if (root->ref_cows && root->last_trans < running_trans_id) {
WARN_ON(root == root->fs_info->extent_root);
if (root->root_item.refs != 0) {
radix_tree_tag_set(&root->fs_info->fs_roots_radix,
(unsigned long)root->root_key.objectid,
BTRFS_ROOT_TRANS_TAG);
dirty = kmalloc(sizeof(*dirty), GFP_NOFS);
BUG_ON(!dirty);
dirty->root = kmalloc(sizeof(*dirty->root), GFP_NOFS);
BUG_ON(!dirty->root);
dirty->latest_root = root;
INIT_LIST_HEAD(&dirty->list);
root->commit_root = btrfs_root_node(root);
memcpy(dirty->root, root, sizeof(*root));
spin_lock_init(&dirty->root->node_lock);
spin_lock_init(&dirty->root->list_lock);
mutex_init(&dirty->root->objectid_mutex);
INIT_LIST_HEAD(&dirty->root->dead_list);
dirty->root->node = root->commit_root;
dirty->root->commit_root = NULL;
spin_lock(&root->list_lock);
list_add(&dirty->root->dead_list, &root->dead_list);
spin_unlock(&root->list_lock);
root->dirty_root = dirty;
} else {
WARN_ON(1);
}
root->last_trans = running_trans_id;
}
return 0;
}
static void wait_current_trans(struct btrfs_root *root)
{
struct btrfs_transaction *cur_trans;
cur_trans = root->fs_info->running_transaction;
if (cur_trans && cur_trans->blocked) {
DEFINE_WAIT(wait);
cur_trans->use_count++;
while(1) {
prepare_to_wait(&root->fs_info->transaction_wait, &wait,
TASK_UNINTERRUPTIBLE);
if (cur_trans->blocked) {
mutex_unlock(&root->fs_info->trans_mutex);
schedule();
mutex_lock(&root->fs_info->trans_mutex);
finish_wait(&root->fs_info->transaction_wait,
&wait);
} else {
finish_wait(&root->fs_info->transaction_wait,
&wait);
break;
}
}
put_transaction(cur_trans);
}
}
static struct btrfs_trans_handle *start_transaction(struct btrfs_root *root,
int num_blocks, int wait)
{
struct btrfs_trans_handle *h =
kmem_cache_alloc(btrfs_trans_handle_cachep, GFP_NOFS);
int ret;
mutex_lock(&root->fs_info->trans_mutex);
if (!root->fs_info->log_root_recovering &&
((wait == 1 && !root->fs_info->open_ioctl_trans) || wait == 2))
wait_current_trans(root);
ret = join_transaction(root);
BUG_ON(ret);
btrfs_record_root_in_trans(root);
h->transid = root->fs_info->running_transaction->transid;
h->transaction = root->fs_info->running_transaction;
h->blocks_reserved = num_blocks;
h->blocks_used = 0;
h->block_group = NULL;
h->alloc_exclude_nr = 0;
h->alloc_exclude_start = 0;
root->fs_info->running_transaction->use_count++;
mutex_unlock(&root->fs_info->trans_mutex);
return h;
}
struct btrfs_trans_handle *btrfs_start_transaction(struct btrfs_root *root,
int num_blocks)
{
return start_transaction(root, num_blocks, 1);
}
struct btrfs_trans_handle *btrfs_join_transaction(struct btrfs_root *root,
int num_blocks)
{
return start_transaction(root, num_blocks, 0);
}
struct btrfs_trans_handle *btrfs_start_ioctl_transaction(struct btrfs_root *r,
int num_blocks)
{
return start_transaction(r, num_blocks, 2);
}
static noinline int wait_for_commit(struct btrfs_root *root,
struct btrfs_transaction *commit)
{
DEFINE_WAIT(wait);
mutex_lock(&root->fs_info->trans_mutex);
while(!commit->commit_done) {
prepare_to_wait(&commit->commit_wait, &wait,
TASK_UNINTERRUPTIBLE);
if (commit->commit_done)
break;
mutex_unlock(&root->fs_info->trans_mutex);
schedule();
mutex_lock(&root->fs_info->trans_mutex);
}
mutex_unlock(&root->fs_info->trans_mutex);
finish_wait(&commit->commit_wait, &wait);
return 0;
}
static void throttle_on_drops(struct btrfs_root *root)
{
struct btrfs_fs_info *info = root->fs_info;
int harder_count = 0;
harder:
if (atomic_read(&info->throttles)) {
DEFINE_WAIT(wait);
int thr;
thr = atomic_read(&info->throttle_gen);
do {
prepare_to_wait(&info->transaction_throttle,
&wait, TASK_UNINTERRUPTIBLE);
if (!atomic_read(&info->throttles)) {
finish_wait(&info->transaction_throttle, &wait);
break;
}
schedule();
finish_wait(&info->transaction_throttle, &wait);
} while (thr == atomic_read(&info->throttle_gen));
harder_count++;
if (root->fs_info->total_ref_cache_size > 1 * 1024 * 1024 &&
harder_count < 2)
goto harder;
if (root->fs_info->total_ref_cache_size > 5 * 1024 * 1024 &&
harder_count < 10)
goto harder;
if (root->fs_info->total_ref_cache_size > 10 * 1024 * 1024 &&
harder_count < 20)
goto harder;
}
}
void btrfs_throttle(struct btrfs_root *root)
{
mutex_lock(&root->fs_info->trans_mutex);
if (!root->fs_info->open_ioctl_trans)
wait_current_trans(root);
mutex_unlock(&root->fs_info->trans_mutex);
throttle_on_drops(root);
}
static int __btrfs_end_transaction(struct btrfs_trans_handle *trans,
struct btrfs_root *root, int throttle)
{
struct btrfs_transaction *cur_trans;
struct btrfs_fs_info *info = root->fs_info;
mutex_lock(&info->trans_mutex);
cur_trans = info->running_transaction;
WARN_ON(cur_trans != trans->transaction);
WARN_ON(cur_trans->num_writers < 1);
cur_trans->num_writers--;
if (waitqueue_active(&cur_trans->writer_wait))
wake_up(&cur_trans->writer_wait);
put_transaction(cur_trans);
mutex_unlock(&info->trans_mutex);
memset(trans, 0, sizeof(*trans));
kmem_cache_free(btrfs_trans_handle_cachep, trans);
if (throttle)
throttle_on_drops(root);
return 0;
}
int btrfs_end_transaction(struct btrfs_trans_handle *trans,
struct btrfs_root *root)
{
return __btrfs_end_transaction(trans, root, 0);
}
int btrfs_end_transaction_throttle(struct btrfs_trans_handle *trans,
struct btrfs_root *root)
{
return __btrfs_end_transaction(trans, root, 1);
}
int btrfs_write_and_wait_marked_extents(struct btrfs_root *root,
struct extent_io_tree *dirty_pages)
{
int ret;
int err = 0;
int werr = 0;
struct page *page;
struct inode *btree_inode = root->fs_info->btree_inode;
u64 start = 0;
u64 end;
unsigned long index;
while(1) {
ret = find_first_extent_bit(dirty_pages, start, &start, &end,
EXTENT_DIRTY);
if (ret)
break;
while(start <= end) {
cond_resched();
index = start >> PAGE_CACHE_SHIFT;
start = (u64)(index + 1) << PAGE_CACHE_SHIFT;
page = find_get_page(btree_inode->i_mapping, index);
if (!page)
continue;
btree_lock_page_hook(page);
if (!page->mapping) {
unlock_page(page);
page_cache_release(page);
continue;
}
if (PageWriteback(page)) {
if (PageDirty(page))
wait_on_page_writeback(page);
else {
unlock_page(page);
page_cache_release(page);
continue;
}
}
err = write_one_page(page, 0);
if (err)
werr = err;
page_cache_release(page);
}
}
while(1) {
ret = find_first_extent_bit(dirty_pages, 0, &start, &end,
EXTENT_DIRTY);
if (ret)
break;
clear_extent_dirty(dirty_pages, start, end, GFP_NOFS);
while(start <= end) {
index = start >> PAGE_CACHE_SHIFT;
start = (u64)(index + 1) << PAGE_CACHE_SHIFT;
page = find_get_page(btree_inode->i_mapping, index);
if (!page)
continue;
if (PageDirty(page)) {
btree_lock_page_hook(page);
wait_on_page_writeback(page);
err = write_one_page(page, 0);
if (err)
werr = err;
}
wait_on_page_writeback(page);
page_cache_release(page);
cond_resched();
}
}
if (err)
werr = err;
return werr;
}
int btrfs_write_and_wait_transaction(struct btrfs_trans_handle *trans,
struct btrfs_root *root)
{
if (!trans || !trans->transaction) {
struct inode *btree_inode;
btree_inode = root->fs_info->btree_inode;
return filemap_write_and_wait(btree_inode->i_mapping);
}
return btrfs_write_and_wait_marked_extents(root,
&trans->transaction->dirty_pages);
}
static int update_cowonly_root(struct btrfs_trans_handle *trans,
struct btrfs_root *root)
{
int ret;
u64 old_root_bytenr;
struct btrfs_root *tree_root = root->fs_info->tree_root;
btrfs_write_dirty_block_groups(trans, root);
while(1) {
old_root_bytenr = btrfs_root_bytenr(&root->root_item);
if (old_root_bytenr == root->node->start)
break;
btrfs_set_root_bytenr(&root->root_item,
root->node->start);
btrfs_set_root_level(&root->root_item,
btrfs_header_level(root->node));
ret = btrfs_update_root(trans, tree_root,
&root->root_key,
&root->root_item);
BUG_ON(ret);
btrfs_write_dirty_block_groups(trans, root);
}
return 0;
}
int btrfs_commit_tree_roots(struct btrfs_trans_handle *trans,
struct btrfs_root *root)
{
struct btrfs_fs_info *fs_info = root->fs_info;
struct list_head *next;
while(!list_empty(&fs_info->dirty_cowonly_roots)) {
next = fs_info->dirty_cowonly_roots.next;
list_del_init(next);
root = list_entry(next, struct btrfs_root, dirty_list);
update_cowonly_root(trans, root);
}
return 0;
}
int btrfs_add_dead_root(struct btrfs_root *root, struct btrfs_root *latest)
{
struct btrfs_dirty_root *dirty;
dirty = kmalloc(sizeof(*dirty), GFP_NOFS);
if (!dirty)
return -ENOMEM;
dirty->root = root;
dirty->latest_root = latest;
mutex_lock(&root->fs_info->trans_mutex);
list_add(&dirty->list, &latest->fs_info->dead_roots);
mutex_unlock(&root->fs_info->trans_mutex);
return 0;
}
static noinline int add_dirty_roots(struct btrfs_trans_handle *trans,
struct radix_tree_root *radix,
struct list_head *list)
{
struct btrfs_dirty_root *dirty;
struct btrfs_root *gang[8];
struct btrfs_root *root;
int i;
int ret;
int err = 0;
u32 refs;
while(1) {
ret = radix_tree_gang_lookup_tag(radix, (void **)gang, 0,
ARRAY_SIZE(gang),
BTRFS_ROOT_TRANS_TAG);
if (ret == 0)
break;
for (i = 0; i < ret; i++) {
root = gang[i];
radix_tree_tag_clear(radix,
(unsigned long)root->root_key.objectid,
BTRFS_ROOT_TRANS_TAG);
BUG_ON(!root->ref_tree);
dirty = root->dirty_root;
btrfs_free_log(trans, root);
if (root->commit_root == root->node) {
WARN_ON(root->node->start !=
btrfs_root_bytenr(&root->root_item));
free_extent_buffer(root->commit_root);
root->commit_root = NULL;
root->dirty_root = NULL;
spin_lock(&root->list_lock);
list_del_init(&dirty->root->dead_list);
spin_unlock(&root->list_lock);
kfree(dirty->root);
kfree(dirty);
/* make sure to update the root on disk
* so we get any updates to the block used
* counts
*/
err = btrfs_update_root(trans,
root->fs_info->tree_root,
&root->root_key,
&root->root_item);
continue;
}
memset(&root->root_item.drop_progress, 0,
sizeof(struct btrfs_disk_key));
root->root_item.drop_level = 0;
root->commit_root = NULL;
root->dirty_root = NULL;
root->root_key.offset = root->fs_info->generation;
btrfs_set_root_bytenr(&root->root_item,
root->node->start);
btrfs_set_root_level(&root->root_item,
btrfs_header_level(root->node));
err = btrfs_insert_root(trans, root->fs_info->tree_root,
&root->root_key,
&root->root_item);
if (err)
break;
refs = btrfs_root_refs(&dirty->root->root_item);
btrfs_set_root_refs(&dirty->root->root_item, refs - 1);
err = btrfs_update_root(trans, root->fs_info->tree_root,
&dirty->root->root_key,
&dirty->root->root_item);
BUG_ON(err);
if (refs == 1) {
list_add(&dirty->list, list);
} else {
WARN_ON(1);
free_extent_buffer(dirty->root->node);
kfree(dirty->root);
kfree(dirty);
}
}
}
return err;
}
int btrfs_defrag_root(struct btrfs_root *root, int cacheonly)
{
struct btrfs_fs_info *info = root->fs_info;
int ret;
struct btrfs_trans_handle *trans;
unsigned long nr;
smp_mb();
if (root->defrag_running)
return 0;
trans = btrfs_start_transaction(root, 1);
while (1) {
root->defrag_running = 1;
ret = btrfs_defrag_leaves(trans, root, cacheonly);
nr = trans->blocks_used;
btrfs_end_transaction(trans, root);
btrfs_btree_balance_dirty(info->tree_root, nr);
cond_resched();
trans = btrfs_start_transaction(root, 1);
if (root->fs_info->closing || ret != -EAGAIN)
break;
}
root->defrag_running = 0;
smp_mb();
btrfs_end_transaction(trans, root);
return 0;
}
static noinline int drop_dirty_roots(struct btrfs_root *tree_root,
struct list_head *list)
{
struct btrfs_dirty_root *dirty;
struct btrfs_trans_handle *trans;
unsigned long nr;
u64 num_bytes;
u64 bytes_used;
u64 max_useless;
int ret = 0;
int err;
while(!list_empty(list)) {
struct btrfs_root *root;
dirty = list_entry(list->prev, struct btrfs_dirty_root, list);
list_del_init(&dirty->list);
num_bytes = btrfs_root_used(&dirty->root->root_item);
root = dirty->latest_root;
atomic_inc(&root->fs_info->throttles);
mutex_lock(&root->fs_info->drop_mutex);
while(1) {
trans = btrfs_start_transaction(tree_root, 1);
ret = btrfs_drop_snapshot(trans, dirty->root);
if (ret != -EAGAIN) {
break;
}
err = btrfs_update_root(trans,
tree_root,
&dirty->root->root_key,
&dirty->root->root_item);
if (err)
ret = err;
nr = trans->blocks_used;
ret = btrfs_end_transaction(trans, tree_root);
BUG_ON(ret);
mutex_unlock(&root->fs_info->drop_mutex);
btrfs_btree_balance_dirty(tree_root, nr);
cond_resched();
mutex_lock(&root->fs_info->drop_mutex);
}
BUG_ON(ret);
atomic_dec(&root->fs_info->throttles);
wake_up(&root->fs_info->transaction_throttle);
mutex_lock(&root->fs_info->alloc_mutex);
num_bytes -= btrfs_root_used(&dirty->root->root_item);
bytes_used = btrfs_root_used(&root->root_item);
if (num_bytes) {
btrfs_record_root_in_trans(root);
btrfs_set_root_used(&root->root_item,
bytes_used - num_bytes);
}
mutex_unlock(&root->fs_info->alloc_mutex);
ret = btrfs_del_root(trans, tree_root, &dirty->root->root_key);
if (ret) {
BUG();
break;
}
mutex_unlock(&root->fs_info->drop_mutex);
spin_lock(&root->list_lock);
list_del_init(&dirty->root->dead_list);
if (!list_empty(&root->dead_list)) {
struct btrfs_root *oldest;
oldest = list_entry(root->dead_list.prev,
struct btrfs_root, dead_list);
max_useless = oldest->root_key.offset - 1;
} else {
max_useless = root->root_key.offset - 1;
}
spin_unlock(&root->list_lock);
nr = trans->blocks_used;
ret = btrfs_end_transaction(trans, tree_root);
BUG_ON(ret);
ret = btrfs_remove_leaf_refs(root, max_useless);
BUG_ON(ret);
free_extent_buffer(dirty->root->node);
kfree(dirty->root);
kfree(dirty);
btrfs_btree_balance_dirty(tree_root, nr);
cond_resched();
}
return ret;
}
static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
struct btrfs_fs_info *fs_info,
struct btrfs_pending_snapshot *pending)
{
struct btrfs_key key;
struct btrfs_root_item *new_root_item;
struct btrfs_root *tree_root = fs_info->tree_root;
struct btrfs_root *root = pending->root;
struct extent_buffer *tmp;
struct extent_buffer *old;
int ret;
int namelen;
u64 objectid;
new_root_item = kmalloc(sizeof(*new_root_item), GFP_NOFS);
if (!new_root_item) {
ret = -ENOMEM;
goto fail;
}
ret = btrfs_find_free_objectid(trans, tree_root, 0, &objectid);
if (ret)
goto fail;
memcpy(new_root_item, &root->root_item, sizeof(*new_root_item));
key.objectid = objectid;
key.offset = 1;
btrfs_set_key_type(&key, BTRFS_ROOT_ITEM_KEY);
old = btrfs_lock_root_node(root);
btrfs_cow_block(trans, root, old, NULL, 0, &old, 0);
btrfs_copy_root(trans, root, old, &tmp, objectid);
btrfs_tree_unlock(old);
free_extent_buffer(old);
btrfs_set_root_bytenr(new_root_item, tmp->start);
btrfs_set_root_level(new_root_item, btrfs_header_level(tmp));
ret = btrfs_insert_root(trans, root->fs_info->tree_root, &key,
new_root_item);
btrfs_tree_unlock(tmp);
free_extent_buffer(tmp);
if (ret)
goto fail;
/*
* insert the directory item
*/
key.offset = (u64)-1;
namelen = strlen(pending->name);
ret = btrfs_insert_dir_item(trans, root->fs_info->tree_root,
pending->name, namelen,
root->fs_info->sb->s_root->d_inode->i_ino,
&key, BTRFS_FT_DIR, 0);
if (ret)
goto fail;
ret = btrfs_insert_inode_ref(trans, root->fs_info->tree_root,
pending->name, strlen(pending->name), objectid,
root->fs_info->sb->s_root->d_inode->i_ino, 0);
/* Invalidate existing dcache entry for new snapshot. */
btrfs_invalidate_dcache_root(root, pending->name, namelen);
fail:
kfree(new_root_item);
return ret;
}
static noinline int create_pending_snapshots(struct btrfs_trans_handle *trans,
struct btrfs_fs_info *fs_info)
{
struct btrfs_pending_snapshot *pending;
struct list_head *head = &trans->transaction->pending_snapshots;
int ret;
while(!list_empty(head)) {
pending = list_entry(head->next,
struct btrfs_pending_snapshot, list);
ret = create_pending_snapshot(trans, fs_info, pending);
BUG_ON(ret);
list_del(&pending->list);
kfree(pending->name);
kfree(pending);
}
return 0;
}
int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
struct btrfs_root *root)
{
unsigned long joined = 0;
unsigned long timeout = 1;
struct btrfs_transaction *cur_trans;
struct btrfs_transaction *prev_trans = NULL;
struct btrfs_root *chunk_root = root->fs_info->chunk_root;
struct list_head dirty_fs_roots;
struct extent_io_tree *pinned_copy;
DEFINE_WAIT(wait);
int ret;
INIT_LIST_HEAD(&dirty_fs_roots);
mutex_lock(&root->fs_info->trans_mutex);
if (trans->transaction->in_commit) {
cur_trans = trans->transaction;
trans->transaction->use_count++;
mutex_unlock(&root->fs_info->trans_mutex);
btrfs_end_transaction(trans, root);
ret = wait_for_commit(root, cur_trans);
BUG_ON(ret);
mutex_lock(&root->fs_info->trans_mutex);
put_transaction(cur_trans);
mutex_unlock(&root->fs_info->trans_mutex);
return 0;
}
pinned_copy = kmalloc(sizeof(*pinned_copy), GFP_NOFS);
if (!pinned_copy)
return -ENOMEM;
extent_io_tree_init(pinned_copy,
root->fs_info->btree_inode->i_mapping, GFP_NOFS);
trans->transaction->in_commit = 1;
trans->transaction->blocked = 1;
cur_trans = trans->transaction;
if (cur_trans->list.prev != &root->fs_info->trans_list) {
prev_trans = list_entry(cur_trans->list.prev,
struct btrfs_transaction, list);
if (!prev_trans->commit_done) {
prev_trans->use_count++;
mutex_unlock(&root->fs_info->trans_mutex);
wait_for_commit(root, prev_trans);
mutex_lock(&root->fs_info->trans_mutex);
put_transaction(prev_trans);
}
}
do {
int snap_pending = 0;
joined = cur_trans->num_joined;
if (!list_empty(&trans->transaction->pending_snapshots))
snap_pending = 1;
WARN_ON(cur_trans != trans->transaction);
prepare_to_wait(&cur_trans->writer_wait, &wait,
TASK_UNINTERRUPTIBLE);
if (cur_trans->num_writers > 1)
timeout = MAX_SCHEDULE_TIMEOUT;
else
timeout = 1;
mutex_unlock(&root->fs_info->trans_mutex);
if (snap_pending) {
ret = btrfs_wait_ordered_extents(root, 1);
BUG_ON(ret);
}
schedule_timeout(timeout);
mutex_lock(&root->fs_info->trans_mutex);
finish_wait(&cur_trans->writer_wait, &wait);
} while (cur_trans->num_writers > 1 ||
(cur_trans->num_joined != joined));
ret = create_pending_snapshots(trans, root->fs_info);
BUG_ON(ret);
WARN_ON(cur_trans != trans->transaction);
/* btrfs_commit_tree_roots is responsible for getting the
* various roots consistent with each other. Every pointer
* in the tree of tree roots has to point to the most up to date
* root for every subvolume and other tree. So, we have to keep
* the tree logging code from jumping in and changing any
* of the trees.
*
* At this point in the commit, there can't be any tree-log
* writers, but a little lower down we drop the trans mutex
* and let new people in. By holding the tree_log_mutex
* from now until after the super is written, we avoid races
* with the tree-log code.
*/
mutex_lock(&root->fs_info->tree_log_mutex);
ret = add_dirty_roots(trans, &root->fs_info->fs_roots_radix,
&dirty_fs_roots);
BUG_ON(ret);
/* add_dirty_roots gets rid of all the tree log roots, it is now
* safe to free the root of tree log roots
*/
btrfs_free_log_root_tree(trans, root->fs_info);
ret = btrfs_commit_tree_roots(trans, root);
BUG_ON(ret);
cur_trans = root->fs_info->running_transaction;
spin_lock(&root->fs_info->new_trans_lock);
root->fs_info->running_transaction = NULL;
spin_unlock(&root->fs_info->new_trans_lock);
btrfs_set_super_generation(&root->fs_info->super_copy,
cur_trans->transid);
btrfs_set_super_root(&root->fs_info->super_copy,
root->fs_info->tree_root->node->start);
btrfs_set_super_root_level(&root->fs_info->super_copy,
btrfs_header_level(root->fs_info->tree_root->node));
btrfs_set_super_chunk_root(&root->fs_info->super_copy,
chunk_root->node->start);
btrfs_set_super_chunk_root_level(&root->fs_info->super_copy,
btrfs_header_level(chunk_root->node));
if (!root->fs_info->log_root_recovering) {
btrfs_set_super_log_root(&root->fs_info->super_copy, 0);
btrfs_set_super_log_root_level(&root->fs_info->super_copy, 0);
}
memcpy(&root->fs_info->super_for_commit, &root->fs_info->super_copy,
sizeof(root->fs_info->super_copy));
btrfs_copy_pinned(root, pinned_copy);
trans->transaction->blocked = 0;
wake_up(&root->fs_info->transaction_throttle);
wake_up(&root->fs_info->transaction_wait);
mutex_unlock(&root->fs_info->trans_mutex);
ret = btrfs_write_and_wait_transaction(trans, root);
BUG_ON(ret);
write_ctree_super(trans, root);
/*
* the super is written, we can safely allow the tree-loggers
* to go about their business
*/
mutex_unlock(&root->fs_info->tree_log_mutex);
btrfs_finish_extent_commit(trans, root, pinned_copy);
mutex_lock(&root->fs_info->trans_mutex);
kfree(pinned_copy);
cur_trans->commit_done = 1;
root->fs_info->last_trans_committed = cur_trans->transid;
wake_up(&cur_trans->commit_wait);
put_transaction(cur_trans);
put_transaction(cur_trans);
list_splice_init(&dirty_fs_roots, &root->fs_info->dead_roots);
if (root->fs_info->closing)
list_splice_init(&root->fs_info->dead_roots, &dirty_fs_roots);
mutex_unlock(&root->fs_info->trans_mutex);
kmem_cache_free(btrfs_trans_handle_cachep, trans);
if (root->fs_info->closing) {
drop_dirty_roots(root->fs_info->tree_root, &dirty_fs_roots);
}
return ret;
}
int btrfs_clean_old_snapshots(struct btrfs_root *root)
{
struct list_head dirty_roots;
INIT_LIST_HEAD(&dirty_roots);
again:
mutex_lock(&root->fs_info->trans_mutex);
list_splice_init(&root->fs_info->dead_roots, &dirty_roots);
mutex_unlock(&root->fs_info->trans_mutex);
if (!list_empty(&dirty_roots)) {
drop_dirty_roots(root, &dirty_roots);
goto again;
}
return 0;
}

104
fs/btrfs/transaction.h Normal file
View File

@ -0,0 +1,104 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __BTRFS_TRANSACTION__
#define __BTRFS_TRANSACTION__
#include "btrfs_inode.h"
struct btrfs_transaction {
u64 transid;
unsigned long num_writers;
unsigned long num_joined;
int in_commit;
int use_count;
int commit_done;
int blocked;
struct list_head list;
struct extent_io_tree dirty_pages;
unsigned long start_time;
wait_queue_head_t writer_wait;
wait_queue_head_t commit_wait;
struct list_head pending_snapshots;
};
struct btrfs_trans_handle {
u64 transid;
unsigned long blocks_reserved;
unsigned long blocks_used;
struct btrfs_transaction *transaction;
struct btrfs_block_group_cache *block_group;
u64 alloc_exclude_start;
u64 alloc_exclude_nr;
};
struct btrfs_pending_snapshot {
struct btrfs_root *root;
char *name;
struct list_head list;
};
struct btrfs_dirty_root {
struct list_head list;
struct btrfs_root *root;
struct btrfs_root *latest_root;
};
static inline void btrfs_set_trans_block_group(struct btrfs_trans_handle *trans,
struct inode *inode)
{
trans->block_group = BTRFS_I(inode)->block_group;
}
static inline void btrfs_update_inode_block_group(struct
btrfs_trans_handle *trans,
struct inode *inode)
{
BTRFS_I(inode)->block_group = trans->block_group;
}
static inline void btrfs_set_inode_last_trans(struct btrfs_trans_handle *trans,
struct inode *inode)
{
BTRFS_I(inode)->last_trans = trans->transaction->transid;
}
int btrfs_end_transaction(struct btrfs_trans_handle *trans,
struct btrfs_root *root);
struct btrfs_trans_handle *btrfs_start_transaction(struct btrfs_root *root,
int num_blocks);
struct btrfs_trans_handle *btrfs_join_transaction(struct btrfs_root *root,
int num_blocks);
struct btrfs_trans_handle *btrfs_start_ioctl_transaction(struct btrfs_root *r,
int num_blocks);
int btrfs_write_and_wait_transaction(struct btrfs_trans_handle *trans,
struct btrfs_root *root);
int btrfs_commit_tree_roots(struct btrfs_trans_handle *trans,
struct btrfs_root *root);
int btrfs_add_dead_root(struct btrfs_root *root, struct btrfs_root *latest);
int btrfs_defrag_root(struct btrfs_root *root, int cacheonly);
int btrfs_clean_old_snapshots(struct btrfs_root *root);
int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
struct btrfs_root *root);
int btrfs_end_transaction_throttle(struct btrfs_trans_handle *trans,
struct btrfs_root *root);
void btrfs_throttle(struct btrfs_root *root);
int btrfs_record_root_in_trans(struct btrfs_root *root);
int btrfs_write_and_wait_marked_extents(struct btrfs_root *root,
struct extent_io_tree *dirty_pages);
#endif

145
fs/btrfs/tree-defrag.c Normal file
View File

@ -0,0 +1,145 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/sched.h>
#include "ctree.h"
#include "disk-io.h"
#include "print-tree.h"
#include "transaction.h"
#include "locking.h"
int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
struct btrfs_root *root, int cache_only)
{
struct btrfs_path *path = NULL;
struct btrfs_key key;
int ret = 0;
int wret;
int level;
int orig_level;
int is_extent = 0;
int next_key_ret = 0;
u64 last_ret = 0;
u64 min_trans = 0;
if (cache_only)
goto out;
if (root->fs_info->extent_root == root) {
/*
* there's recursion here right now in the tree locking,
* we can't defrag the extent root without deadlock
*/
goto out;
}
if (root->ref_cows == 0 && !is_extent)
goto out;
if (btrfs_test_opt(root, SSD))
goto out;
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
level = btrfs_header_level(root->node);
orig_level = level;
if (level == 0) {
goto out;
}
if (root->defrag_progress.objectid == 0) {
struct extent_buffer *root_node;
u32 nritems;
root_node = btrfs_lock_root_node(root);
nritems = btrfs_header_nritems(root_node);
root->defrag_max.objectid = 0;
/* from above we know this is not a leaf */
btrfs_node_key_to_cpu(root_node, &root->defrag_max,
nritems - 1);
btrfs_tree_unlock(root_node);
free_extent_buffer(root_node);
memset(&key, 0, sizeof(key));
} else {
memcpy(&key, &root->defrag_progress, sizeof(key));
}
path->keep_locks = 1;
if (cache_only)
min_trans = root->defrag_trans_start;
ret = btrfs_search_forward(root, &key, NULL, path,
cache_only, min_trans);
if (ret < 0)
goto out;
if (ret > 0) {
ret = 0;
goto out;
}
btrfs_release_path(root, path);
wret = btrfs_search_slot(trans, root, &key, path, 0, 1);
if (wret < 0) {
ret = wret;
goto out;
}
if (!path->nodes[1]) {
ret = 0;
goto out;
}
path->slots[1] = btrfs_header_nritems(path->nodes[1]);
next_key_ret = btrfs_find_next_key(root, path, &key, 1, cache_only,
min_trans);
ret = btrfs_realloc_node(trans, root,
path->nodes[1], 0,
cache_only, &last_ret,
&root->defrag_progress);
WARN_ON(ret && ret != -EAGAIN);
if (next_key_ret == 0) {
memcpy(&root->defrag_progress, &key, sizeof(key));
ret = -EAGAIN;
}
btrfs_release_path(root, path);
if (is_extent)
btrfs_extent_post_op(trans, root);
out:
if (is_extent)
mutex_unlock(&root->fs_info->alloc_mutex);
if (path)
btrfs_free_path(path);
if (ret == -EAGAIN) {
if (root->defrag_max.objectid > root->defrag_progress.objectid)
goto done;
if (root->defrag_max.type > root->defrag_progress.type)
goto done;
if (root->defrag_max.offset > root->defrag_progress.offset)
goto done;
ret = 0;
}
done:
if (ret != -EAGAIN) {
memset(&root->defrag_progress, 0,
sizeof(root->defrag_progress));
root->defrag_trans_start = trans->transid;
}
return ret;
}

2892
fs/btrfs/tree-log.c Normal file

File diff suppressed because it is too large Load Diff

41
fs/btrfs/tree-log.h Normal file
View File

@ -0,0 +1,41 @@
/*
* Copyright (C) 2008 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __TREE_LOG_
#define __TREE_LOG_
int btrfs_sync_log(struct btrfs_trans_handle *trans,
struct btrfs_root *root);
int btrfs_free_log(struct btrfs_trans_handle *trans, struct btrfs_root *root);
int btrfs_log_dentry(struct btrfs_trans_handle *trans,
struct btrfs_root *root, struct dentry *dentry);
int btrfs_recover_log_trees(struct btrfs_root *tree_root);
int btrfs_log_dentry_safe(struct btrfs_trans_handle *trans,
struct btrfs_root *root, struct dentry *dentry);
int btrfs_log_inode(struct btrfs_trans_handle *trans,
struct btrfs_root *root, struct inode *inode,
int inode_only);
int btrfs_del_dir_entries_in_log(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
const char *name, int name_len,
struct inode *dir, u64 index);
int btrfs_del_inode_ref_in_log(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
const char *name, int name_len,
struct inode *inode, u64 dirid);
#endif

43
fs/btrfs/version.sh Normal file
View File

@ -0,0 +1,43 @@
#!/bin/bash
#
# determine-version -- report a useful version for releases
#
# Copyright 2008, Aron Griffis <agriffis@n01se.net>
# Copyright 2008, Oracle
# Released under the GNU GPLv2
v="v0.16"
which hg > /dev/null
if [ -d .hg ] && [ $? == 0 ]; then
last=$(hg tags | grep -m1 -o '^v[0-9.]\+')
# now check if the repo has commits since then...
if [[ $(hg id -t) == $last || \
$(hg di -r "$last:." | awk '/^diff/{print $NF}' | sort -u) == .hgtags ]]
then
# check if it's dirty
if [[ $(hg id | cut -d' ' -f1) == *+ ]]; then
v=$last+
else
v=$last
fi
else
# includes dirty flag
v=$last+$(hg id -i)
fi
fi
echo "#ifndef __BUILD_VERSION" > .build-version.h
echo "#define __BUILD_VERSION" >> .build-version.h
echo "#define BTRFS_BUILD_VERSION \"Btrfs $v\"" >> .build-version.h
echo "#endif" >> .build-version.h
diff -q version.h .build-version.h >& /dev/null
if [ $? == 0 ]; then
rm .build-version.h
exit 0
fi
mv .build-version.h version.h

2565
fs/btrfs/volumes.c Normal file

File diff suppressed because it is too large Load Diff

150
fs/btrfs/volumes.h Normal file
View File

@ -0,0 +1,150 @@
/*
* Copyright (C) 2007 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __BTRFS_VOLUMES_
#define __BTRFS_VOLUMES_
#include <linux/bio.h>
#include "async-thread.h"
struct buffer_head;
struct btrfs_device {
struct list_head dev_list;
struct list_head dev_alloc_list;
struct btrfs_root *dev_root;
struct buffer_head *pending_io;
struct bio *pending_bios;
struct bio *pending_bio_tail;
int running_pending;
u64 generation;
int barriers;
int in_fs_metadata;
spinlock_t io_lock;
struct block_device *bdev;
char *name;
/* the internal btrfs device id */
u64 devid;
/* size of the device */
u64 total_bytes;
/* bytes used */
u64 bytes_used;
/* optimal io alignment for this device */
u32 io_align;
/* optimal io width for this device */
u32 io_width;
/* minimal io size for this device */
u32 sector_size;
/* type and info about this device */
u64 type;
/* physical drive uuid (or lvm uuid) */
u8 uuid[BTRFS_UUID_SIZE];
struct btrfs_work work;
};
struct btrfs_fs_devices {
u8 fsid[BTRFS_FSID_SIZE]; /* FS specific uuid */
/* the device with this id has the most recent coyp of the super */
u64 latest_devid;
u64 latest_trans;
u64 num_devices;
u64 open_devices;
struct block_device *latest_bdev;
/* all of the devices in the FS */
struct list_head devices;
/* devices not currently being allocated */
struct list_head alloc_list;
struct list_head list;
int mounted;
};
struct btrfs_bio_stripe {
struct btrfs_device *dev;
u64 physical;
};
struct btrfs_multi_bio {
atomic_t stripes_pending;
bio_end_io_t *end_io;
struct bio *orig_bio;
void *private;
atomic_t error;
int max_errors;
int num_stripes;
struct btrfs_bio_stripe stripes[];
};
#define btrfs_multi_bio_size(n) (sizeof(struct btrfs_multi_bio) + \
(sizeof(struct btrfs_bio_stripe) * (n)))
int btrfs_alloc_dev_extent(struct btrfs_trans_handle *trans,
struct btrfs_device *device,
u64 chunk_tree, u64 chunk_objectid,
u64 chunk_offset,
u64 num_bytes, u64 *start);
int btrfs_map_block(struct btrfs_mapping_tree *map_tree, int rw,
u64 logical, u64 *length,
struct btrfs_multi_bio **multi_ret, int mirror_num);
int btrfs_read_sys_array(struct btrfs_root *root);
int btrfs_read_chunk_tree(struct btrfs_root *root);
int btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
struct btrfs_root *extent_root, u64 *start,
u64 *num_bytes, u64 type);
void btrfs_mapping_init(struct btrfs_mapping_tree *tree);
void btrfs_mapping_tree_free(struct btrfs_mapping_tree *tree);
int btrfs_map_bio(struct btrfs_root *root, int rw, struct bio *bio,
int mirror_num, int async_submit);
int btrfs_read_super_device(struct btrfs_root *root, struct extent_buffer *buf);
int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
int flags, void *holder);
int btrfs_scan_one_device(const char *path, int flags, void *holder,
struct btrfs_fs_devices **fs_devices_ret);
int btrfs_close_devices(struct btrfs_fs_devices *fs_devices);
int btrfs_close_extra_devices(struct btrfs_fs_devices *fs_devices);
int btrfs_add_device(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct btrfs_device *device);
int btrfs_rm_device(struct btrfs_root *root, char *device_path);
int btrfs_cleanup_fs_uuids(void);
int btrfs_num_copies(struct btrfs_mapping_tree *map_tree, u64 logical, u64 len);
int btrfs_unplug_page(struct btrfs_mapping_tree *map_tree,
u64 logical, struct page *page);
int btrfs_grow_device(struct btrfs_trans_handle *trans,
struct btrfs_device *device, u64 new_size);
struct btrfs_device *btrfs_find_device(struct btrfs_root *root, u64 devid,
u8 *uuid);
int btrfs_shrink_device(struct btrfs_device *device, u64 new_size);
int btrfs_init_new_device(struct btrfs_root *root, char *path);
int btrfs_balance(struct btrfs_root *dev_root);
void btrfs_unlock_volumes(void);
void btrfs_lock_volumes(void);
#endif

321
fs/btrfs/xattr.c Normal file
View File

@ -0,0 +1,321 @@
/*
* Copyright (C) 2007 Red Hat. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/init.h>
#include <linux/fs.h>
#include <linux/slab.h>
#include <linux/rwsem.h>
#include <linux/xattr.h>
#include "ctree.h"
#include "btrfs_inode.h"
#include "transaction.h"
#include "xattr.h"
#include "disk-io.h"
ssize_t __btrfs_getxattr(struct inode *inode, const char *name,
void *buffer, size_t size)
{
struct btrfs_dir_item *di;
struct btrfs_root *root = BTRFS_I(inode)->root;
struct btrfs_path *path;
struct extent_buffer *leaf;
int ret = 0;
unsigned long data_ptr;
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
/* lookup the xattr by name */
di = btrfs_lookup_xattr(NULL, root, path, inode->i_ino, name,
strlen(name), 0);
if (!di || IS_ERR(di)) {
ret = -ENODATA;
goto out;
}
leaf = path->nodes[0];
/* if size is 0, that means we want the size of the attr */
if (!size) {
ret = btrfs_dir_data_len(leaf, di);
goto out;
}
/* now get the data out of our dir_item */
if (btrfs_dir_data_len(leaf, di) > size) {
ret = -ERANGE;
goto out;
}
data_ptr = (unsigned long)((char *)(di + 1) +
btrfs_dir_name_len(leaf, di));
read_extent_buffer(leaf, buffer, data_ptr,
btrfs_dir_data_len(leaf, di));
ret = btrfs_dir_data_len(leaf, di);
out:
btrfs_free_path(path);
return ret;
}
int __btrfs_setxattr(struct inode *inode, const char *name,
const void *value, size_t size, int flags)
{
struct btrfs_dir_item *di;
struct btrfs_root *root = BTRFS_I(inode)->root;
struct btrfs_trans_handle *trans;
struct btrfs_path *path;
int ret = 0, mod = 0;
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
trans = btrfs_start_transaction(root, 1);
btrfs_set_trans_block_group(trans, inode);
/* first lets see if we already have this xattr */
di = btrfs_lookup_xattr(trans, root, path, inode->i_ino, name,
strlen(name), -1);
if (IS_ERR(di)) {
ret = PTR_ERR(di);
goto out;
}
/* ok we already have this xattr, lets remove it */
if (di) {
/* if we want create only exit */
if (flags & XATTR_CREATE) {
ret = -EEXIST;
goto out;
}
ret = btrfs_delete_one_dir_name(trans, root, path, di);
if (ret)
goto out;
btrfs_release_path(root, path);
/* if we don't have a value then we are removing the xattr */
if (!value) {
mod = 1;
goto out;
}
} else {
btrfs_release_path(root, path);
if (flags & XATTR_REPLACE) {
/* we couldn't find the attr to replace */
ret = -ENODATA;
goto out;
}
}
/* ok we have to create a completely new xattr */
ret = btrfs_insert_xattr_item(trans, root, name, strlen(name),
value, size, inode->i_ino);
if (ret)
goto out;
mod = 1;
out:
if (mod) {
inode->i_ctime = CURRENT_TIME;
ret = btrfs_update_inode(trans, root, inode);
}
btrfs_end_transaction(trans, root);
btrfs_free_path(path);
return ret;
}
ssize_t btrfs_listxattr(struct dentry *dentry, char *buffer, size_t size)
{
struct btrfs_key key, found_key;
struct inode *inode = dentry->d_inode;
struct btrfs_root *root = BTRFS_I(inode)->root;
struct btrfs_path *path;
struct btrfs_item *item;
struct extent_buffer *leaf;
struct btrfs_dir_item *di;
int ret = 0, slot, advance;
size_t total_size = 0, size_left = size;
unsigned long name_ptr;
size_t name_len;
u32 nritems;
/*
* ok we want all objects associated with this id.
* NOTE: we set key.offset = 0; because we want to start with the
* first xattr that we find and walk forward
*/
key.objectid = inode->i_ino;
btrfs_set_key_type(&key, BTRFS_XATTR_ITEM_KEY);
key.offset = 0;
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
path->reada = 2;
/* search for our xattrs */
ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
if (ret < 0)
goto err;
ret = 0;
advance = 0;
while (1) {
leaf = path->nodes[0];
nritems = btrfs_header_nritems(leaf);
slot = path->slots[0];
/* this is where we start walking through the path */
if (advance || slot >= nritems) {
/*
* if we've reached the last slot in this leaf we need
* to go to the next leaf and reset everything
*/
if (slot >= nritems-1) {
ret = btrfs_next_leaf(root, path);
if (ret)
break;
leaf = path->nodes[0];
nritems = btrfs_header_nritems(leaf);
slot = path->slots[0];
} else {
/*
* just walking through the slots on this leaf
*/
slot++;
path->slots[0]++;
}
}
advance = 1;
item = btrfs_item_nr(leaf, slot);
btrfs_item_key_to_cpu(leaf, &found_key, slot);
/* check to make sure this item is what we want */
if (found_key.objectid != key.objectid)
break;
if (btrfs_key_type(&found_key) != BTRFS_XATTR_ITEM_KEY)
break;
di = btrfs_item_ptr(leaf, slot, struct btrfs_dir_item);
name_len = btrfs_dir_name_len(leaf, di);
total_size += name_len + 1;
/* we are just looking for how big our buffer needs to be */
if (!size)
continue;
if (!buffer || (name_len + 1) > size_left) {
ret = -ERANGE;
break;
}
name_ptr = (unsigned long)(di + 1);
read_extent_buffer(leaf, buffer, name_ptr, name_len);
buffer[name_len] = '\0';
size_left -= name_len + 1;
buffer += name_len + 1;
}
ret = total_size;
err:
btrfs_free_path(path);
return ret;
}
/*
* List of handlers for synthetic system.* attributes. All real ondisk
* attributes are handled directly.
*/
struct xattr_handler *btrfs_xattr_handlers[] = {
#ifdef CONFIG_FS_POSIX_ACL
&btrfs_xattr_acl_access_handler,
&btrfs_xattr_acl_default_handler,
#endif
NULL,
};
/*
* Check if the attribute is in a supported namespace.
*
* This applied after the check for the synthetic attributes in the system
* namespace.
*/
static bool btrfs_is_valid_xattr(const char *name)
{
return !strncmp(name, XATTR_SECURITY_PREFIX, XATTR_SECURITY_PREFIX_LEN) ||
!strncmp(name, XATTR_SYSTEM_PREFIX, XATTR_SYSTEM_PREFIX_LEN) ||
!strncmp(name, XATTR_TRUSTED_PREFIX, XATTR_TRUSTED_PREFIX_LEN) ||
!strncmp(name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN);
}
ssize_t btrfs_getxattr(struct dentry *dentry, const char *name,
void *buffer, size_t size)
{
/*
* If this is a request for a synthetic attribute in the system.*
* namespace use the generic infrastructure to resolve a handler
* for it via sb->s_xattr.
*/
if (!strncmp(name, XATTR_SYSTEM_PREFIX, XATTR_SYSTEM_PREFIX_LEN))
return generic_getxattr(dentry, name, buffer, size);
if (!btrfs_is_valid_xattr(name))
return -EOPNOTSUPP;
return __btrfs_getxattr(dentry->d_inode, name, buffer, size);
}
int btrfs_setxattr(struct dentry *dentry, const char *name, const void *value,
size_t size, int flags)
{
/*
* If this is a request for a synthetic attribute in the system.*
* namespace use the generic infrastructure to resolve a handler
* for it via sb->s_xattr.
*/
if (!strncmp(name, XATTR_SYSTEM_PREFIX, XATTR_SYSTEM_PREFIX_LEN))
return generic_setxattr(dentry, name, value, size, flags);
if (!btrfs_is_valid_xattr(name))
return -EOPNOTSUPP;
if (size == 0)
value = ""; /* empty EA, do not remove */
return __btrfs_setxattr(dentry->d_inode, name, value, size, flags);
}
int btrfs_removexattr(struct dentry *dentry, const char *name)
{
/*
* If this is a request for a synthetic attribute in the system.*
* namespace use the generic infrastructure to resolve a handler
* for it via sb->s_xattr.
*/
if (!strncmp(name, XATTR_SYSTEM_PREFIX, XATTR_SYSTEM_PREFIX_LEN))
return generic_removexattr(dentry, name);
if (!btrfs_is_valid_xattr(name))
return -EOPNOTSUPP;
return __btrfs_setxattr(dentry->d_inode, name, NULL, 0, XATTR_REPLACE);
}

39
fs/btrfs/xattr.h Normal file
View File

@ -0,0 +1,39 @@
/*
* Copyright (C) 2007 Red Hat. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef __XATTR__
#define __XATTR__
#include <linux/xattr.h>
extern struct xattr_handler btrfs_xattr_acl_access_handler;
extern struct xattr_handler btrfs_xattr_acl_default_handler;
extern struct xattr_handler *btrfs_xattr_handlers[];
extern ssize_t __btrfs_getxattr(struct inode *inode, const char *name,
void *buffer, size_t size);
extern int __btrfs_setxattr(struct inode *inode, const char *name,
const void *value, size_t size, int flags);
extern ssize_t btrfs_getxattr(struct dentry *dentry, const char *name,
void *buffer, size_t size);
extern int btrfs_setxattr(struct dentry *dentry, const char *name,
const void *value, size_t size, int flags);
extern int btrfs_removexattr(struct dentry *dentry, const char *name);
#endif /* __XATTR__ */