dect
/
linux-2.6
Archived
13
0
Fork 0
Commit Graph

709 Commits

Author SHA1 Message Date
Eric Paris 5224ee0863 SELinux: do not destroy the avc_cache_nodep
The security_ops reset done when SELinux is disabled at run time is done
after the avc cache is freed and after the kmem_cache for the avc is also
freed.  This means that between the time the selinux disable code destroys
the avc_node_cachep another process could make a security request and could
try to allocate from the cache.  We are just going to leave the cachep around,
like we always have.

SELinux:  Disabled at runtime.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81122537>] kmem_cache_alloc+0x9a/0x185
PGD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file:
CPU 1
Modules linked in:
Pid: 12, comm: khelper Not tainted 2.6.31-tip-05525-g0eeacc6-dirty #14819
System Product Name
RIP: 0010:[<ffffffff81122537>]  [<ffffffff81122537>]
kmem_cache_alloc+0x9a/0x185
RSP: 0018:ffff88003f9258b0  EFLAGS: 00010086
RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000078c0129e
RDX: 0000000000000000 RSI: ffffffff8130b626 RDI: ffffffff81122528
RBP: ffff88003f925900 R08: 0000000078c0129e R09: 0000000000000001
R10: 0000000000000000 R11: 0000000078c0129e R12: 0000000000000246
R13: 0000000000008020 R14: ffff88003f8586d8 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff880002b00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: ffffffff827bd420 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process khelper (pid: 12, threadinfo ffff88003f924000, task
ffff88003f928000)
Stack:
 0000000000000246 0000802000000246 ffffffff8130b626 0000000000000001
<0> 0000000078c0129e 0000000000000000 ffff88003f925a70 0000000000000002
<0> 0000000000000001 0000000000000001 ffff88003f925960 ffffffff8130b626
Call Trace:
 [<ffffffff8130b626>] ? avc_alloc_node+0x36/0x273
 [<ffffffff8130b626>] avc_alloc_node+0x36/0x273
 [<ffffffff8130b545>] ? avc_latest_notif_update+0x7d/0x9e
 [<ffffffff8130b8b4>] avc_insert+0x51/0x18d
 [<ffffffff8130bcce>] avc_has_perm_noaudit+0x9d/0x128
 [<ffffffff8130bf20>] avc_has_perm+0x45/0x88
 [<ffffffff8130f99d>] current_has_perm+0x52/0x6d
 [<ffffffff8130fbb2>] selinux_task_create+0x2f/0x45
 [<ffffffff81303bf7>] security_task_create+0x29/0x3f
 [<ffffffff8105c6ba>] copy_process+0x82/0xdf0
 [<ffffffff81091578>] ? register_lock_class+0x2f/0x36c
 [<ffffffff81091a13>] ? mark_lock+0x2e/0x1e1
 [<ffffffff8105d596>] do_fork+0x16e/0x382
 [<ffffffff81091578>] ? register_lock_class+0x2f/0x36c
 [<ffffffff810d9166>] ? probe_workqueue_execution+0x57/0xf9
 [<ffffffff81091a13>] ? mark_lock+0x2e/0x1e1
 [<ffffffff810d9166>] ? probe_workqueue_execution+0x57/0xf9
 [<ffffffff8100cdb2>] kernel_thread+0x82/0xe0
 [<ffffffff81078b1f>] ? ____call_usermodehelper+0x0/0x139
 [<ffffffff8100ce10>] ? child_rip+0x0/0x20
 [<ffffffff81078aea>] ? __call_usermodehelper+0x65/0x9a
 [<ffffffff8107a5c7>] run_workqueue+0x171/0x27e
 [<ffffffff8107a573>] ? run_workqueue+0x11d/0x27e
 [<ffffffff81078a85>] ? __call_usermodehelper+0x0/0x9a
 [<ffffffff8107a7bc>] worker_thread+0xe8/0x10f
 [<ffffffff810808e2>] ? autoremove_wake_function+0x0/0x63
 [<ffffffff8107a6d4>] ? worker_thread+0x0/0x10f
 [<ffffffff8108042e>] kthread+0x91/0x99
 [<ffffffff8100ce1a>] child_rip+0xa/0x20
 [<ffffffff8100c754>] ? restore_args+0x0/0x30
 [<ffffffff8108039d>] ? kthread+0x0/0x99
 [<ffffffff8100ce10>] ? child_rip+0x0/0x20
Code: 0f 85 99 00 00 00 9c 58 66 66 90 66 90 49 89 c4 fa 66 66 90 66 66 90
e8 83 34 fb ff e8 d7 e9 26 00 48 98 49 8b 94 c6 10 01 00 00 <48> 8b 1a 44
8b 7a 18 48 85 db 74 0f 8b 42 14 48 8b 04 c3 ff 42
RIP  [<ffffffff81122537>] kmem_cache_alloc+0x9a/0x185
 RSP <ffff88003f9258b0>
CR2: 0000000000000000
---[ end trace 42f41a982344e606 ]---

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-23 11:16:20 -07:00
Eric Paris 4e6d0bffd3 SELinux: flush the avc before disabling SELinux
Before SELinux is disabled at boot it can create AVC entries.  This patch
will flush those entries before disabling SELinux.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-14 12:34:11 +10:00
Eric Paris 008574b111 SELinux: seperate avc_cache flushing
Move the avc_cache flushing into it's own function so it can be reused when
disabling SELinux.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-14 12:34:09 +10:00
Eric Paris ed868a5698 Creds: creds->security can be NULL is selinux is disabled
__validate_process_creds should check if selinux is actually enabled before
running tests on the selinux portion of the credentials struct.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-14 12:34:07 +10:00
David P. Quigley ddd29ec659 sysfs: Add labeling support for sysfs
This patch adds a setxattr handler to the file, directory, and symlink
inode_operations structures for sysfs. The patch uses hooks introduced in the
previous patch to handle the getting and setting of security information for
the sysfs inodes. As was suggested by Eric Biederman the struct iattr in the
sysfs_dirent structure has been replaced by a structure which contains the
iattr, secdata and secdata length to allow the changes to persist in the event
that the inode representing the sysfs_dirent is evicted. Because sysfs only
stores this information when a change is made all the optional data is moved
into one dynamically allocated field.

This patch addresses an issue where SELinux was denying virtd access to the PCI
configuration entries in sysfs. The lack of setxattr handlers for sysfs
required that a single label be assigned to all entries in sysfs. Granting virtd
access to every entry in sysfs is not an acceptable solution so fine grained
labeling of sysfs is required such that individual entries can be labeled
appropriately.

[sds:  Fixed compile-time warnings, coding style, and setting of inode security init flags.]

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-10 10:11:29 +10:00
David P. Quigley 1ee65e37e9 LSM/SELinux: inode_{get,set,notify}secctx hooks to access LSM security context information.
This patch introduces three new hooks. The inode_getsecctx hook is used to get
all relevant information from an LSM about an inode. The inode_setsecctx is
used to set both the in-core and on-disk state for the inode based on a context
derived from inode_getsecctx.The final hook inode_notifysecctx will notify the
LSM of a change for the in-core state of the inode in question. These hooks are
for use in the labeled NFS code and addresses concerns of how to set security
on an inode in a multi-xattr LSM. For historical reasons Stephen Smalley's
explanation of the reason for these hooks is pasted below.

Quote Stephen Smalley

inode_setsecctx:  Change the security context of an inode.  Updates the
in core security context managed by the security module and invokes the
fs code as needed (via __vfs_setxattr_noperm) to update any backing
xattrs that represent the context.  Example usage:  NFS server invokes
this hook to change the security context in its incore inode and on the
backing file system to a value provided by the client on a SETATTR
operation.

inode_notifysecctx:  Notify the security module of what the security
context of an inode should be.  Initializes the incore security context
managed by the security module for this inode.  Example usage:  NFS
client invokes this hook to initialize the security context in its
incore inode to the value provided by the server for the file when the
server returned the file's attributes to the client.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-10 10:11:24 +10:00
David Howells ee18d64c1f KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]
Add a keyctl to install a process's session keyring onto its parent.  This
replaces the parent's session keyring.  Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again.  Normally this
will be after a wait*() syscall.

To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.

The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.

Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME.  This allows the
replacement to be performed at the point the parent process resumes userspace
execution.

This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership.  However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.

This can be tested with the following program:

	#include <stdio.h>
	#include <stdlib.h>
	#include <keyutils.h>

	#define KEYCTL_SESSION_TO_PARENT	18

	#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)

	int main(int argc, char **argv)
	{
		key_serial_t keyring, key;
		long ret;

		keyring = keyctl_join_session_keyring(argv[1]);
		OSERROR(keyring, "keyctl_join_session_keyring");

		key = add_key("user", "a", "b", 1, keyring);
		OSERROR(key, "add_key");

		ret = keyctl(KEYCTL_SESSION_TO_PARENT);
		OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");

		return 0;
	}

Compiled and linked with -lkeyutils, you should see something like:

	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: _ses
	355907932 --alswrv   4043    -1   \_ keyring: _uid.4043
	[dhowells@andromeda ~]$ /tmp/newpag
	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: _ses
	1055658746 --alswrv   4043  4043   \_ user: a
	[dhowells@andromeda ~]$ /tmp/newpag hello
	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: hello
	340417692 --alswrv   4043  4043   \_ user: a

Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:22 +10:00
David Howells e0e817392b CRED: Add some configurable debugging [try #6]
Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking
for credential management.  The additional code keeps track of the number of
pointers from task_structs to any given cred struct, and checks to see that
this number never exceeds the usage count of the cred struct (which includes
all references, not just those from task_structs).

Furthermore, if SELinux is enabled, the code also checks that the security
pointer in the cred struct is never seen to be invalid.

This attempts to catch the bug whereby inode_has_perm() faults in an nfsd
kernel thread on seeing cred->security be a NULL pointer (it appears that the
credential struct has been previously released):

	http://www.kerneloops.org/oops.php?number=252883

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:01 +10:00
Paul Moore ed6d76e4c3 selinux: Support for the new TUN LSM hooks
Add support for the new TUN LSM hooks: security_tun_dev_create(),
security_tun_dev_post_create() and security_tun_dev_attach().  This includes
the addition of a new object class, tun_socket, which represents the socks
associated with TUN devices.  The _tun_dev_create() and _tun_dev_post_create()
hooks are fairly similar to the standard socket functions but _tun_dev_attach()
is a bit special.  The _tun_dev_attach() is unique because it involves a
domain attaching to an existing TUN device and its associated tun_socket
object, an operation which does not exist with standard sockets and most
closely resembles a relabel operation.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@parisplace.org>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-01 08:29:52 +10:00
Amerigo Wang bc6a6008e5 selinux: adjust rules for ATTR_FORCE
As suggested by OGAWA Hirofumi in thread:
http://lkml.org/lkml/2009/8/7/132, we should let selinux_inode_setattr()
to match our ATTR_* rules.  ATTR_FORCE should not force things like
ATTR_SIZE.

[hirofumi@mail.parknet.co.jp: tweaks]
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: Eugene Teo <eteo@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-21 14:25:30 +10:00
James Morris ece13879e7 Merge branch 'master' into next
Conflicts:
	security/Kconfig

Manual fix.

Signed-off-by: James Morris <jmorris@namei.org>
2009-08-20 09:18:42 +10:00
Eric Paris 788084aba2 Security/SELinux: seperate lsm specific mmap_min_addr
Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable.  This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.

The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.

This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-17 15:09:11 +10:00
Eric Paris 8cf948e744 SELinux: call cap_file_mmap in selinux_file_mmap
Currently SELinux does not check CAP_SYS_RAWIO in the file_mmap hook.  This
means there is no DAC check on the ability to mmap low addresses in the
memory space.  This function adds the DAC check for CAP_SYS_RAWIO while
maintaining the selinux check on mmap_zero.  This means that processes
which need to mmap low memory will need CAP_SYS_RAWIO and mmap_zero but will
NOT need the SELinux sys_rawio capability.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-17 15:08:48 +10:00
Thomas Liu 2bf4969032 SELinux: Convert avc_audit to use lsm_audit.h
Convert avc_audit in security/selinux/avc.c to use lsm_audit.h,
for better maintainability.

 - changed selinux to use common_audit_data instead of
    avc_audit_data
 - eliminated code in avc.c and used code from lsm_audit.h instead.

Had to add a LSM_AUDIT_NO_AUDIT to lsm_audit.h so that avc_audit
can call common_lsm_audit and do the pre and post callbacks without
doing the actual dump.  This makes it so that the patched version
behaves the same way as the unpatched version.

Also added a denied field to the selinux_audit_data private space,
once again to make it so that the patched version behaves like the
unpatched.

I've tested and confirmed that AVCs look the same before and after
this patch.

Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-17 08:37:18 +10:00
Eric Paris 25354c4fee SELinux: add selinux_kernel_module_request
This patch adds a new selinux hook so SELinux can arbitrate if a given
process should be allowed to trigger a request for the kernel to try to
load a module.  This is a different operation than a process trying to load
a module itself, which is already protected by CAP_SYS_MODULE.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-14 11:18:40 +10:00
James Morris 314dabb83a SELinux: fix memory leakage in /security/selinux/hooks.c
Fix memory leakage in /security/selinux/hooks.c

The buffer always needs to be freed here; we either error
out or allocate more memory.

Reported-by: iceberg <strakh@ispras.ru>
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
2009-08-11 08:37:13 +10:00
Eric Paris a2551df7ec Security/SELinux: seperate lsm specific mmap_min_addr
Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable.  This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.

The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.

This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-06 09:02:23 +10:00
Eric Paris 84336d1a77 SELinux: call cap_file_mmap in selinux_file_mmap
Currently SELinux does not check CAP_SYS_RAWIO in the file_mmap hook.  This
means there is no DAC check on the ability to mmap low addresses in the
memory space.  This function adds the DAC check for CAP_SYS_RAWIO while
maintaining the selinux check on mmap_zero.  This means that processes
which need to mmap low memory will need CAP_SYS_RAWIO and mmap_zero but will
NOT need the SELinux sys_rawio capability.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-06 09:02:21 +10:00
Oleg Nesterov 5bb459bb45 kernel: rename is_single_threaded(task) to current_is_single_threaded(void)
- is_single_threaded(task) is not safe unless task == current,
  we can't use task->signal or task->mm.

- it doesn't make sense unless task == current, the task can
  fork right after the check.

Rename it to current_is_single_threaded() and kill the argument.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-07-17 09:10:42 +10:00
James Morris be940d6279 Revert "SELinux: Convert avc_audit to use lsm_audit.h"
This reverts commit 8113a8d80f.

The patch causes a stack overflow on my system during boot.

Signed-off-by: James Morris <jmorris@namei.org>
2009-07-13 10:39:36 +10:00
Thomas Liu 8113a8d80f SELinux: Convert avc_audit to use lsm_audit.h
Convert avc_audit in security/selinux/avc.c to use lsm_audit.h,
for better maintainability and for less code duplication.

 - changed selinux to use common_audit_data instead of
   avc_audit_data
 - eliminated code in avc.c and used code from lsm_audit.h instead.

I have tested to make sure that the avcs look the same before and
after this patch.

Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-07-13 07:54:48 +10:00
Thomas Liu 89c86576ec selinux: clean up avc node cache when disabling selinux
Added a call to free the avc_node_cache when inside selinux_disable because
it should not waste resources allocated during avc_init if SELinux is disabled
and the cache will never be used.

Signed-off-by: Thomas Liu <tliu@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-25 08:29:16 +10:00
Ingo Molnar 9e48858f7d security: rename ptrace_may_access => ptrace_access_check
The ->ptrace_may_access() methods are named confusingly - the real
ptrace_may_access() returns a bool, while these security checks have
a retval convention.

Rename it to ptrace_access_check, to reduce the confusion factor.

[ Impact: cleanup, no code changed ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-25 00:18:05 +10:00
Stephen Smalley 20dda18be9 selinux: restore optimization to selinux_file_permission
Restore the optimization to skip revalidation in selinux_file_permission
if nothing has changed since the dentry_open checks, accidentally removed by
389fb800.  Also remove redundant test from selinux_revalidate_file_permission.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-23 08:19:58 +10:00
James Morris d905163c5b Merge branch 'master' into next 2009-06-19 08:20:55 +10:00
KaiGai Kohei 44c2d9bdd7 Add audit messages on type boundary violations
The attached patch adds support to generate audit messages on two cases.

The first one is a case when a multi-thread process tries to switch its
performing security context using setcon(3), but new security context is
not bounded by the old one.

  type=SELINUX_ERR msg=audit(1245311998.599:17):        \
      op=security_bounded_transition result=denied      \
      oldcontext=system_u:system_r:httpd_t:s0           \
      newcontext=system_u:system_r:guest_webapp_t:s0

The other one is a case when security_compute_av() masked any permissions
due to the type boundary violation.

  type=SELINUX_ERR msg=audit(1245312836.035:32):	\
      op=security_compute_av reason=bounds              \
      scontext=system_u:object_r:user_webapp_t:s0       \
      tcontext=system_u:object_r:shadow_t:s0:c0         \
      tclass=file perms=getattr,open

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-19 00:12:28 +10:00
KaiGai Kohei caabbdc07d cleanup in ss/services.c
It is a cleanup patch to cut down a line within 80 columns.

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
--
 security/selinux/ss/services.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-18 21:53:44 +10:00
David S. Miller 9cbc1cb8cd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:
	Documentation/feature-removal-schedule.txt
	drivers/scsi/fcoe/fcoe.c
	net/core/drop_monitor.c
	net/core/net-traces.c
2009-06-15 03:02:23 -07:00
Eric Dumazet adf30907d6 net: skb->dst accessors
Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-03 02:51:04 -07:00
Eric Paris 850b0cee16 SELinux: define audit permissions for audit tree netlink messages
Audit trees defined 2 new netlink messages but the netlink mapping tables for
selinux permissions were not set up.  This patch maps these 2 new operations
to AUDIT_WRITE.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-06-03 07:44:53 +10:00
Stephen Smalley c5642f4bba selinux: remove obsolete read buffer limit from sel_read_bool
On Tue, 2009-05-19 at 00:05 -0400, Eamon Walsh wrote:
> Recent versions of coreutils have bumped the read buffer size from 4K to
> 32K in several of the utilities.
>
> This means that "cat /selinux/booleans/xserver_object_manager" no longer
> works, it returns "Invalid argument" on F11.  getsebool works fine.
>
> sel_read_bool has a check for "count > PAGE_SIZE" that doesn't seem to
> be present in the other read functions.  Maybe it could be removed?

Yes, that check is obsoleted by the conversion of those functions to
using simple_read_from_buffer(), which will reduce count if necessary to
what is available in the buffer.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-05-19 23:56:11 +10:00
Eric Paris 75834fc3b6 SELinux: move SELINUX_MAGIC into magic.h
The selinuxfs superblock magic is used inside the IMA code, but is being
defined in two places and could someday get out of sync.  This patch moves the
declaration into magic.h so it is only done once.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-05-19 08:19:00 +10:00
James Morris d254117099 Merge branch 'master' into next 2009-05-08 17:56:47 +10:00
Stephen Smalley 65c90bca0d selinux: Fix send_sigiotask hook
The CRED patch incorrectly converted the SELinux send_sigiotask hook to
use the current task SID rather than the target task SID in its
permission check, yielding the wrong permission check.  This fixes the
hook function.  Detected by the ltp selinux testsuite and confirmed to
correct the test failure.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-05-05 08:31:03 +10:00
Oleg Nesterov ecd6de3c88 selinux: selinux_bprm_committed_creds() should wake up ->real_parent, not ->parent.
We shouldn't worry about the tracer if current is ptraced, exec() must not
succeed if the tracer has no rights to trace this task after cred changing.
But we should notify ->real_parent which is, well, real parent.

Also, we don't need _irq to take tasklist, and we don't need parent's
->siglock to wake_up_interruptible(real_parent->signal->wait_chldexit).
Since we hold tasklist, real_parent->signal must be stable. Otherwise
spin_lock(siglock) is not safe too and can't help anyway.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-04-30 09:08:48 +10:00
David Howells 3bcac0263f SELinux: Don't flush inherited SIGKILL during execve()
Don't flush inherited SIGKILL during execve() in SELinux's post cred commit
hook.  This isn't really a security problem: if the SIGKILL came before the
credentials were changed, then we were right to receive it at the time, and
should honour it; if it came after the creds were changed, then we definitely
should honour it; and in any case, all that will happen is that the process
will be scrapped before it ever returns to userspace.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-04-30 09:07:13 +10:00
Eric Paris 88c48db978 SELinux: drop secondary_ops->sysctl
We are still calling secondary_ops->sysctl even though the capabilities
module does not define a sysctl operation.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-04-30 08:45:56 +10:00
KaiGai Kohei 8a6f83afd0 Permissive domain in userspace object manager
This patch enables applications to handle permissive domain correctly.

Since the v2.6.26 kernel, SELinux has supported an idea of permissive
domain which allows certain processes to work as if permissive mode,
even if the global setting is enforcing mode.
However, we don't have an application program interface to inform
what domains are permissive one, and what domains are not.
It means applications focuses on SELinux (XACE/SELinux, SE-PostgreSQL
and so on) cannot handle permissive domain correctly.

This patch add the sixth field (flags) on the reply of the /selinux/access
interface which is used to make an access control decision from userspace.
If the first bit of the flags field is positive, it means the required
access control decision is on permissive domain, so application should
allow any required actions, as the kernel doing.

This patch also has a side benefit. The av_decision.flags is set at
context_struct_compute_av(). It enables to check required permissions
without read_lock(&policy_rwlock).

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
--
 security/selinux/avc.c              |    2 +-
 security/selinux/include/security.h |    4 +++-
 security/selinux/selinuxfs.c        |    4 ++--
 security/selinux/ss/services.c      |   30 +++++-------------------------
 4 files changed, 11 insertions(+), 29 deletions(-)
Signed-off-by: James Morris <jmorris@namei.org>
2009-04-02 09:23:45 +11:00
Paul Moore 58bfbb51ff selinux: Remove the "compat_net" compatibility code
The SELinux "compat_net" is marked as deprecated, the time has come to
finally remove it from the kernel.  Further code simplifications are
likely in the future, but this patch was intended to be a simple,
straight-up removal of the compat_net code.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-28 15:01:37 +11:00
Paul Moore 389fb800ac netlabel: Label incoming TCP connections correctly in SELinux
The current NetLabel/SELinux behavior for incoming TCP connections works but
only through a series of happy coincidences that rely on the limited nature of
standard CIPSO (only able to convey MLS attributes) and the write equality
imposed by the SELinux MLS constraints.  The problem is that network sockets
created as the result of an incoming TCP connection were not on-the-wire
labeled based on the security attributes of the parent socket but rather based
on the wire label of the remote peer.  The issue had to do with how IP options
were managed as part of the network stack and where the LSM hooks were in
relation to the code which set the IP options on these newly created child
sockets.  While NetLabel/SELinux did correctly set the socket's on-the-wire
label it was promptly cleared by the network stack and reset based on the IP
options of the remote peer.

This patch, in conjunction with a prior patch that adjusted the LSM hook
locations, works to set the correct on-the-wire label format for new incoming
connections through the security_inet_conn_request() hook.  Besides the
correct behavior there are many advantages to this change, the most significant
is that all of the NetLabel socket labeling code in SELinux now lives in hooks
which can return error codes to the core stack which allows us to finally get
ride of the selinux_netlbl_inode_permission() logic which greatly simplfies
the NetLabel/SELinux glue code.  In the process of developing this patch I
also ran into a small handful of AF_INET6 cleanliness issues that have been
fixed which should make the code safer and easier to extend in the future.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-28 15:01:36 +11:00
James Morris 703a3cd728 Merge branch 'master' into next 2009-03-24 10:52:46 +11:00
Eric Paris df7f54c012 SELinux: inode_doinit_with_dentry drop no dentry printk
Drop the printk message when an inode is found without an associated
dentry.  This should only happen when userspace can't be accessing those
inodes and those labels will get set correctly on the next d_instantiate.
Thus there is no reason to send this message.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-10 08:40:02 +11:00
Eric Paris dd34b5d75a SELinux: new permission between tty audit and audit socket
New selinux permission to separate the ability to turn on tty auditing from
the ability to set audit rules.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-06 08:50:21 +11:00
Eric Paris 6a25b27d60 SELinux: open perm for sock files
When I did open permissions I didn't think any sockets would have an open.
Turns out AF_UNIX sockets can have an open when they are bound to the
filesystem namespace.  This patch adds a new SOCK_FILE__OPEN permission.
It's safe to add this as the open perms are already predicated on
capabilities and capabilities means we have unknown perm handling so
systems should be as backwards compatible as the policy wants them to
be.

https://bugzilla.redhat.com/show_bug.cgi?id=475224

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-06 08:50:18 +11:00
Paul Moore d7f59dc464 selinux: Fix a panic in selinux_netlbl_inode_permission()
Rick McNeal from LSI identified a panic in selinux_netlbl_inode_permission()
caused by a certain sequence of SUNRPC operations.  The problem appears to be
due to the lack of NULL pointer checking in the function; this patch adds the
pointer checks so the function will exit safely in the cases where the socket
is not completely initialized.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-03-02 09:30:04 +11:00
Paul Moore 09c50b4a52 selinux: Fix the NetLabel glue code for setsockopt()
At some point we (okay, I) managed to break the ability for users to use the
setsockopt() syscall to set IPv4 options when NetLabel was not active on the
socket in question.  The problem was noticed by someone trying to use the
"-R" (record route) option of ping:

 # ping -R 10.0.0.1
 ping: record route: No message of desired type

The solution is relatively simple, we catch the unlabeled socket case and
clear the error code, allowing the operation to succeed.  Please note that we
still deny users the ability to override IPv4 options on socket's which have
NetLabel labeling active; this is done to ensure the labeling remains intact.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-23 10:05:55 +11:00
Eric Paris 26036651c5 SELinux: convert the avc cache hash list to an hlist
We do not need O(1) access to the tail of the avc cache lists and so we are
wasting lots of space using struct list_head instead of struct hlist_head.
This patch converts the avc cache to use hlists in which there is a single
pointer from the head which saves us about 4k of global memory.

Resulted in about a 1.5% decrease in time spent in avc_has_perm_noaudit based
on oprofile sampling of tbench.  Although likely within the noise....

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:23:48 +11:00
Eric Paris edf3d1aecd SELinux: code readability with avc_cache
The code making use of struct avc_cache was not easy to read thanks to liberal
use of &avc_cache.{slots_lock,slots}[hvalue] throughout.  This patch simply
creates local pointers and uses those instead of the long global names.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:23:45 +11:00
Eric Paris f1c6381a6e SELinux: remove unused av.decided field
It appears there was an intention to have the security server only decide
certain permissions and leave other for later as some sort of a portential
performance win.  We are currently always deciding all 32 bits of
permissions and this is a useless couple of branches and wasted space.
This patch completely drops the av.decided concept.

This in a 17% reduction in the time spent in avc_has_perm_noaudit
based on oprofile sampling of a tbench benchmark.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:23:08 +11:00
Eric Paris 21193dcd1f SELinux: more careful use of avd in avc_has_perm_noaudit
we are often needlessly jumping through hoops when it comes to avd
entries in avc_has_perm_noaudit and we have extra initialization and memcpy
which are just wasting performance.  Try to clean the function up a bit.

This patch resulted in a 13% drop in time spent in avc_has_perm_noaudit in my
oprofile sampling of a tbench benchmark.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:23:04 +11:00
Eric Paris 906d27d9d2 SELinux: remove the unused ae.used
Currently SELinux code has an atomic which was intended to track how many
times an avc entry was used and to evict entries when they haven't been
used recently.  Instead we never let this atomic get above 1 and evict when
it is first checked for eviction since it hits zero.  This is a total waste
of time so I'm completely dropping ae.used.

This change resulted in about a 3% faster avc_has_perm_noaudit when running
oprofile against a tbench benchmark.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:22:37 +11:00
Eric Paris a5dda68332 SELinux: check seqno when updating an avc_node
The avc update node callbacks do not check the seqno of the caller with the
seqno of the node found.  It is possible that a policy change could happen
(although almost impossibly unlikely) in which a permissive or
permissive_domain decision is not valid for the entry found.  Simply pass
and check that the seqno of the caller and the seqno of the node found
match.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:22:34 +11:00
Eric Paris 4cb912f1d1 SELinux: NULL terminate al contexts from disk
When a context is pulled in from disk we don't know that it is null
terminated.  This patch forecebly null terminates contexts when we pull
them from disk.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:22:30 +11:00
Eric Paris 4ba0a8ad63 SELinux: better printk when file with invalid label found
Currently when an inode is read into the kernel with an invalid label
string (can often happen with removable media) we output a string like:

SELinux: inode_doinit_with_dentry:  context_to_sid([SOME INVALID LABEL])
returned -22 dor dev=[blah] ino=[blah]

Which is all but incomprehensible to all but a couple of us.  Instead, on
EINVAL only, I plan to output a much more user friendly string and I plan to
ratelimit the printk since many of these could be generated very rapidly.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:22:27 +11:00
Eric Paris 200ac532a4 SELinux: call capabilities code directory
For cleanliness and efficiency remove all calls to secondary-> and instead
call capabilities code directly.  capabilities are the only module that
selinux stacks with and so the code should not indicate that other stacking
might be possible.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2009-02-14 09:22:24 +11:00
James Morris cb5629b10d Merge branch 'master' into next
Conflicts:
	fs/namei.c

Manually merged per:

diff --cc fs/namei.c
index 734f2b5,bbc15c2..0000000
--- a/fs/namei.c
+++ b/fs/namei.c
@@@ -860,9 -848,8 +849,10 @@@ static int __link_path_walk(const char
  		nd->flags |= LOOKUP_CONTINUE;
  		err = exec_permission_lite(inode);
  		if (err == -EAGAIN)
- 			err = vfs_permission(nd, MAY_EXEC);
+ 			err = inode_permission(nd->path.dentry->d_inode,
+ 					       MAY_EXEC);
 +		if (!err)
 +			err = ima_path_check(&nd->path, MAY_EXEC);
   		if (err)
  			break;

@@@ -1525,14 -1506,9 +1509,14 @@@ int may_open(struct path *path, int acc
  		flag &= ~O_TRUNC;
  	}

- 	error = vfs_permission(nd, acc_mode);
+ 	error = inode_permission(inode, acc_mode);
  	if (error)
  		return error;
 +
- 	error = ima_path_check(&nd->path,
++	error = ima_path_check(path,
 +			       acc_mode & (MAY_READ | MAY_WRITE | MAY_EXEC));
 +	if (error)
 +		return error;
  	/*
  	 * An append-only file must be opened in append mode for writing.
  	 */

Signed-off-by: James Morris <jmorris@namei.org>
2009-02-06 11:01:45 +11:00
James Morris 5626d3e861 selinux: remove hooks which simply defer to capabilities
Remove SELinux hooks which do nothing except defer to the capabilites
hooks (or in one case, replicates the function).

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
2009-02-02 09:20:34 +11:00
James Morris 95c14904b6 selinux: remove secondary ops call to shm_shmat
Remove secondary ops call to shm_shmat, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:16 +11:00
James Morris 5c4054ccfa selinux: remove secondary ops call to unix_stream_connect
Remove secondary ops call to unix_stream_connect, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:15 +11:00
James Morris 2cbbd19812 selinux: remove secondary ops call to task_kill
Remove secondary ops call to task_kill, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:14 +11:00
James Morris ef76e748fa selinux: remove secondary ops call to task_setrlimit
Remove secondary ops call to task_setrlimit, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:13 +11:00
James Morris ca5143d3ff selinux: remove unused cred_commit hook
Remove unused cred_commit hook from SELinux.   This
currently calls into the capabilities hook, which is a noop.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:12 +11:00
James Morris af294e41d0 selinux: remove secondary ops call to task_create
Remove secondary ops call to task_create, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:11 +11:00
James Morris d541bbee69 selinux: remove secondary ops call to file_mprotect
Remove secondary ops call to file_mprotect, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:11 +11:00
James Morris 438add6b32 selinux: remove secondary ops call to inode_setattr
Remove secondary ops call to inode_setattr, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:10 +11:00
James Morris 188fbcca9d selinux: remove secondary ops call to inode_permission
Remove secondary ops call to inode_permission, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:09 +11:00
James Morris f51115b9ab selinux: remove secondary ops call to inode_follow_link
Remove secondary ops call to inode_follow_link, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:08 +11:00
James Morris dd4907a6d4 selinux: remove secondary ops call to inode_mknod
Remove secondary ops call to inode_mknod, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:07 +11:00
James Morris e4737250b7 selinux: remove secondary ops call to inode_unlink
Remove secondary ops call to inode_unlink, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:06 +11:00
James Morris efdfac4376 selinux: remove secondary ops call to inode_link
Remove secondary ops call to inode_link, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:06 +11:00
James Morris 97422ab9ef selinux: remove secondary ops call to sb_umount
Remove secondary ops call to sb_umount, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:05 +11:00
James Morris ef935b9136 selinux: remove secondary ops call to sb_mount
Remove secondary ops call to sb_mount, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:04 +11:00
James Morris 5565b0b865 selinux: remove secondary ops call to bprm_committed_creds
Remove secondary ops call to bprm_committed_creds, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:03 +11:00
James Morris 2ec5dbe23d selinux: remove secondary ops call to bprm_committing_creds
Remove secondary ops call to bprm_committing_creds, which is
a noop in capabilities.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:02 +11:00
James Morris bc05595845 selinux: remove unused bprm_check_security hook
Remove unused bprm_check_security hook from SELinux.   This
currently calls into the capabilities hook, which is a noop.

Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-30 08:55:01 +11:00
David P. Quigley cd89596f0c SELinux: Unify context mount and genfs behavior
Context mounts and genfs labeled file systems behave differently with respect to
setting file system labels. This patch brings genfs labeled file systems in line
with context mounts in that setxattr calls to them should return EOPNOTSUPP and
fscreate calls will be ignored.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@macbook.localdomain>
2009-01-19 09:47:14 +11:00
David P. Quigley 11689d47f0 SELinux: Add new security mount option to indicate security label support.
There is no easy way to tell if a file system supports SELinux security labeling.
Because of this a new flag is being added to the super block security structure
to indicate that the particular super block supports labeling. This flag is set
for file systems using the xattr, task, and transition labeling methods unless
that behavior is overridden by context mounts.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@macbook.localdomain>
2009-01-19 09:47:06 +11:00
David P. Quigley 0d90a7ec48 SELinux: Condense super block security structure flags and cleanup necessary code.
The super block security structure currently has three fields for what are
essentially flags.  The flags field is used for mount options while two other
char fields are used for initialization and proc flags. These latter two fields are
essentially bit fields since the only used values are 0 and 1.  These fields
have been collapsed into the flags field and new bit masks have been added for
them. The code is also fixed to work with these new flags.

Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@macbook.localdomain>
2009-01-19 09:46:40 +11:00
James Morris ac8cc0fa53 Merge branch 'next' into for-linus 2009-01-07 09:58:22 +11:00
David Howells 3699c53c48 CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #3]
Fix a regression in cap_capable() due to:

	commit 3b11a1dece
	Author: David Howells <dhowells@redhat.com>
	Date:   Fri Nov 14 10:39:26 2008 +1100

	    CRED: Differentiate objective and effective subjective credentials on a task

The problem is that the above patch allows a process to have two sets of
credentials, and for the most part uses the subjective credentials when
accessing current's creds.

There is, however, one exception: cap_capable(), and thus capable(), uses the
real/objective credentials of the target task, whether or not it is the current
task.

Ordinarily this doesn't matter, since usually the two cred pointers in current
point to the same set of creds.  However, sys_faccessat() makes use of this
facility to override the credentials of the calling process to make its test,
without affecting the creds as seen from other processes.

One of the things sys_faccessat() does is to make an adjustment to the
effective capabilities mask, which cap_capable(), as it stands, then ignores.

The affected capability check is in generic_permission():

	if (!(mask & MAY_EXEC) || execute_ok(inode))
		if (capable(CAP_DAC_OVERRIDE))
			return 0;

This change passes the set of credentials to be tested down into the commoncap
and SELinux code.  The security functions called by capable() and
has_capability() select the appropriate set of credentials from the process
being checked.

This can be tested by compiling the following program from the XFS testsuite:

/*
 *  t_access_root.c - trivial test program to show permission bug.
 *
 *  Written by Michael Kerrisk - copyright ownership not pursued.
 *  Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html
 */
#include <limits.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/stat.h>

#define UID 500
#define GID 100
#define PERM 0
#define TESTPATH "/tmp/t_access"

static void
errExit(char *msg)
{
    perror(msg);
    exit(EXIT_FAILURE);
} /* errExit */

static void
accessTest(char *file, int mask, char *mstr)
{
    printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask));
} /* accessTest */

int
main(int argc, char *argv[])
{
    int fd, perm, uid, gid;
    char *testpath;
    char cmd[PATH_MAX + 20];

    testpath = (argc > 1) ? argv[1] : TESTPATH;
    perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM;
    uid = (argc > 3) ? atoi(argv[3]) : UID;
    gid = (argc > 4) ? atoi(argv[4]) : GID;

    unlink(testpath);

    fd = open(testpath, O_RDWR | O_CREAT, 0);
    if (fd == -1) errExit("open");

    if (fchown(fd, uid, gid) == -1) errExit("fchown");
    if (fchmod(fd, perm) == -1) errExit("fchmod");
    close(fd);

    snprintf(cmd, sizeof(cmd), "ls -l %s", testpath);
    system(cmd);

    if (seteuid(uid) == -1) errExit("seteuid");

    accessTest(testpath, 0, "0");
    accessTest(testpath, R_OK, "R_OK");
    accessTest(testpath, W_OK, "W_OK");
    accessTest(testpath, X_OK, "X_OK");
    accessTest(testpath, R_OK | W_OK, "R_OK | W_OK");
    accessTest(testpath, R_OK | X_OK, "R_OK | X_OK");
    accessTest(testpath, W_OK | X_OK, "W_OK | X_OK");
    accessTest(testpath, R_OK | W_OK | X_OK, "R_OK | W_OK | X_OK");

    exit(EXIT_SUCCESS);
} /* main */

This can be run against an Ext3 filesystem as well as against an XFS
filesystem.  If successful, it will show:

	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
	---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx
	access(/tmp/xxx, 0) returns 0
	access(/tmp/xxx, R_OK) returns 0
	access(/tmp/xxx, W_OK) returns 0
	access(/tmp/xxx, X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK) returns 0
	access(/tmp/xxx, R_OK | X_OK) returns -1
	access(/tmp/xxx, W_OK | X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

If unsuccessful, it will show:

	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
	---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx
	access(/tmp/xxx, 0) returns 0
	access(/tmp/xxx, R_OK) returns -1
	access(/tmp/xxx, W_OK) returns -1
	access(/tmp/xxx, X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK) returns -1
	access(/tmp/xxx, R_OK | X_OK) returns -1
	access(/tmp/xxx, W_OK | X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

I've also tested the fix with the SELinux and syscalls LTP testsuites.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-07 09:38:48 +11:00
James Morris 29881c4502 Revert "CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]"
This reverts commit 14eaddc967.

David has a better version to come.
2009-01-07 09:21:54 +11:00
Linus Torvalds 520c853466 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  inotify: fix type errors in interfaces
  fix breakage in reiserfs_new_inode()
  fix the treatment of jfs special inodes
  vfs: remove duplicate code in get_fs_type()
  add a vfs_fsync helper
  sys_execve and sys_uselib do not call into fsnotify
  zero i_uid/i_gid on inode allocation
  inode->i_op is never NULL
  ntfs: don't NULL i_op
  isofs check for NULL ->i_op in root directory is dead code
  affs: do not zero ->i_op
  kill suid bit only for regular files
  vfs: lseek(fd, 0, SEEK_CUR) race condition
2009-01-05 18:32:06 -08:00
Al Viro 56ff5efad9 zero i_uid/i_gid on inode allocation
... and don't bother in callers.  Don't bother with zeroing i_blocks,
while we are at it - it's already been zeroed.

i_mode is not worth the effort; it has no common default value.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-05 11:54:28 -05:00
Eric Paris 76f7ba35d4 SELinux: shrink sizeof av_inhert selinux_class_perm and context
I started playing with pahole today and decided to put it against the
selinux structures.  Found we could save a little bit of space on x86_64
(and no harm on i686) just reorganizing some structs.

Object size changes:
av_inherit: 24 -> 16
selinux_class_perm: 48 -> 40
context: 80 -> 72

Admittedly there aren't many of av_inherit or selinux_class_perm's in
the kernel (33 and 1 respectively) But the change to the size of struct
context reverberate out a bit.  I can get some hard number if they are
needed, but I don't see why they would be.  We do change which cacheline
context->len and context->str would be on, but I don't see that as a
problem since we are clearly going to have to load both if the context
is to be of any value.  I've run with the patch and don't seem to be
having any problems.

An example of what's going on using struct av_inherit would be:

form: to:
struct av_inherit {			struct av_inherit {
	u16 tclass;				const char **common_pts;
	const char **common_pts;		u32 common_base;
	u32 common_base;			u16 tclass;
};

(notice all I did was move u16 tclass to the end of the struct instead
of the beginning)

Memory layout before the change:
struct av_inherit {
	u16 tclass; /* 2 */
	/* 6 bytes hole */
	const char** common_pts; /* 8 */
	u32 common_base; /* 4 */
	/* 4 byes padding */

	/* size: 24, cachelines: 1 */
	/* sum members: 14, holes: 1, sum holes: 6 */
	/* padding: 4 */
};

Memory layout after the change:
struct av_inherit {
	const char ** common_pts; /* 8 */
	u32 common_base; /* 4 */
	u16 tclass; /* 2 */
	/* 2 bytes padding */

	/* size: 16, cachelines: 1 */
	/* sum members: 14, holes: 0, sum holes: 0 */
	/* padding: 2 */
};

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-05 19:19:55 +11:00
David Howells 14eaddc967 CRED: Fix regression in cap_capable() as shown up by sys_faccessat() [ver #2]
Fix a regression in cap_capable() due to:

	commit 5ff7711e635b32f0a1e558227d030c7e45b4a465
	Author: David Howells <dhowells@redhat.com>
	Date:   Wed Dec 31 02:52:28 2008 +0000

	    CRED: Differentiate objective and effective subjective credentials on a task

The problem is that the above patch allows a process to have two sets of
credentials, and for the most part uses the subjective credentials when
accessing current's creds.

There is, however, one exception: cap_capable(), and thus capable(), uses the
real/objective credentials of the target task, whether or not it is the current
task.

Ordinarily this doesn't matter, since usually the two cred pointers in current
point to the same set of creds.  However, sys_faccessat() makes use of this
facility to override the credentials of the calling process to make its test,
without affecting the creds as seen from other processes.

One of the things sys_faccessat() does is to make an adjustment to the
effective capabilities mask, which cap_capable(), as it stands, then ignores.

The affected capability check is in generic_permission():

	if (!(mask & MAY_EXEC) || execute_ok(inode))
		if (capable(CAP_DAC_OVERRIDE))
			return 0;

This change splits capable() from has_capability() down into the commoncap and
SELinux code.  The capable() security op now only deals with the current
process, and uses the current process's subjective creds.  A new security op -
task_capable() - is introduced that can check any task's objective creds.

strictly the capable() security op is superfluous with the presence of the
task_capable() op, however it should be faster to call the capable() op since
two fewer arguments need be passed down through the various layers.

This can be tested by compiling the following program from the XFS testsuite:

/*
 *  t_access_root.c - trivial test program to show permission bug.
 *
 *  Written by Michael Kerrisk - copyright ownership not pursued.
 *  Sourced from: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-10/6030.html
 */
#include <limits.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/stat.h>

#define UID 500
#define GID 100
#define PERM 0
#define TESTPATH "/tmp/t_access"

static void
errExit(char *msg)
{
    perror(msg);
    exit(EXIT_FAILURE);
} /* errExit */

static void
accessTest(char *file, int mask, char *mstr)
{
    printf("access(%s, %s) returns %d\n", file, mstr, access(file, mask));
} /* accessTest */

int
main(int argc, char *argv[])
{
    int fd, perm, uid, gid;
    char *testpath;
    char cmd[PATH_MAX + 20];

    testpath = (argc > 1) ? argv[1] : TESTPATH;
    perm = (argc > 2) ? strtoul(argv[2], NULL, 8) : PERM;
    uid = (argc > 3) ? atoi(argv[3]) : UID;
    gid = (argc > 4) ? atoi(argv[4]) : GID;

    unlink(testpath);

    fd = open(testpath, O_RDWR | O_CREAT, 0);
    if (fd == -1) errExit("open");

    if (fchown(fd, uid, gid) == -1) errExit("fchown");
    if (fchmod(fd, perm) == -1) errExit("fchmod");
    close(fd);

    snprintf(cmd, sizeof(cmd), "ls -l %s", testpath);
    system(cmd);

    if (seteuid(uid) == -1) errExit("seteuid");

    accessTest(testpath, 0, "0");
    accessTest(testpath, R_OK, "R_OK");
    accessTest(testpath, W_OK, "W_OK");
    accessTest(testpath, X_OK, "X_OK");
    accessTest(testpath, R_OK | W_OK, "R_OK | W_OK");
    accessTest(testpath, R_OK | X_OK, "R_OK | X_OK");
    accessTest(testpath, W_OK | X_OK, "W_OK | X_OK");
    accessTest(testpath, R_OK | W_OK | X_OK, "R_OK | W_OK | X_OK");

    exit(EXIT_SUCCESS);
} /* main */

This can be run against an Ext3 filesystem as well as against an XFS
filesystem.  If successful, it will show:

	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
	---------- 1 dhowells dhowells 0 2008-12-31 03:00 /tmp/xxx
	access(/tmp/xxx, 0) returns 0
	access(/tmp/xxx, R_OK) returns 0
	access(/tmp/xxx, W_OK) returns 0
	access(/tmp/xxx, X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK) returns 0
	access(/tmp/xxx, R_OK | X_OK) returns -1
	access(/tmp/xxx, W_OK | X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

If unsuccessful, it will show:

	[root@andromeda src]# ./t_access_root /tmp/xxx 0 4043 4043
	---------- 1 dhowells dhowells 0 2008-12-31 02:56 /tmp/xxx
	access(/tmp/xxx, 0) returns 0
	access(/tmp/xxx, R_OK) returns -1
	access(/tmp/xxx, W_OK) returns -1
	access(/tmp/xxx, X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK) returns -1
	access(/tmp/xxx, R_OK | X_OK) returns -1
	access(/tmp/xxx, W_OK | X_OK) returns -1
	access(/tmp/xxx, R_OK | W_OK | X_OK) returns -1

I've also tested the fix with the SELinux and syscalls LTP testsuites.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-01-05 11:17:04 +11:00
Al Viro 5af75d8d58 audit: validate comparison operations, store them in sane form
Don't store the field->op in the messy (and very inconvenient for e.g.
audit_comparator()) form; translate to dense set of values and do full
validation of userland-submitted value while we are at it.

->audit_init_rule() and ->audit_match_rule() get new values now; in-tree
instances updated.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-04 15:14:42 -05:00
Rusty Russell 4f4b6c1a94 cpumask: prepare for iterators to only go to nr_cpu_ids/nr_cpumask_bits.: core
Impact: cleanup

In future, all cpumask ops will only be valid (in general) for bit
numbers < nr_cpu_ids.  So use that instead of NR_CPUS in iterators
and other comparisons.

This is always safe: no cpu number can be >= nr_cpu_ids, and
nr_cpu_ids is initialized to NR_CPUS at boot.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: James Morris <jmorris@namei.org>
Cc: Eric Biederman <ebiederm@xmission.com>
2009-01-01 10:12:15 +10:30
Paul Moore 277d342fc4 selinux: Deprecate and schedule the removal of the the compat_net functionality
This patch is the first step towards removing the old "compat_net" code from
the kernel.  Secmark, the "compat_net" replacement was first introduced in
2.6.18 (September 2006) and the major Linux distributions with SELinux support
have transitioned to Secmark so it is time to start deprecating the "compat_net"
mechanism.  Testing a patched version of 2.6.28-rc6 with the initial release of
Fedora Core 5 did not show any problems when running in enforcing mode.

This patch adds an entry to the feature-removal-schedule.txt file and removes
the SECURITY_SELINUX_ENABLE_SECMARK_DEFAULT configuration option, forcing
Secmark on by default although it can still be disabled at runtime.  The patch
also makes the Secmark permission checks "dynamic" in the sense that they are
only executed when Secmark is configured; this should help prevent problems
with older distributions that have not yet migrated to Secmark.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-12-31 12:54:11 -05:00
Linus Torvalds 0191b625ca Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits)
  net: Allow dependancies of FDDI & Tokenring to be modular.
  igb: Fix build warning when DCA is disabled.
  net: Fix warning fallout from recent NAPI interface changes.
  gro: Fix potential use after free
  sfc: If AN is enabled, always read speed/duplex from the AN advertising bits
  sfc: When disabling the NIC, close the device rather than unregistering it
  sfc: SFT9001: Add cable diagnostics
  sfc: Add support for multiple PHY self-tests
  sfc: Merge top-level functions for self-tests
  sfc: Clean up PHY mode management in loopback self-test
  sfc: Fix unreliable link detection in some loopback modes
  sfc: Generate unique names for per-NIC workqueues
  802.3ad: use standard ethhdr instead of ad_header
  802.3ad: generalize out mac address initializer
  802.3ad: initialize ports LACPDU from const initializer
  802.3ad: remove typedef around ad_system
  802.3ad: turn ports is_individual into a bool
  802.3ad: turn ports is_enabled into a bool
  802.3ad: make ntt bool
  ixgbe: Fix set_ringparam in ixgbe to use the same memory pools.
  ...

Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due
to the conversion to %pI (in this networking merge) and the addition of
doing IPv6 addresses (from the earlier merge of CIFS).
2008-12-28 12:49:40 -08:00
James Morris 7419224691 SELinux: don't check permissions for kernel mounts
Don't bother checking permissions when the kernel performs an
internal mount, as this should always be allowed.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2008-12-20 09:03:39 +11:00
James Morris 12204e24b1 security: pass mount flags to security_sb_kern_mount()
Pass mount flags to security_sb_kern_mount(), so security modules
can determine if a mount operation is being performed by the kernel.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2008-12-20 09:02:39 +11:00
Stephen Smalley 459c19f524 SELinux: correctly detect proc filesystems of the form "proc/foo"
Map all of these proc/ filesystem types to "proc" for the policy lookup at
filesystem mount time.

Signed-off-by: James Morris <jmorris@namei.org>
2008-12-20 09:01:03 +11:00
David Howells 3a3b7ce933 CRED: Allow kernel services to override LSM settings for task actions
Allow kernel services to override LSM settings appropriate to the actions
performed by a task by duplicating a set of credentials, modifying it and then
using task_struct::cred to point to it when performing operations on behalf of
a task.

This is used, for example, by CacheFiles which has to transparently access the
cache on behalf of a process that thinks it is doing, say, NFS accesses with a
potentially inappropriate (with respect to accessing the cache) set of
credentials.

This patch provides two LSM hooks for modifying a task security record:

 (*) security_kernel_act_as() which allows modification of the security datum
     with which a task acts on other objects (most notably files).

 (*) security_kernel_create_files_as() which allows modification of the
     security datum that is used to initialise the security data on a file that
     a task creates.

The patch also provides four new credentials handling functions, which wrap the
LSM functions:

 (1) prepare_kernel_cred()

     Prepare a set of credentials for a kernel service to use, based either on
     a daemon's credentials or on init_cred.  All the keyrings are cleared.

 (2) set_security_override()

     Set the LSM security ID in a set of credentials to a specific security
     context, assuming permission from the LSM policy.

 (3) set_security_override_from_ctx()

     As (2), but takes the security context as a string.

 (4) set_create_files_as()

     Set the file creation LSM security ID in a set of credentials to be the
     same as that on a particular inode.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> [Smack changes]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:28 +11:00
David Howells 1bfdc75ae0 CRED: Add a kernel_service object class to SELinux
Add a 'kernel_service' object class to SELinux and give this object class two
access vectors: 'use_as_override' and 'create_files_as'.

The first vector is used to grant a process the right to nominate an alternate
process security ID for the kernel to use as an override for the SELinux
subjective security when accessing stuff on behalf of another process.

For example, CacheFiles when accessing the cache on behalf on a process
accessing an NFS file needs to use a subjective security ID appropriate to the
cache rather then the one the calling process is using.  The cachefilesd
daemon will nominate the security ID to be used.

The second vector is used to grant a process the right to nominate a file
creation label for a kernel service to use.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:27 +11:00
David Howells 3b11a1dece CRED: Differentiate objective and effective subjective credentials on a task
Differentiate the objective and real subjective credentials from the effective
subjective credentials on a task by introducing a second credentials pointer
into the task_struct.

task_struct::real_cred then refers to the objective and apparent real
subjective credentials of a task, as perceived by the other tasks in the
system.

task_struct::cred then refers to the effective subjective credentials of a
task, as used by that task when it's actually running.  These are not visible
to the other tasks in the system.

__task_cred(task) then refers to the objective/real credentials of the task in
question.

current_cred() refers to the effective subjective credentials of the current
task.

prepare_creds() uses the objective creds as a base and commit_creds() changes
both pointers in the task_struct (indeed commit_creds() requires them to be the
same).

override_creds() and revert_creds() change the subjective creds pointer only,
and the former returns the old subjective creds.  These are used by NFSD,
faccessat() and do_coredump(), and will by used by CacheFiles.

In SELinux, current_has_perm() is provided as an alternative to
task_has_perm().  This uses the effective subjective context of current,
whereas task_has_perm() uses the objective/real context of the subject.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:26 +11:00
David Howells a6f76f23d2 CRED: Make execve() take advantage of copy-on-write credentials
Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.

This patch makes several logical sets of alteration:

 (1) execve().

     The credential bits from struct linux_binprm are, for the most part,
     replaced with a single credentials pointer (bprm->cred).  This means that
     all the creds can be calculated in advance and then applied at the point
     of no return with no possibility of failure.

     I would like to replace bprm->cap_effective with:

	cap_isclear(bprm->cap_effective)

     but this seems impossible due to special behaviour for processes of pid 1
     (they always retain their parent's capability masks where normally they'd
     be changed - see cap_bprm_set_creds()).

     The following sequence of events now happens:

     (a) At the start of do_execve, the current task's cred_exec_mutex is
     	 locked to prevent PTRACE_ATTACH from obsoleting the calculation of
     	 creds that we make.

     (a) prepare_exec_creds() is then called to make a copy of the current
     	 task's credentials and prepare it.  This copy is then assigned to
     	 bprm->cred.

  	 This renders security_bprm_alloc() and security_bprm_free()
     	 unnecessary, and so they've been removed.

     (b) The determination of unsafe execution is now performed immediately
     	 after (a) rather than later on in the code.  The result is stored in
     	 bprm->unsafe for future reference.

     (c) prepare_binprm() is called, possibly multiple times.

     	 (i) This applies the result of set[ug]id binaries to the new creds
     	     attached to bprm->cred.  Personality bit clearance is recorded,
     	     but now deferred on the basis that the exec procedure may yet
     	     fail.

         (ii) This then calls the new security_bprm_set_creds().  This should
	     calculate the new LSM and capability credentials into *bprm->cred.

	     This folds together security_bprm_set() and parts of
	     security_bprm_apply_creds() (these two have been removed).
	     Anything that might fail must be done at this point.

         (iii) bprm->cred_prepared is set to 1.

	     bprm->cred_prepared is 0 on the first pass of the security
	     calculations, and 1 on all subsequent passes.  This allows SELinux
	     in (ii) to base its calculations only on the initial script and
	     not on the interpreter.

     (d) flush_old_exec() is called to commit the task to execution.  This
     	 performs the following steps with regard to credentials:

	 (i) Clear pdeath_signal and set dumpable on certain circumstances that
	     may not be covered by commit_creds().

         (ii) Clear any bits in current->personality that were deferred from
             (c.i).

     (e) install_exec_creds() [compute_creds() as was] is called to install the
     	 new credentials.  This performs the following steps with regard to
     	 credentials:

         (i) Calls security_bprm_committing_creds() to apply any security
             requirements, such as flushing unauthorised files in SELinux, that
             must be done before the credentials are changed.

	     This is made up of bits of security_bprm_apply_creds() and
	     security_bprm_post_apply_creds(), both of which have been removed.
	     This function is not allowed to fail; anything that might fail
	     must have been done in (c.ii).

         (ii) Calls commit_creds() to apply the new credentials in a single
             assignment (more or less).  Possibly pdeath_signal and dumpable
             should be part of struct creds.

	 (iii) Unlocks the task's cred_replace_mutex, thus allowing
	     PTRACE_ATTACH to take place.

         (iv) Clears The bprm->cred pointer as the credentials it was holding
             are now immutable.

         (v) Calls security_bprm_committed_creds() to apply any security
             alterations that must be done after the creds have been changed.
             SELinux uses this to flush signals and signal handlers.

     (f) If an error occurs before (d.i), bprm_free() will call abort_creds()
     	 to destroy the proposed new credentials and will then unlock
     	 cred_replace_mutex.  No changes to the credentials will have been
     	 made.

 (2) LSM interface.

     A number of functions have been changed, added or removed:

     (*) security_bprm_alloc(), ->bprm_alloc_security()
     (*) security_bprm_free(), ->bprm_free_security()

     	 Removed in favour of preparing new credentials and modifying those.

     (*) security_bprm_apply_creds(), ->bprm_apply_creds()
     (*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()

     	 Removed; split between security_bprm_set_creds(),
     	 security_bprm_committing_creds() and security_bprm_committed_creds().

     (*) security_bprm_set(), ->bprm_set_security()

     	 Removed; folded into security_bprm_set_creds().

     (*) security_bprm_set_creds(), ->bprm_set_creds()

     	 New.  The new credentials in bprm->creds should be checked and set up
     	 as appropriate.  bprm->cred_prepared is 0 on the first call, 1 on the
     	 second and subsequent calls.

     (*) security_bprm_committing_creds(), ->bprm_committing_creds()
     (*) security_bprm_committed_creds(), ->bprm_committed_creds()

     	 New.  Apply the security effects of the new credentials.  This
     	 includes closing unauthorised files in SELinux.  This function may not
     	 fail.  When the former is called, the creds haven't yet been applied
     	 to the process; when the latter is called, they have.

 	 The former may access bprm->cred, the latter may not.

 (3) SELinux.

     SELinux has a number of changes, in addition to those to support the LSM
     interface changes mentioned above:

     (a) The bprm_security_struct struct has been removed in favour of using
     	 the credentials-under-construction approach.

     (c) flush_unauthorized_files() now takes a cred pointer and passes it on
     	 to inode_has_perm(), file_has_perm() and dentry_open().

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:24 +11:00
David Howells d84f4f992c CRED: Inaugurate COW credentials
Inaugurate copy-on-write credentials management.  This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.

A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().

With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:

	struct cred *new = prepare_creds();
	int ret = blah(new);
	if (ret < 0) {
		abort_creds(new);
		return ret;
	}
	return commit_creds(new);

There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.

To help enforce this, various pointers to sets of credentials, such as those in
the task_struct, are declared const.  The purpose of this is compile-time
discouragement of altering credentials through those pointers.  Once a set of
credentials has been made public through one of these pointers, it may not be
modified, except under special circumstances:

  (1) Its reference count may incremented and decremented.

  (2) The keyrings to which it points may be modified, but not replaced.

The only safe way to modify anything else is to create a replacement and commit
using the functions described in Documentation/credentials.txt (which will be
added by a later patch).

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.

This patch makes several logical sets of alteration:

 (1) execve().

     This now prepares and commits credentials in various places in the
     security code rather than altering the current creds directly.

 (2) Temporary credential overrides.

     do_coredump() and sys_faccessat() now prepare their own credentials and
     temporarily override the ones currently on the acting thread, whilst
     preventing interference from other threads by holding cred_replace_mutex
     on the thread being dumped.

     This will be replaced in a future patch by something that hands down the
     credentials directly to the functions being called, rather than altering
     the task's objective credentials.

 (3) LSM interface.

     A number of functions have been changed, added or removed:

     (*) security_capset_check(), ->capset_check()
     (*) security_capset_set(), ->capset_set()

     	 Removed in favour of security_capset().

     (*) security_capset(), ->capset()

     	 New.  This is passed a pointer to the new creds, a pointer to the old
     	 creds and the proposed capability sets.  It should fill in the new
     	 creds or return an error.  All pointers, barring the pointer to the
     	 new creds, are now const.

     (*) security_bprm_apply_creds(), ->bprm_apply_creds()

     	 Changed; now returns a value, which will cause the process to be
     	 killed if it's an error.

     (*) security_task_alloc(), ->task_alloc_security()

     	 Removed in favour of security_prepare_creds().

     (*) security_cred_free(), ->cred_free()

     	 New.  Free security data attached to cred->security.

     (*) security_prepare_creds(), ->cred_prepare()

     	 New. Duplicate any security data attached to cred->security.

     (*) security_commit_creds(), ->cred_commit()

     	 New. Apply any security effects for the upcoming installation of new
     	 security by commit_creds().

     (*) security_task_post_setuid(), ->task_post_setuid()

     	 Removed in favour of security_task_fix_setuid().

     (*) security_task_fix_setuid(), ->task_fix_setuid()

     	 Fix up the proposed new credentials for setuid().  This is used by
     	 cap_set_fix_setuid() to implicitly adjust capabilities in line with
     	 setuid() changes.  Changes are made to the new credentials, rather
     	 than the task itself as in security_task_post_setuid().

     (*) security_task_reparent_to_init(), ->task_reparent_to_init()

     	 Removed.  Instead the task being reparented to init is referred
     	 directly to init's credentials.

	 NOTE!  This results in the loss of some state: SELinux's osid no
	 longer records the sid of the thread that forked it.

     (*) security_key_alloc(), ->key_alloc()
     (*) security_key_permission(), ->key_permission()

     	 Changed.  These now take cred pointers rather than task pointers to
     	 refer to the security context.

 (4) sys_capset().

     This has been simplified and uses less locking.  The LSM functions it
     calls have been merged.

 (5) reparent_to_kthreadd().

     This gives the current thread the same credentials as init by simply using
     commit_thread() to point that way.

 (6) __sigqueue_alloc() and switch_uid()

     __sigqueue_alloc() can't stop the target task from changing its creds
     beneath it, so this function gets a reference to the currently applicable
     user_struct which it then passes into the sigqueue struct it returns if
     successful.

     switch_uid() is now called from commit_creds(), and possibly should be
     folded into that.  commit_creds() should take care of protecting
     __sigqueue_alloc().

 (7) [sg]et[ug]id() and co and [sg]et_current_groups.

     The set functions now all use prepare_creds(), commit_creds() and
     abort_creds() to build and check a new set of credentials before applying
     it.

     security_task_set[ug]id() is called inside the prepared section.  This
     guarantees that nothing else will affect the creds until we've finished.

     The calling of set_dumpable() has been moved into commit_creds().

     Much of the functionality of set_user() has been moved into
     commit_creds().

     The get functions all simply access the data directly.

 (8) security_task_prctl() and cap_task_prctl().

     security_task_prctl() has been modified to return -ENOSYS if it doesn't
     want to handle a function, or otherwise return the return value directly
     rather than through an argument.

     Additionally, cap_task_prctl() now prepares a new set of credentials, even
     if it doesn't end up using it.

 (9) Keyrings.

     A number of changes have been made to the keyrings code:

     (a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
     	 all been dropped and built in to the credentials functions directly.
     	 They may want separating out again later.

     (b) key_alloc() and search_process_keyrings() now take a cred pointer
     	 rather than a task pointer to specify the security context.

     (c) copy_creds() gives a new thread within the same thread group a new
     	 thread keyring if its parent had one, otherwise it discards the thread
     	 keyring.

     (d) The authorisation key now points directly to the credentials to extend
     	 the search into rather pointing to the task that carries them.

     (e) Installing thread, process or session keyrings causes a new set of
     	 credentials to be created, even though it's not strictly necessary for
     	 process or session keyrings (they're shared).

(10) Usermode helper.

     The usermode helper code now carries a cred struct pointer in its
     subprocess_info struct instead of a new session keyring pointer.  This set
     of credentials is derived from init_cred and installed on the new process
     after it has been cloned.

     call_usermodehelper_setup() allocates the new credentials and
     call_usermodehelper_freeinfo() discards them if they haven't been used.  A
     special cred function (prepare_usermodeinfo_creds()) is provided
     specifically for call_usermodehelper_setup() to call.

     call_usermodehelper_setkeys() adjusts the credentials to sport the
     supplied keyring as the new session keyring.

(11) SELinux.

     SELinux has a number of changes, in addition to those to support the LSM
     interface changes mentioned above:

     (a) selinux_setprocattr() no longer does its check for whether the
     	 current ptracer can access processes with the new SID inside the lock
     	 that covers getting the ptracer's SID.  Whilst this lock ensures that
     	 the check is done with the ptracer pinned, the result is only valid
     	 until the lock is released, so there's no point doing it inside the
     	 lock.

(12) is_single_threaded().

     This function has been extracted from selinux_setprocattr() and put into
     a file of its own in the lib/ directory as join_session_keyring() now
     wants to use it too.

     The code in SELinux just checked to see whether a task shared mm_structs
     with other tasks (CLONE_VM), but that isn't good enough.  We really want
     to know if they're part of the same thread group (CLONE_THREAD).

(13) nfsd.

     The NFS server daemon now has to use the COW credentials to set the
     credentials it is going to use.  It really needs to pass the credentials
     down to the functions it calls, but it can't do that until other patches
     in this series have been applied.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:23 +11:00
David Howells 745ca2475a CRED: Pass credentials through dentry_open()
Pass credentials through dentry_open() so that the COW creds patch can have
SELinux's flush_unauthorized_files() pass the appropriate creds back to itself
when it opens its null chardev.

The security_dentry_open() call also now takes a creds pointer, as does the
dentry_open hook in struct security_operations.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:22 +11:00
David Howells 88e67f3b88 CRED: Make inode_has_perm() and file_has_perm() take a cred pointer
Make inode_has_perm() and file_has_perm() take a cred pointer rather than a
task pointer.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:21 +11:00
David Howells 275bb41e9d CRED: Wrap access to SELinux's task SID
Wrap access to SELinux's task SID, using task_sid() and current_sid() as
appropriate.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:19 +11:00
David Howells c69e8d9c01 CRED: Use RCU to access another task's creds and to release a task's own creds
Use RCU to access another task's creds and to release a task's own creds.
This means that it will be possible for the credentials of a task to be
replaced without another task (a) requiring a full lock to read them, and (b)
seeing deallocated memory.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:19 +11:00
David Howells 86a264abe5 CRED: Wrap current->cred and a few other accessors
Wrap current->cred and a few other accessors to hide their actual
implementation.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:18 +11:00
David Howells f1752eec61 CRED: Detach the credentials from task_struct
Detach the credentials from task_struct, duplicating them in copy_process()
and releasing them in __put_task_struct().

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:17 +11:00
David Howells b6dff3ec5e CRED: Separate task security context from task_struct
Separate the task security context from task_struct.  At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:16 +11:00
David Howells 15a2460ed0 CRED: Constify the kernel_cap_t arguments to the capset LSM hooks
Constify the kernel_cap_t arguments to the capset LSM hooks.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:15 +11:00
David Howells 1cdcbec1a3 CRED: Neuter sys_capset()
Take away the ability for sys_capset() to affect processes other than current.

This means that current will not need to lock its own credentials when reading
them against interference by other processes.

This has effectively been the case for a while anyway, since:

 (1) Without LSM enabled, sys_capset() is disallowed.

 (2) With file-based capabilities, sys_capset() is neutered.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:14 +11:00
Eric Paris 066746796b Currently SELinux jumps through some ugly hoops to not audit a capbility
check when determining if a process has additional powers to override
memory limits or when trying to read/write illegal file labels.  Use
the new noaudit call instead.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-11 22:02:57 +11:00
Eric Paris 06112163f5 Add a new capable interface that will be used by systems that use audit to
make an A or B type decision instead of a security decision.  Currently
this is the case at least for filesystems when deciding if a process can use
the reserved 'root' blocks and for the case of things like the oom
algorithm determining if processes are root processes and should be less
likely to be killed.  These types of security system requests should not be
audited or logged since they are not really security decisions.  It would be
possible to solve this problem like the vm_enough_memory security check did
by creating a new LSM interface and moving all of the policy into that
interface but proves the needlessly bloat the LSM and provide complex
indirection.

This merely allows those decisions to be made where they belong and to not
flood logs or printk with denials for thing that are not security decisions.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-11 22:02:50 +11:00
Eric Paris 39c9aede2b SELinux: Use unknown perm handling to handle unknown netlink msg types
Currently when SELinux has not been updated to handle a netlink message
type the operation is denied with EINVAL.  This patch will leave the
audit/warning message so things get fixed but if policy chose to allow
unknowns this will allow the netlink operation.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-09 07:33:18 +08:00
David S. Miller 9eeda9abd1 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/ath5k/base.c
	net/8021q/vlan_core.c
2008-11-06 22:43:03 -08:00
James Morris e21e696edb Merge branch 'master' into next 2008-11-06 07:12:34 +08:00
Michal Schmidt 2f99db28af selinux: recognize netlink messages for 'ip addrlabel'
In enforcing mode '/sbin/ip addrlabel' results in a SELinux error:
type=SELINUX_ERR msg=audit(1225698822.073:42): SELinux:  unrecognized
netlink message type=74 for sclass=43

The problem is missing RTM_*ADDRLABEL entries in SELinux's netlink
message types table.

Reported in https://bugzilla.redhat.com/show_bug.cgi?id=469423

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-06 07:08:36 +08:00
Eric Paris 41d9f9c524 SELinux: hold tasklist_lock and siglock while waking wait_chldexit
SELinux has long been calling wake_up_interruptible() on
current->parent->signal->wait_chldexit without holding any locks.  It
appears that this operation should hold the tasklist_lock to dereference
current->parent and we should hold the siglock when waking up the
signal->wait_chldexit.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-05 08:44:11 +11:00
Eric Paris 37dd0bd04a SELinux: properly handle empty tty_files list
SELinux has wrongly (since 2004) had an incorrect test for an empty
tty->tty_files list.  With an empty list selinux would be pointing to part
of the tty struct itself and would then proceed to dereference that value
and again dereference that result.  An F10 change to plymouth on a ppc64
system is actually currently triggering this bug.  This patch uses
list_empty() to handle empty lists rather than looking at a meaningless
location.

[note, this fixes the oops reported in
https://bugzilla.redhat.com/show_bug.cgi?id=469079]

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-01 09:38:48 +11:00
Harvey Harrison 3685f25de1 misc: replace NIPQUAD()
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:56:49 -07:00
Eric Paris 8b6a5a37f8 SELinux: check open perms in dentry_open not inode_permission
Some operations, like searching a directory path or connecting a unix domain
socket, make explicit calls into inode_permission.  Our choices are to
either try to come up with a signature for all of the explicit calls to
inode_permission and do not check open on those, or to move the open checks to
dentry_open where we know this is always an open operation.  This patch moves
the checks to dentry_open.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-10-31 02:00:52 +11:00
Harvey Harrison 5b095d9892 net: replace %p6 with %pI6
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-29 12:52:50 -07:00
Harvey Harrison 1afa67f5e7 misc: replace NIP6_FMT with %p6 format specifier
The iscsi_ibft.c changes are almost certainly a bugfix as the
pointer 'ip' is a u8 *, so they never print the last 8 bytes
of the IPv6 address, and the eight bytes they do print have
a zero byte with them in each 16-bit word.

Other than that, this should cause no difference in functionality.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 16:06:44 -07:00
Alexey Dobriyan def8b4faff net: reduce structures when XFRM=n
ifdef out
* struct sk_buff::sp		(pointer)
* struct dst_entry::xfrm	(pointer)
* struct sock::sk_policy	(2 pointers)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 13:24:06 -07:00
Thomas Gleixner c465a76af6 Merge branches 'timers/clocksource', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus 2008-10-20 13:14:06 +02:00
Steven Whitehouse a447c09324 vfs: Use const for kernel parser table
This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.

This was posted for review some time ago and I believe its been in -mm
since then.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-13 10:10:37 -07:00
Linus Torvalds 8d71ff0bef Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (24 commits)
  integrity: special fs magic
  As pointed out by Jonathan Corbet, the timer must be deleted before
  ERROR: code indent should use tabs where possible
  The tpm_dev_release function is only called for platform devices, not pnp
  Protect tpm_chip_list when transversing it.
  Renames num_open to is_open, as only one process can open the file at a time.
  Remove the BKL calls from the TPM driver, which were added in the overall
  netlabel: Add configuration support for local labeling
  cipso: Add support for native local labeling and fixup mapping names
  netlabel: Changes to the NetLabel security attributes to allow LSMs to pass full contexts
  selinux: Cache NetLabel secattrs in the socket's security struct
  selinux: Set socket NetLabel based on connection endpoint
  netlabel: Add functionality to set the security attributes of a packet
  netlabel: Add network address selectors to the NetLabel/LSM domain mapping
  netlabel: Add a generic way to create ordered linked lists of network addrs
  netlabel: Replace protocol/NetLabel linking with refrerence counts
  smack: Fix missing calls to netlbl_skbuff_err()
  selinux: Fix missing calls to netlbl_skbuff_err()
  selinux: Fix a problem in security_netlbl_sid_to_secattr()
  selinux: Better local/forward check in selinux_ip_postroute()
  ...
2008-10-13 10:00:44 -07:00
Alan Cox 934e6ebf96 tty: Redo current tty locking
Currently it is sometimes locked by the tty mutex and sometimes by the
sighand lock. The latter is in fact correct and now we can hand back referenced
objects we can fix this up without problems around sleeping functions.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-13 09:51:41 -07:00
Alan Cox 452a00d2ee tty: Make get_current_tty use a kref
We now return a kref covered tty reference. That ensures the tty structure
doesn't go away when you have a return from get_current_tty. This is not
enough to protect you from most of the resources being freed behind your
back - yet.

[Updated to include fixes for SELinux problems found by Andrew Morton and
 an s390 leak found while debugging the former]

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-13 09:51:41 -07:00
James Morris 0da939b005 Merge branch 'master' of git://git.infradead.org/users/pcmoore/lblnet-2.6_next into next 2008-10-11 09:26:14 +11:00
Paul Moore 8d75899d03 netlabel: Changes to the NetLabel security attributes to allow LSMs to pass full contexts
This patch provides support for including the LSM's secid in addition to
the LSM's MLS information in the NetLabel security attributes structure.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:33 -04:00
Paul Moore 6c5b3fc014 selinux: Cache NetLabel secattrs in the socket's security struct
Previous work enabled the use of address based NetLabel selectors, which
while highly useful, brought the potential for additional per-packet overhead
when used.  This patch attempts to mitigate some of that overhead by caching
the NetLabel security attribute struct within the SELinux socket security
structure.  This should help eliminate the need to recreate the NetLabel
secattr structure for each packet resulting in less overhead.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:33 -04:00
Paul Moore 014ab19a69 selinux: Set socket NetLabel based on connection endpoint
Previous work enabled the use of address based NetLabel selectors, which while
highly useful, brought the potential for additional per-packet overhead when
used.  This patch attempts to solve that by applying NetLabel socket labels
when sockets are connect()'d.  This should alleviate the per-packet NetLabel
labeling for all connected sockets (yes, it even works for connected DGRAM
sockets).

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:33 -04:00
Paul Moore 948bf85c1b netlabel: Add functionality to set the security attributes of a packet
This patch builds upon the new NetLabel address selector functionality by
providing the NetLabel KAPI and CIPSO engine support needed to enable the
new packet-based labeling.  The only new addition to the NetLabel KAPI at
this point is shown below:

 * int netlbl_skbuff_setattr(skb, family, secattr)

... and is designed to be called from a Netfilter hook after the packet's
IP header has been populated such as in the FORWARD or LOCAL_OUT hooks.

This patch also provides the necessary SELinux hooks to support this new
functionality.  Smack support is not currently included due to uncertainty
regarding the permissions needed to expand the Smack network access controls.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:32 -04:00
Paul Moore dfaebe9825 selinux: Fix missing calls to netlbl_skbuff_err()
At some point I think I messed up and dropped the calls to netlbl_skbuff_err()
which are necessary for CIPSO to send error notifications to remote systems.
This patch re-introduces the error handling calls into the SELinux code.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:31 -04:00
Paul Moore 99d854d231 selinux: Fix a problem in security_netlbl_sid_to_secattr()
Currently when SELinux fails to allocate memory in
security_netlbl_sid_to_secattr() the NetLabel LSM domain field is set to
NULL which triggers the default NetLabel LSM domain mapping which may not
always be the desired mapping.  This patch fixes this by returning an error
when the kernel is unable to allocate memory.  This could result in more
failures on a system with heavy memory pressure but it is the "correct"
thing to do.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:30 -04:00
Paul Moore d8395c876b selinux: Better local/forward check in selinux_ip_postroute()
It turns out that checking to see if skb->sk is NULL is not a very good
indicator of a forwarded packet as some locally generated packets also have
skb->sk set to NULL.  Fix this by not only checking the skb->sk field but also
the IP[6]CB(skb)->flags field for the IP[6]SKB_FORWARDED flag.  While we are
at it, we are calling selinux_parse_skb() much earlier than we really should
resulting in potentially wasted cycles parsing packets for information we
might no use; so shuffle the code around a bit to fix this.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:30 -04:00
Paul Moore aa86290089 selinux: Correctly handle IPv4 packets on IPv6 sockets in all cases
We did the right thing in a few cases but there were several areas where we
determined a packet's address family based on the socket's address family which
is not the right thing to do since we can get IPv4 packets on IPv6 sockets.
This patch fixes these problems by either taking the address family directly
from the packet.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:29 -04:00
Paul Moore accc609322 selinux: Cleanup the NetLabel glue code
We were doing a lot of extra work in selinux_netlbl_sock_graft() what wasn't
necessary so this patch removes that code.  It also removes the redundant
second argument to selinux_netlbl_sock_setsid() which allows us to simplify a
few other functions.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:29 -04:00
Paul Moore 3040a6d5a2 selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid()
At some point during the 2.6.27 development cycle two new fields were added
to the SELinux context structure, a string pointer and a length field.  The
code in selinux_secattr_to_sid() was not modified and as a result these two
fields were left uninitialized which could result in erratic behavior,
including kernel panics, when NetLabel is used.  This patch fixes the
problem by fully initializing the context in selinux_secattr_to_sid() before
use and reducing the level of direct context manipulation done to help
prevent future problems.

Please apply this to the 2.6.27-rcX release stream.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-10-04 08:25:18 +10:00
Paul Moore 81990fbdd1 selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid()
At some point during the 2.6.27 development cycle two new fields were added
to the SELinux context structure, a string pointer and a length field.  The
code in selinux_secattr_to_sid() was not modified and as a result these two
fields were left uninitialized which could result in erratic behavior,
including kernel panics, when NetLabel is used.  This patch fixes the
problem by fully initializing the context in selinux_secattr_to_sid() before
use and reducing the level of direct context manipulation done to help
prevent future problems.

Please apply this to the 2.6.27-rcX release stream.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-10-04 08:18:18 +10:00
Stephen Smalley ea6b184f7d selinux: use default proc sid on symlinks
As we are not concerned with fine-grained control over reading of
symlinks in proc, always use the default proc SID for all proc symlinks.
This should help avoid permission issues upon changes to the proc tree
as in the /proc/net -> /proc/self/net example.
This does not alter labeling of symlinks within /proc/pid directories.
ls -Zd /proc/net output before and after the patch should show the difference.

Signed-off-by:  Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-09-30 00:26:53 +10:00
James Morris ab2b49518e Merge branch 'master' into next
Conflicts:

	MAINTAINERS

Thanks for breaking my tree :-)

Signed-off-by: James Morris <jmorris@namei.org>
2008-09-21 17:41:56 -07:00
Frank Mayhar f06febc96b timers: fix itimer/many thread hang
Overview

This patch reworks the handling of POSIX CPU timers, including the
ITIMER_PROF, ITIMER_VIRT timers and rlimit handling.  It was put together
with the help of Roland McGrath, the owner and original writer of this code.

The problem we ran into, and the reason for this rework, has to do with using
a profiling timer in a process with a large number of threads.  It appears
that the performance of the old implementation of run_posix_cpu_timers() was
at least O(n*3) (where "n" is the number of threads in a process) or worse.
Everything is fine with an increasing number of threads until the time taken
for that routine to run becomes the same as or greater than the tick time, at
which point things degrade rather quickly.

This patch fixes bug 9906, "Weird hang with NPTL and SIGPROF."

Code Changes

This rework corrects the implementation of run_posix_cpu_timers() to make it
run in constant time for a particular machine.  (Performance may vary between
one machine and another depending upon whether the kernel is built as single-
or multiprocessor and, in the latter case, depending upon the number of
running processors.)  To do this, at each tick we now update fields in
signal_struct as well as task_struct.  The run_posix_cpu_timers() function
uses those fields to make its decisions.

We define a new structure, "task_cputime," to contain user, system and
scheduler times and use these in appropriate places:

struct task_cputime {
	cputime_t utime;
	cputime_t stime;
	unsigned long long sum_exec_runtime;
};

This is included in the structure "thread_group_cputime," which is a new
substructure of signal_struct and which varies for uniprocessor versus
multiprocessor kernels.  For uniprocessor kernels, it uses "task_cputime" as
a simple substructure, while for multiprocessor kernels it is a pointer:

struct thread_group_cputime {
	struct task_cputime totals;
};

struct thread_group_cputime {
	struct task_cputime *totals;
};

We also add a new task_cputime substructure directly to signal_struct, to
cache the earliest expiration of process-wide timers, and task_cputime also
replaces the it_*_expires fields of task_struct (used for earliest expiration
of thread timers).  The "thread_group_cputime" structure contains process-wide
timers that are updated via account_user_time() and friends.  In the non-SMP
case the structure is a simple aggregator; unfortunately in the SMP case that
simplicity was not achievable due to cache-line contention between CPUs (in
one measured case performance was actually _worse_ on a 16-cpu system than
the same test on a 4-cpu system, due to this contention).  For SMP, the
thread_group_cputime counters are maintained as a per-cpu structure allocated
using alloc_percpu().  The timer functions update only the timer field in
the structure corresponding to the running CPU, obtained using per_cpu_ptr().

We define a set of inline functions in sched.h that we use to maintain the
thread_group_cputime structure and hide the differences between UP and SMP
implementations from the rest of the kernel.  The thread_group_cputime_init()
function initializes the thread_group_cputime structure for the given task.
The thread_group_cputime_alloc() is a no-op for UP; for SMP it calls the
out-of-line function thread_group_cputime_alloc_smp() to allocate and fill
in the per-cpu structures and fields.  The thread_group_cputime_free()
function, also a no-op for UP, in SMP frees the per-cpu structures.  The
thread_group_cputime_clone_thread() function (also a UP no-op) for SMP calls
thread_group_cputime_alloc() if the per-cpu structures haven't yet been
allocated.  The thread_group_cputime() function fills the task_cputime
structure it is passed with the contents of the thread_group_cputime fields;
in UP it's that simple but in SMP it must also safely check that tsk->signal
is non-NULL (if it is it just uses the appropriate fields of task_struct) and,
if so, sums the per-cpu values for each online CPU.  Finally, the three
functions account_group_user_time(), account_group_system_time() and
account_group_exec_runtime() are used by timer functions to update the
respective fields of the thread_group_cputime structure.

Non-SMP operation is trivial and will not be mentioned further.

The per-cpu structure is always allocated when a task creates its first new
thread, via a call to thread_group_cputime_clone_thread() from copy_signal().
It is freed at process exit via a call to thread_group_cputime_free() from
cleanup_signal().

All functions that formerly summed utime/stime/sum_sched_runtime values from
from all threads in the thread group now use thread_group_cputime() to
snapshot the values in the thread_group_cputime structure or the values in
the task structure itself if the per-cpu structure hasn't been allocated.

Finally, the code in kernel/posix-cpu-timers.c has changed quite a bit.
The run_posix_cpu_timers() function has been split into a fast path and a
slow path; the former safely checks whether there are any expired thread
timers and, if not, just returns, while the slow path does the heavy lifting.
With the dedicated thread group fields, timers are no longer "rebalanced" and
the process_timer_rebalance() function and related code has gone away.  All
summing loops are gone and all code that used them now uses the
thread_group_cputime() inline.  When process-wide timers are set, the new
task_cputime structure in signal_struct is used to cache the earliest
expiration; this is checked in the fast path.

Performance

The fix appears not to add significant overhead to existing operations.  It
generally performs the same as the current code except in two cases, one in
which it performs slightly worse (Case 5 below) and one in which it performs
very significantly better (Case 2 below).  Overall it's a wash except in those
two cases.

I've since done somewhat more involved testing on a dual-core Opteron system.

Case 1: With no itimer running, for a test with 100,000 threads, the fixed
	kernel took 1428.5 seconds, 513 seconds more than the unfixed system,
	all of which was spent in the system.  There were twice as many
	voluntary context switches with the fix as without it.

Case 2: With an itimer running at .01 second ticks and 4000 threads (the most
	an unmodified kernel can handle), the fixed kernel ran the test in
	eight percent of the time (5.8 seconds as opposed to 70 seconds) and
	had better tick accuracy (.012 seconds per tick as opposed to .023
	seconds per tick).

Case 3: A 4000-thread test with an initial timer tick of .01 second and an
	interval of 10,000 seconds (i.e. a timer that ticks only once) had
	very nearly the same performance in both cases:  6.3 seconds elapsed
	for the fixed kernel versus 5.5 seconds for the unfixed kernel.

With fewer threads (eight in these tests), the Case 1 test ran in essentially
the same time on both the modified and unmodified kernels (5.2 seconds versus
5.8 seconds).  The Case 2 test ran in about the same time as well, 5.9 seconds
versus 5.4 seconds but again with much better tick accuracy, .013 seconds per
tick versus .025 seconds per tick for the unmodified kernel.

Since the fix affected the rlimit code, I also tested soft and hard CPU limits.

Case 4: With a hard CPU limit of 20 seconds and eight threads (and an itimer
	running), the modified kernel was very slightly favored in that while
	it killed the process in 19.997 seconds of CPU time (5.002 seconds of
	wall time), only .003 seconds of that was system time, the rest was
	user time.  The unmodified kernel killed the process in 20.001 seconds
	of CPU (5.014 seconds of wall time) of which .016 seconds was system
	time.  Really, though, the results were too close to call.  The results
	were essentially the same with no itimer running.

Case 5: With a soft limit of 20 seconds and a hard limit of 2000 seconds
	(where the hard limit would never be reached) and an itimer running,
	the modified kernel exhibited worse tick accuracy than the unmodified
	kernel: .050 seconds/tick versus .028 seconds/tick.  Otherwise,
	performance was almost indistinguishable.  With no itimer running this
	test exhibited virtually identical behavior and times in both cases.

In times past I did some limited performance testing.  those results are below.

On a four-cpu Opteron system without this fix, a sixteen-thread test executed
in 3569.991 seconds, of which user was 3568.435s and system was 1.556s.  On
the same system with the fix, user and elapsed time were about the same, but
system time dropped to 0.007 seconds.  Performance with eight, four and one
thread were comparable.  Interestingly, the timer ticks with the fix seemed
more accurate:  The sixteen-thread test with the fix received 149543 ticks
for 0.024 seconds per tick, while the same test without the fix received 58720
for 0.061 seconds per tick.  Both cases were configured for an interval of
0.01 seconds.  Again, the other tests were comparable.  Each thread in this
test computed the primes up to 25,000,000.

I also did a test with a large number of threads, 100,000 threads, which is
impossible without the fix.  In this case each thread computed the primes only
up to 10,000 (to make the runtime manageable).  System time dominated, at
1546.968 seconds out of a total 2176.906 seconds (giving a user time of
629.938s).  It received 147651 ticks for 0.015 seconds per tick, still quite
accurate.  There is obviously no comparable test without the fix.

Signed-off-by: Frank Mayhar <fmayhar@google.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-14 16:25:35 +02:00
Stephen Smalley f058925b20 Update selinux info in MAINTAINERS and Kconfig help text
Update the SELinux entry in MAINTAINERS and drop the obsolete information
from the selinux Kconfig help text.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-09-12 00:44:08 +10:00
Eric Paris 8e531af90f SELinux: memory leak in security_context_to_sid_core
Fix a bug and a philosophical decision about who handles errors.

security_context_to_sid_core() was leaking a context in the common case.
This was causing problems on fedora systems which recently have started
making extensive use of this function.

In discussion it was decided that if string_to_context_struct() had an
error it was its own responsibility to clean up any mess it created
along the way.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-09-04 08:35:13 +10:00
KaiGai Kohei d9250dea3f SELinux: add boundary support and thread context assignment
The purpose of this patch is to assign per-thread security context
under a constraint. It enables multi-threaded server application
to kick a request handler with its fair security context, and
helps some of userspace object managers to handle user's request.

When we assign a per-thread security context, it must not have wider
permissions than the original one. Because a multi-threaded process
shares a single local memory, an arbitary per-thread security context
also means another thread can easily refer violated information.

The constraint on a per-thread security context requires a new domain
has to be equal or weaker than its original one, when it tries to assign
a per-thread security context.

Bounds relationship between two types is a way to ensure a domain can
never have wider permission than its bounds. We can define it in two
explicit or implicit ways.

The first way is using new TYPEBOUNDS statement. It enables to define
a boundary of types explicitly. The other one expand the concept of
existing named based hierarchy. If we defines a type with "." separated
name like "httpd_t.php", toolchain implicitly set its bounds on "httpd_t".

This feature requires a new policy version.
The 24th version (POLICYDB_VERSION_BOUNDARY) enables to ship them into
kernel space, and the following patch enables to handle it.

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-29 00:33:33 +10:00
James Morris 86d688984d Merge branch 'master' into next 2008-08-28 10:47:34 +10:00
Vesa-Matti Kari dbc74c65b3 selinux: Unify for- and while-loop style
Replace "thing != NULL" comparisons with just "thing" to make
the code look more uniform (mixed styles were used even in the
same source file).

Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-15 08:40:47 +10:00
David Howells 5cd9c58fbe security: Fix setting of PF_SUPERPRIV by __capable()
Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags
the target process if that is not the current process and it is trying to
change its own flags in a different way at the same time.

__capable() is using neither atomic ops nor locking to protect t->flags.  This
patch removes __capable() and introduces has_capability() that doesn't set
PF_SUPERPRIV on the process being queried.

This patch further splits security_ptrace() in two:

 (1) security_ptrace_may_access().  This passes judgement on whether one
     process may access another only (PTRACE_MODE_ATTACH for ptrace() and
     PTRACE_MODE_READ for /proc), and takes a pointer to the child process.
     current is the parent.

 (2) security_ptrace_traceme().  This passes judgement on PTRACE_TRACEME only,
     and takes only a pointer to the parent process.  current is the child.

     In Smack and commoncap, this uses has_capability() to determine whether
     the parent will be permitted to use PTRACE_ATTACH if normal checks fail.
     This does not set PF_SUPERPRIV.

Two of the instances of __capable() actually only act on current, and so have
been changed to calls to capable().

Of the places that were using __capable():

 (1) The OOM killer calls __capable() thrice when weighing the killability of a
     process.  All of these now use has_capability().

 (2) cap_ptrace() and smack_ptrace() were using __capable() to check to see
     whether the parent was allowed to trace any process.  As mentioned above,
     these have been split.  For PTRACE_ATTACH and /proc, capable() is now
     used, and for PTRACE_TRACEME, has_capability() is used.

 (3) cap_safe_nice() only ever saw current, so now uses capable().

 (4) smack_setprocattr() rejected accesses to tasks other than current just
     after calling __capable(), so the order of these two tests have been
     switched and capable() is used instead.

 (5) In smack_file_send_sigiotask(), we need to allow privileged processes to
     receive SIGIO on files they're manipulating.

 (6) In smack_task_wait(), we let a process wait for a privileged process,
     whether or not the process doing the waiting is privileged.

I've tested this with the LTP SELinux and syscalls testscripts.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-14 22:59:43 +10:00
Vesa-Matti Kari 421fae06be selinux: conditional expression type validation was off-by-one
expr_isvalid() in conditional.c was off-by-one and allowed
invalid expression type COND_LAST. However, it is this header file
that needs to be fixed. That way the if-statement's disjunction's
second component reads more naturally, "if expr type is greater than
the last allowed value" ( rather than using ">=" in conditional.c):

  if (expr->expr_type <= 0 || expr->expr_type > COND_LAST)

Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-07 08:56:16 +10:00
David Howells cf9481e289 SELinux: Fix a potentially uninitialised variable in SELinux hooks
Fix a potentially uninitialised variable in SELinux hooks that's given a
pointer to the network address by selinux_parse_skb() passing a pointer back
through its argument list.  By restructuring selinux_parse_skb(), the compiler
can see that the error case need not set it as the caller will return
immediately.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-05 10:55:47 +10:00
Vesa-Matti J Kari 0c0e186f81 SELinux: trivial, remove unneeded local variable
Hello,

Remove unneeded local variable:

    struct avtab_node *newnode

Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-05 10:55:38 +10:00
Vesa-Matti J Kari df4ea865f0 SELinux: Trivial minor fixes that change C null character style
Trivial minor fixes that change C null character style.

Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-05 10:55:30 +10:00
Adrian Bunk 3583a71183 make selinux_write_opts() static
This patch makes the needlessly global selinux_write_opts() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-05 10:55:24 +10:00
Eric Paris 383795c206 SELinux: /proc/mounts should show what it can
Given a hosed SELinux config in which a system never loads policy or
disables SELinux we currently just return -EINVAL for anyone trying to
read /proc/mounts.  This is a configuration problem but we can certainly
be more graceful.  This patch just ignores -EINVAL when displaying LSM
options and causes /proc/mounts display everything else it can.  If
policy isn't loaded the obviously there are no options, so we aren't
really loosing any information here.

This is safe as the only other return of EINVAL comes from
security_sid_to_context_core() in the case of an invalid sid.  Even if a
FS was mounted with a now invalidated context that sid should have been
remapped to unlabeled and so we won't hit the EINVAL and will work like
we should.  (yes, I tested to make sure it worked like I thought)

Signed-off-by: Eric Paris <eparis@redhat.com>
Tested-by: Marc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-30 08:31:28 +10:00
Linus Torvalds 4836e30078 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (39 commits)
  [PATCH] fix RLIM_NOFILE handling
  [PATCH] get rid of corner case in dup3() entirely
  [PATCH] remove remaining namei_{32,64}.h crap
  [PATCH] get rid of indirect users of namei.h
  [PATCH] get rid of __user_path_lookup_open
  [PATCH] f_count may wrap around
  [PATCH] dup3 fix
  [PATCH] don't pass nameidata to __ncp_lookup_validate()
  [PATCH] don't pass nameidata to gfs2_lookupi()
  [PATCH] new (local) helper: user_path_parent()
  [PATCH] sanitize __user_walk_fd() et.al.
  [PATCH] preparation to __user_walk_fd cleanup
  [PATCH] kill nameidata passing to permission(), rename to inode_permission()
  [PATCH] take noexec checks to very few callers that care
  Re: [PATCH 3/6] vfs: open_exec cleanup
  [patch 4/4] vfs: immutable inode checking cleanup
  [patch 3/4] fat: dont call notify_change
  [patch 2/4] vfs: utimes cleanup
  [patch 1/4] vfs: utimes: move owner check into inode_change_ok()
  [PATCH] vfs: use kstrdup() and check failing allocation
  ...
2008-07-26 20:23:44 -07:00
Linus Torvalds 2284284281 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  netns: fix ip_rt_frag_needed rt_is_expired
  netfilter: nf_conntrack_extend: avoid unnecessary "ct->ext" dereferences
  netfilter: fix double-free and use-after free
  netfilter: arptables in netns for real
  netfilter: ip{,6}tables_security: fix future section mismatch
  selinux: use nf_register_hooks()
  netfilter: ebtables: use nf_register_hooks()
  Revert "pkt_sched: sch_sfq: dump a real number of flows"
  qeth: use dev->ml_priv instead of dev->priv
  syncookies: Make sure ECN is disabled
  net: drop unused BUG_TRAP()
  net: convert BUG_TRAP to generic WARN_ON
  drivers/net: convert BUG_TRAP to generic WARN_ON
2008-07-26 20:17:56 -07:00
Al Viro b77b0646ef [PATCH] pass MAY_OPEN to vfs_permission() explicitly
... and get rid of the last "let's deduce mask from nameidata->flags"
bit.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:22 -04:00
Alexey Dobriyan 6c5a9d2e15 selinux: use nf_register_hooks()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-26 17:48:15 -07:00
Roland McGrath 0d094efeb1 tracehook: tracehook_tracer_task
This adds the tracehook_tracer_task() hook to consolidate all forms of
"Who is using ptrace on me?" logic.  This is used for "TracerPid:" in
/proc and for permission checks.  We also clean up the selinux code the
called an identical accessor.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:08 -07:00
James Morris 089be43e40 Revert "SELinux: allow fstype unknown to policy to use xattrs if present"
This reverts commit 811f379927.

From Eric Paris:

"Please drop this patch for now.  It deadlocks on ntfs-3g.  I need to
rework it to handle fuse filesystems better.  (casey was right)"
2008-07-15 18:32:49 +10:00
James Morris 6f0f0fd496 security: remove register_security hook
The register security hook is no longer required, as the capability
module is always registered.  LSMs wishing to stack capability as
a secondary module should do so explicitly.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-14 15:04:06 +10:00
Miklos Szeredi b478a9f988 security: remove unused sb_get_mnt_opts hook
The sb_get_mnt_opts() hook is unused, and is superseded by the
sb_show_options() hook.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: James Morris <jmorris@namei.org>
2008-07-14 15:02:05 +10:00
Eric Paris 2069f45784 LSM/SELinux: show LSM mount options in /proc/mounts
This patch causes SELinux mount options to show up in /proc/mounts.  As
with other code in the area seq_put errors are ignored.  Other LSM's
will not have their mount options displayed until they fill in their own
security_sb_show_options() function.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:02:05 +10:00
Eric Paris 811f379927 SELinux: allow fstype unknown to policy to use xattrs if present
Currently if a FS is mounted for which SELinux policy does not define an
fs_use_* that FS will either be genfs labeled or not labeled at all.
This decision is based on the existence of a genfscon rule in policy and
is irrespective of the capabilities of the filesystem itself.  This
patch allows the kernel to check if the filesystem supports security
xattrs and if so will use those if there is no fs_use_* rule in policy.
An fstype with a no fs_use_* rule but with a genfs rule will use xattrs
if available and will follow the genfs rule.

This can be particularly interesting for things like ecryptfs which
actually overlays a real underlying FS.  If we define excryptfs in
policy to use xattrs we will likely get this wrong at times, so with
this path we just don't need to define it!

Overlay ecryptfs on top of NFS with no xattr support:
SELinux: initialized (dev ecryptfs, type ecryptfs), uses genfs_contexts
Overlay ecryptfs on top of ext4 with xattr support:
SELinux: initialized (dev ecryptfs, type ecryptfs), uses xattr

It is also useful as the kernel adds new FS we don't need to add them in
policy if they support xattrs and that is how we want to handle them.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:02:04 +10:00
James Morris 2baf06df85 SELinux: use do_each_thread as a proper do/while block
Use do_each_thread as a proper do/while block.  Sparse complained.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2008-07-14 15:02:02 +10:00
James Morris e399f98224 SELinux: remove unused and shadowed addrlen variable
Remove unused and shadowed addrlen variable.  Picked up by sparse.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Paul Moore <paul.moore@hp.com>
2008-07-14 15:02:01 +10:00
Eric Paris 6cbe27061a SELinux: more user friendly unknown handling printk
I've gotten complaints and reports about people not understanding the
meaning of the current unknown class/perm handling the kernel emits on
every policy load.  Hopefully this will make make it clear to everyone
the meaning of the message and won't waste a printk the user won't care
about anyway on systems where the kernel and the policy agree on
everything.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:02:00 +10:00
Stephen Smalley 22df4adb04 selinux: change handling of invalid classes (Was: Re: 2.6.26-rc5-mm1 selinux whine)
On Mon, 2008-06-09 at 01:24 -0700, Andrew Morton wrote:
> Getting a few of these with FC5:
>
> SELinux: context_struct_compute_av:  unrecognized class 69
> SELinux: context_struct_compute_av:  unrecognized class 69
>
> one came out when I logged in.
>
> No other symptoms, yet.

Change handling of invalid classes by SELinux, reporting class values
unknown to the kernel as errors (w/ ratelimit applied) and handling
class values unknown to policy as normal denials.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:59 +10:00
Eric Paris 89abd0acf0 SELinux: drop load_mutex in security_load_policy
We used to protect against races of policy load in security_load_policy
by using the load_mutex.  Since then we have added a new mutex,
sel_mutex, in sel_write_load() which is always held across all calls to
security_load_policy we are covered and can safely just drop this one.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:58 +10:00
Eric Paris cea78dc4ca SELinux: fix off by 1 reference of class_to_string in context_struct_compute_av
The class_to_string array is referenced by tclass.  My code mistakenly
was using tclass - 1.  If the proceeding class is a userspace class
rather than kernel class this may cause a denial/EINVAL even if unknown
handling is set to allow.  The bug shouldn't be allowing excess
privileges since those are given based on the contents of another array
which should be correctly referenced.

At this point in time its pretty unlikely this is going to cause
problems.  The most recently added kernel classes which could be
affected are association, dccp_socket, and peer.  Its pretty unlikely
any policy with handle_unknown=allow doesn't have association and
dccp_socket undefined (they've been around longer than unknown handling)
and peer is conditionalized on a policy cap which should only be defined
if that class exists in policy.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:58 +10:00
James Morris bdd581c143 SELinux: open code sidtab lock
Open code sidtab lock to make Andrew Morton happy.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2008-07-14 15:01:57 +10:00
James Morris 972ccac2b2 SELinux: open code load_mutex
Open code load_mutex as suggested by Andrew Morton.

Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:56 +10:00
James Morris 0804d1133c SELinux: open code policy_rwlock
Open code policy_rwlock, as suggested by Andrew Morton.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
2008-07-14 15:01:55 +10:00
Stephen Smalley 59dbd1ba98 selinux: fix endianness bug in network node address handling
Fix an endianness bug in the handling of network node addresses by
SELinux.  This yields no change on little endian hardware but fixes
the incorrect handling on big endian hardware.  The network node
addresses are stored in network order in memory by checkpolicy, not in
cpu/host order, and thus should not have cpu_to_le32/le32_to_cpu
conversions applied upon policy write/read unlike other data in the
policy.

Bug reported by John Weeks of Sun, who noticed that binary policy
files built from the same policy source on x86 and sparc differed and
tracked it down to the ipv4 address handling in checkpolicy.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:54 +10:00
Stephen Smalley 242631c49d selinux: simplify ioctl checking
Simplify and improve the robustness of the SELinux ioctl checking by
using the "access mode" bits of the ioctl command to determine the
permission check rather than dealing with individual command values.
This removes any knowledge of specific ioctl commands from SELinux
and follows the same guidance we gave to Smack earlier.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:53 +10:00
Stephen Smalley abc69bb633 SELinux: enable processes with mac_admin to get the raw inode contexts
Enable processes with CAP_MAC_ADMIN + mac_admin permission in policy
to get undefined contexts on inodes.  This extends the support for
deferred mapping of security contexts in order to permit restorecon
and similar programs to see the raw file contexts unknown to the
system policy in order to check them.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:52 +10:00
Stephen Smalley 006ebb40d3 Security: split proc ptrace checking into read vs. attach
Enable security modules to distinguish reading of process state via
proc from full ptrace access by renaming ptrace_may_attach to
ptrace_may_access and adding a mode argument indicating whether only
read access or full attach access is requested.  This allows security
modules to permit access to reading process state without granting
full ptrace access.  The base DAC/capability checking remains unchanged.

Read access to /proc/pid/mem continues to apply a full ptrace attach
check since check_mem_permission() already requires the current task
to already be ptracing the target.  The other ptrace checks within
proc for elements like environ, maps, and fds are changed to pass the
read mode instead of attach.

In the SELinux case, we model such reading of process state as a
reading of a proc file labeled with the target process' label.  This
enables SELinux policy to permit such reading of process state without
permitting control or manipulation of the target process, as there are
a number of cases where programs probe for such information via proc
but do not need to be able to control the target (e.g. procps,
lsof, PolicyKit, ConsoleKit).  At present we have to choose between
allowing full ptrace in policy (more permissive than required/desired)
or breaking functionality (or in some cases just silencing the denials
via dontaudit rules but this can hide genuine attacks).

This version of the patch incorporates comments from Casey Schaufler
(change/replace existing ptrace_may_attach interface, pass access
mode), and Chris Wright (provide greater consistency in the checking).

Note that like their predecessors __ptrace_may_attach and
ptrace_may_attach, the __ptrace_may_access and ptrace_may_access
interfaces use different return value conventions from each other (0
or -errno vs. 1 or 0).  I retained this difference to avoid any
changes to the caller logic but made the difference clearer by
changing the latter interface to return a bool rather than an int and
by adding a comment about it to ptrace.h for any future callers.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:47 +10:00
James Morris feb2a5b82d SELinux: remove inherit field from inode_security_struct
Remove inherit field from inode_security_struct, per Stephen Smalley:
"Let's just drop inherit altogether - dead field."

Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:38 +10:00
Richard Kennedy fdeb05184b SELinux: reorder inode_security_struct to increase objs/slab on 64bit
reorder inode_security_struct to remove padding on 64 bit builds

size reduced from 72 to 64 bytes increasing objects per slab to 64.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:37 +10:00
Eric Paris f526971078 SELinux: keep the code clean formating and syntax
Formatting and syntax changes

whitespace, tabs to spaces, trailing space
put open { on same line as struct def
remove unneeded {} after if statements
change printk("Lu") to printk("llu")
convert asm/uaccess.h to linux/uaacess.h includes
remove unnecessary asm/bug.h includes
convert all users of simple_strtol to strict_strtol

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:36 +10:00
Stephen Smalley 9a59daa03d SELinux: fix sleeping allocation in security_context_to_sid
Fix a sleeping function called from invalid context bug by moving allocation
to the callers prior to taking the policy rdlock.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:35 +10:00
Stephen Smalley 12b29f3455 selinux: support deferred mapping of contexts
Introduce SELinux support for deferred mapping of security contexts in
the SID table upon policy reload, and use this support for inode
security contexts when the context is not yet valid under the current
policy.  Only processes with CAP_MAC_ADMIN + mac_admin permission in
policy can set undefined security contexts on inodes.  Inodes with
such undefined contexts are treated as having the unlabeled context
until the context becomes valid upon a policy reload that defines the
context.  Context invalidation upon policy reload also uses this
support to save the context information in the SID table and later
recover it upon a subsequent policy reload that defines the context
again.

This support is to enable package managers and similar programs to set
down file contexts unknown to the system policy at the time the file
is created in order to better support placing loadable policy modules
in packages and to support build systems that need to create images of
different distro releases with different policies w/o requiring all of
the contexts to be defined or legal in the build host policy.

With this patch applied, the following sequence is possible, although
in practice it is recommended that this permission only be allowed to
specific program domains such as the package manager.

# rmdir baz
# rm bar
# touch bar
# chcon -t foo_exec_t bar # foo_exec_t is not yet defined
chcon: failed to change context of `bar' to `system_u:object_r:foo_exec_t': Invalid argument
# mkdir -Z system_u:object_r:foo_exec_t baz
mkdir: failed to set default file creation context to `system_u:object_r:foo_exec_t': Invalid argument
# cat setundefined.te
policy_module(setundefined, 1.0)
require {
	type unconfined_t;
	type unlabeled_t;
}
files_type(unlabeled_t)
allow unconfined_t self:capability2 mac_admin;
# make -f /usr/share/selinux/devel/Makefile setundefined.pp
# semodule -i setundefined.pp
# chcon -t foo_exec_t bar # foo_exec_t is not yet defined
# mkdir -Z system_u:object_r:foo_exec_t baz
# ls -Zd bar baz
-rw-r--r--  root root system_u:object_r:unlabeled_t    bar
drwxr-xr-x  root root system_u:object_r:unlabeled_t    baz
# cat foo.te
policy_module(foo, 1.0)
type foo_exec_t;
files_type(foo_exec_t)
# make -f /usr/share/selinux/devel/Makefile foo.pp
# semodule -i foo.pp # defines foo_exec_t
# ls -Zd bar baz
-rw-r--r--  root root user_u:object_r:foo_exec_t       bar
drwxr-xr-x  root root system_u:object_r:foo_exec_t    baz
# semodule -r foo
# ls -Zd bar baz
-rw-r--r--  root root system_u:object_r:unlabeled_t    bar
drwxr-xr-x  root root system_u:object_r:unlabeled_t    baz
# semodule -i foo.pp
# ls -Zd bar baz
-rw-r--r--  root root user_u:object_r:foo_exec_t       bar
drwxr-xr-x  root root system_u:object_r:foo_exec_t    baz
# semodule -r setundefined foo
# chcon -t foo_exec_t bar # no longer defined and not allowed
chcon: failed to change context of `bar' to `system_u:object_r:foo_exec_t': Invalid argument
# rmdir baz
# mkdir -Z system_u:object_r:foo_exec_t baz
mkdir: failed to set default file creation context to `system_u:object_r:foo_exec_t': Invalid argument

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-07-14 15:01:34 +10:00
Al Viro 9f3acc3140 [PATCH] split linux/file.h
Initial splitoff of the low-level stuff; taken to fdtable.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-05-01 13:08:16 -04:00
Oleg Nesterov 3b5e9e53c6 signals: cleanup security_task_kill() usage/implementation
Every implementation of ->task_kill() does nothing when the signal comes from
the kernel.  This is correct, but means that check_kill_permission() should
call security_task_kill() only for SI_FROMUSER() case, and we can remove the
same check from ->task_kill() implementations.

(sadly, check_kill_permission() is the last user of signal->session/__session
 but we can't s/task_session_nr/task_session/ here).

NOTE: Eric W.  Biederman pointed out cap_task_kill() should die, and I think
he is very right.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: David Quigley <dpquigl@tycho.nsa.gov>
Cc: Eric Paris <eparis@redhat.com>
Cc: Harald Welte <laforge@gnumonks.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30 08:29:34 -07:00
David Howells 7bf570dc8d Security: Make secctx_to_secid() take const secdata
Make secctx_to_secid() take constant secdata.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 13:22:56 -07:00
Linus Torvalds 9781db7b34 Merge branch 'audit.b50' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b50' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] new predicate - AUDIT_FILETYPE
  [patch 2/2] Use find_task_by_vpid in audit code
  [patch 1/2] audit: let userspace fully control TTY input auditing
  [PATCH 2/2] audit: fix sparse shadowed variable warnings
  [PATCH 1/2] audit: move extern declarations to audit.h
  Audit: MAINTAINERS update
  Audit: increase the maximum length of the key field
  Audit: standardize string audit interfaces
  Audit: stop deadlock from signals under load
  Audit: save audit_backlog_limit audit messages in case auditd comes back
  Audit: collect sessionid in netlink messages
  Audit: end printk with newline
2008-04-29 11:41:22 -07:00
David Howells 69664cf16a keys: don't generate user and user session keyrings unless they're accessed
Don't generate the per-UID user and user session keyrings unless they're
explicitly accessed.  This solves a problem during a login process whereby
set*uid() is called before the SELinux PAM module, resulting in the per-UID
keyrings having the wrong security labels.

This also cures the problem of multiple per-UID keyrings sometimes appearing
due to PAM modules (including pam_keyinit) setuiding and causing user_structs
to come into and go out of existence whilst the session keyring pins the user
keyring.  This is achieved by first searching for extant per-UID keyrings
before inventing new ones.

The serial bound argument is also dropped from find_keyring_by_name() as it's
not currently made use of (setting it to 0 disables the feature).

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <kwc@citi.umich.edu>
Cc: <arunsr@cse.iitk.ac.in>
Cc: <dwalsh@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:17 -07:00
David Howells 70a5bb72b5 keys: add keyctl function to get a security label
Add a keyctl() function to get the security label of a key.

The following is added to Documentation/keys.txt:

 (*) Get the LSM security context attached to a key.

	long keyctl(KEYCTL_GET_SECURITY, key_serial_t key, char *buffer,
		    size_t buflen)

     This function returns a string that represents the LSM security context
     attached to a key in the buffer provided.

     Unless there's an error, it always returns the amount of data it could
     produce, even if that's too big for the buffer, but it won't copy more
     than requested to userspace. If the buffer pointer is NULL then no copy
     will take place.

     A NUL character is included at the end of the string if the buffer is
     sufficiently big.  This is included in the returned count.  If no LSM is
     in force then an empty string will be returned.

     A process must have view permission on the key for this function to be
     successful.

[akpm@linux-foundation.org: declare keyctl_get_security()]
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Paul Moore <paul.moore@hp.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Cc: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:16 -07:00
David Howells 8f0cfa52a1 xattr: add missing consts to function arguments
Add missing consts to xattr function arguments.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:06 -07:00
Linus Torvalds cfd299dffe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
  SELinux: Fix a RCU free problem with the netport cache
  SELinux: Made netnode cache adds faster
  SELinux: include/security.h whitespace, syntax, and other cleanups
  SELinux: policydb.h whitespace, syntax, and other cleanups
  SELinux: mls_types.h whitespace, syntax, and other cleanups
  SELinux: mls.h whitespace, syntax, and other cleanups
  SELinux: hashtab.h whitespace, syntax, and other cleanups
  SELinux: context.h whitespace, syntax, and other cleanups
  SELinux: ss/conditional.h whitespace, syntax, and other cleanups
  SELinux: selinux/include/security.h whitespace, syntax, and other cleanups
  SELinux: objsec.h whitespace, syntax, and other cleanups
  SELinux: netlabel.h whitespace, syntax, and other cleanups
  SELinux: avc_ss.h whitespace, syntax, and other cleanups

Fixed up conflict in include/linux/security.h manually
2008-04-28 10:08:49 -07:00
Andrew G. Morgan 3898b1b4eb capabilities: implement per-process securebits
Filesystem capability support makes it possible to do away with (set)uid-0
based privilege and use capabilities instead.  That is, with filesystem
support for capabilities but without this present patch, it is (conceptually)
possible to manage a system with capabilities alone and never need to obtain
privilege via (set)uid-0.

Of course, conceptually isn't quite the same as currently possible since few
user applications, certainly not enough to run a viable system, are currently
prepared to leverage capabilities to exercise privilege.  Further, many
applications exist that may never get upgraded in this way, and the kernel
will continue to want to support their setuid-0 base privilege needs.

Where pure-capability applications evolve and replace setuid-0 binaries, it is
desirable that there be a mechanisms by which they can contain their
privilege.  In addition to leveraging the per-process bounding and inheritable
sets, this should include suppressing the privilege of the uid-0 superuser
from the process' tree of children.

The feature added by this patch can be leveraged to suppress the privilege
associated with (set)uid-0.  This suppression requires CAP_SETPCAP to
initiate, and only immediately affects the 'current' process (it is inherited
through fork()/exec()).  This reimplementation differs significantly from the
historical support for securebits which was system-wide, unwieldy and which
has ultimately withered to a dead relic in the source of the modern kernel.

With this patch applied a process, that is capable(CAP_SETPCAP), can now drop
all legacy privilege (through uid=0) for itself and all subsequently
fork()'d/exec()'d children with:

  prctl(PR_SET_SECUREBITS, 0x2f);

This patch represents a no-op unless CONFIG_SECURITY_FILE_CAPABILITIES is
enabled at configure time.

[akpm@linux-foundation.org: fix uninitialised var warning]
[serue@us.ibm.com: capabilities: use cap_task_prctl when !CONFIG_SECURITY]
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Reviewed-by: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Paul Moore <paul.moore@hp.com>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28 08:58:26 -07:00
Eric Paris b556f8ad58 Audit: standardize string audit interfaces
This patch standardized the string auditing interfaces.  No userspace
changes will be visible and this is all just cleanup and consistancy
work.  We have the following string audit interfaces to use:

void audit_log_n_hex(struct audit_buffer *ab, const unsigned char *buf, size_t len);

void audit_log_n_string(struct audit_buffer *ab, const char *buf, size_t n);
void audit_log_string(struct audit_buffer *ab, const char *buf);

void audit_log_n_untrustedstring(struct audit_buffer *ab, const char *string, size_t n);
void audit_log_untrustedstring(struct audit_buffer *ab, const char *string);

This may be the first step to possibly fixing some of the issues that
people have with the string output from the kernel audit system.  But we
still don't have an agreed upon solution to that problem.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-28 06:19:22 -04:00
Paul Moore c9b7b97937 SELinux: Fix a RCU free problem with the netport cache
The netport cache doesn't free resources in a manner which is safe or orderly.
This patch fixes this by adding in a missing call to rcu_dereference() in
sel_netport_insert() as well as some general cleanup throughout the file.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-04-28 09:36:27 +10:00
Paul Moore a639e7ca8e SELinux: Made netnode cache adds faster
When adding new entries to the network node cache we would walk the entire
hash bucket to make sure we didn't cross a threshold (done to bound the
cache size).  This isn't a very quick or elegant solution for something
which is supposed to be quick-ish so add a counter to each hash bucket to
track the size of the bucket and eliminate the need to walk the entire
bucket list on each add.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-04-28 09:36:23 +10:00
Eric Paris 489a5fd719 SELinux: policydb.h whitespace, syntax, and other cleanups
This patch changes policydb.h to fix whitespace and syntax issues.  Things that
are fixed may include (does not not have to include)

spaces followed by tabs
spaces used instead of tabs
location of * in pointer declarations

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-04-28 09:29:07 +10:00
Eric Paris 8bf1f3a6c0 SELinux: mls_types.h whitespace, syntax, and other cleanups
This patch changes mls_types.h to fix whitespace and syntax issues.  Things that
are fixed may include (does not not have to include)

spaces used instead of tabs

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-04-28 09:29:06 +10:00
Eric Paris d497fc87c0 SELinux: mls.h whitespace, syntax, and other cleanups
This patch changes mls.h to fix whitespace and syntax issues.  Things that
are fixed may include (does not not have to include)

spaces used instead of tabs

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-04-28 09:29:05 +10:00
Eric Paris faff786ce2 SELinux: hashtab.h whitespace, syntax, and other cleanups
This patch changes hashtab.h to fix whitespace and syntax issues.  Things that
are fixed may include (does not not have to include)

spaces used instead of tabs

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-04-28 09:29:04 +10:00
Eric Paris 81fa42df78 SELinux: context.h whitespace, syntax, and other cleanups
This patch changes context.h to fix whitespace and syntax issues.  Things that
are fixed may include (does not not have to include)

include spaces around , in function calls

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-04-28 09:29:03 +10:00
Eric Paris ccb3cbeb4f SELinux: ss/conditional.h whitespace, syntax, and other cleanups
This patch changes ss/conditional.h to fix whitespace and syntax issues.  Things that
are fixed may include (does not not have to include)

location of * in pointer declarations

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-04-28 09:29:02 +10:00
Eric Paris b19d8eae99 SELinux: selinux/include/security.h whitespace, syntax, and other cleanups
This patch changes selinux/include/security.h to fix whitespace and syntax issues.  Things that
are fixed may include (does not not have to include)

whitespace at end of lines
spaces followed by tabs
spaces used instead of tabs
spacing around parenthesis
location of { around structs and else clauses
location of * in pointer declarations
removal of initialization of static data to keep it in the right section
useless {} in if statemetns
useless checking for NULL before kfree
fixing of the indentation depth of switch statements
no assignments in if statements
and any number of other things I forgot to mention

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-04-28 09:29:01 +10:00
Eric Paris a936b79bdf SELinux: objsec.h whitespace, syntax, and other cleanups
This patch changes objsec.h to fix whitespace and syntax issues.  Things that
are fixed may include (does not not have to include)

whitespace at end of lines
spaces followed by tabs
spaces used instead of tabs
spacing around parenthesis
location of { around structs and else clauses
location of * in pointer declarations
removal of initialization of static data to keep it in the right section
useless {} in if statemetns
useless checking for NULL before kfree
fixing of the indentation depth of switch statements
no assignments in if statements
and any number of other things I forgot to mention

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-04-28 09:29:00 +10:00
Eric Paris cc03766aaf SELinux: netlabel.h whitespace, syntax, and other cleanups
This patch changes netlabel.h to fix whitespace and syntax issues.  Things that
are fixed may include (does not not have to include)

spaces used instead of tabs

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-04-28 09:28:59 +10:00