Archived
14
0
Fork 0
This repository has been archived on 2022-02-17. You can view files and clone it, but cannot push or open issues or pull requests.
linux-2.6/kernel
Davide Libenzi 4ede816ac3 epoll keyed wakeups: add __wake_up_locked_key() and __wake_up_sync_key()
This patchset introduces wakeup hints for some of the most popular (from
epoll POV) devices, so that epoll code can avoid spurious wakeups on its
waiters.

The problem with epoll is that the callback-based wakeups do not, ATM,
carry any information about the events the wakeup is related to.  So the
only choice epoll has (not being able to call f_op->poll() from inside the
callback), is to add the file* to a ready-list and resolve the real events
later on, at epoll_wait() (or its own f_op->poll()) time.  This can cause
spurious wakeups, since the wake_up() itself might be for an event the
caller is not interested into.

The rate of these spurious wakeup can be pretty high in case of many
network sockets being monitored.

By allowing devices to report the events the wakeups refer to (at least
the two major classes - POLLIN/POLLOUT), we are able to spare useless
wakeups by proper handling inside the epoll's poll callback.

Epoll will have in any case to call f_op->poll() on the file* later on,
since the change to be done in order to have the full event set sent via
wakeup, is too invasive for the way our f_op->poll() system works (the
full event set is calculated inside the poll function - there are too many
of them to even start thinking the change - also poll/select would need
change too).

Epoll is changed in a way that both devices which send event hints, and
the ones that don't, are correctly handled.  The former will gain some
efficiency though.

As a general rule for devices, would be to add an event mask by using
key-aware wakeup macros, when making up poll wait queues.  I tested it
(together with the epoll's poll fix patch Andrew has in -mm) and wakeups
for the supported devices are correctly filtered.

Test program available here:

http://www.xmailserver.org/epoll_test.c

This patch:

Nothing revolutionary here.  Just using the available "key" that our
wakeup core already support.  The __wake_up_locked_key() was no brainer,
since both __wake_up_locked() and __wake_up_locked_key() are thin wrappers
around __wake_up_common().

The __wake_up_sync() function had a body, so the choice was between
borrowing the body for __wake_up_sync_key() and calling it from
__wake_up_sync(), or make an inline and calling it from both.  I chose the
former since in most archs it all resolves to "mov $0, REG; jmp ADDR".

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: William Lee Irwin III <wli@movementarian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-01 08:59:20 -07:00
..
irq PM: Introduce functions for suspending and resuming device interrupts 2009-03-30 21:46:54 +02:00
power pm: rework includes, remove arch ifdefs 2009-04-01 08:59:16 -07:00
time Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-03-26 16:05:42 -07:00
trace Merge commit 'origin/master' into next 2009-03-11 17:10:07 +11:00
.gitignore
acct.c
async.c async: remove the temporary (2.6.29) "async is off by default" code 2009-03-28 13:05:30 -07:00
audit.c
audit.h
audit_tree.c
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c
cgroup.c vfs: simple_set_mnt() should return void 2009-03-27 14:44:03 -04:00
cgroup_debug.c
cgroup_freezer.c
compat.c
configs.c
cpu.c cpumask: use set_cpu_active in init/main.c 2009-03-30 22:05:12 +10:30
cpuset.c
cred-internals.h
cred.c
delayacct.c
dma-coherent.c
dma.c
exec_domain.c
exit.c Merge branch 'linus' into x86/apic 2009-02-13 09:44:22 +01:00
extable.c
fork.c cpumask: use mm_cpumask() wrapper: kernel/fork.c 2009-03-30 22:05:13 +10:30
freezer.c
futex.c futex: remove the pointer math from double_unlock_hb, fix 2009-03-13 10:32:07 +01:00
futex_compat.c
hrtimer.c
itimer.c timers: split process wide cpu clocks/timers 2009-02-05 13:04:33 +01:00
kallsyms.c
Kconfig.freezer
Kconfig.hz
Kconfig.preempt
kexec.c kexec: Change kexec jump code ordering 2009-03-30 21:46:55 +02:00
kfifo.c
kgdb.c
kmod.c cpumask: remove dangerous CPU_MASK_ALL_PTR, &CPU_MASK_ALL 2009-03-30 22:05:11 +10:30
kprobes.c
ksysfs.c
kthread.c cpumask: remove dangerous CPU_MASK_ALL_PTR, &CPU_MASK_ALL 2009-03-30 22:05:11 +10:30
latencytop.c sched, latencytop: incorporate review feedback from Andrew Morton 2009-02-11 10:18:04 +01:00
lockdep.c lockdep: fix deadlock in lockdep_trace_alloc 2009-03-30 23:19:24 +02:00
lockdep_internals.h lockdep: get_user_chars() redo 2009-02-14 23:28:22 +01:00
lockdep_proc.c lockstat: warn about disabled lock debugging 2009-02-14 23:28:28 +01:00
lockdep_states.h lockdep: move state bit definitions around 2009-02-14 23:27:59 +01:00
Makefile PM: fix build for CONFIG_PM unset 2009-02-21 14:17:17 -08:00
marker.c
module.c Merge branch 'core/percpu' into percpu-cpumask-x86-for-linus-2 2009-03-27 17:28:43 +01:00
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
ns_cgroup.c
nsproxy.c
panic.c stackprotector: update make rules 2009-02-10 00:41:54 +01:00
params.c
pid.c
pid_namespace.c
pm_qos_params.c
posix-cpu-timers.c posix timers: fix RLIMIT_CPU && fork() 2009-03-23 20:43:35 +01:00
posix-timers.c
printk.c PM: Fix suspend_console and resume_console to use only one semaphore 2009-02-21 14:17:18 -08:00
profile.c profiling: fix broken profiling regression 2009-02-10 00:50:37 +01:00
ptrace.c
rcuclassic.c rcu: Teach RCU that idle task is not quiscent state at boot 2009-02-26 04:08:14 +01:00
rcupdate.c rcu: Teach RCU that idle task is not quiscent state at boot 2009-02-26 04:08:14 +01:00
rcupreempt.c rcu: Teach RCU that idle task is not quiscent state at boot 2009-02-26 04:08:14 +01:00
rcupreempt_trace.c
rcutorture.c cpumask: convert rcutorture.c 2009-03-30 22:05:16 +10:30
rcutree.c rcu: Teach RCU that idle task is not quiscent state at boot 2009-02-26 04:08:14 +01:00
rcutree_trace.c
relay.c timers: add mod_timer_pending() 2009-02-18 19:26:33 +01:00
res_counter.c
resource.c
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rtmutex_common.h
rwsem.c
sched.c epoll keyed wakeups: add __wake_up_locked_key() and __wake_up_sync_key() 2009-04-01 08:59:20 -07:00
sched_clock.c sched_clock: cleanups 2009-02-26 21:56:07 +01:00
sched_cpupri.c
sched_cpupri.h cpumask: remove cpumask_t from core 2009-03-30 22:05:17 +10:30
sched_debug.c sched: remove unused fields from struct rq 2009-03-24 23:16:51 +01:00
sched_fair.c Merge branch 'sched/urgent'; commit 'v2.6.29-rc5' into sched/core 2009-02-15 21:15:16 +01:00
sched_features.h Merge branch 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-03-30 17:17:35 -07:00
sched_idletask.c
sched_rt.c Merge branch 'core/percpu' into percpu-cpumask-x86-for-linus-2 2009-03-27 17:28:43 +01:00
sched_stats.h sched: remove unused fields from struct rq 2009-03-24 23:16:51 +01:00
seccomp.c x86-64: seccomp: fix 32/64 syscall hole 2009-03-02 15:41:30 -08:00
semaphore.c
signal.c fix ptrace slowness 2009-03-23 09:22:31 -07:00
smp.c
softirq.c Merge branch 'core/percpu' into percpu-cpumask-x86-for-linus-2 2009-03-27 17:28:43 +01:00
softlockup.c
spinlock.c
srcu.c
stacktrace.c
stop_machine.c cpumask: remove cpumask_t from core 2009-03-30 22:05:17 +10:30
sys.c sched: don't allow setuid to succeed if the user does not have rt bandwidth 2009-02-27 11:11:53 +01:00
sys_ni.c
sysctl.c mm: fix proc_dointvec_userhz_jiffies "breakage" 2009-04-01 08:59:13 -07:00
sysctl_check.c
taskstats.c
test_kprobes.c
time.c
timeconst.pl
timer.c Merge branch 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-03-30 17:17:35 -07:00
tracepoint.c
tsacct.c Fix fixpoint divide exception in acct_update_integrals 2009-03-09 08:13:35 -07:00
uid16.c
up.c
user.c Merge branch 'master' into next 2009-03-24 10:52:46 +11:00
user_namespace.c Fix recursive lock in free_uid()/free_user_ns() 2009-02-27 16:26:21 -08:00
utsname.c
utsname_sysctl.c
wait.c wait: prevent exclusive waiter starvation 2009-02-05 12:56:48 -08:00
workqueue.c cpumask: use new cpumask_ functions in core code. 2009-03-30 22:05:16 +10:30