Skip to content

Commit aacedf2

Browse files
Peter ZijlstraIngo Molnar
authored andcommitted
sched/core: Optimize try_to_wake_up() for local wakeups
Jens reported that significant performance can be had on some block workloads by special casing local wakeups. That is, wakeups on the current task before it schedules out. Given something like the normal wait pattern: for (;;) { set_current_state(TASK_UNINTERRUPTIBLE); if (cond) break; schedule(); } __set_current_state(TASK_RUNNING); Any wakeup (on this CPU) after set_current_state() and before schedule() would benefit from this. Normal wakeups take p->pi_lock, which serializes wakeups to the same task. By eliding that we gain concurrency on: - ttwu_stat(); we already had concurrency on rq stats, this now also brings it to task stats. -ENOCARE - tracepoints; it is now possible to get multiple instances of trace_sched_waking() (and possibly trace_sched_wakeup()) for the same task. Tracers will have to learn to cope. Furthermore, p->pi_lock is used by set_special_state(), to order against TASK_RUNNING stores from other CPUs. But since this is strictly CPU local, we don't need the lock, and set_special_state()'s disabling of IRQs is sufficient. After the normal wakeup takes p->pi_lock it issues smp_mb__after_spinlock(), in order to ensure the woken task must observe prior stores before we observe the p->state. If this is CPU local, this will be satisfied with a compiler barrier, and we rely on try_to_wake_up() being a funcation call, which implies such. Since, when 'p == current', 'p->on_rq' must be true, the normal wakeup would continue into the ttwu_remote() branch, which normally is concerned with exactly this wakeup scenario, except from a remote CPU. IOW we're waking a task that is still running. In this case, we can trivially avoid taking rq->lock, all that's left from this is to set p->state. This then yields an extremely simple and fast path for 'p == current'. Reported-by: Jens Axboe <[email protected]> Tested-by: Jens Axboe <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Qian Cai <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
1 parent 509466b commit aacedf2

File tree

1 file changed

+29
-5
lines changed

1 file changed

+29
-5
lines changed

kernel/sched/core.c

Lines changed: 29 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1991,6 +1991,29 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
19911991
unsigned long flags;
19921992
int cpu, success = 0;
19931993

1994+
if (p == current) {
1995+
/*
1996+
* We're waking current, this means 'p->on_rq' and 'task_cpu(p)
1997+
* == smp_processor_id()'. Together this means we can special
1998+
* case the whole 'p->on_rq && ttwu_remote()' case below
1999+
* without taking any locks.
2000+
*
2001+
* In particular:
2002+
* - we rely on Program-Order guarantees for all the ordering,
2003+
* - we're serialized against set_special_state() by virtue of
2004+
* it disabling IRQs (this allows not taking ->pi_lock).
2005+
*/
2006+
if (!(p->state & state))
2007+
return false;
2008+
2009+
success = 1;
2010+
cpu = task_cpu(p);
2011+
trace_sched_waking(p);
2012+
p->state = TASK_RUNNING;
2013+
trace_sched_wakeup(p);
2014+
goto out;
2015+
}
2016+
19942017
/*
19952018
* If we are going to wake up a thread waiting for CONDITION we
19962019
* need to ensure that CONDITION=1 done by the caller can not be
@@ -2000,7 +2023,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
20002023
raw_spin_lock_irqsave(&p->pi_lock, flags);
20012024
smp_mb__after_spinlock();
20022025
if (!(p->state & state))
2003-
goto out;
2026+
goto unlock;
20042027

20052028
trace_sched_waking(p);
20062029

@@ -2030,7 +2053,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
20302053
*/
20312054
smp_rmb();
20322055
if (p->on_rq && ttwu_remote(p, wake_flags))
2033-
goto stat;
2056+
goto unlock;
20342057

20352058
#ifdef CONFIG_SMP
20362059
/*
@@ -2090,10 +2113,11 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
20902113
#endif /* CONFIG_SMP */
20912114

20922115
ttwu_queue(p, cpu, wake_flags);
2093-
stat:
2094-
ttwu_stat(p, cpu, wake_flags);
2095-
out:
2116+
unlock:
20962117
raw_spin_unlock_irqrestore(&p->pi_lock, flags);
2118+
out:
2119+
if (success)
2120+
ttwu_stat(p, cpu, wake_flags);
20972121

20982122
return success;
20992123
}

0 commit comments

Comments
 (0)