Skip to content

Commit 91dabf3

Browse files
author
Peter Zijlstra
committed
sched: Fix race in task_call_func()
There is a very narrow race between schedule() and task_call_func(). CPU0 CPU1 __schedule() rq_lock(); prev_state = READ_ONCE(prev->__state); if (... && prev_state) { deactivate_tasl(rq, prev, ...) prev->on_rq = 0; task_call_func() raw_spin_lock_irqsave(p->pi_lock); state = READ_ONCE(p->__state); smp_rmb(); if (... || p->on_rq) // false!!! rq = __task_rq_lock() ret = func(); next = pick_next_task(); rq = context_switch(prev, next) prepare_lock_switch() spin_release(&__rq_lockp(rq)->dep_map...) So while the task is on it's way out, it still holds rq->lock for a little while, and right then task_call_func() comes in and figures it doesn't need rq->lock anymore (because the task is already dequeued -- but still running there) and then the __set_task_frozen() thing observes it's holding rq->lock and yells murder. Avoid this by waiting for p->on_cpu to get cleared, which guarantees the task is fully finished on the old CPU. ( While arguably the fixes tag is 'wrong' -- none of the previous task_call_func() users appears to care for this case. ) Fixes: f5d39b0 ("freezer,sched: Rewrite core freezer logic") Reported-by: Ville Syrjälä <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Ville Syrjälä <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
1 parent 448dca8 commit 91dabf3

File tree

1 file changed

+35
-17
lines changed

1 file changed

+35
-17
lines changed

kernel/sched/core.c

Lines changed: 35 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -4200,6 +4200,40 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
42004200
return success;
42014201
}
42024202

4203+
static bool __task_needs_rq_lock(struct task_struct *p)
4204+
{
4205+
unsigned int state = READ_ONCE(p->__state);
4206+
4207+
/*
4208+
* Since pi->lock blocks try_to_wake_up(), we don't need rq->lock when
4209+
* the task is blocked. Make sure to check @state since ttwu() can drop
4210+
* locks at the end, see ttwu_queue_wakelist().
4211+
*/
4212+
if (state == TASK_RUNNING || state == TASK_WAKING)
4213+
return true;
4214+
4215+
/*
4216+
* Ensure we load p->on_rq after p->__state, otherwise it would be
4217+
* possible to, falsely, observe p->on_rq == 0.
4218+
*
4219+
* See try_to_wake_up() for a longer comment.
4220+
*/
4221+
smp_rmb();
4222+
if (p->on_rq)
4223+
return true;
4224+
4225+
#ifdef CONFIG_SMP
4226+
/*
4227+
* Ensure the task has finished __schedule() and will not be referenced
4228+
* anymore. Again, see try_to_wake_up() for a longer comment.
4229+
*/
4230+
smp_rmb();
4231+
smp_cond_load_acquire(&p->on_cpu, !VAL);
4232+
#endif
4233+
4234+
return false;
4235+
}
4236+
42034237
/**
42044238
* task_call_func - Invoke a function on task in fixed state
42054239
* @p: Process for which the function is to be invoked, can be @current.
@@ -4217,28 +4251,12 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
42174251
int task_call_func(struct task_struct *p, task_call_f func, void *arg)
42184252
{
42194253
struct rq *rq = NULL;
4220-
unsigned int state;
42214254
struct rq_flags rf;
42224255
int ret;
42234256

42244257
raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
42254258

4226-
state = READ_ONCE(p->__state);
4227-
4228-
/*
4229-
* Ensure we load p->on_rq after p->__state, otherwise it would be
4230-
* possible to, falsely, observe p->on_rq == 0.
4231-
*
4232-
* See try_to_wake_up() for a longer comment.
4233-
*/
4234-
smp_rmb();
4235-
4236-
/*
4237-
* Since pi->lock blocks try_to_wake_up(), we don't need rq->lock when
4238-
* the task is blocked. Make sure to check @state since ttwu() can drop
4239-
* locks at the end, see ttwu_queue_wakelist().
4240-
*/
4241-
if (state == TASK_RUNNING || state == TASK_WAKING || p->on_rq)
4259+
if (__task_needs_rq_lock(p))
42424260
rq = __task_rq_lock(p, &rf);
42434261

42444262
/*

0 commit comments

Comments
 (0)