Skip to content

Commit 724a868

Browse files
Peter ZijlstraKAGA-KOKO
authored andcommitted
smp/hotplug: Callback vs state-machine consistency
While the generic callback functions have an 'int' return and thus appear to be allowed to return error, this is not true for all states. Specifically, what used to be STARTING/DYING are ran with IRQs disabled from critical parts of CPU bringup/teardown and are not allowed to fail. Add WARNs to enforce this rule. But since some callbacks are indeed allowed to fail, we have the situation where a state-machine rollback encounters a failure, in this case we're stuck, we can't go forward and we can't go back. Also add a WARN for that case. AFAICT this is a fundamental 'problem' with no real obvious solution. We want the 'prepare' callbacks to allow failure on either up or down. Typically on prepare-up this would be things like -ENOMEM from resource allocations, and the typical usage in prepare-down would be something like -EBUSY to avoid CPUs being taken away. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
1 parent 4dddfb5 commit 724a868

File tree

1 file changed

+22
-4
lines changed

1 file changed

+22
-4
lines changed

kernel/cpu.c

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -202,7 +202,14 @@ static int cpuhp_invoke_callback(unsigned int cpu, enum cpuhp_state state,
202202
hlist_for_each(node, &step->list) {
203203
if (!cnt--)
204204
break;
205-
cbm(cpu, node);
205+
206+
trace_cpuhp_multi_enter(cpu, st->target, state, cbm, node);
207+
ret = cbm(cpu, node);
208+
trace_cpuhp_exit(cpu, st->state, state, ret);
209+
/*
210+
* Rollback must not fail,
211+
*/
212+
WARN_ON_ONCE(ret);
206213
}
207214
return ret;
208215
}
@@ -659,6 +666,7 @@ static int take_cpu_down(void *_param)
659666
struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
660667
enum cpuhp_state target = max((int)st->target, CPUHP_AP_OFFLINE);
661668
int err, cpu = smp_processor_id();
669+
int ret;
662670

663671
/* Ensure this CPU doesn't handle any more interrupts. */
664672
err = __cpu_disable();
@@ -672,8 +680,13 @@ static int take_cpu_down(void *_param)
672680
WARN_ON(st->state != CPUHP_TEARDOWN_CPU);
673681
st->state--;
674682
/* Invoke the former CPU_DYING callbacks */
675-
for (; st->state > target; st->state--)
676-
cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
683+
for (; st->state > target; st->state--) {
684+
ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
685+
/*
686+
* DYING must not fail!
687+
*/
688+
WARN_ON_ONCE(ret);
689+
}
677690

678691
/* Give up timekeeping duties */
679692
tick_handover_do_timer();
@@ -876,11 +889,16 @@ void notify_cpu_starting(unsigned int cpu)
876889
{
877890
struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
878891
enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE);
892+
int ret;
879893

880894
rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
881895
while (st->state < target) {
882896
st->state++;
883-
cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
897+
ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
898+
/*
899+
* STARTING must not fail!
900+
*/
901+
WARN_ON_ONCE(ret);
884902
}
885903
}
886904

0 commit comments

Comments
 (0)