Skip to content

Commit f99a993

Browse files
dzickusrhtorvalds
authored andcommitted
kernel/watchdog.c: always return NOTIFY_OK during cpu up/down events
This patch addresses a couple of problems. One was the case when the hardlockup failed to start, it also failed to start the softlockup. There were valid cases when the hardlockup shouldn't start and that shouldn't block the softlockup (no lapic, bios controls perf counters). The second problem was when the hardlockup failed to start on boxes (from a no lapic or bios controlled perf counter case), it reported failure to the cpu notifier chain. This blocked the notifier from continuing to start other more critical pieces of cpu bring-up (in our case based on a 2.6.32 fork, it was the mce). As a result, during soft cpu online/offline testing, the system would panic when a cpu was offlined because the cpu notifier would succeed in processing a watchdog disable cpu event and would panic in the mce case as a result of un-initialized variables from a never executed cpu up event. I realized the hardlockup/softlockup cases are really just debugging aids and should never impede the progress of a cpu up/down event. Therefore I modified the code to always return NOTIFY_OK and instead rely on printks to inform the user of problems. Signed-off-by: Don Zickus <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Reviewed-by: WANG Cong <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent fef2c9b commit f99a993

File tree

1 file changed

+16
-6
lines changed

1 file changed

+16
-6
lines changed

kernel/watchdog.c

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -418,27 +418,31 @@ static int watchdog_prepare_cpu(int cpu)
418418
static int watchdog_enable(int cpu)
419419
{
420420
struct task_struct *p = per_cpu(softlockup_watchdog, cpu);
421-
int err;
421+
int err = 0;
422422

423423
/* enable the perf event */
424424
err = watchdog_nmi_enable(cpu);
425-
if (err)
426-
return err;
425+
426+
/* Regardless of err above, fall through and start softlockup */
427427

428428
/* create the watchdog thread */
429429
if (!p) {
430430
p = kthread_create(watchdog, (void *)(unsigned long)cpu, "watchdog/%d", cpu);
431431
if (IS_ERR(p)) {
432432
printk(KERN_ERR "softlockup watchdog for %i failed\n", cpu);
433-
return PTR_ERR(p);
433+
if (!err)
434+
/* if hardlockup hasn't already set this */
435+
err = PTR_ERR(p);
436+
goto out;
434437
}
435438
kthread_bind(p, cpu);
436439
per_cpu(watchdog_touch_ts, cpu) = 0;
437440
per_cpu(softlockup_watchdog, cpu) = p;
438441
wake_up_process(p);
439442
}
440443

441-
return 0;
444+
out:
445+
return err;
442446
}
443447

444448
static void watchdog_disable(int cpu)
@@ -550,7 +554,13 @@ cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
550554
break;
551555
#endif /* CONFIG_HOTPLUG_CPU */
552556
}
553-
return notifier_from_errno(err);
557+
558+
/*
559+
* hardlockup and softlockup are not important enough
560+
* to block cpu bring up. Just always succeed and
561+
* rely on printk output to flag problems.
562+
*/
563+
return NOTIFY_OK;
554564
}
555565

556566
static struct notifier_block __cpuinitdata cpu_nfb = {

0 commit comments

Comments
 (0)