Skip to content

Commit d5421ea

Browse files
committed
hrtimer: Reset hrtimer cpu base proper on CPU hotplug
The hrtimer interrupt code contains a hang detection and mitigation mechanism, which prevents that a long delayed hrtimer interrupt causes a continous retriggering of interrupts which prevent the system from making progress. If a hang is detected then the timer hardware is programmed with a certain delay into the future and a flag is set in the hrtimer cpu base which prevents newly enqueued timers from reprogramming the timer hardware prior to the chosen delay. The subsequent hrtimer interrupt after the delay clears the flag and resumes normal operation. If such a hang happens in the last hrtimer interrupt before a CPU is unplugged then the hang_detected flag is set and stays that way when the CPU is plugged in again. At that point the timer hardware is not armed and it cannot be armed because the hang_detected flag is still active, so nothing clears that flag. As a consequence the CPU does not receive hrtimer interrupts and no timers expire on that CPU which results in RCU stalls and other malfunctions. Clear the flag along with some other less critical members of the hrtimer cpu base to ensure starting from a clean state when a CPU is plugged in. Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the root cause of that hard to reproduce heisenbug. Once understood it's trivial and certainly justifies a brown paperbag. Fixes: 41d2e49 ("hrtimer: Tune hrtimer_interrupt hang logic") Reported-by: Paul E. McKenney <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sebastian Sewior <[email protected]> Cc: Anna-Maria Gleixner <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
1 parent 993ca20 commit d5421ea

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

kernel/time/hrtimer.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -655,7 +655,9 @@ static void hrtimer_reprogram(struct hrtimer *timer,
655655
static inline void hrtimer_init_hres(struct hrtimer_cpu_base *base)
656656
{
657657
base->expires_next = KTIME_MAX;
658+
base->hang_detected = 0;
658659
base->hres_active = 0;
660+
base->next_timer = NULL;
659661
}
660662

661663
/*
@@ -1589,6 +1591,7 @@ int hrtimers_prepare_cpu(unsigned int cpu)
15891591
timerqueue_init_head(&cpu_base->clock_base[i].active);
15901592
}
15911593

1594+
cpu_base->active_bases = 0;
15921595
cpu_base->cpu = cpu;
15931596
hrtimer_init_hres(cpu_base);
15941597
return 0;

0 commit comments

Comments
 (0)