Skip to content

Commit 7a96a84

Browse files
anna-marialxKAGA-KOKO
authored andcommitted
timers/migration: Return early on deactivation
Commit 4b6f4c5 ("timer/migration: Remove buggy early return on deactivation") removed the logic to return early in tmigr_update_events() on deactivation. With this the problem with a not properly updated first global event in a hierarchy containing only a single group was fixed. But when having a look at this code path with a hierarchy with more than a single level, now unnecessary work is done (example is partially copied from the message of the commit mentioned above): [GRP1:0] migrator = GRP0:0 active = GRP0:0 nextevt = T0:0i, T0:1 / \ [GRP0:0] [GRP0:1] migrator = 0 migrator = NONE active = 0 active = NONE nextevt = T0i, T1 nextevt = T2 / \ / \ 0 (T0i) 1 (T1) 2 (T2) 3 active idle idle idle 0) CPU 0 is active thus its event is ignored (the letter 'i') and so are upper levels' events. CPU 1 is idle and has the timer T1 enqueued. CPU 2 also has a timer. The expiry order is T0 (ignored) < T1 < T2 [GRP1:0] migrator = GRP0:0 active = GRP0:0 nextevt = T0:0i, T0:1 / \ [GRP0:0] [GRP0:1] migrator = NONE migrator = NONE active = NONE active = NONE nextevt = T1 nextevt = T2 / \ / \ 0 (T0i) 1 (T1) 2 (T2) 3 idle idle idle idle 1) CPU 0 goes idle without global event queued. Therefore KTIME_MAX is pushed as its next expiry and its own event kept as "ignore". Without this early return the following steps happen in tmigr_update_events() when child = null and group = GRP0:0 : lock(GRP0:0->lock); timerqueue_del(GRP0:0, T0i); unlock(GRP0:0->lock); [GRP1:0] migrator = NONE active = NONE nextevt = T0:0, T0:1 / \ [GRP0:0] [GRP0:1] migrator = NONE migrator = NONE active = NONE active = NONE nextevt = T1 nextevt = T2 / \ / \ 0 (T0i) 1 (T1) 2 (T2) 3 idle idle idle idle 2) The change now propagates up to the top. Then tmigr_update_events() updates the group event of GRP0:0 and executes the following steps (child = GRP0:0 and group = GRP0:0): lock(GRP0:0->lock); lock(GRP1:0->lock); evt = tmigr_next_groupevt(GRP0:0); -> this removes the ignored events in GRP0:0 ... update GRP1:0 group event and timerqueue ... unlock(GRP1:0->lock); unlock(GRP0:0->lock); So the dance in 1) with locking the GRP0:0->lock and removing the T0i from the timerqueue is redundand as this is done nevertheless in 2) when tmigr_next_groupevt(GRP0:0) is executed. Revert commit 4b6f4c5 ("timer/migration: Remove buggy early return on deactivation") and add a condition into return path to skip the return only, when hierarchy contains a single group. Adapt comments accordingly. Fixes: 4b6f4c5 ("timer/migration: Remove buggy early return on deactivation") Signed-off-by: Anna-Maria Behnsen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lore.kernel.org/r/87cyr49on2.fsf@somnus
1 parent 61f7fdf commit 7a96a84

File tree

1 file changed

+27
-0
lines changed

1 file changed

+27
-0
lines changed

kernel/time/timer_migration.c

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -751,6 +751,33 @@ bool tmigr_update_events(struct tmigr_group *group, struct tmigr_group *child,
751751

752752
first_childevt = evt = data->evt;
753753

754+
/*
755+
* Walking the hierarchy is required in any case when a
756+
* remote expiry was done before. This ensures to not lose
757+
* already queued events in non active groups (see section
758+
* "Required event and timerqueue update after a remote
759+
* expiry" in the documentation at the top).
760+
*
761+
* The two call sites which are executed without a remote expiry
762+
* before, are not prevented from propagating changes through
763+
* the hierarchy by the return:
764+
* - When entering this path by tmigr_new_timer(), @evt->ignore
765+
* is never set.
766+
* - tmigr_inactive_up() takes care of the propagation by
767+
* itself and ignores the return value. But an immediate
768+
* return is possible if there is a parent, sparing group
769+
* locking at this level, because the upper walking call to
770+
* the parent will take care about removing this event from
771+
* within the group and update next_expiry accordingly.
772+
*
773+
* However if there is no parent, ie: the hierarchy has only a
774+
* single level so @group is the top level group, make sure the
775+
* first event information of the group is updated properly and
776+
* also handled properly, so skip this fast return path.
777+
*/
778+
if (evt->ignore && !remote && group->parent)
779+
return true;
780+
754781
raw_spin_lock(&group->lock);
755782

756783
childstate.state = 0;

0 commit comments

Comments
 (0)