Skip to content

Commit f207934

Browse files
Peter ZijlstraIngo Molnar
authored andcommitted
sched/fair: Align PELT windows between cfs_rq and its se
The PELT _sum values are a saw-tooth function, dropping on the decay edge and then growing back up again during the window. When these window-edges are not aligned between cfs_rq and se, we can have the situation where, for example, on dequeue, the se decays first. Its _sum values will be small(er), while the cfs_rq _sum values will still be on their way up. Because of this, the subtraction: cfs_rq->avg._sum -= se->avg._sum will result in a positive value. This will then, once the cfs_rq reaches an edge, translate into its _avg value jumping up. This is especially visible with the runnable_load bits, since they get added/subtracted a lot. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
1 parent 144d848 commit f207934

File tree

1 file changed

+31
-14
lines changed

1 file changed

+31
-14
lines changed

kernel/sched/fair.c

Lines changed: 31 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -716,13 +716,8 @@ void init_entity_runnable_average(struct sched_entity *se)
716716
{
717717
struct sched_avg *sa = &se->avg;
718718

719-
sa->last_update_time = 0;
720-
/*
721-
* sched_avg's period_contrib should be strictly less then 1024, so
722-
* we give it 1023 to make sure it is almost a period (1024us), and
723-
* will definitely be update (after enqueue).
724-
*/
725-
sa->period_contrib = 1023;
719+
memset(sa, 0, sizeof(*sa));
720+
726721
/*
727722
* Tasks are intialized with full load to be seen as heavy tasks until
728723
* they get a chance to stabilize to their real load level.
@@ -731,13 +726,9 @@ void init_entity_runnable_average(struct sched_entity *se)
731726
*/
732727
if (entity_is_task(se))
733728
sa->runnable_load_avg = sa->load_avg = scale_load_down(se->load.weight);
734-
sa->runnable_load_sum = sa->load_sum = LOAD_AVG_MAX;
735729

736-
/*
737-
* At this point, util_avg won't be used in select_task_rq_fair anyway
738-
*/
739-
sa->util_avg = 0;
740-
sa->util_sum = 0;
730+
se->runnable_weight = se->load.weight;
731+
741732
/* when this task enqueue'ed, it will contribute to its cfs_rq's load_avg */
742733
}
743734

@@ -785,7 +776,6 @@ void post_init_entity_util_avg(struct sched_entity *se)
785776
} else {
786777
sa->util_avg = cap;
787778
}
788-
sa->util_sum = sa->util_avg * LOAD_AVG_MAX;
789779
}
790780

791781
if (entity_is_task(se)) {
@@ -3632,7 +3622,34 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
36323622
*/
36333623
static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
36343624
{
3625+
u32 divider = LOAD_AVG_MAX - 1024 + cfs_rq->avg.period_contrib;
3626+
3627+
/*
3628+
* When we attach the @se to the @cfs_rq, we must align the decay
3629+
* window because without that, really weird and wonderful things can
3630+
* happen.
3631+
*
3632+
* XXX illustrate
3633+
*/
36353634
se->avg.last_update_time = cfs_rq->avg.last_update_time;
3635+
se->avg.period_contrib = cfs_rq->avg.period_contrib;
3636+
3637+
/*
3638+
* Hell(o) Nasty stuff.. we need to recompute _sum based on the new
3639+
* period_contrib. This isn't strictly correct, but since we're
3640+
* entirely outside of the PELT hierarchy, nobody cares if we truncate
3641+
* _sum a little.
3642+
*/
3643+
se->avg.util_sum = se->avg.util_avg * divider;
3644+
3645+
se->avg.load_sum = divider;
3646+
if (se_weight(se)) {
3647+
se->avg.load_sum =
3648+
div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se));
3649+
}
3650+
3651+
se->avg.runnable_load_sum = se->avg.load_sum;
3652+
36363653
enqueue_load_avg(cfs_rq, se);
36373654
cfs_rq->avg.util_avg += se->avg.util_avg;
36383655
cfs_rq->avg.util_sum += se->avg.util_sum;

0 commit comments

Comments
 (0)