Skip to content

Commit fe1f071

Browse files
Peter Newmanbp3tk0v
authored andcommitted
x86/resctrl: Fix task CLOSID/RMID update race
When the user moves a running task to a new rdtgroup using the task's file interface or by deleting its rdtgroup, the resulting change in CLOSID/RMID must be immediately propagated to the PQR_ASSOC MSR on the task(s) CPUs. x86 allows reordering loads with prior stores, so if the task starts running between a task_curr() check that the CPU hoisted before the stores in the CLOSID/RMID update then it can start running with the old CLOSID/RMID until it is switched again because __rdtgroup_move_task() failed to determine that it needs to be interrupted to obtain the new CLOSID/RMID. Refer to the diagram below: CPU 0 CPU 1 ----- ----- __rdtgroup_move_task(): curr <- t1->cpu->rq->curr __schedule(): rq->curr <- t1 resctrl_sched_in(): t1->{closid,rmid} -> {1,1} t1->{closid,rmid} <- {2,2} if (curr == t1) // false IPI(t1->cpu) A similar race impacts rdt_move_group_tasks(), which updates tasks in a deleted rdtgroup. In both cases, use smp_mb() to order the task_struct::{closid,rmid} stores before the loads in task_curr(). In particular, in the rdt_move_group_tasks() case, simply execute an smp_mb() on every iteration with a matching task. It is possible to use a single smp_mb() in rdt_move_group_tasks(), but this would require two passes and a means of remembering which task_structs were updated in the first loop. However, benchmarking results below showed too little performance impact in the simple approach to justify implementing the two-pass approach. Times below were collected using `perf stat` to measure the time to remove a group containing a 1600-task, parallel workload. CPU: Intel(R) Xeon(R) Platinum P-8136 CPU @ 2.00GHz (112 threads) # mkdir /sys/fs/resctrl/test # echo $$ > /sys/fs/resctrl/test/tasks # perf bench sched messaging -g 40 -l 100000 task-clock time ranges collected using: # perf stat rmdir /sys/fs/resctrl/test Baseline: 1.54 - 1.60 ms smp_mb() every matching task: 1.57 - 1.67 ms [ bp: Massage commit message. ] Fixes: ae28d1a ("x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR") Fixes: 0efc89b ("x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount") Signed-off-by: Peter Newman <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Reviewed-by: Babu Moger <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected]
1 parent 90b926e commit fe1f071

File tree

1 file changed

+11
-1
lines changed

1 file changed

+11
-1
lines changed

arch/x86/kernel/cpu/resctrl/rdtgroup.c

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -580,8 +580,10 @@ static int __rdtgroup_move_task(struct task_struct *tsk,
580580
/*
581581
* Ensure the task's closid and rmid are written before determining if
582582
* the task is current that will decide if it will be interrupted.
583+
* This pairs with the full barrier between the rq->curr update and
584+
* resctrl_sched_in() during context switch.
583585
*/
584-
barrier();
586+
smp_mb();
585587

586588
/*
587589
* By now, the task's closid and rmid are set. If the task is current
@@ -2401,6 +2403,14 @@ static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to,
24012403
WRITE_ONCE(t->closid, to->closid);
24022404
WRITE_ONCE(t->rmid, to->mon.rmid);
24032405

2406+
/*
2407+
* Order the closid/rmid stores above before the loads
2408+
* in task_curr(). This pairs with the full barrier
2409+
* between the rq->curr update and resctrl_sched_in()
2410+
* during context switch.
2411+
*/
2412+
smp_mb();
2413+
24042414
/*
24052415
* If the task is on a CPU, set the CPU in the mask.
24062416
* The detection is inaccurate as tasks might move or

0 commit comments

Comments
 (0)