Skip to content

Commit a1150c2

Browse files
liu-song-6Ingo Molnar
authored andcommitted
perf/core: Fix group scheduling with mixed hw and sw events
When hw and sw events are mixed in the same group, they are all attached to the hw perf_event_context. This sometimes requires moving group of perf_event to a different context. We found a bug in how the kernel handles this, for example if we do: perf stat -e '{faults,ref-cycles,faults}' -I 1000 1.005591180 1,297 faults 1.005591180 457,476,576 ref-cycles 1.005591180 <not supported> faults First, sw event "faults" is attached to the sw context, and becomes the group leader. Then, hw event "ref-cycles" is attached, so both events are moved to the hw context. Last, another sw "faults" tries to attach, but it fails because of mismatch between the new target ctx (from sw pmu) and the group_leader's ctx (hw context, same as ref-cycles). The broken condition is: group_leader is sw event; group_leader is on hw context; add a sw event to the group. Fix this scenario by checking group_leader's context (instead of just event type). If group_leader is on hw context, use the ->pmu of this context to look up context for the new event. Signed-off-by: Song Liu <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Fixes: b04243e ("perf: Complete software pmu grouping") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
1 parent bd9c67a commit a1150c2

File tree

2 files changed

+19
-10
lines changed

2 files changed

+19
-10
lines changed

include/linux/perf_event.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1016,6 +1016,14 @@ static inline int is_software_event(struct perf_event *event)
10161016
return event->event_caps & PERF_EV_CAP_SOFTWARE;
10171017
}
10181018

1019+
/*
1020+
* Return 1 for event in sw context, 0 for event in hw context
1021+
*/
1022+
static inline int in_software_context(struct perf_event *event)
1023+
{
1024+
return event->ctx->pmu->task_ctx_nr == perf_sw_context;
1025+
}
1026+
10191027
extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX];
10201028

10211029
extern void ___perf_sw_event(u32, u64, struct pt_regs *, u64);

kernel/events/core.c

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -10521,19 +10521,20 @@ SYSCALL_DEFINE5(perf_event_open,
1052110521
if (pmu->task_ctx_nr == perf_sw_context)
1052210522
event->event_caps |= PERF_EV_CAP_SOFTWARE;
1052310523

10524-
if (group_leader &&
10525-
(is_software_event(event) != is_software_event(group_leader))) {
10526-
if (is_software_event(event)) {
10524+
if (group_leader) {
10525+
if (is_software_event(event) &&
10526+
!in_software_context(group_leader)) {
1052710527
/*
10528-
* If event and group_leader are not both a software
10529-
* event, and event is, then group leader is not.
10528+
* If the event is a sw event, but the group_leader
10529+
* is on hw context.
1053010530
*
10531-
* Allow the addition of software events to !software
10532-
* groups, this is safe because software events never
10533-
* fail to schedule.
10531+
* Allow the addition of software events to hw
10532+
* groups, this is safe because software events
10533+
* never fail to schedule.
1053410534
*/
10535-
pmu = group_leader->pmu;
10536-
} else if (is_software_event(group_leader) &&
10535+
pmu = group_leader->ctx->pmu;
10536+
} else if (!is_software_event(event) &&
10537+
is_software_event(group_leader) &&
1053710538
(group_leader->group_caps & PERF_EV_CAP_SOFTWARE)) {
1053810539
/*
1053910540
* In case the group is a pure software group, and we

0 commit comments

Comments
 (0)