Skip to content

Commit 0f4cd76

Browse files
kimphillamdPeter Zijlstra
authored andcommitted
perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops
When counting dispatched micro-ops with cnt_ctl=1, in order to prevent sample bias, IBS hardware preloads the least significant 7 bits of current count (IbsOpCurCnt) with random values, such that, after the interrupt is handled and counting resumes, the next sample taken will be slightly perturbed. The current count bitfield is in the IBS execution control h/w register, alongside the maximum count field. Currently, the IBS driver writes that register with the maximum count, leaving zeroes to fill the current count field, thereby overwriting the random bits the hardware preloaded for itself. Fix the driver to actually retain and carry those random bits from the read of the IBS control register, through to its write, instead of overwriting the lower current count bits with zeroes. Tested with: perf record -c 100001 -e ibs_op/cnt_ctl=1/pp -a -C 0 taskset -c 0 <workload> 'perf annotate' output before: 15.70 65: addsd %xmm0,%xmm1 17.30 add $0x1,%rax 15.88 cmp %rdx,%rax je 82 17.32 72: test $0x1,%al jne 7c 7.52 movapd %xmm1,%xmm0 5.90 jmp 65 8.23 7c: sqrtsd %xmm1,%xmm0 12.15 jmp 65 'perf annotate' output after: 16.63 65: addsd %xmm0,%xmm1 16.82 add $0x1,%rax 16.81 cmp %rdx,%rax je 82 16.69 72: test $0x1,%al jne 7c 8.30 movapd %xmm1,%xmm0 8.13 jmp 65 8.24 7c: sqrtsd %xmm1,%xmm0 8.39 jmp 65 Tested on Family 15h and 17h machines. Machines prior to family 10h Rev. C don't have the RDWROPCNT capability, and have the IbsOpCurCnt bitfield reserved, so this patch shouldn't affect their operation. It is unknown why commit db98c5f ("perf/x86: Implement 64-bit counter support for IBS") ignored the lower 4 bits of the IbsOpCurCnt field; the number of preloaded random bits has always been 7, AFAICT. Signed-off-by: Kim Phillips <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: "Arnaldo Carvalho de Melo" <[email protected]> Cc: <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "Borislav Petkov" <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: "Namhyung Kim" <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
1 parent 44d3bbb commit 0f4cd76

File tree

2 files changed

+18
-7
lines changed

2 files changed

+18
-7
lines changed

arch/x86/events/amd/ibs.c

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -661,10 +661,17 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
661661

662662
throttle = perf_event_overflow(event, &data, &regs);
663663
out:
664-
if (throttle)
664+
if (throttle) {
665665
perf_ibs_stop(event, 0);
666-
else
667-
perf_ibs_enable_event(perf_ibs, hwc, period >> 4);
666+
} else {
667+
period >>= 4;
668+
669+
if ((ibs_caps & IBS_CAPS_RDWROPCNT) &&
670+
(*config & IBS_OP_CNT_CTL))
671+
period |= *config & IBS_OP_CUR_CNT_RAND;
672+
673+
perf_ibs_enable_event(perf_ibs, hwc, period);
674+
}
668675

669676
perf_event_update_userpage(event);
670677

arch/x86/include/asm/perf_event.h

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -252,16 +252,20 @@ struct pebs_lbr {
252252
#define IBSCTL_LVT_OFFSET_VALID (1ULL<<8)
253253
#define IBSCTL_LVT_OFFSET_MASK 0x0F
254254

255-
/* ibs fetch bits/masks */
255+
/* IBS fetch bits/masks */
256256
#define IBS_FETCH_RAND_EN (1ULL<<57)
257257
#define IBS_FETCH_VAL (1ULL<<49)
258258
#define IBS_FETCH_ENABLE (1ULL<<48)
259259
#define IBS_FETCH_CNT 0xFFFF0000ULL
260260
#define IBS_FETCH_MAX_CNT 0x0000FFFFULL
261261

262-
/* ibs op bits/masks */
263-
/* lower 4 bits of the current count are ignored: */
264-
#define IBS_OP_CUR_CNT (0xFFFF0ULL<<32)
262+
/*
263+
* IBS op bits/masks
264+
* The lower 7 bits of the current count are random bits
265+
* preloaded by hardware and ignored in software
266+
*/
267+
#define IBS_OP_CUR_CNT (0xFFF80ULL<<32)
268+
#define IBS_OP_CUR_CNT_RAND (0x0007FULL<<32)
265269
#define IBS_OP_CNT_CTL (1ULL<<19)
266270
#define IBS_OP_VAL (1ULL<<18)
267271
#define IBS_OP_ENABLE (1ULL<<17)

0 commit comments

Comments
 (0)