Skip to content

Commit 5768402

Browse files
virtuosoIngo Molnar
authored andcommitted
perf/ring_buffer: Use high order allocations for AUX buffers optimistically
Currently, the AUX buffer allocator will use high-order allocations for PMUs that don't support hardware scatter-gather chaining to ensure large contiguous blocks of pages, and always use an array of single pages otherwise. There is, however, a tangible performance benefit in using larger chunks of contiguous memory even in the latter case, that comes from not having to fetch the next page's address at every page boundary. In particular, a task running under Intel PT on an Atom CPU shows 1.5%-2% less runtime penalty with a single multi-page output region in snapshot mode (no PMI) than with multiple single-page output regions, from ~6% down to ~4%. For the snapshot mode it does make a difference as it is intended to run over long periods of time. For this reason, change the allocation policy to always optimistically start with the highest possible order when allocating pages for the AUX buffer, desceding until the allocation succeeds or order zero allocation fails. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
1 parent 6ea98b4 commit 5768402

File tree

1 file changed

+15
-17
lines changed

1 file changed

+15
-17
lines changed

kernel/events/ring_buffer.c

Lines changed: 15 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -598,29 +598,27 @@ int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,
598598
{
599599
bool overwrite = !(flags & RING_BUFFER_WRITABLE);
600600
int node = (event->cpu == -1) ? -1 : cpu_to_node(event->cpu);
601-
int ret = -ENOMEM, max_order = 0;
601+
int ret = -ENOMEM, max_order;
602602

603603
if (!has_aux(event))
604604
return -EOPNOTSUPP;
605605

606-
if (event->pmu->capabilities & PERF_PMU_CAP_AUX_NO_SG) {
607-
/*
608-
* We need to start with the max_order that fits in nr_pages,
609-
* not the other way around, hence ilog2() and not get_order.
610-
*/
611-
max_order = ilog2(nr_pages);
606+
/*
607+
* We need to start with the max_order that fits in nr_pages,
608+
* not the other way around, hence ilog2() and not get_order.
609+
*/
610+
max_order = ilog2(nr_pages);
612611

613-
/*
614-
* PMU requests more than one contiguous chunks of memory
615-
* for SW double buffering
616-
*/
617-
if ((event->pmu->capabilities & PERF_PMU_CAP_AUX_SW_DOUBLEBUF) &&
618-
!overwrite) {
619-
if (!max_order)
620-
return -EINVAL;
612+
/*
613+
* PMU requests more than one contiguous chunks of memory
614+
* for SW double buffering
615+
*/
616+
if ((event->pmu->capabilities & PERF_PMU_CAP_AUX_SW_DOUBLEBUF) &&
617+
!overwrite) {
618+
if (!max_order)
619+
return -EINVAL;
621620

622-
max_order--;
623-
}
621+
max_order--;
624622
}
625623

626624
rb->aux_pages = kcalloc_node(nr_pages, sizeof(void *), GFP_KERNEL,

0 commit comments

Comments
 (0)