Skip to content

Commit d80da76

Browse files
Kan Liangacmel
authored andcommitted
perf c2c: Add option to enable the LBR stitching approach
With the LBR stitching approach, the reconstructed LBR call stack can break the HW limitation. However, it may reconstruct invalid call stacks in some cases, e.g. exception handing such as setjmp/longjmp. Also, it may impact the processing time especially when the number of samples with stitched LBRs are huge. Add an option to enable the approach. Signed-off-by: Kan Liang <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pavel Gerasimov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Vitaly Slobodskoy <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
1 parent 13e0c84 commit d80da76

File tree

2 files changed

+23
-0
lines changed

2 files changed

+23
-0
lines changed

tools/perf/Documentation/perf-c2c.txt

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,17 @@ REPORT OPTIONS
111111
--display::
112112
Switch to HITM type (rmt, lcl) to display and sort on. Total HITMs as default.
113113

114+
--stitch-lbr::
115+
Show callgraph with stitched LBRs, which may have more complete
116+
callgraph. The perf.data file must have been obtained using
117+
perf c2c record --call-graph lbr.
118+
Disabled by default. In common cases with call stack overflows,
119+
it can recreate better call stacks than the default lbr call stack
120+
output. But this approach is not full proof. There can be cases
121+
where it creates incorrect call stacks from incorrect matches.
122+
The known limitations include exception handing such as
123+
setjmp/longjmp will have calls/returns not match.
124+
114125
C2C RECORD
115126
----------
116127
The perf c2c record command setup options related to HITM cacheline analysis

tools/perf/builtin-c2c.c

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,7 @@ struct perf_c2c {
9595
bool use_stdio;
9696
bool stats_only;
9797
bool symbol_full;
98+
bool stitch_lbr;
9899

99100
/* HITM shared clines stats */
100101
struct c2c_stats hitm_stats;
@@ -273,6 +274,9 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
273274
return -1;
274275
}
275276

277+
if (c2c.stitch_lbr)
278+
al.thread->lbr_stitch_enable = true;
279+
276280
ret = sample__resolve_callchain(sample, &callchain_cursor, NULL,
277281
evsel, &al, sysctl_perf_event_max_stack);
278282
if (ret)
@@ -2601,6 +2605,12 @@ static int setup_callchain(struct evlist *evlist)
26012605
}
26022606
}
26032607

2608+
if (c2c.stitch_lbr && (mode != CALLCHAIN_LBR)) {
2609+
ui__warning("Can't find LBR callchain. Switch off --stitch-lbr.\n"
2610+
"Please apply --call-graph lbr when recording.\n");
2611+
c2c.stitch_lbr = false;
2612+
}
2613+
26042614
callchain_param.record_mode = mode;
26052615
callchain_param.min_percent = 0;
26062616
return 0;
@@ -2752,6 +2762,8 @@ static int perf_c2c__report(int argc, const char **argv)
27522762
OPT_STRING('c', "coalesce", &coalesce, "coalesce fields",
27532763
"coalesce fields: pid,tid,iaddr,dso"),
27542764
OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
2765+
OPT_BOOLEAN(0, "stitch-lbr", &c2c.stitch_lbr,
2766+
"Enable LBR callgraph stitching approach"),
27552767
OPT_PARENT(c2c_options),
27562768
OPT_END()
27572769
};

0 commit comments

Comments
 (0)