Skip to content

Commit 13e0c84

Browse files
Kan Liangacmel
authored andcommitted
perf top: Add option to enable the LBR stitching approach
With the LBR stitching approach, the reconstructed LBR call stack can break the HW limitation. However, it may reconstruct invalid call stacks in some cases, e.g. exception handing such as setjmp/longjmp. Also, it may impact the processing time especially when the number of samples with stitched LBRs are huge. Add an option to enable the approach. The option must be used with --call-graph lbr. Signed-off-by: Kan Liang <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pavel Gerasimov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Vitaly Slobodskoy <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
1 parent 680d125 commit 13e0c84

File tree

3 files changed

+21
-0
lines changed

3 files changed

+21
-0
lines changed

tools/perf/Documentation/perf-top.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -319,6 +319,15 @@ Default is to monitor all CPUS.
319319
go straight to the histogram browser, just like 'perf top' with no events
320320
explicitely specified does.
321321

322+
--stitch-lbr::
323+
Show callgraph with stitched LBRs, which may have more complete
324+
callgraph. The option must be used with --call-graph lbr recording.
325+
Disabled by default. In common cases with call stack overflows,
326+
it can recreate better call stacks than the default lbr call stack
327+
output. But this approach is not full proof. There can be cases
328+
where it creates incorrect call stacks from incorrect matches.
329+
The known limitations include exception handing such as
330+
setjmp/longjmp will have calls/returns not match.
322331

323332
INTERACTIVE PROMPTING KEYS
324333
--------------------------

tools/perf/builtin-top.c

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
#include "util/map.h"
3434
#include "util/mmap.h"
3535
#include "util/session.h"
36+
#include "util/thread.h"
3637
#include "util/symbol.h"
3738
#include "util/synthetic-events.h"
3839
#include "util/top.h"
@@ -775,6 +776,9 @@ static void perf_event__process_sample(struct perf_tool *tool,
775776
if (machine__resolve(machine, &al, sample) < 0)
776777
return;
777778

779+
if (top->stitch_lbr)
780+
al.thread->lbr_stitch_enable = true;
781+
778782
if (!machine->kptr_restrict_warned &&
779783
symbol_conf.kptr_restrict &&
780784
al.cpumode == PERF_RECORD_MISC_KERNEL) {
@@ -1571,6 +1575,8 @@ int cmd_top(int argc, const char **argv)
15711575
"Sort the output by the event at the index n in group. "
15721576
"If n is invalid, sort by the first event. "
15731577
"WARNING: should be used on grouped events."),
1578+
OPT_BOOLEAN(0, "stitch-lbr", &top.stitch_lbr,
1579+
"Enable LBR callgraph stitching approach"),
15741580
OPTS_EVSWITCH(&top.evswitch),
15751581
OPT_END()
15761582
};
@@ -1640,6 +1646,11 @@ int cmd_top(int argc, const char **argv)
16401646
}
16411647
}
16421648

1649+
if (top.stitch_lbr && !(callchain_param.record_mode == CALLCHAIN_LBR)) {
1650+
pr_err("Error: --stitch-lbr must be used with --call-graph lbr\n");
1651+
goto out_delete_evlist;
1652+
}
1653+
16431654
if (opts->branch_stack && callchain_param.enabled)
16441655
symbol_conf.show_branchflag_count = true;
16451656

tools/perf/util/top.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ struct perf_top {
3636
bool use_tui, use_stdio;
3737
bool vmlinux_warned;
3838
bool dump_symtab;
39+
bool stitch_lbr;
3940
struct hist_entry *sym_filter_entry;
4041
struct evsel *sym_evsel;
4142
struct perf_session *session;

0 commit comments

Comments
 (0)