Skip to content

Commit 2322d6c

Browse files
committed
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more perf tooling updates from Thomas Gleixner: "Perf tool updates and fixes: perf stat: - Display user and system time for workload targets (Jiri Olsa) perf record: - Enable arbitrary event names thru name= modifier (Alexey Budankov) PowerPC: - Add a python script for hypervisor call statistics (Ravi Bangoria) Intel PT: (Adrian Hunter) - Fix sync_switch INTEL_PT_SS_NOT_TRACING - Fix decoding to accept CBR between FUP and corresponding TIP - Fix MTC timing after overflow - Fix "Unexpected indirect branch" error perf test: - record+probe_libc_inet_pton: - To get the symbol table for dynamic shared objects on ubuntu we need to pass the -D/--dynamic command line option, unlike with the fedora distros (Arnaldo Carvalho de Melo) - code-reading: - Fix perf_env setup for PTI entry trampolines (Adrian Hunter) - kmod-path: - Add tests for vdso32 and vdsox32 (Adrian Hunter) - Use header file util/debug.h (Thomas Richter) perf annotate: - Make the various UI backends (stdio, TUI, gtk) use more consistently structs with annotation options as specified by the user (Arnaldo Carvalho de Melo) - Move annotation specific knobs from the symbol_conf global kitchen sink to the annotation option structs (Arnaldo Carvalho de Melo) perf script: - Add more PMU fields to python scripts event handler dict (Jin Yao) Core: - Fix misleading error for some unparsable events mentioning PMUs when those are not involved in the problem (Jiri Olsa) - Consider BSS symbols when processing /proc/kallsyms ('B' and 'b') (Arnaldo Carvalho de Melo) - Be more robust when trying to use per-symbol histograms, checking for unlikely but possible cases where the space for the histograms wasn't allocated, print a debug message for such cases (Arnaldo Carvalho de Melo) - Fix symbol and object code resolution for vdso32 and vdsox32 (Adrian Hunter) - No need to check for null when passing pointers to foo__get() style refcount grabbing helpers, just like in the kernel and with free(), its safe to pass a NULL pointer to avoid having to check it before each and every foo__get() call (Arnaldo Carvalho de Melo) - Remove some dead code (quote.[ch]) (Arnaldo Carvalho de Melo) - Remove some needless globals, making them local (Arnaldo Carvalho de Melo) - Reduce usage of symbol_conf.use_callchain, using other means of finding out if callchains are in use or available for specific events, as we evolved this codebase to allow requesting callchains for just a subset of the monitored events. In time it will help polish recording and showing mixed sets accross the various tools: perf record -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions' (Arnaldo Carvalho de Melo) - Consider PTI entry trampolines in map__rip_2objdump() (Adrian Hunter)" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) perf script python: Add dict fields introduction to Documentation perf script python: Add more PMU fields to event handler dict perf script python: Move dsoname code to a new function perf symbols: Add BSS symbols when reading from /proc/kallsyms perf annnotate: Make __symbol__inc_addr_samples handle src->histograms == NULL perf intel-pt: Fix "Unexpected indirect branch" error perf intel-pt: Fix MTC timing after overflow perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING perf script powerpc: Python script for hypervisor call statistics perf test record+probe_libc_inet_pton: Ask 'nm' for dynamic symbols perf map: Consider PTI entry trampolines in rip_2objdump() perf test code-reading: Fix perf_env setup for PTI entry trampolines perf tools: Fix pmu events parsing rule perf stat: Display user and system time perf record: Enable arbitrary event names thru name= modifier perf tools: Fix symbol and object code resolution for vdso32 and vdsox32 perf tests kmod-path: Add tests for vdso32 and vdsox32 perf hists: Check if a hist_entry has callchains before using them perf hists: Introduce hist_entry__has_callchain() method ...
2 parents 9f3fbe8 + 2696ec4 commit 2322d6c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+998
-427
lines changed

tools/perf/Documentation/perf-list.txt

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,11 @@ The available PMUs and their raw parameters can be listed with
124124
For example the raw event "LSD.UOPS" core pmu event above could
125125
be specified as
126126

127-
perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=1/ ...
127+
perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ...
128+
129+
or using extended name syntax
130+
131+
perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ...
128132

129133
PER SOCKET PMUS
130134
---------------

tools/perf/Documentation/perf-record.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,9 @@ OPTIONS
5757
FP mode, "dwarf" for DWARF mode, "lbr" for LBR mode and
5858
"no" for disable callgraph.
5959
- 'stack-size': user stack size for dwarf mode
60+
- 'name' : User defined event name. Single quotes (') may be used to
61+
escape symbols in the name from parsing by shell and tool
62+
like this: name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\'.
6063

6164
See the linkperf:perf-list[1] man page for more parameters.
6265

tools/perf/Documentation/perf-script-python.txt

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -610,6 +610,32 @@ Various utility functions for use with perf script:
610610
nsecs_str(nsecs) - returns printable string in the form secs.nsecs
611611
avg(total, n) - returns average given a sum and a total number of values
612612

613+
SUPPORTED FIELDS
614+
----------------
615+
616+
Currently supported fields:
617+
618+
ev_name, comm, pid, tid, cpu, ip, time, period, phys_addr, addr,
619+
symbol, dso, time_enabled, time_running, values, callchain,
620+
brstack, brstacksym, datasrc, datasrc_decode, iregs, uregs,
621+
weight, transaction, raw_buf, attr.
622+
623+
Some fields have sub items:
624+
625+
brstack:
626+
from, to, from_dsoname, to_dsoname, mispred,
627+
predicted, in_tx, abort, cycles.
628+
629+
brstacksym:
630+
items: from, to, pred, in_tx, abort (converted string)
631+
632+
For example,
633+
We can use this code to print brstack "from", "to", "cycles".
634+
635+
if 'brstack' in dict:
636+
for entry in dict['brstack']:
637+
print "from %s, to %s, cycles %s" % (entry["from"], entry["to"], entry["cycles"])
638+
613639
SEE ALSO
614640
--------
615641
linkperf:perf-script[1]

tools/perf/Documentation/perf-stat.txt

Lines changed: 29 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -310,20 +310,38 @@ Users who wants to get the actual value can apply --no-metric-only.
310310
EXAMPLES
311311
--------
312312

313-
$ perf stat -- make -j
313+
$ perf stat -- make
314314

315-
Performance counter stats for 'make -j':
315+
Performance counter stats for 'make':
316316

317-
8117.370256 task clock ticks # 11.281 CPU utilization factor
318-
678 context switches # 0.000 M/sec
319-
133 CPU migrations # 0.000 M/sec
320-
235724 pagefaults # 0.029 M/sec
321-
24821162526 CPU cycles # 3057.784 M/sec
322-
18687303457 instructions # 2302.138 M/sec
323-
172158895 cache references # 21.209 M/sec
324-
27075259 cache misses # 3.335 M/sec
317+
83723.452481 task-clock:u (msec) # 1.004 CPUs utilized
318+
0 context-switches:u # 0.000 K/sec
319+
0 cpu-migrations:u # 0.000 K/sec
320+
3,228,188 page-faults:u # 0.039 M/sec
321+
229,570,665,834 cycles:u # 2.742 GHz
322+
313,163,853,778 instructions:u # 1.36 insn per cycle
323+
69,704,684,856 branches:u # 832.559 M/sec
324+
2,078,861,393 branch-misses:u # 2.98% of all branches
325325

326-
Wall-clock time elapsed: 719.554352 msecs
326+
83.409183620 seconds time elapsed
327+
328+
74.684747000 seconds user
329+
8.739217000 seconds sys
330+
331+
TIMINGS
332+
-------
333+
As displayed in the example above we can display 3 types of timings.
334+
We always display the time the counters were enabled/alive:
335+
336+
83.409183620 seconds time elapsed
337+
338+
For workload sessions we also display time the workloads spent in
339+
user/system lands:
340+
341+
74.684747000 seconds user
342+
8.739217000 seconds sys
343+
344+
Those times are the very same as displayed by the 'time' tool.
327345

328346
CSV FORMAT
329347
----------

tools/perf/arch/common.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,7 @@ static int perf_env__lookup_binutils_path(struct perf_env *env,
189189
return -1;
190190
}
191191

192-
int perf_env__lookup_objdump(struct perf_env *env)
192+
int perf_env__lookup_objdump(struct perf_env *env, const char **path)
193193
{
194194
/*
195195
* For live mode, env->arch will be NULL and we can use
@@ -198,5 +198,5 @@ int perf_env__lookup_objdump(struct perf_env *env)
198198
if (env->arch == NULL)
199199
return 0;
200200

201-
return perf_env__lookup_binutils_path(env, "objdump", &objdump_path);
201+
return perf_env__lookup_binutils_path(env, "objdump", path);
202202
}

tools/perf/arch/common.h

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@
44

55
#include "../util/env.h"
66

7-
extern const char *objdump_path;
8-
9-
int perf_env__lookup_objdump(struct perf_env *env);
7+
int perf_env__lookup_objdump(struct perf_env *env, const char **path);
108

119
#endif /* ARCH_PERF_COMMON_H */

tools/perf/builtin-annotate.c

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -40,9 +40,8 @@
4040
struct perf_annotate {
4141
struct perf_tool tool;
4242
struct perf_session *session;
43+
struct annotation_options opts;
4344
bool use_tui, use_stdio, use_stdio2, use_gtk;
44-
bool full_paths;
45-
bool print_line;
4645
bool skip_missing;
4746
bool has_br_stack;
4847
bool group_set;
@@ -162,12 +161,12 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter,
162161
hist__account_cycles(sample->branch_stack, al, sample, false);
163162

164163
bi = he->branch_info;
165-
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
164+
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel);
166165

167166
if (err)
168167
goto out;
169168

170-
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
169+
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel);
171170

172171
out:
173172
return err;
@@ -249,7 +248,7 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
249248
if (he == NULL)
250249
return -ENOMEM;
251250

252-
ret = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr);
251+
ret = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);
253252
hists__inc_nr_samples(hists, true);
254253
return ret;
255254
}
@@ -289,10 +288,9 @@ static int hist_entry__tty_annotate(struct hist_entry *he,
289288
struct perf_annotate *ann)
290289
{
291290
if (!ann->use_stdio2)
292-
return symbol__tty_annotate(he->ms.sym, he->ms.map, evsel,
293-
ann->print_line, ann->full_paths, 0, 0);
294-
return symbol__tty_annotate2(he->ms.sym, he->ms.map, evsel,
295-
ann->print_line, ann->full_paths);
291+
return symbol__tty_annotate(he->ms.sym, he->ms.map, evsel, &ann->opts);
292+
293+
return symbol__tty_annotate2(he->ms.sym, he->ms.map, evsel, &ann->opts);
296294
}
297295

298296
static void hists__find_annotations(struct hists *hists,
@@ -343,7 +341,7 @@ static void hists__find_annotations(struct hists *hists,
343341
/* skip missing symbols */
344342
nd = rb_next(nd);
345343
} else if (use_browser == 1) {
346-
key = hist_entry__tui_annotate(he, evsel, NULL);
344+
key = hist_entry__tui_annotate(he, evsel, NULL, &ann->opts);
347345

348346
switch (key) {
349347
case -1:
@@ -390,8 +388,9 @@ static int __cmd_annotate(struct perf_annotate *ann)
390388
goto out;
391389
}
392390

393-
if (!objdump_path) {
394-
ret = perf_env__lookup_objdump(&session->header.env);
391+
if (!ann->opts.objdump_path) {
392+
ret = perf_env__lookup_objdump(&session->header.env,
393+
&ann->opts.objdump_path);
395394
if (ret)
396395
goto out;
397396
}
@@ -476,6 +475,7 @@ int cmd_annotate(int argc, const char **argv)
476475
.ordered_events = true,
477476
.ordering_requires_timestamps = true,
478477
},
478+
.opts = annotation__default_options,
479479
};
480480
struct perf_data data = {
481481
.mode = PERF_DATA_MODE_READ,
@@ -503,9 +503,9 @@ int cmd_annotate(int argc, const char **argv)
503503
"file", "vmlinux pathname"),
504504
OPT_BOOLEAN('m', "modules", &symbol_conf.use_modules,
505505
"load module symbols - WARNING: use only with -k and LIVE kernel"),
506-
OPT_BOOLEAN('l', "print-line", &annotate.print_line,
506+
OPT_BOOLEAN('l', "print-line", &annotate.opts.print_lines,
507507
"print matching source lines (may be slow)"),
508-
OPT_BOOLEAN('P', "full-paths", &annotate.full_paths,
508+
OPT_BOOLEAN('P', "full-paths", &annotate.opts.full_path,
509509
"Don't shorten the displayed pathnames"),
510510
OPT_BOOLEAN(0, "skip-missing", &annotate.skip_missing,
511511
"Skip symbols that cannot be annotated"),
@@ -516,13 +516,13 @@ int cmd_annotate(int argc, const char **argv)
516516
OPT_CALLBACK(0, "symfs", NULL, "directory",
517517
"Look for files with symbols relative to this directory",
518518
symbol__config_symfs),
519-
OPT_BOOLEAN(0, "source", &symbol_conf.annotate_src,
519+
OPT_BOOLEAN(0, "source", &annotate.opts.annotate_src,
520520
"Interleave source code with assembly code (default)"),
521-
OPT_BOOLEAN(0, "asm-raw", &symbol_conf.annotate_asm_raw,
521+
OPT_BOOLEAN(0, "asm-raw", &annotate.opts.show_asm_raw,
522522
"Display raw encoding of assembly instructions (default)"),
523-
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
523+
OPT_STRING('M', "disassembler-style", &annotate.opts.disassembler_style, "disassembler style",
524524
"Specify disassembler style (e.g. -M intel for intel syntax)"),
525-
OPT_STRING(0, "objdump", &objdump_path, "path",
525+
OPT_STRING(0, "objdump", &annotate.opts.objdump_path, "path",
526526
"objdump binary to use for disassembly and annotations"),
527527
OPT_BOOLEAN(0, "group", &symbol_conf.event_group,
528528
"Show event group information together"),

tools/perf/builtin-c2c.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1976,7 +1976,7 @@ static int filter_cb(struct hist_entry *he)
19761976
c2c_he = container_of(he, struct c2c_hist_entry, he);
19771977

19781978
if (c2c.show_src && !he->srcline)
1979-
he->srcline = hist_entry__get_srcline(he);
1979+
he->srcline = hist_entry__srcline(he);
19801980

19811981
calc_width(c2c_he);
19821982

tools/perf/builtin-kvm.c

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1438,8 +1438,6 @@ static int kvm_events_live(struct perf_kvm_stat *kvm,
14381438
goto out;
14391439
}
14401440

1441-
symbol_conf.nr_events = kvm->evlist->nr_entries;
1442-
14431441
if (perf_evlist__create_maps(kvm->evlist, &kvm->opts.target) < 0)
14441442
usage_with_options(live_usage, live_options);
14451443

tools/perf/builtin-probe.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -81,8 +81,7 @@ static int parse_probe_event(const char *str)
8181
params.target_used = true;
8282
}
8383

84-
if (params.nsi)
85-
pev->nsi = nsinfo__get(params.nsi);
84+
pev->nsi = nsinfo__get(params.nsi);
8685

8786
/* Parse a perf-probe command into event */
8887
ret = parse_perf_probe_command(str, pev);

tools/perf/builtin-report.c

Lines changed: 19 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ struct report {
7171
bool group_set;
7272
int max_stack;
7373
struct perf_read_values show_threads_values;
74+
struct annotation_options annotation_opts;
7475
const char *pretty_printing_style;
7576
const char *cpu_list;
7677
const char *symbol_filter_str;
@@ -136,26 +137,25 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter,
136137

137138
if (sort__mode == SORT_MODE__BRANCH) {
138139
bi = he->branch_info;
139-
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
140+
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel);
140141
if (err)
141142
goto out;
142143

143-
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
144+
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel);
144145

145146
} else if (rep->mem_mode) {
146147
mi = he->mem_info;
147-
err = addr_map_symbol__inc_samples(&mi->daddr, sample, evsel->idx);
148+
err = addr_map_symbol__inc_samples(&mi->daddr, sample, evsel);
148149
if (err)
149150
goto out;
150151

151-
err = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr);
152+
err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);
152153

153154
} else if (symbol_conf.cumulate_callchain) {
154155
if (single)
155-
err = hist_entry__inc_addr_samples(he, sample, evsel->idx,
156-
al->addr);
156+
err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);
157157
} else {
158-
err = hist_entry__inc_addr_samples(he, sample, evsel->idx, al->addr);
158+
err = hist_entry__inc_addr_samples(he, sample, evsel, al->addr);
159159
}
160160

161161
out:
@@ -181,11 +181,11 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter,
181181
rep->nonany_branch_mode);
182182

183183
bi = he->branch_info;
184-
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
184+
err = addr_map_symbol__inc_samples(&bi->from, sample, evsel);
185185
if (err)
186186
goto out;
187187

188-
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
188+
err = addr_map_symbol__inc_samples(&bi->to, sample, evsel);
189189

190190
branch_type_count(&rep->brtype_stat, &bi->flags,
191191
bi->from.addr, bi->to.addr);
@@ -561,7 +561,7 @@ static int report__browse_hists(struct report *rep)
561561
ret = perf_evlist__tui_browse_hists(evlist, help, NULL,
562562
rep->min_percent,
563563
&session->header.env,
564-
true);
564+
true, &rep->annotation_opts);
565565
/*
566566
* Usually "ret" is the last pressed key, and we only
567567
* care if the key notifies us to switch data file.
@@ -946,12 +946,6 @@ parse_percent_limit(const struct option *opt, const char *str,
946946
return 0;
947947
}
948948

949-
#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent"
950-
951-
const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
952-
CALLCHAIN_REPORT_HELP
953-
"\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;
954-
955949
int cmd_report(int argc, const char **argv)
956950
{
957951
struct perf_session *session;
@@ -960,6 +954,10 @@ int cmd_report(int argc, const char **argv)
960954
bool has_br_stack = false;
961955
int branch_mode = -1;
962956
bool branch_call_mode = false;
957+
#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent"
958+
const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
959+
CALLCHAIN_REPORT_HELP
960+
"\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;
963961
char callchain_default_opt[] = CALLCHAIN_DEFAULT_OPT;
964962
const char * const report_usage[] = {
965963
"perf report [<options>]",
@@ -989,6 +987,7 @@ int cmd_report(int argc, const char **argv)
989987
.max_stack = PERF_MAX_STACK_DEPTH,
990988
.pretty_printing_style = "normal",
991989
.socket_filter = -1,
990+
.annotation_opts = annotation__default_options,
992991
};
993992
const struct option options[] = {
994993
OPT_STRING('i', "input", &input_name, "file",
@@ -1078,11 +1077,11 @@ int cmd_report(int argc, const char **argv)
10781077
"list of cpus to profile"),
10791078
OPT_BOOLEAN('I', "show-info", &report.show_full_info,
10801079
"Display extended information about perf.data file"),
1081-
OPT_BOOLEAN(0, "source", &symbol_conf.annotate_src,
1080+
OPT_BOOLEAN(0, "source", &report.annotation_opts.annotate_src,
10821081
"Interleave source code with assembly code (default)"),
1083-
OPT_BOOLEAN(0, "asm-raw", &symbol_conf.annotate_asm_raw,
1082+
OPT_BOOLEAN(0, "asm-raw", &report.annotation_opts.show_asm_raw,
10841083
"Display raw encoding of assembly instructions (default)"),
1085-
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
1084+
OPT_STRING('M', "disassembler-style", &report.annotation_opts.disassembler_style, "disassembler style",
10861085
"Specify disassembler style (e.g. -M intel for intel syntax)"),
10871086
OPT_BOOLEAN(0, "show-total-period", &symbol_conf.show_total_period,
10881087
"Show a column with the sum of periods"),
@@ -1093,7 +1092,7 @@ int cmd_report(int argc, const char **argv)
10931092
parse_branch_mode),
10941093
OPT_BOOLEAN(0, "branch-history", &branch_call_mode,
10951094
"add last branch records to call history"),
1096-
OPT_STRING(0, "objdump", &objdump_path, "path",
1095+
OPT_STRING(0, "objdump", &report.annotation_opts.objdump_path, "path",
10971096
"objdump binary to use for disassembly and annotations"),
10981097
OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
10991098
"Disable symbol demangling"),

0 commit comments

Comments
 (0)