Skip to content

Commit 55b4ce6

Browse files
author
Ingo Molnar
committed
Merge tag 'perf-core-for-mingo-4.17-20180305' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: - Be more robust when drawing arrows in the annotation TUI, avoiding a segfault when jump instructions have as a target addresses in functions other that the one currently being annotated. The full fix will come in the following days, when jumping to other functions will work as call instructions (Arnaldo Carvalho de Melo) - Allow asking for the maximum allowed sample rate in 'top' and 'record', i.e. 'perf record -F max' will read the kernel.perf_event_max_sample_rate sysctl and use it (Arnaldo Carvalho de Melo) - When the user specifies a freq above kernel.perf_event_max_sample_rate, Throttle it down to that max freq, and warn the user about it, add as well --strict-freq so that the previous behaviour of not starting the session when the desired freq can't be used can be selected (Arnaldo Carvalho de Melo) - Find 'call' instruction target symbol at parsing time, used so far in the TUI, part of the infrastructure changes that will end up allowing for jumps to navigate to other functions, just like 'call' instructions. (Arnaldo Carvalho de Melo) - Use xyarray dimensions to iterate fds in 'perf stat' (Andi Kleen) - Ignore threads for which the current user hasn't permissions when enabling system-wide --per-thread (Jin Yao) - Fix some backtrace perf test cases to use 'perf record' + 'perf script' instead, till 'perf trace' starts using ordered_events or equivalent to avoid symbol resolving artifacts due to reordering of PERF_RECORD_MMAP events (Jiri Olsa) - Fix crash in 'perf record' pipe mode, it needs to allocate the ID array even for a single event, unlike non-pipe mode (Jiri Olsa) - Make annoying fallback message on older kernels with newer 'perf top' binaries trying to use overwrite mode and that not being present in the older kernels (Kan Liang) - Switch last users of old APIs to the newer perf_mmap__read_event() one, then discard those old mmap read forward APIs (Kan Liang) - Fix the usage on the 'perf kallsyms' man page (Sangwon Hong) - Simplify cgroup arguments when tracking multiple events (weiping zhang) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2 parents 8af3136 + 6afad54 commit 55b4ce6

34 files changed

+328
-146
lines changed

tools/perf/Documentation/perf-kallsyms.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ perf-kallsyms - Searches running kernel for symbols
88
SYNOPSIS
99
--------
1010
[verse]
11-
'perf kallsyms <options> symbol_name[,symbol_name...]'
11+
'perf kallsyms' [<options>] symbol_name[,symbol_name...]
1212

1313
DESCRIPTION
1414
-----------

tools/perf/Documentation/perf-record.txt

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -191,9 +191,16 @@ OPTIONS
191191
-i::
192192
--no-inherit::
193193
Child tasks do not inherit counters.
194+
194195
-F::
195196
--freq=::
196-
Profile at this frequency.
197+
Profile at this frequency. Use 'max' to use the currently maximum
198+
allowed frequency, i.e. the value in the kernel.perf_event_max_sample_rate
199+
sysctl. Will throttle down to the currently maximum allowed frequency.
200+
See --strict-freq.
201+
202+
--strict-freq::
203+
Fail if the specified frequency can't be used.
197204

198205
-m::
199206
--mmap-pages=::
@@ -308,7 +315,11 @@ can be provided. Each cgroup is applied to the corresponding event, i.e., first
308315
to first event, second cgroup to second event and so on. It is possible to provide
309316
an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have
310317
corresponding events, i.e., they always refer to events defined earlier on the command
311-
line.
318+
line. If the user wants to track multiple events for a specific cgroup, the user can
319+
use '-e e1 -e e2 -G foo,foo' or just use '-e e1 -e e2 -G foo'.
320+
321+
If wanting to monitor, say, 'cycles' for a cgroup and also for system wide, this
322+
command line can be used: 'perf stat -e cycles -G cgroup_name -a -e cycles'.
312323

313324
-b::
314325
--branch-any::

tools/perf/Documentation/perf-stat.txt

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,11 @@ can be provided. Each cgroup is applied to the corresponding event, i.e., first
118118
to first event, second cgroup to second event and so on. It is possible to provide
119119
an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have
120120
corresponding events, i.e., they always refer to events defined earlier on the command
121-
line.
121+
line. If the user wants to track multiple events for a specific cgroup, the user can
122+
use '-e e1 -e e2 -G foo,foo' or just use '-e e1 -e e2 -G foo'.
123+
124+
If wanting to monitor, say, 'cycles' for a cgroup and also for system wide, this
125+
command line can be used: 'perf stat -e cycles -G cgroup_name -a -e cycles'.
122126

123127
-o file::
124128
--output file::

tools/perf/Documentation/perf-top.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,9 @@ Default is to monitor all CPUS.
5555

5656
-F <freq>::
5757
--freq=<freq>::
58-
Profile at this frequency.
58+
Profile at this frequency. Use 'max' to use the currently maximum
59+
allowed frequency, i.e. the value in the kernel.perf_event_max_sample_rate
60+
sysctl.
5961

6062
-i::
6163
--inherit::

tools/perf/arch/x86/tests/perf-time-to-tsc.c

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,8 @@ int test__perf_time_to_tsc(struct test *test __maybe_unused, int subtest __maybe
6060
union perf_event *event;
6161
u64 test_tsc, comm1_tsc, comm2_tsc;
6262
u64 test_time, comm1_time = 0, comm2_time = 0;
63+
struct perf_mmap *md;
64+
u64 end, start;
6365

6466
threads = thread_map__new(-1, getpid(), UINT_MAX);
6567
CHECK_NOT_NULL__(threads);
@@ -109,7 +111,11 @@ int test__perf_time_to_tsc(struct test *test __maybe_unused, int subtest __maybe
109111
perf_evlist__disable(evlist);
110112

111113
for (i = 0; i < evlist->nr_mmaps; i++) {
112-
while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
114+
md = &evlist->mmap[i];
115+
if (perf_mmap__read_init(md, false, &start, &end) < 0)
116+
continue;
117+
118+
while ((event = perf_mmap__read_event(md, false, &start, end)) != NULL) {
113119
struct perf_sample sample;
114120

115121
if (event->header.type != PERF_RECORD_COMM ||
@@ -128,8 +134,9 @@ int test__perf_time_to_tsc(struct test *test __maybe_unused, int subtest __maybe
128134
comm2_time = sample.time;
129135
}
130136
next_event:
131-
perf_evlist__mmap_consume(evlist, i);
137+
perf_mmap__consume(md, false);
132138
}
139+
perf_mmap__read_done(md);
133140
}
134141

135142
if (!comm1_time || !comm2_time)

tools/perf/builtin-kvm.c

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -743,16 +743,24 @@ static bool verify_vcpu(int vcpu)
743743
static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat *kvm, int idx,
744744
u64 *mmap_time)
745745
{
746+
struct perf_evlist *evlist = kvm->evlist;
746747
union perf_event *event;
748+
struct perf_mmap *md;
749+
u64 end, start;
747750
u64 timestamp;
748751
s64 n = 0;
749752
int err;
750753

751754
*mmap_time = ULLONG_MAX;
752-
while ((event = perf_evlist__mmap_read(kvm->evlist, idx)) != NULL) {
753-
err = perf_evlist__parse_sample_timestamp(kvm->evlist, event, &timestamp);
755+
md = &evlist->mmap[idx];
756+
err = perf_mmap__read_init(md, false, &start, &end);
757+
if (err < 0)
758+
return (err == -EAGAIN) ? 0 : -1;
759+
760+
while ((event = perf_mmap__read_event(md, false, &start, end)) != NULL) {
761+
err = perf_evlist__parse_sample_timestamp(evlist, event, &timestamp);
754762
if (err) {
755-
perf_evlist__mmap_consume(kvm->evlist, idx);
763+
perf_mmap__consume(md, false);
756764
pr_err("Failed to parse sample\n");
757765
return -1;
758766
}
@@ -762,7 +770,7 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat *kvm, int idx,
762770
* FIXME: Here we can't consume the event, as perf_session__queue_event will
763771
* point to it, and it'll get possibly overwritten by the kernel.
764772
*/
765-
perf_evlist__mmap_consume(kvm->evlist, idx);
773+
perf_mmap__consume(md, false);
766774

767775
if (err) {
768776
pr_err("Failed to enqueue sample: %d\n", err);
@@ -779,6 +787,7 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat *kvm, int idx,
779787
break;
780788
}
781789

790+
perf_mmap__read_done(md);
782791
return n;
783792
}
784793

tools/perf/builtin-record.c

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@
4545

4646
#include <errno.h>
4747
#include <inttypes.h>
48+
#include <locale.h>
4849
#include <poll.h>
4950
#include <unistd.h>
5051
#include <sched.h>
@@ -881,6 +882,15 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
881882
}
882883
}
883884

885+
/*
886+
* If we have just single event and are sending data
887+
* through pipe, we need to force the ids allocation,
888+
* because we synthesize event name through the pipe
889+
* and need the id for that.
890+
*/
891+
if (data->is_pipe && rec->evlist->nr_entries == 1)
892+
rec->opts.sample_id = true;
893+
884894
if (record__open(rec) != 0) {
885895
err = -1;
886896
goto out_child;
@@ -1542,7 +1552,11 @@ static struct option __record_options[] = {
15421552
OPT_BOOLEAN(0, "tail-synthesize", &record.opts.tail_synthesize,
15431553
"synthesize non-sample events at the end of output"),
15441554
OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
1545-
OPT_UINTEGER('F', "freq", &record.opts.user_freq, "profile at this frequency"),
1555+
OPT_BOOLEAN(0, "strict-freq", &record.opts.strict_freq,
1556+
"Fail if the specified frequency can't be used"),
1557+
OPT_CALLBACK('F', "freq", &record.opts, "freq or 'max'",
1558+
"profile at this frequency",
1559+
record__parse_freq),
15461560
OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
15471561
"number of mmap data pages and AUX area tracing mmap pages",
15481562
record__parse_mmap_pages),
@@ -1651,6 +1665,8 @@ int cmd_record(int argc, const char **argv)
16511665
struct record *rec = &record;
16521666
char errbuf[BUFSIZ];
16531667

1668+
setlocale(LC_ALL, "");
1669+
16541670
#ifndef HAVE_LIBBPF_SUPPORT
16551671
# define set_nobuild(s, l, c) set_option_nobuild(record_options, s, l, "NO_LIBBPF=1", c)
16561672
set_nobuild('\0', "clang-path", true);

tools/perf/builtin-stat.c

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -508,14 +508,13 @@ static int perf_stat_synthesize_config(bool is_pipe)
508508

509509
#define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
510510

511-
static int __store_counter_ids(struct perf_evsel *counter,
512-
struct cpu_map *cpus,
513-
struct thread_map *threads)
511+
static int __store_counter_ids(struct perf_evsel *counter)
514512
{
515513
int cpu, thread;
516514

517-
for (cpu = 0; cpu < cpus->nr; cpu++) {
518-
for (thread = 0; thread < threads->nr; thread++) {
515+
for (cpu = 0; cpu < xyarray__max_x(counter->fd); cpu++) {
516+
for (thread = 0; thread < xyarray__max_y(counter->fd);
517+
thread++) {
519518
int fd = FD(counter, cpu, thread);
520519

521520
if (perf_evlist__id_add_fd(evsel_list, counter,
@@ -535,7 +534,7 @@ static int store_counter_ids(struct perf_evsel *counter)
535534
if (perf_evsel__alloc_id(counter, cpus->nr, threads->nr))
536535
return -ENOMEM;
537536

538-
return __store_counter_ids(counter, cpus, threads);
537+
return __store_counter_ids(counter);
539538
}
540539

541540
static bool perf_evsel__should_store_id(struct perf_evsel *counter)
@@ -638,7 +637,19 @@ static int __run_perf_stat(int argc, const char **argv)
638637
if (verbose > 0)
639638
ui__warning("%s\n", msg);
640639
goto try_again;
641-
}
640+
} else if (target__has_per_thread(&target) &&
641+
evsel_list->threads &&
642+
evsel_list->threads->err_thread != -1) {
643+
/*
644+
* For global --per-thread case, skip current
645+
* error thread.
646+
*/
647+
if (!thread_map__remove(evsel_list->threads,
648+
evsel_list->threads->err_thread)) {
649+
evsel_list->threads->err_thread = -1;
650+
goto try_again;
651+
}
652+
}
642653

643654
perf_evsel__open_strerror(counter, &target,
644655
errno, msg, sizeof(msg));

tools/perf/builtin-top.c

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -991,7 +991,7 @@ static int perf_top_overwrite_fallback(struct perf_top *top,
991991
evlist__for_each_entry(evlist, counter)
992992
counter->attr.write_backward = false;
993993
opts->overwrite = false;
994-
ui__warning("fall back to non-overwrite mode\n");
994+
pr_debug2("fall back to non-overwrite mode\n");
995995
return 1;
996996
}
997997

@@ -1307,7 +1307,9 @@ int cmd_top(int argc, const char **argv)
13071307
OPT_STRING(0, "sym-annotate", &top.sym_filter, "symbol name",
13081308
"symbol to annotate"),
13091309
OPT_BOOLEAN('z', "zero", &top.zero, "zero history across updates"),
1310-
OPT_UINTEGER('F', "freq", &opts->user_freq, "profile at this frequency"),
1310+
OPT_CALLBACK('F', "freq", &top.record_opts, "freq or 'max'",
1311+
"profile at this frequency",
1312+
record__parse_freq),
13111313
OPT_INTEGER('E', "entries", &top.print_entries,
13121314
"display this many functions"),
13131315
OPT_BOOLEAN('U', "hide_user_symbols", &top.hide_user_symbols,

tools/perf/builtin-trace.c

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2472,8 +2472,14 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
24722472

24732473
for (i = 0; i < evlist->nr_mmaps; i++) {
24742474
union perf_event *event;
2475+
struct perf_mmap *md;
2476+
u64 end, start;
24752477

2476-
while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
2478+
md = &evlist->mmap[i];
2479+
if (perf_mmap__read_init(md, false, &start, &end) < 0)
2480+
continue;
2481+
2482+
while ((event = perf_mmap__read_event(md, false, &start, end)) != NULL) {
24772483
struct perf_sample sample;
24782484

24792485
++trace->nr_events;
@@ -2486,7 +2492,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
24862492

24872493
trace__handle_event(trace, event, &sample);
24882494
next_event:
2489-
perf_evlist__mmap_consume(evlist, i);
2495+
perf_mmap__consume(md, false);
24902496

24912497
if (interrupted)
24922498
goto out_disable;
@@ -2496,6 +2502,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
24962502
draining = true;
24972503
}
24982504
}
2505+
perf_mmap__read_done(md);
24992506
}
25002507

25012508
if (trace->nr_events == before) {

tools/perf/perf.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ struct record_opts {
6161
bool tail_synthesize;
6262
bool overwrite;
6363
bool ignore_missing_thread;
64+
bool strict_freq;
65+
bool sample_id;
6466
unsigned int freq;
6567
unsigned int mmap_pages;
6668
unsigned int auxtrace_mmap_pages;
@@ -82,4 +84,6 @@ struct record_opts {
8284
struct option;
8385
extern const char * const *record_usage;
8486
extern struct option *record_options;
87+
88+
int record__parse_freq(const struct option *opt, const char *str, int unset);
8589
#endif

tools/perf/tests/bpf.c

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -176,13 +176,20 @@ static int do_test(struct bpf_object *obj, int (*func)(void),
176176

177177
for (i = 0; i < evlist->nr_mmaps; i++) {
178178
union perf_event *event;
179+
struct perf_mmap *md;
180+
u64 end, start;
179181

180-
while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
182+
md = &evlist->mmap[i];
183+
if (perf_mmap__read_init(md, false, &start, &end) < 0)
184+
continue;
185+
186+
while ((event = perf_mmap__read_event(md, false, &start, end)) != NULL) {
181187
const u32 type = event->header.type;
182188

183189
if (type == PERF_RECORD_SAMPLE)
184190
count ++;
185191
}
192+
perf_mmap__read_done(md);
186193
}
187194

188195
if (count != expect) {

tools/perf/tests/code-reading.c

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -409,15 +409,22 @@ static int process_events(struct machine *machine, struct perf_evlist *evlist,
409409
struct state *state)
410410
{
411411
union perf_event *event;
412+
struct perf_mmap *md;
413+
u64 end, start;
412414
int i, ret;
413415

414416
for (i = 0; i < evlist->nr_mmaps; i++) {
415-
while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
417+
md = &evlist->mmap[i];
418+
if (perf_mmap__read_init(md, false, &start, &end) < 0)
419+
continue;
420+
421+
while ((event = perf_mmap__read_event(md, false, &start, end)) != NULL) {
416422
ret = process_event(machine, evlist, event, state);
417-
perf_evlist__mmap_consume(evlist, i);
423+
perf_mmap__consume(md, false);
418424
if (ret < 0)
419425
return ret;
420426
}
427+
perf_mmap__read_done(md);
421428
}
422429
return 0;
423430
}

tools/perf/tests/keep-tracking.c

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,18 +27,24 @@
2727
static int find_comm(struct perf_evlist *evlist, const char *comm)
2828
{
2929
union perf_event *event;
30+
struct perf_mmap *md;
31+
u64 end, start;
3032
int i, found;
3133

3234
found = 0;
3335
for (i = 0; i < evlist->nr_mmaps; i++) {
34-
while ((event = perf_evlist__mmap_read(evlist, i)) != NULL) {
36+
md = &evlist->mmap[i];
37+
if (perf_mmap__read_init(md, false, &start, &end) < 0)
38+
continue;
39+
while ((event = perf_mmap__read_event(md, false, &start, end)) != NULL) {
3540
if (event->header.type == PERF_RECORD_COMM &&
3641
(pid_t)event->comm.pid == getpid() &&
3742
(pid_t)event->comm.tid == getpid() &&
3843
strcmp(event->comm.comm, comm) == 0)
3944
found += 1;
40-
perf_evlist__mmap_consume(evlist, i);
45+
perf_mmap__consume(md, false);
4146
}
47+
perf_mmap__read_done(md);
4248
}
4349
return found;
4450
}

0 commit comments

Comments
 (0)