-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[llvm-profgen] Support creating profiles of arbitrary events #99026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change introduces two options which may be used to create profiles of arbitrary PMU events. 1. `--leading-ip-only` provides a simple sample-IP-based profile mode. This is not useful for building a profile of execution frequency, but it is useful for building new types of profiles. For example, to build a profile of unpredictable branches: ```sh perf record -b -e branch-misses:upp -o perf.data ... llvm-profgen --perfdata perf.data --leading-ip-only ... ``` 2. `--perf-event=event` enables the creation of a profile concerned with a specific event or set of events. The names given should match the "event" field as emitted by perf-script(1). This option has two spellings: `--perf-event` and `--perf-events`. The plural spelling accepts a comma-separated list. The singular spelling appends a single event name to the set of events which will be used. This is meant to accommodate event names containing commas. Combined, these options allow generating multiple kinds of profiles from a single `perf record` collection. For example, to generate both execution frequency and branch mispredict profiles: ```sh perf record -c 1000003 -b -e br_inst_retired.near_taken:upp,br_misp_retired.all_branches:upp ... llvm-profgen --output hotness.prof --perf-event=br_inst_retired.near_taken:upp ... llvm-profgen --leading-ip-only --output unpredictable.prof --perf-event=br_misp_retired.all_branches:upp ... ```
✅ With the latest revision this PR passed the C/C++ code formatter. |
@llvm/pr-subscribers-pgo Author: Tim Creech (tcreech-intel) ChangesThis change introduces two options which may be used to create profiles of arbitrary PMU events.
Combined, these options allow generating multiple kinds of profiles from a single
Patch is 59.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/99026.diff 9 Files Affected:
diff --git a/llvm/test/tools/llvm-profgen/Inputs/cmov_3.perfbin b/llvm/test/tools/llvm-profgen/Inputs/cmov_3.perfbin
new file mode 100755
index 0000000000000..7a1543041f805
Binary files /dev/null and b/llvm/test/tools/llvm-profgen/Inputs/cmov_3.perfbin differ
diff --git a/llvm/test/tools/llvm-profgen/Inputs/cmov_3.perfscript b/llvm/test/tools/llvm-profgen/Inputs/cmov_3.perfscript
new file mode 100644
index 0000000000000..3d29d444d56bb
--- /dev/null
+++ b/llvm/test/tools/llvm-profgen/Inputs/cmov_3.perfscript
@@ -0,0 +1,39 @@
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/21//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/3//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/28//- 0x401310/0x4012f0/P/-/-/21//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/24//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/21//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/29//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//-
+br_misp_retired.all_branches:upp: 4012fa 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/24//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/4//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/21//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/22//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/2//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/21//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/21//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/3//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/28//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//-
+br_misp_retired.all_branches:upp: 4012fa 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/28//- 0x401310/0x4012f0/P/-/-/24//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/21//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/24//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/24//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/26//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/6//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/21//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/14//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/21//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/2//-
+br_misp_retired.all_branches:upp: 4012fa 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/28//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/28//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/26//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/21//- 0x4012fa/0x4012ff/M/-/-/5//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/24//- 0x401310/0x4012f0/P/-/-/2//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/28//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/28//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/28//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/28//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//-
+br_misp_retired.all_branches:upp: 4012fa 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/2//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/2//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/21//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/27//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/2//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/29//- 0x401310/0x4012f0/P/-/-/24//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/19//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/21//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/4//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/28//- 0x401310/0x4012f0/P/-/-/3//- 0x401310/0x4012f0/P/-/-/26//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/3//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/4//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/26//- 0x401310/0x4012f0/P/-/-/25//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/2//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/27//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/1//-
+ br_inst_retired.near_taken:upp: 401310 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/5//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/3//- 0x401310/0x4012f0/P/-/-/22//- 0x4012fa/0x4012ff/M/-/-/1//- 0x401310/0x4012f0/P/-/-/29//- 0x401310/0x4012f0/P/-/-/23//- 0x4012fa/0x4012ff/M/-/-/2//- 0x401310/0x4012f0/P/-/-/1//- 0x4012fa/0x4012ff/P/-/-/1//- 0x401310/0x4012f0/P/-/-/26/...
[truncated]
|
Hi @tcreech-intel you can mention that this enhancement is for HWPGO (https://llvm.org/devmtg/2024-04/slides/TechnicalTalks/Xiao-EnablingHW-BasedPGO.pdf) to support a wide range of hardware events in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just some minor comments.
Thanks, @williamweixiao! I have fixed the issues you found. |
@williamweixiao @tcreech-intel for this kind of changes, can you give other reviews and code owners some time to respond before merging? I think a week is what's recommended for non-trivial changes. Also as I mentioned, for HWPGO, could you send an RFC to outline: 1) high level design (also including how this interacts with other PGO, and usage instructions), 2) performance results, 3) future plan? HWPGO is a good exploration, but when it comes to upstreaming as part of LLVM, generally the community would like to see results (performance) first. |
@@ -41,6 +41,17 @@ static cl::opt<bool> | |||
"and produce context-insensitive profile.")); | |||
cl::opt<bool> ShowDetailedWarning("show-detailed-warning", | |||
cl::desc("Show detailed warning message.")); | |||
cl::opt<bool> | |||
LeadingIPOnly("leading-ip-only", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is a bit confusing. I think what you meant is to ignore LBRs and only consuming leading IPs?
In that case, we should name it something like ignore-lbr-samples
, to be consistent with existing ignore-stack-samples
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, does use of profiled-event
require leading-ip-only
? If leading-ip-only
is not specified, what is the expected behavior?
@@ -575,14 +591,54 @@ bool PerfScriptReader::extractLBRStack(TraceStream &TraceIt, | |||
|
|||
// Skip the leading instruction pointer. | |||
size_t Index = 0; | |||
|
|||
StringRef EventName; | |||
// Skip a perf event name. This may or may not exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This may or may not exist." <-- I can't parse this.
Can you add comment with expected input format?
WithColor::warning() << "No --perf-event filter was specified, but an " | ||
"\"event\" field was found in line " | ||
<< TraceIt.getLineNumber() << ": " | ||
<< TraceIt.getCurrentLine() << "\n"; | ||
} else if (std::find(PerfEventFilter.begin(), PerfEventFilter.end(), | ||
EventName) == PerfEventFilter.end()) { | ||
TraceIt.advance(); | ||
return false; | ||
} | ||
|
||
} else if (!PerfEventFilter.empty()) { | ||
WithColor::warning() << "A --perf-event filter was specified, but no " | ||
"\"event\" field found in line " | ||
<< TraceIt.getLineNumber() << ": " | ||
<< TraceIt.getCurrentLine() << "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are emitting warning on each sample, and this is going to make the warnings super noisy.
Hi @WenleiHe; my apologies. We will certainly allow more time for feedback moving forward. Thanks for your guidance and comments here. I will prepare another PR to address your review comments. We plan to prepare an RFC in the near future as requested. In the very short term our intent was to provide the branch mispredict parts of HWPGO with LLVM19 as a follow-up to our EuroLLVM presentation on HWPGO. However, that is not to say that we mean to rush the review process. The current usage is very much opt-in and relatively ad-hoc: one needs to run llvm-profgen a second time to produce a mispredict profile, then name this secondary profile when compiling to annotate with !mispredict metadata. (Similar to https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86InsertPrefetch.cpp .) The RFC will seek to improve this usage model so that it could potentially scale to many profile types. |
@tcreech-intel @williamweixiao I was there at the EuroLLVM talk. While I appreciate the work you guys are doing, I think what's missing is practical result beyond hand-crafted microbenchmark. It's one thing to explore ideas with prototype leveraging LLVM project, and then a different discussion when it comes to making prototypes into LLVM project itself, which means the entire community will be committed to carrying and maintaining such features/optimizations going forward. For that to happen, generally we need 1) practical benefit demonstrated beyond novelty, beyond toy-program showcasing the idea, 2) we need to have a coherent design including its interaction with other existing features. We we have a ton of prototypes that we didn't upstream too, including PEBS value profiling work.
It's pointless to send RFC after the fact. The purpose of RFC is to seek feedback, build consensus on worthiness of feature/optimization, and also on design/implementation, all before anything is committed upstream. This is a convention of how everyone works in LLVM community. I'd like to suggest pausing committing any of these changes related to HWPGO before RFC is out and consensus is built. |
StringRef EventName; | ||
// Skip a perf event name. This may or may not exist. | ||
if (Records.size() > Index && Records[Index].ends_with(":")) { | ||
EventName = Records[Index].ltrim().rtrim(':'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the format in cmov_3.perfscript
, event name should be extracted from extractCallstack
, rather than extractLBRStack
?
@@ -1062,6 +1132,18 @@ bool PerfScriptReader::isLBRSample(StringRef Line) { | |||
Line.trim().split(Records, " ", 2, false); | |||
if (Records.size() < 2) | |||
return false; | |||
// Check if there is an event name before the leading IP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the header comment for this function to include a representation of LBR sample with event name
bool SampleIPIsInternal = Binary->addressIsCode(SampleIP); | ||
if (SampleIPIsInternal) { | ||
// Form a half LBR entry where the sample IP is the destination. | ||
LBRStack.emplace_back(LBREntry(SampleIP, SampleIP)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't really fit in LBRStack, instead, it fits CallStack better. All of the special case from warnInvalidRange
can be avoided if we handle these events as part of extractCallstack
.
// When computing an IP-based profile we take the SUM of counts at the | ||
// location instead of applying duplication factors and taking the MAX. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of using IP sampling or LBR sampling, sample profile loader always take the max for profile annotation, so this is not correct in the general sense. I'm guessing what you meant is when consuming mispredict profile, sum is used at profile use time. In that case, this needs to be narrowed to only mispredict or certain profile type, not general IP profile.
"perf-event", | ||
cl::desc("Ignore samples not matching the given event names")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: name it profiled-event
with description Perf event to generate profile for, e.g. br_misp_retired.all_branches. Program execution count profile is generated when unspecified.
@@ -41,6 +41,17 @@ static cl::opt<bool> | |||
"and produce context-insensitive profile.")); | |||
cl::opt<bool> ShowDetailedWarning("show-detailed-warning", | |||
cl::desc("Show detailed warning message.")); | |||
cl::opt<bool> | |||
LeadingIPOnly("leading-ip-only", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, does use of profiled-event
require leading-ip-only
? If leading-ip-only
is not specified, what is the expected behavior?
Thanks. I went over the patch and I think we'd have a cleaner and more compatible implementation if this is all handled as part of callstack sample rather than lbr sample. If we do that, it might actually be easier to unland this first and review/reland a reworked version. |
…lvm#99026)" This reverts commit 0caf0c9.
Summary: This change introduces two options which may be used to create profiles of arbitrary PMU events. 1. `--leading-ip-only` provides a simple sample-IP-based profile mode. This is not useful for building a profile of execution frequency, but it is useful for building new types of profiles. For example, to build a profile of unpredictable branches: perf record -b -e branch-misses:upp -o perf.data ... llvm-profgen --perfdata perf.data --leading-ip-only ... 2. `--perf-event=event` enables the creation of a profile concerned with a specific event or set of events. The names given should match the "event" field as emitted by perf-script(1). This option has two spellings: `--perf-event` and `--perf-events`. The plural spelling accepts a comma-separated list. The singular spelling appends a single event name to the set of events which will be used. This is meant to accommodate event names containing commas. Combined, these options allow generating multiple kinds of profiles from a single `perf record` collection. For example, to generate both execution frequency and branch mispredict profiles: perf record -c 1000003 -b -e br_inst_retired.near_taken:upp,br_misp_retired.all_branches:upp ... llvm-profgen --output execution.prof --perf-event=br_inst_retired.near_taken:upp ... llvm-profgen --leading-ip-only --output unpredictable.prof --perf-event=br_misp_retired.all_branches:upp ... These additions are in support of more general HWPGO[^1], allowing feedback from a wider range of hardware events. [^1]: https://llvm.org/devmtg/2024-04/slides/TechnicalTalks/Xiao-EnablingHW-BasedPGO.pdf --------- Co-authored-by: Tim Creech <[email protected]> Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251204
This change introduces two options which may be used to create profiles of arbitrary PMU events.
--leading-ip-only
provides a simple sample-IP-based profile mode. This is not useful for building a profile of execution frequency, but it is useful for building new types of profiles.For example, to build a profile of unpredictable branches:
--perf-event=event
enables the creation of a profile concerned with a specific event or set of events. The names given should match the "event" field as emitted by perf-script(1).This option has two spellings:
--perf-event
and--perf-events
. The plural spelling accepts a comma-separated list. The singular spelling appends a single event name to the set of events which will be used. This is meant to accommodate event names containing commas.Combined, these options allow generating multiple kinds of profiles from a single
perf record
collection. For example, to generate both execution frequency and branch mispredict profiles:These additions are in support of more general HWPGO1, allowing feedback from a wider range of hardware events.
Footnotes
https://llvm.org/devmtg/2024-04/slides/TechnicalTalks/Xiao-EnablingHW-BasedPGO.pdf ↩