Skip to content

[BOLT] Make memory profile parsing optional #129585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

aaupov
Copy link
Contributor

@aaupov aaupov commented Mar 3, 2025

Introduce parse-mem-profile option to limit overheads processing
tracing data (Intel PT or ARM ETM). By default, it's enabled for
perf data (existing behavior), unless itrace is passed to parse
tracing data where it's extremely expensive.

Created using spr 1.3.4
@aaupov aaupov marked this pull request as ready for review March 3, 2025 21:43
@llvmbot llvmbot added the BOLT label Mar 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 3, 2025

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-lldb

@llvm/pr-subscribers-bolt

Author: Amir Ayupov (aaupov)

Changes

Introduce parse-mem-profile option (on by default) to control
whether perf2bolt will attempt to extract memory profile from perf data.

The use case is tracing data (ARM SPE or Intel PT) where synthesizing
memory profile is extremely expensive. Memory profile parsing will be
switched off by default when working with trace data (based on itrace
flag).


Full diff: https://github.com/llvm/llvm-project/pull/129585.diff

1 Files Affected:

  • (modified) bolt/lib/Profile/DataAggregator.cpp (+29-20)
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index d20626bd5062f..909ec32477468 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -61,6 +61,12 @@ FilterMemProfile("filter-mem-profile",
   cl::init(true),
   cl::cat(AggregatorCategory));
 
+static cl::opt<bool> ParseMemProfile(
+    "parse-mem-profile",
+    cl::desc("enable memory profile parsing if it's present in the input data, "
+             "on by default unless `--itrace` is set."),
+    cl::init(true), cl::cat(AggregatorCategory));
+
 static cl::opt<unsigned long long>
 FilterPID("pid",
   cl::desc("only use samples from process with specified PID"),
@@ -177,6 +183,10 @@ void DataAggregator::start() {
                       "script -F pid,event,ip",
                       /*Wait = */false);
   } else if (!opts::ITraceAggregation.empty()) {
+    // Disable parsing memory profile from trace data, unless requested by user.
+    if (!opts::ParseMemProfile.getNumOccurrences())
+      opts::ParseMemProfile = false;
+
     std::string ItracePerfScriptArgs = llvm::formatv(
         "script -F pid,brstack --itrace={0}", opts::ITraceAggregation);
     launchPerfProcess("branch events with itrace", MainEventsPPI,
@@ -187,12 +197,9 @@ void DataAggregator::start() {
                       /*Wait = */ false);
   }
 
-  // Note: we launch script for mem events regardless of the option, as the
-  //       command fails fairly fast if mem events were not collected.
-  launchPerfProcess("mem events",
-                    MemEventsPPI,
-                    "script -F pid,event,addr,ip",
-                    /*Wait = */false);
+  if (opts::ParseMemProfile)
+    launchPerfProcess("mem events", MemEventsPPI, "script -F pid,event,addr,ip",
+                      /*Wait = */ false);
 
   launchPerfProcess("process events", MMapEventsPPI,
                     "script --show-mmap-events --no-itrace",
@@ -213,7 +220,8 @@ void DataAggregator::abort() {
   sys::Wait(TaskEventsPPI.PI, 1, &Error);
   sys::Wait(MMapEventsPPI.PI, 1, &Error);
   sys::Wait(MainEventsPPI.PI, 1, &Error);
-  sys::Wait(MemEventsPPI.PI, 1, &Error);
+  if (opts::ParseMemProfile)
+    sys::Wait(MemEventsPPI.PI, 1, &Error);
 
   deleteTempFiles();
 
@@ -464,13 +472,6 @@ Error DataAggregator::preprocessProfile(BinaryContext &BC) {
     exit(1);
   };
 
-  auto MemEventsErrorCallback = [&](int ReturnCode, StringRef ErrBuf) {
-    Regex NoData("Samples for '.*' event do not have ADDR attribute set. "
-                 "Cannot print 'addr' field.");
-    if (!NoData.match(ErrBuf))
-      ErrorCallback(ReturnCode, ErrBuf);
-  };
-
   if (BC.IsLinuxKernel) {
     // Current MMap parsing logic does not work with linux kernel.
     // MMap entries for linux kernel uses PERF_RECORD_MMAP
@@ -511,13 +512,21 @@ Error DataAggregator::preprocessProfile(BinaryContext &BC) {
       (opts::BasicAggregation && parseBasicEvents()))
     errs() << "PERF2BOLT: failed to parse samples\n";
 
-  // Special handling for memory events
-  if (prepareToParse("mem events", MemEventsPPI, MemEventsErrorCallback))
-    return Error::success();
+  if (opts::ParseMemProfile) {
+    auto MemEventsErrorCallback = [&](int ReturnCode, StringRef ErrBuf) {
+      Regex NoData("Samples for '.*' event do not have ADDR attribute set. "
+                   "Cannot print 'addr' field.");
+      if (!NoData.match(ErrBuf))
+        ErrorCallback(ReturnCode, ErrBuf);
+    };
 
-  if (const std::error_code EC = parseMemEvents())
-    errs() << "PERF2BOLT: failed to parse memory events: " << EC.message()
-           << '\n';
+    if (prepareToParse("mem events", MemEventsPPI, MemEventsErrorCallback))
+      return Error::success();
+
+    if (const std::error_code EC = parseMemEvents())
+      errs() << "PERF2BOLT: failed to parse memory events: " << EC.message()
+             << '\n';
+  }
 
   deleteTempFiles();
 

aaupov added 2 commits June 12, 2025 14:13
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
@llvmbot llvmbot added clang Clang issues not falling into any other category lldb clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jun 12, 2025
Created using spr 1.3.4
@aaupov aaupov merged commit 902a991 into main Jun 12, 2025
7 checks passed
@aaupov aaupov deleted the users/aaupov/spr/bolt-make-memory-profile-parsing-optional branch June 12, 2025 21:46
tomtor pushed a commit to tomtor/llvm-project that referenced this pull request Jun 14, 2025
Introduce `parse-mem-profile` option to limit overheads processing
tracing data (Intel PT or ARM ETM). By default, it's enabled for
perf data (existing behavior), unless `itrace` is passed to parse
tracing data where it's extremely expensive. In this case, the flag
needs to be set explicitly if needed.
akuhlens pushed a commit to akuhlens/llvm-project that referenced this pull request Jun 24, 2025
Introduce `parse-mem-profile` option to limit overheads processing
tracing data (Intel PT or ARM ETM). By default, it's enabled for
perf data (existing behavior), unless `itrace` is passed to parse
tracing data where it's extremely expensive. In this case, the flag
needs to be set explicitly if needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BOLT clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category lldb
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants