Skip to content

[BOLT] DataAggregator support for binaries with multiple text segments #92815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 25, 2024

Conversation

paschalis-mpeis
Copy link
Member

@paschalis-mpeis paschalis-mpeis commented May 20, 2024

When a binary has multiple text segments, the Size is computed as the
difference of the last address of these segments from the BaseAddress.
The base addresses of all text segments must be the same.

Background:
Larger binaries get two text segments mapped when loaded in memory.
BOLT processes only the first, which is not having a correct BaseAddress,
causing a wrong computation of a BinaryMMapInfo's size.

Consequently, BOLT wrongly thinks that many of the samples fall outside
the binary and ignores them. As a result, when used in heatmaps the
output excludes all those entries and the section hotness statistics are
wrong.

This bug is present in both the AArch64 and x86 backends.


This patch introduces the flag 'perf-script-events' that allows passing
perf events without BOLT having to parse them using 'perf script'.
The flag is used to pass a mock perf profile that has two memory
mappings for a mock binary that has two text segments. The size of the
mapping is updated as parseMMapEvents now processes all text segments.


Example used in unit tests:
From /proc/<BINARY PID>/maps, we have 2 text mappings, say A and B.

abc0000000-abc1000000 r-xp 011c0000 103:01 1573523 BINARY
abc2000000-abca000000 r-xp 031d0000 103:01 1573523 BINARY

Size of text mappings:

Mapping Size
A ~15MB
B ~135MB

Example on a real program:

2f7200000-2fabca000 r--p 00000000        bolted-binary
2fabd9000-2fe47c000 r-xp 039c9000        bolted-binary <- 1st txt segment
2fe48b000-2fe61d000 r--p 0727b000        bolted-binary
2fe62c000-2fe660000 rw-p 0740c000        bolted-binary
2fe660000-2fea4c000 rw-p 00000000
2fec00000-303dad000 r-xp 07a00000        bolted-binary <- 2nd (appears only on the bolted binary)

@llvmbot
Copy link
Member

llvmbot commented May 20, 2024

@llvm/pr-subscribers-bolt

Author: Paschalis Mpeis (paschalis-mpeis)

Changes

DRAFT PR: This is WIP.

Needs:

  • Add context describing the issue
  • Add testing

In short, it improves heatmaps by:

  1. Fixing a bug where the computation of a text section's size was wrong. As a result, running the heatmap tool on some bigger bolted binaries, the tool ignored most of the hits in hot/cold text segments.

  2. Adding to the legend the mapping between the bucket letters to the actual text sections.


Full diff: https://github.com/llvm/llvm-project/pull/92815.diff

2 Files Affected:

  • (modified) bolt/lib/Profile/DataAggregator.cpp (+14-3)
  • (modified) bolt/lib/Profile/Heatmap.cpp (+11)
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index e06debcee741e..8d5b5531654cb 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -2009,8 +2009,9 @@ std::error_code DataAggregator::parseMMapEvents() {
       return MI.second.PID == FileMMapInfo.second.PID;
     });
 
-    if (PIDExists)
-      continue;
+    // let duplicates to the multimap.
+    // if (PIDExists)
+    //   continue;
 
     GlobalMMapInfo.insert(FileMMapInfo);
   }
@@ -2067,7 +2068,17 @@ std::error_code DataAggregator::parseMMapEvents() {
       }
     }
 
-    BinaryMMapInfo.insert(std::make_pair(MMapInfo.PID, MMapInfo));
+    // The mapping was already in place, but there are cases where the size
+    // is wrong. Fix it if needed.
+    if(!BinaryMMapInfo.insert(std::make_pair(MMapInfo.PID, MMapInfo)).second) {
+      auto EndAddress = MMapInfo.MMapAddress + MMapInfo.Size;
+      auto FixedSize = EndAddress - BinaryMMapInfo[MMapInfo.PID].BaseAddress;
+
+      if (FixedSize != BinaryMMapInfo[MMapInfo.PID].Size) {
+          outs() << "MMap size fixed: " << Twine::utohexstr(FixedSize) << " \n";
+         BinaryMMapInfo[MMapInfo.PID].Size = FixedSize;
+      }
+    }
   }
 
   if (BinaryMMapInfo.empty()) {
diff --git a/bolt/lib/Profile/Heatmap.cpp b/bolt/lib/Profile/Heatmap.cpp
index 210a5cc98c104..233b070bde2da 100644
--- a/bolt/lib/Profile/Heatmap.cpp
+++ b/bolt/lib/Profile/Heatmap.cpp
@@ -164,6 +164,7 @@ void Heatmap::print(raw_ostream &OS) const {
 
   // Print map legend
   OS << "Legend:\n";
+  OS << "\nRegions:\n";
   uint64_t PrevValue = 0;
   for (unsigned I = 0; I < sizeof(Range) / sizeof(Range[0]); ++I) {
     const uint64_t Value = Range[I];
@@ -173,6 +174,16 @@ void Heatmap::print(raw_ostream &OS) const {
     PrevValue = Value;
   }
 
+  {
+    OS << "\nSections:\n";
+    int Idx = 0;
+    for (auto TxtSeg : TextSections) {
+      OS << static_cast<char>('A' + ((Idx++) % 26)) << ": " << TxtSeg.Name
+         << ": 0x" <<  Twine::utohexstr(TxtSeg.BeginAddress) << "-0x"
+         << Twine::utohexstr(TxtSeg.EndAddress) << "\n";
+    }
+  }
+
   // Pos - character position from right in hex form.
   auto printHeader = [&](unsigned Pos) {
     OS << "            ";

Copy link

github-actions bot commented May 20, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@paschalis-mpeis paschalis-mpeis force-pushed the users/paschalis-mpeis/bolt-heatmap-fix branch from c74aaf4 to ff020d9 Compare June 25, 2024 09:18
@paschalis-mpeis paschalis-mpeis changed the title [BOLT] Heatmap fix on large binaries and printing mappings [BOLT] Fix heatmaps on large BOLTE'd binaries. Jun 25, 2024
@paschalis-mpeis paschalis-mpeis changed the title [BOLT] Fix heatmaps on large BOLTE'd binaries. [BOLT] Fix heatmaps on large BOLTEd binaries. Jun 25, 2024
@paschalis-mpeis paschalis-mpeis marked this pull request as ready for review June 25, 2024 13:21
Copy link
Contributor

@maksfb maksfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Paschalis, thanks for the PR. I think it applies to all binaries with multiple code segments and is a welcome fix. Can you think of the way to test it? I'm a bit confused regarding the comment about the incorrect BaseAddress, perhaps the test case will clarify it for me.

@@ -2009,9 +2009,6 @@ std::error_code DataAggregator::parseMMapEvents() {
return MI.second.PID == FileMMapInfo.second.PID;
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer need the code above if we are tracking all mappings.

@@ -2067,7 +2064,21 @@ std::error_code DataAggregator::parseMMapEvents() {
}
}

BinaryMMapInfo.insert(std::make_pair(MMapInfo.PID, MMapInfo));
// In some larger binaries, the loaded binary gets a a second text segment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially, instead of storing the mapping for the first segment, you suggest to store the base address and the last address calculated as BaseAddress + Size, right? We will need to update the description of MMapInfo structure accordingly.

@paschalis-mpeis paschalis-mpeis force-pushed the users/paschalis-mpeis/bolt-heatmap-fix branch from ff020d9 to d5a5aaa Compare August 28, 2024 09:52
@paschalis-mpeis
Copy link
Member Author

Hi @maksfb, thanks for your review.

I went with a unit test. Consider this as a starting point and feel free to suggest more changes.


In short:
I added a flag perf-script-events to bypass BOLT's 'perf script' processing,
and I use it to supply custom mmap event entries that demonstrate the problem.

I invoke preprocessProfile with the mock perf profile+binary and finally check the computed size. I needed to expose a const reference to BinaryMMapInfo for checking the size.

The unit test ignores some errors incurred by not supplying valid 'memory events' for irrelevant processing in preprocessProfile.

@paschalis-mpeis paschalis-mpeis force-pushed the users/paschalis-mpeis/bolt-heatmap-fix branch from 946e149 to b1765dc Compare August 29, 2024 10:11
@@ -464,6 +470,13 @@ void DataAggregator::filterBinaryMMapInfo() {

int DataAggregator::prepareToParse(StringRef Name, PerfProcessInfo &Process,
PerfProcessErrorCallbackTy Callback) {
if (!opts::ReadPerfEvents.empty()) {
dbgs() << "PERF2BOLT: using pre-processed perf events for '" << Name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dbgs() << "PERF2BOLT: using pre-processed perf events for '" << Name
outs() << "PERF2BOLT: using pre-processed perf events for '" << Name

Copy link
Contributor

@maksfb maksfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test!

Do you think we need to check that BaseAddress is calculated consistently across multiple segment mappings?

Please update comments on struct MMapInfo to reflect the effect of multiple segments.

@paschalis-mpeis paschalis-mpeis force-pushed the users/paschalis-mpeis/bolt-heatmap-fix branch from b1765dc to 14cf49c Compare November 5, 2024 09:08
@paschalis-mpeis paschalis-mpeis changed the title [BOLT] Fix heatmaps on large BOLTEd binaries. [BOLT] DataAggregator supports binaries with multiple text segments Nov 5, 2024
@paschalis-mpeis paschalis-mpeis changed the title [BOLT] DataAggregator supports binaries with multiple text segments [BOLT] DataAggregator support for binaries with multiple text segments Nov 5, 2024
@paschalis-mpeis
Copy link
Member Author

Thanks Maks for your review.

That is a good idea. I've added an assertion to ensure BaseAddress consistency and a unit test for it.
I have also forced-push to rebase to latest main and changed the PR title.

@paschalis-mpeis
Copy link
Member Author

Just pointing out that some tests are failing which are not related to this change or to BOLT (it's MLIR on Windows).
But if Meta has some further tests to run, it might be a good idea (I've saw this mentioned in other PRs).

@paschalis-mpeis
Copy link
Member Author

Gentle ping. I think the PR addresses all comments. Do we need any more changes?

Copy link
Contributor

@maksfb maksfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Paschalis, sorry for the delay. Looks good to me, but please address a few nits before committing.

assert(MMapInfo.BaseAddress == BinaryMMapInfo[MMapInfo.PID].BaseAddress &&
"Base address on multiple segment mappings should match");

// Update section size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Update section size.
// Update mapping size.

Comment on lines 2093 to 2094
uint64_t EndAddress = MMapInfo.MMapAddress + MMapInfo.Size;
uint64_t Size = EndAddress - BinaryMMapInfo[MMapInfo.PID].BaseAddress;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
uint64_t EndAddress = MMapInfo.MMapAddress + MMapInfo.Size;
uint64_t Size = EndAddress - BinaryMMapInfo[MMapInfo.PID].BaseAddress;
const uint64_t EndAddress = MMapInfo.MMapAddress + MMapInfo.Size;
const uint64_t Size = EndAddress - BinaryMMapInfo[MMapInfo.PID].BaseAddress;

Comment on lines 98 to 102
cl::opt<std::string>
ReadPerfEvents("perf-script-events",
cl::desc("skip perf event collection by supplying a "
"perf-script output in a textual format"),
cl::cat(AggregatorCategory));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't expect the end user to see this option, let's make it cl::Hidden or cl::ReallyHidden. Alternatively, we can modify DataAggregator interface for testing since you don't invoke the option via CL.

@@ -0,0 +1,142 @@
//===- bolt/unittest/Core/MemoryMaps.cpp -------------------------------===//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: adjust the line to end at 80 chars.

paschalis-mpeis and others added 4 commits November 20, 2024 15:18
When a binary has multiple text segments, the Size is computed as the
difference of the last address of these segments from the BaseAddress.
The base addresses of all text segments must be the same.

Background:
Larger binaries get two text segments mapped when loaded in memory.
BOLT processes only the first, which is not having a correct BaseAddress,
causing a wrong computation of a BinaryMMapInfo's size.

Consequently, BOLT wrongly thinks that many of the samples fall outside
the binary and ignores them. As a result, when used in heatmaps the
output excludes all those entries and the section hotness statistics are
wrong.

This bug is present in both the AArch64 and x86 backends.

---

This patch introduces the flag 'perf-script-events' that allows passing
perf events without BOLT having to parse them using 'perf script'.
The flag is used to pass a mock perf profile that has two memory
mappings for a mock binary that has two text segments. The size of the
mapping is updated as `parseMMapEvents` now processes all text segments.

---

Example used in unit tests:
From `/proc/<BINARY PID>/maps`, we have 2 text mappings, say A and B.

```
abc0000000-abc1000000 r-xp 011c0000 103:01 1573523 BINARY
abc2000000-abca000000 r-xp 031d0000 103:01 1573523 BINARY
```

Size of text mappings:

| Mapping |  Size  |
| ------- | ------ |
| A       | ~15MB  |
| B       | ~135MB |

---

Example on a real program:
```
2f7200000-2fabca000 r--p 00000000        bolted-binary
2fabd9000-2fe47c000 r-xp 039c9000        bolted-binary <- 1st txt segment
2fe48b000-2fe61d000 r--p 0727b000        bolted-binary
2fe62c000-2fe660000 rw-p 0740c000        bolted-binary
2fe660000-2fea4c000 rw-p 00000000
2fec00000-303dad000 r-xp 07a00000        bolted-binary <- 2nd (appears only on the bolted binary)
```
- Added to CoreTests in BUILD.gn
- Hiding DataAggregator std out/err outputs
@paschalis-mpeis paschalis-mpeis force-pushed the users/paschalis-mpeis/bolt-heatmap-fix branch from 3fb0ca5 to 31e98b1 Compare November 21, 2024 10:07
@paschalis-mpeis
Copy link
Member Author

Thanks for your review Maks. Addressed the nits and rebased to latest main to get the unrelated tests green.

I'll be merging this in a couple of working days, given no further comments from other reviewers.

@paschalis-mpeis paschalis-mpeis merged commit 4b71b37 into main Nov 25, 2024
8 checks passed
@paschalis-mpeis paschalis-mpeis deleted the users/paschalis-mpeis/bolt-heatmap-fix branch November 25, 2024 13:12
@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 25, 2024

LLVM Buildbot has detected a new failure on builder lld-x86_64-win running on as-worker-93 while building bolt,llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/1688

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM-Unit :: Support/./SupportTests.exe/37/87' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-19800-37-87.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=87 GTEST_SHARD_INDEX=37 C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe
--

Script:
--
C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe --gtest_filter=ProgramEnvTest.CreateProcessLongPath
--
C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): error: Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): error: fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied



C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160
Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163
fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied




********************


@zmodem
Copy link
Collaborator

zmodem commented Nov 26, 2024

We're hitting test failures after this on Linux. From a build directory:

$ cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_ENABLE_PROJECTS='clang;lld;bolt' -DLLVM_ENABLE_RUNTIMES=compiler-rt -DLLVM_TARGETS_TO_BUILD='AArch64;X86' ../llvm && ninja check-bolt
[...]
******************** TEST 'BOLT-Unit :: Core/./CoreTests/26/27' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/work/llvm-project/build.xx/tools/bolt/unittests/Core/./CoreTests-BOLT-Unit-989610-26-27.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=27 GTEST_SHARD_INDEX=26 /work/llvm-project/build.xx/tools/bolt/unittests/Core/./CoreTests
--

Script:
--
/work/llvm-project/build.xx/tools/bolt/unittests/Core/./CoreTests --gtest_filter=AArch64/MemoryMapsTester.MultipleSegmentsMismatchedBaseAddress/0
--
/work/llvm-project/bolt/unittests/Core/MemoryMaps.cpp:139: Failure
Death test: { Error Err = DA.preprocessProfile(*BC); }
    Result: failed to die.
 Error msg:
[  DEATH   ] BOLT-WARNING: build-id will not be checked because we could not read one from input binary
[  DEATH   ] Error reading BOLT data input file: line 4, column 5: expected decimal number
[  DEATH   ] Found: name
[  DEATH   ] PERF2BOLT: failed to parse samples
[  DEATH   ] Error reading BOLT data input file: line 4, column 10: expected decimal number
[  DEATH   ] Found: name
[  DEATH   ] PERF2BOLT: failed to parse memory events: Input/output error
[  DEATH   ] 


/work/llvm-project/bolt/unittests/Core/MemoryMaps.cpp:139
Death test: { Error Err = DA.preprocessProfile(*BC); }
    Result: failed to die.
 Error msg:
[  DEATH   ] BOLT-WARNING: build-id will not be checked because we could not read one from input binary
[  DEATH   ] Error reading BOLT data input file: line 4, column 5: expected decimal number
[  DEATH   ] Found: name
[  DEATH   ] PERF2BOLT: failed to parse samples
[  DEATH   ] Error reading BOLT data input file: line 4, column 10: expected decimal number
[  DEATH   ] Found: name
[  DEATH   ] PERF2BOLT: failed to parse memory events: Input/output error
[  DEATH   ] 



********************
********************
Failed Tests (2):
  BOLT-Unit :: Core/./CoreTests/AArch64/MemoryMapsTester/MultipleSegmentsMismatchedBaseAddress/0
  BOLT-Unit :: Core/./CoreTests/X86/MemoryMapsTester/MultipleSegmentsMismatchedBaseAddress/0


Testing Time: 3.13s

Total Discovered Tests: 489
  Skipped    :   7 (1.43%)
  Unsupported:  49 (10.02%)
  Passed     : 431 (88.14%)
  Failed     :   2 (0.41%)

Can you please take a look? I'll prepare a revert in the meantime.

zmodem added a commit that referenced this pull request Nov 26, 2024
… segments (#92815)"

This caused test failures, see comment on the PR:

  Failed Tests (2):
    BOLT-Unit :: Core/./CoreTests/AArch64/MemoryMapsTester/MultipleSegmentsMismatchedBaseAddress/0
    BOLT-Unit :: Core/./CoreTests/X86/MemoryMapsTester/MultipleSegmentsMismatchedBaseAddress/0

> When a binary has multiple text segments, the Size is computed as the
> difference of the last address of these segments from the BaseAddress.
> The base addresses of all text segments must be the same.
>
> Introduces flag 'perf-script-events' for testing. It allows passing perf events
> without BOLT having to parse them using 'perf script'. The flag is used to
> pass a mock perf profile that has two memory mappings for a mock binary
> that has two text segments. The size of the mapping is updated as this
> change `parseMMapEvents` processes all text segments.

This reverts commit 4b71b37.
@paschalis-mpeis
Copy link
Member Author

Hi Hans, thanks for reporting this.

Are these errors from an official buildbot?
I am asking as the error reported in the earlier comment seems unrelated.

@zmodem
Copy link
Collaborator

zmodem commented Nov 26, 2024

No, the error is from a local build. Agreed that the comment about https://lab.llvm.org/buildbot/#/builders/146/builds/1688 seems unrelated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants