-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[BOLT] DataAggregator support for binaries with multiple text segments #92815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-bolt Author: Paschalis Mpeis (paschalis-mpeis) ChangesDRAFT PR: This is WIP.Needs:
In short, it improves heatmaps by:
Full diff: https://github.com/llvm/llvm-project/pull/92815.diff 2 Files Affected:
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index e06debcee741e..8d5b5531654cb 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -2009,8 +2009,9 @@ std::error_code DataAggregator::parseMMapEvents() {
return MI.second.PID == FileMMapInfo.second.PID;
});
- if (PIDExists)
- continue;
+ // let duplicates to the multimap.
+ // if (PIDExists)
+ // continue;
GlobalMMapInfo.insert(FileMMapInfo);
}
@@ -2067,7 +2068,17 @@ std::error_code DataAggregator::parseMMapEvents() {
}
}
- BinaryMMapInfo.insert(std::make_pair(MMapInfo.PID, MMapInfo));
+ // The mapping was already in place, but there are cases where the size
+ // is wrong. Fix it if needed.
+ if(!BinaryMMapInfo.insert(std::make_pair(MMapInfo.PID, MMapInfo)).second) {
+ auto EndAddress = MMapInfo.MMapAddress + MMapInfo.Size;
+ auto FixedSize = EndAddress - BinaryMMapInfo[MMapInfo.PID].BaseAddress;
+
+ if (FixedSize != BinaryMMapInfo[MMapInfo.PID].Size) {
+ outs() << "MMap size fixed: " << Twine::utohexstr(FixedSize) << " \n";
+ BinaryMMapInfo[MMapInfo.PID].Size = FixedSize;
+ }
+ }
}
if (BinaryMMapInfo.empty()) {
diff --git a/bolt/lib/Profile/Heatmap.cpp b/bolt/lib/Profile/Heatmap.cpp
index 210a5cc98c104..233b070bde2da 100644
--- a/bolt/lib/Profile/Heatmap.cpp
+++ b/bolt/lib/Profile/Heatmap.cpp
@@ -164,6 +164,7 @@ void Heatmap::print(raw_ostream &OS) const {
// Print map legend
OS << "Legend:\n";
+ OS << "\nRegions:\n";
uint64_t PrevValue = 0;
for (unsigned I = 0; I < sizeof(Range) / sizeof(Range[0]); ++I) {
const uint64_t Value = Range[I];
@@ -173,6 +174,16 @@ void Heatmap::print(raw_ostream &OS) const {
PrevValue = Value;
}
+ {
+ OS << "\nSections:\n";
+ int Idx = 0;
+ for (auto TxtSeg : TextSections) {
+ OS << static_cast<char>('A' + ((Idx++) % 26)) << ": " << TxtSeg.Name
+ << ": 0x" << Twine::utohexstr(TxtSeg.BeginAddress) << "-0x"
+ << Twine::utohexstr(TxtSeg.EndAddress) << "\n";
+ }
+ }
+
// Pos - character position from right in hex form.
auto printHeader = [&](unsigned Pos) {
OS << " ";
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
c74aaf4
to
ff020d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Paschalis, thanks for the PR. I think it applies to all binaries with multiple code segments and is a welcome fix. Can you think of the way to test it? I'm a bit confused regarding the comment about the incorrect BaseAddress
, perhaps the test case will clarify it for me.
bolt/lib/Profile/DataAggregator.cpp
Outdated
@@ -2009,9 +2009,6 @@ std::error_code DataAggregator::parseMMapEvents() { | |||
return MI.second.PID == FileMMapInfo.second.PID; | |||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We no longer need the code above if we are tracking all mappings.
bolt/lib/Profile/DataAggregator.cpp
Outdated
@@ -2067,7 +2064,21 @@ std::error_code DataAggregator::parseMMapEvents() { | |||
} | |||
} | |||
|
|||
BinaryMMapInfo.insert(std::make_pair(MMapInfo.PID, MMapInfo)); | |||
// In some larger binaries, the loaded binary gets a a second text segment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Essentially, instead of storing the mapping for the first segment, you suggest to store the base address and the last address calculated as BaseAddress + Size
, right? We will need to update the description of MMapInfo
structure accordingly.
ff020d9
to
d5a5aaa
Compare
Hi @maksfb, thanks for your review. I went with a unit test. Consider this as a starting point and feel free to suggest more changes. In short: I invoke The unit test ignores some errors incurred by not supplying valid 'memory events' for irrelevant processing in |
946e149
to
b1765dc
Compare
bolt/lib/Profile/DataAggregator.cpp
Outdated
@@ -464,6 +470,13 @@ void DataAggregator::filterBinaryMMapInfo() { | |||
|
|||
int DataAggregator::prepareToParse(StringRef Name, PerfProcessInfo &Process, | |||
PerfProcessErrorCallbackTy Callback) { | |||
if (!opts::ReadPerfEvents.empty()) { | |||
dbgs() << "PERF2BOLT: using pre-processed perf events for '" << Name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dbgs() << "PERF2BOLT: using pre-processed perf events for '" << Name | |
outs() << "PERF2BOLT: using pre-processed perf events for '" << Name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the test!
Do you think we need to check that BaseAddress
is calculated consistently across multiple segment mappings?
Please update comments on struct MMapInfo
to reflect the effect of multiple segments.
b1765dc
to
14cf49c
Compare
Thanks Maks for your review. That is a good idea. I've added an assertion to ensure BaseAddress consistency and a unit test for it. |
Just pointing out that some tests are failing which are not related to this change or to BOLT (it's MLIR on Windows). |
Gentle ping. I think the PR addresses all comments. Do we need any more changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Paschalis, sorry for the delay. Looks good to me, but please address a few nits before committing.
bolt/lib/Profile/DataAggregator.cpp
Outdated
assert(MMapInfo.BaseAddress == BinaryMMapInfo[MMapInfo.PID].BaseAddress && | ||
"Base address on multiple segment mappings should match"); | ||
|
||
// Update section size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Update section size. | |
// Update mapping size. |
bolt/lib/Profile/DataAggregator.cpp
Outdated
uint64_t EndAddress = MMapInfo.MMapAddress + MMapInfo.Size; | ||
uint64_t Size = EndAddress - BinaryMMapInfo[MMapInfo.PID].BaseAddress; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint64_t EndAddress = MMapInfo.MMapAddress + MMapInfo.Size; | |
uint64_t Size = EndAddress - BinaryMMapInfo[MMapInfo.PID].BaseAddress; | |
const uint64_t EndAddress = MMapInfo.MMapAddress + MMapInfo.Size; | |
const uint64_t Size = EndAddress - BinaryMMapInfo[MMapInfo.PID].BaseAddress; |
bolt/lib/Profile/DataAggregator.cpp
Outdated
cl::opt<std::string> | ||
ReadPerfEvents("perf-script-events", | ||
cl::desc("skip perf event collection by supplying a " | ||
"perf-script output in a textual format"), | ||
cl::cat(AggregatorCategory)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't expect the end user to see this option, let's make it cl::Hidden
or cl::ReallyHidden
. Alternatively, we can modify DataAggregator
interface for testing since you don't invoke the option via CL.
bolt/unittests/Core/MemoryMaps.cpp
Outdated
@@ -0,0 +1,142 @@ | |||
//===- bolt/unittest/Core/MemoryMaps.cpp -------------------------------===// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: adjust the line to end at 80 chars.
When a binary has multiple text segments, the Size is computed as the difference of the last address of these segments from the BaseAddress. The base addresses of all text segments must be the same. Background: Larger binaries get two text segments mapped when loaded in memory. BOLT processes only the first, which is not having a correct BaseAddress, causing a wrong computation of a BinaryMMapInfo's size. Consequently, BOLT wrongly thinks that many of the samples fall outside the binary and ignores them. As a result, when used in heatmaps the output excludes all those entries and the section hotness statistics are wrong. This bug is present in both the AArch64 and x86 backends. --- This patch introduces the flag 'perf-script-events' that allows passing perf events without BOLT having to parse them using 'perf script'. The flag is used to pass a mock perf profile that has two memory mappings for a mock binary that has two text segments. The size of the mapping is updated as `parseMMapEvents` now processes all text segments. --- Example used in unit tests: From `/proc/<BINARY PID>/maps`, we have 2 text mappings, say A and B. ``` abc0000000-abc1000000 r-xp 011c0000 103:01 1573523 BINARY abc2000000-abca000000 r-xp 031d0000 103:01 1573523 BINARY ``` Size of text mappings: | Mapping | Size | | ------- | ------ | | A | ~15MB | | B | ~135MB | --- Example on a real program: ``` 2f7200000-2fabca000 r--p 00000000 bolted-binary 2fabd9000-2fe47c000 r-xp 039c9000 bolted-binary <- 1st txt segment 2fe48b000-2fe61d000 r--p 0727b000 bolted-binary 2fe62c000-2fe660000 rw-p 0740c000 bolted-binary 2fe660000-2fea4c000 rw-p 00000000 2fec00000-303dad000 r-xp 07a00000 bolted-binary <- 2nd (appears only on the bolted binary) ```
- Added to CoreTests in BUILD.gn - Hiding DataAggregator std out/err outputs
Causes issues on the CI too.
3fb0ca5
to
31e98b1
Compare
Thanks for your review Maks. Addressed the nits and rebased to latest main to get the unrelated tests green. I'll be merging this in a couple of working days, given no further comments from other reviewers. |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/1688 Here is the relevant piece of the build log for the reference
|
We're hitting test failures after this on Linux. From a build directory:
Can you please take a look? I'll prepare a revert in the meantime. |
… segments (#92815)" This caused test failures, see comment on the PR: Failed Tests (2): BOLT-Unit :: Core/./CoreTests/AArch64/MemoryMapsTester/MultipleSegmentsMismatchedBaseAddress/0 BOLT-Unit :: Core/./CoreTests/X86/MemoryMapsTester/MultipleSegmentsMismatchedBaseAddress/0 > When a binary has multiple text segments, the Size is computed as the > difference of the last address of these segments from the BaseAddress. > The base addresses of all text segments must be the same. > > Introduces flag 'perf-script-events' for testing. It allows passing perf events > without BOLT having to parse them using 'perf script'. The flag is used to > pass a mock perf profile that has two memory mappings for a mock binary > that has two text segments. The size of the mapping is updated as this > change `parseMMapEvents` processes all text segments. This reverts commit 4b71b37.
Hi Hans, thanks for reporting this. Are these errors from an official buildbot? |
No, the error is from a local build. Agreed that the comment about https://lab.llvm.org/buildbot/#/builders/146/builds/1688 seems unrelated. |
When a binary has multiple text segments, the Size is computed as the
difference of the last address of these segments from the BaseAddress.
The base addresses of all text segments must be the same.
Background:
Larger binaries get two text segments mapped when loaded in memory.
BOLT processes only the first, which is not having a correct BaseAddress,
causing a wrong computation of a BinaryMMapInfo's size.
Consequently, BOLT wrongly thinks that many of the samples fall outside
the binary and ignores them. As a result, when used in heatmaps the
output excludes all those entries and the section hotness statistics are
wrong.
This bug is present in both the AArch64 and x86 backends.
This patch introduces the flag 'perf-script-events' that allows passing
perf events without BOLT having to parse them using 'perf script'.
The flag is used to pass a mock perf profile that has two memory
mappings for a mock binary that has two text segments. The size of the
mapping is updated as
parseMMapEvents
now processes all text segments.Example used in unit tests:
From
/proc/<BINARY PID>/maps
, we have 2 text mappings, say A and B.Size of text mappings:
Example on a real program: