-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[BOLT][heatmap] Compute section utilization and partition score #139193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BOLT][heatmap] Compute section utilization and partition score #139193
Conversation
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
@llvm/pr-subscribers-bolt Author: Amir Ayupov (aaupov) ChangesHeatmap collects samples grouped by buckets. The size is configurable Define section utilization as the number of buckets mapped to the Note that for buckets that cross section boundaries, we will attribute Test Plan: updated heatmap-preagg.test Full diff: https://github.com/llvm/llvm-project/pull/139193.diff 4 Files Affected:
diff --git a/bolt/include/bolt/Profile/Heatmap.h b/bolt/include/bolt/Profile/Heatmap.h
index fc1e2cd30011e..c7b3d45fa5cc2 100644
--- a/bolt/include/bolt/Profile/Heatmap.h
+++ b/bolt/include/bolt/Profile/Heatmap.h
@@ -9,6 +9,7 @@
#ifndef BOLT_PROFILE_HEATMAP_H
#define BOLT_PROFILE_HEATMAP_H
+#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"
#include <cstdint>
#include <map>
@@ -45,6 +46,10 @@ class Heatmap {
/// Map section names to their address range.
const std::vector<SectionNameAndRange> TextSections;
+ uint64_t getNumBuckets(uint64_t Begin, uint64_t End) const {
+ return End / BucketSize + !!(End % BucketSize) - Begin / BucketSize;
+ };
+
public:
explicit Heatmap(uint64_t BucketSize = 4096, uint64_t MinAddress = 0,
uint64_t MaxAddress = std::numeric_limits<uint64_t>::max(),
@@ -77,9 +82,22 @@ class Heatmap {
void printCDF(raw_ostream &OS) const;
- void printSectionHotness(StringRef Filename) const;
+ /// Struct describing individual section hotness.
+ struct SectionStats {
+ uint64_t Samples{0};
+ uint64_t Buckets{0};
+ };
+
+ /// Mapping from section name to associated \p SectionStats. Special entries:
+ /// - [total] for total stats,
+ /// - [unmapped] for samples outside any section, if non-zero.
+ using SectionStatsMap = StringMap<SectionStats>;
+
+ SectionStatsMap computeSectionStats() const;
+
+ void printSectionHotness(const SectionStatsMap &, StringRef Filename) const;
- void printSectionHotness(raw_ostream &OS) const;
+ void printSectionHotness(const SectionStatsMap &, raw_ostream &OS) const;
size_t size() const { return Map.size(); }
};
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index a5ac87ee781b2..11850fab28bb8 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -1357,10 +1357,12 @@ std::error_code DataAggregator::printLBRHeatMap() {
HM.printCDF(opts::OutputFilename);
else
HM.printCDF(opts::OutputFilename + ".csv");
+ Heatmap::SectionStatsMap Stats = HM.computeSectionStats();
if (opts::OutputFilename == "-")
- HM.printSectionHotness(opts::OutputFilename);
+ HM.printSectionHotness(Stats, opts::OutputFilename);
else
- HM.printSectionHotness(opts::OutputFilename + "-section-hotness.csv");
+ HM.printSectionHotness(Stats,
+ opts::OutputFilename + "-section-hotness.csv");
return std::error_code();
}
diff --git a/bolt/lib/Profile/Heatmap.cpp b/bolt/lib/Profile/Heatmap.cpp
index c7821b3a1a15a..d3ff74f664046 100644
--- a/bolt/lib/Profile/Heatmap.cpp
+++ b/bolt/lib/Profile/Heatmap.cpp
@@ -284,23 +284,24 @@ void Heatmap::printCDF(raw_ostream &OS) const {
Counts.clear();
}
-void Heatmap::printSectionHotness(StringRef FileName) const {
+void Heatmap::printSectionHotness(const Heatmap::SectionStatsMap &Stats,
+ StringRef FileName) const {
std::error_code EC;
raw_fd_ostream OS(FileName, EC, sys::fs::OpenFlags::OF_None);
if (EC) {
errs() << "error opening output file: " << EC.message() << '\n';
exit(1);
}
- printSectionHotness(OS);
+ printSectionHotness(Stats, OS);
}
-void Heatmap::printSectionHotness(raw_ostream &OS) const {
+StringMap<Heatmap::SectionStats> Heatmap::computeSectionStats() const {
uint64_t NumTotalCounts = 0;
- StringMap<uint64_t> SectionHotness;
+ StringMap<SectionStats> Stat;
unsigned TextSectionIndex = 0;
if (TextSections.empty())
- return;
+ return Stat;
uint64_t UnmappedHotness = 0;
auto RecordUnmappedBucket = [&](uint64_t Address, uint64_t Frequency) {
@@ -312,37 +313,61 @@ void Heatmap::printSectionHotness(raw_ostream &OS) const {
UnmappedHotness += Frequency;
};
- for (const std::pair<const uint64_t, uint64_t> &KV : Map) {
- NumTotalCounts += KV.second;
+ for (const auto [Bucket, Count] : Map) {
+ NumTotalCounts += Count;
// We map an address bucket to the first section (lowest address)
// overlapping with that bucket.
- auto Address = KV.first * BucketSize;
+ auto Address = Bucket * BucketSize;
while (TextSectionIndex < TextSections.size() &&
Address >= TextSections[TextSectionIndex].EndAddress)
TextSectionIndex++;
if (TextSectionIndex >= TextSections.size() ||
Address + BucketSize < TextSections[TextSectionIndex].BeginAddress) {
- RecordUnmappedBucket(Address, KV.second);
+ RecordUnmappedBucket(Address, Count);
continue;
}
- SectionHotness[TextSections[TextSectionIndex].Name] += KV.second;
+ SectionStats &SecStats = Stat[TextSections[TextSectionIndex].Name];
+ ++SecStats.Buckets;
+ SecStats.Samples += Count;
}
+ Stat["[total]"] = SectionStats{NumTotalCounts, Map.size()};
+ if (UnmappedHotness)
+ Stat["[unmapped]"] = SectionStats{UnmappedHotness, 0};
+
+ return Stat;
+}
+void Heatmap::printSectionHotness(const StringMap<SectionStats> &Stats,
+ raw_ostream &OS) const {
+ if (TextSections.empty())
+ return;
+
+ auto TotalIt = Stats.find("[total]");
+ assert(TotalIt != Stats.end() && "Malformed SectionStatsMap");
+ const uint64_t NumTotalCounts = TotalIt->second.Samples;
assert(NumTotalCounts > 0 &&
"total number of heatmap buckets should be greater than 0");
- OS << "Section Name, Begin Address, End Address, Percentage Hotness\n";
- for (auto &TextSection : TextSections) {
- OS << TextSection.Name << ", 0x"
- << Twine::utohexstr(TextSection.BeginAddress) << ", 0x"
- << Twine::utohexstr(TextSection.EndAddress) << ", "
- << format("%.4f",
- 100.0 * SectionHotness[TextSection.Name] / NumTotalCounts)
- << "\n";
+ OS << "Section Name, Begin Address, End Address, Percentage Hotness, "
+ << "Utilization Pct\n";
+ for (const auto [Name, Begin, End] : TextSections) {
+ uint64_t Samples = 0;
+ uint64_t Buckets = 0;
+ auto SectionIt = Stats.find(Name);
+ if (SectionIt != Stats.end()) {
+ Samples = SectionIt->second.Samples;
+ Buckets = SectionIt->second.Buckets;
+ }
+ const float RelHotness = 100. * Samples / NumTotalCounts;
+ const float BucketUtilization = 100. * Buckets / getNumBuckets(Begin, End);
+ OS << formatv("{0}, {1:x}, {2:x}, {3:f4}, {4:f4}\n", Name, Begin, End,
+ RelHotness, BucketUtilization);
}
- if (UnmappedHotness > 0)
- OS << "[unmapped], 0x0, 0x0, "
- << format("%.4f", 100.0 * UnmappedHotness / NumTotalCounts) << "\n";
+ auto UnmappedIt = Stats.find("[unmapped]");
+ if (UnmappedIt == Stats.end())
+ return;
+ const float UnmappedPct = 100. * UnmappedIt->second.Samples / NumTotalCounts;
+ OS << formatv("[unmapped], 0x0, 0x0, {0:f4}, 0\n", UnmappedPct);
}
} // namespace bolt
} // namespace llvm
diff --git a/bolt/test/X86/heatmap-preagg.test b/bolt/test/X86/heatmap-preagg.test
index 00d4d521b1adf..660d37fd03cbe 100644
--- a/bolt/test/X86/heatmap-preagg.test
+++ b/bolt/test/X86/heatmap-preagg.test
@@ -17,17 +17,19 @@ RUN: FileCheck %s --check-prefix CHECK-SEC-HOT-BAT --input-file %t2-section-hotn
CHECK-HEATMAP: PERF2BOLT: read 81 aggregated LBR entries
CHECK-HEATMAP: HEATMAP: invalid traces: 1
-CHECK-SEC-HOT: .init, 0x401000, 0x40101b, 16.8545
-CHECK-SEC-HOT-NEXT: .plt, 0x401020, 0x4010b0, 4.7583
-CHECK-SEC-HOT-NEXT: .text, 0x4010b0, 0x401c25, 78.3872
-CHECK-SEC-HOT-NEXT: .fini, 0x401c28, 0x401c35, 0.0000
+CHECK-SEC-HOT: Section Name, Begin Address, End Address, Percentage Hotness, Utilization Pct
+CHECK-SEC-HOT-NEXT: .init, 0x401000, 0x40101b, 16.8545, 100.0000
+CHECK-SEC-HOT-NEXT: .plt, 0x401020, 0x4010b0, 4.7583, 66.6667
+CHECK-SEC-HOT-NEXT: .text, 0x4010b0, 0x401c25, 78.3872, 85.1064
+CHECK-SEC-HOT-NEXT: .fini, 0x401c28, 0x401c35, 0.0000, 0.0000
CHECK-HEATMAP-BAT: PERF2BOLT: read 79 aggregated LBR entries
CHECK-HEATMAP-BAT: HEATMAP: invalid traces: 2
-CHECK-SEC-HOT-BAT: .init, 0x401000, 0x40101b, 17.2888
-CHECK-SEC-HOT-BAT-NEXT: .plt, 0x401020, 0x4010b0, 5.6132
+CHECK-SEC-HOT-BAT: Section Name, Begin Address, End Address, Percentage Hotness, Utilization Pct
+CHECK-SEC-HOT-BAT-NEXT: .init, 0x401000, 0x40101b, 17.2888, 100.0000
+CHECK-SEC-HOT-BAT-NEXT: .plt, 0x401020, 0x4010b0, 5.6132, 66.6667
CHECK-SEC-HOT-BAT-NEXT: .bolt.org.text, 0x4010b0, 0x401c25, 38.3385
-CHECK-SEC-HOT-BAT-NEXT: .fini, 0x401c28, 0x401c35, 0.0000
-CHECK-SEC-HOT-BAT-NEXT: .text, 0x800000, 0x8002cc, 38.7595
-CHECK-SEC-HOT-BAT-NEXT: .text.cold, 0x800300, 0x800415, 0.0000
+CHECK-SEC-HOT-BAT-NEXT: .fini, 0x401c28, 0x401c35, 0.0000, 0.0000
+CHECK-SEC-HOT-BAT-NEXT: .text, 0x800000, 0x8002cc, 38.7595, 91.6667
+CHECK-SEC-HOT-BAT-NEXT: .text.cold, 0x800300, 0x800415, 0.0000, 0.0000
|
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
Created using spr 1.3.4 [skip ci]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code LGTM.
For practical applications, we sometimes use 3-way function splitting resulting in code being broken into more than two partitions. In such case, the most interesting metrics and score should be attached to non-cold partitions. With "hot text" enabled, the corresponding address range of such partition could be identified by [__hot_start, __hot_end).
I checked one such binary with .warm and .cold sections. We report them separately, e.g.
Do you mean we should bundle |
Yes, as the goal is to achieve the perfect score of "1" for that combo partition. |
Let me add that as a follow-up (a synthetic hot text section between the symbols). |
Created using spr 1.3.4 [skip ci]
In heatmap mode, report samples and utilization of the section(s) between hot text markers `[__hot_start, __hot_end)`. The intended use is with multi-way splitting where there are several sections that contain "hot" code (e.g. `.text.warm` with CDSplit). Addresses the comment on #139193 #139193 (review) Test Plan: updated heatmap-preagg.test
In heatmap mode, report samples and utilization of the section(s) between hot text markers `[__hot_start, __hot_end)`. The intended use is with multi-way splitting where there are several sections that contain "hot" code (e.g. `.text.warm` with CDSplit). Addresses the comment on #139193 llvm/llvm-project#139193 (review) Test Plan: updated heatmap-preagg.test
Heatmap groups samples into buckets of configurable size (
--block-size
flag with 64 bytes as the default =X86 cache line size). Buckets are
mapped to containing sections; for buckets that cover multiple sections,
they are attributed to the first overlapping section. Buckets not mapped
to a section are reported as unmapped.
Heatmap reports section hotness which is a percentage of samples
attributed to the section.
Define section utilization as a percentage of buckets with non-zero
samples relative to the total number of section buckets.
Also define section partition score as a product of section hotness
(where total excludes unmapped buckets) and mapped utilization, ranging
from 0 to 1 (higher is better).
The intended use of new metrics is with production profile collected
from BOLT-optimized binary. In this case the partition score of .text
(hot text if function splitting is enabled) reflects optimization
profile representativeness and the quality of hot-cold splitting.
Partition score of 1 means that all samples fall into hot text, and all
buckets (cache lines) in hot text are exercised, equivalent to perfect
hot-cold splitting.
Test Plan: updated heatmap-preagg.test