Skip to content

Commit 3fa8210

Browse files
committed
Merge branch 'main' into users/vikramRH/enable_opt
2 parents 9cccf69 + 94279ae commit 3fa8210

File tree

663 files changed

+98741
-100390
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

663 files changed

+98741
-100390
lines changed

bolt/docs/CommandLineArgumentReference.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -283,6 +283,12 @@
283283

284284
List of functions to pad with amount of bytes
285285

286+
- `--print-mappings`
287+
288+
Print mappings in the legend, between characters/blocks and text sections
289+
(default false).
290+
291+
286292
- `--profile-format=<value>`
287293

288294
Format to dump profile output in aggregation mode, default is fdata
@@ -1240,4 +1246,4 @@
12401246

12411247
- `--print-options`
12421248

1243-
Print non-default options after command line parsing
1249+
Print non-default options after command line parsing

bolt/docs/HeatmapHeader.png

75 KB
Loading

bolt/docs/Heatmaps.md

Lines changed: 56 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# Code Heatmaps
22

33
BOLT has gained the ability to print code heatmaps based on
4-
sampling-based LBR profiles generated by `perf`. The output is produced
5-
in colored ASCII to be displayed in a color-capable terminal. It looks
6-
something like this:
4+
sampling-based profiles generated by `perf`, either with `LBR` data or not.
5+
The output is produced in colored ASCII to be displayed in a color-capable
6+
terminal. It looks something like this:
77

88
![](./Heatmap.png)
99

@@ -32,20 +32,64 @@ $ llvm-bolt-heatmap -p perf.data <executable>
3232
```
3333

3434
By default the heatmap will be dumped to *stdout*. You can change it
35-
with `-o <heatmapfile>` option. Each character/block in the heatmap
36-
shows the execution data accumulated for corresponding 64 bytes of
37-
code. You can change this granularity with a `-block-size` option.
38-
E.g. set it to 4096 to see code usage grouped by 4K pages.
39-
Other useful options are:
35+
with `-o <heatmapfile>` option.
4036

41-
```bash
42-
-line-size=<uint> - number of entries per line (default 256)
43-
-max-address=<uint> - maximum address considered valid for heatmap (default 4GB)
44-
```
4537

4638
If you prefer to look at the data in a browser (or would like to share
4739
it that way), then you can use an HTML conversion tool. E.g.:
4840

4941
```bash
5042
$ aha -b -f <heatmapfile> > <heatmapfile>.html
5143
```
44+
45+
---
46+
47+
## Background on heatmaps:
48+
A heatmap is effectively a histogram that is rendered into a grid for better
49+
visualization.
50+
In theory we can generate a heatmap using any binary and a perf profile.
51+
52+
Each block/character in the heatmap shows the execution data accumulated for
53+
corresponding 64 bytes of code. You can change this granularity with a
54+
`-block-size` option.
55+
E.g. set it to 4096 to see code usage grouped by 4K pages.
56+
57+
58+
When a block is shown as a dot, it means that no samples were found for that
59+
address.
60+
When it is shown as a letter, it indicates a captured sample on a particular
61+
text section of the binary.
62+
To show a mapping between letters and text sections in the legend, use
63+
`-print-mappings`.
64+
When a sampled address does not belong to any of the text sections, the
65+
characters 'o' or 'O' will be shown.
66+
67+
The legend shows by default the ranges in the heatmap according to the number
68+
of samples per block.
69+
A color is assigned per range, except the first two ranges that distinguished by
70+
lower and upper case letters.
71+
72+
On the Y axis, each row/line starts with an actual address of the binary.
73+
Consecutive lines in the heatmap advance by the same amount, with the binary
74+
size covered by a line dependent on the block size and the line size.
75+
An empty new line is inserted for larger gaps between samples.
76+
77+
On the X axis, the horizontally emitted hex numbers can help *estimate* where
78+
in the line the samples lie, but they cannot be combined to provide a full
79+
address, as they are relative to both the bucket and line sizes.
80+
81+
In the example below, the highlighted `0x100` column is not an offset to each
82+
row's address, but instead, it points to the middle of the line.
83+
For the generation, the default bucket size was used with a line size of 128.
84+
85+
86+
![](./HeatmapHeader.png)
87+
88+
89+
Some useful options are:
90+
91+
```
92+
-line-size=<uint> - number of entries per line (default 256)
93+
-max-address=<uint> - maximum address considered valid for heatmap (default 4GB)
94+
-print-mappings - print mappings in the legend, between characters/blocks and text sections (default false)
95+
```

bolt/include/bolt/Core/MCPlusBuilder.h

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2041,9 +2041,13 @@ class MCPlusBuilder {
20412041
return InstructionListType();
20422042
}
20432043

2044+
/// Returns a function body that contains only a return instruction. An
2045+
/// example usage is a workaround for the '__bolt_fini_trampoline' of
2046+
// Instrumentation.
20442047
virtual InstructionListType createDummyReturnFunction(MCContext *Ctx) const {
2045-
llvm_unreachable("not implemented");
2046-
return InstructionListType();
2048+
InstructionListType Insts(1);
2049+
createReturn(Insts[0]);
2050+
return Insts;
20472051
}
20482052

20492053
/// This method takes an indirect call instruction and splits it up into an

bolt/include/bolt/Utils/CommandLineOpts.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ extern llvm::cl::opt<unsigned> ExecutionCountThreshold;
4040
extern llvm::cl::opt<unsigned> HeatmapBlock;
4141
extern llvm::cl::opt<unsigned long long> HeatmapMaxAddress;
4242
extern llvm::cl::opt<unsigned long long> HeatmapMinAddress;
43+
extern llvm::cl::opt<bool> HeatmapPrintMappings;
4344
extern llvm::cl::opt<bool> HotData;
4445
extern llvm::cl::opt<bool> HotFunctionsAtEnd;
4546
extern llvm::cl::opt<bool> HotText;

bolt/lib/Profile/Heatmap.cpp

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
#include "llvm/Support/Debug.h"
1414
#include "llvm/Support/FileSystem.h"
1515
#include "llvm/Support/Format.h"
16+
#include "llvm/Support/FormatVariadic.h"
1617
#include "llvm/Support/MathExtras.h"
1718
#include "llvm/Support/raw_ostream.h"
1819
#include <algorithm>
@@ -164,6 +165,7 @@ void Heatmap::print(raw_ostream &OS) const {
164165

165166
// Print map legend
166167
OS << "Legend:\n";
168+
OS << "\nRanges:\n";
167169
uint64_t PrevValue = 0;
168170
for (unsigned I = 0; I < sizeof(Range) / sizeof(Range[0]); ++I) {
169171
const uint64_t Value = Range[I];
@@ -172,6 +174,22 @@ void Heatmap::print(raw_ostream &OS) const {
172174
OS << " : (" << PrevValue << ", " << Value << "]\n";
173175
PrevValue = Value;
174176
}
177+
if (opts::HeatmapPrintMappings) {
178+
OS << "\nSections:\n";
179+
unsigned SectionIdx = 0;
180+
for (auto TxtSeg : TextSections) {
181+
const char Upper = static_cast<char>('A' + ((SectionIdx++) % 26));
182+
const char Lower = static_cast<char>(std::tolower(Upper));
183+
OS << formatv(" {0}/{1} : {2,-10} ", Lower, Upper, TxtSeg.Name);
184+
if (MaxAddress > 0xffffffff)
185+
OS << format("0x%016" PRIx64, TxtSeg.BeginAddress) << "-"
186+
<< format("0x%016" PRIx64, TxtSeg.EndAddress) << "\n";
187+
else
188+
OS << format("0x%08" PRIx64, TxtSeg.BeginAddress) << "-"
189+
<< format("0x%08" PRIx64, TxtSeg.EndAddress) << "\n";
190+
}
191+
OS << "\n";
192+
}
175193

176194
// Pos - character position from right in hex form.
177195
auto printHeader = [&](unsigned Pos) {

bolt/lib/Target/X86/X86MCPlusBuilder.cpp

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3241,12 +3241,6 @@ class X86MCPlusBuilder : public MCPlusBuilder {
32413241
return Insts;
32423242
}
32433243

3244-
InstructionListType createDummyReturnFunction(MCContext *Ctx) const override {
3245-
InstructionListType Insts(1);
3246-
createReturn(Insts[0]);
3247-
return Insts;
3248-
}
3249-
32503244
BlocksVectorTy indirectCallPromotion(
32513245
const MCInst &CallInst,
32523246
const std::vector<std::pair<MCSymbol *, uint64_t>> &Targets,

bolt/lib/Utils/CommandLineOpts.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,12 @@ cl::opt<unsigned long long> HeatmapMinAddress(
105105
cl::desc("minimum address considered valid for heatmap (default 0)"),
106106
cl::Optional, cl::cat(HeatmapCategory));
107107

108+
cl::opt<bool> HeatmapPrintMappings(
109+
"print-mappings", cl::init(false),
110+
cl::desc("print mappings in the legend, between characters/blocks and text "
111+
"sections (default false)"),
112+
cl::Optional, cl::cat(HeatmapCategory));
113+
108114
cl::opt<bool> HotData("hot-data",
109115
cl::desc("hot data symbols support (relocation mode)"),
110116
cl::cat(BoltCategory));

bolt/test/AArch64/dummy-return.s

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# REQUIRES: system-linux,target=aarch64{{.*}}
2+
3+
# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown %s -o %t.o
4+
# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q -static
5+
# RUN: llvm-bolt -instrument -instrumentation-sleep-time=1 %t.exe \
6+
# RUN: -o %t.instr 2>&1 | FileCheck %s
7+
# RUN: llvm-objdump --disassemble-symbols=__bolt_fini_trampoline %t.instr -D \
8+
# RUN: | FileCheck %s -check-prefix=CHECK-ASM
9+
10+
# CHECK: BOLT-INFO: output linked against instrumentation runtime library
11+
# CHECK-ASM: <__bolt_fini_trampoline>:
12+
# CHECK-ASM-NEXT: ret
13+
14+
.text
15+
.align 4
16+
.global _start
17+
.type _start, %function
18+
_start:
19+
bl foo
20+
ret
21+
.size _start, .-_start
22+
23+
.global foo
24+
.type foo, %function
25+
foo:
26+
mov w0, wzr
27+
ret
28+
.size foo, .-foo

0 commit comments

Comments
 (0)