Skip to content

Commit 1b763f2

Browse files
authored
[BOLT] Add secondary entry points to BAT
Provide secondary entry points for `EntryDiscriminator` call info field in YAML profile. Increases BAT section size to: - large binary: 39655300 bytes (1.03x the original), - medium binary: 3834328 bytes (0.65x), - small binary: 924 bytes (0.64x). Depends on: #76911 Test Plan: - Updated bolt-address-translation{,-yaml}.test - Added openssl test: rafaelauler/bolt-tests#30 Reviewers: dcci, rafaelauler, maksfb, ayermolo Reviewed By: rafaelauler Pull Request: #86218
1 parent da385e8 commit 1b763f2

File tree

5 files changed

+97
-30
lines changed

5 files changed

+97
-30
lines changed

bolt/docs/BAT.md

Lines changed: 36 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -42,21 +42,21 @@ and [BoltAddressTranslation.cpp](/bolt/lib/Profile/BoltAddressTranslation.cpp).
4242
### Layout
4343
The general layout is as follows:
4444
```
45-
Hot functions table header
46-
|------------------|
47-
| Function entry |
48-
| |--------------| |
49-
| | OutOff InOff | |
50-
| |--------------| |
51-
~~~~~~~~~~~~~~~~~~~~
45+
Hot functions table
46+
Cold functions table
5247
53-
Cold functions table header
48+
Functions table:
5449
|------------------|
5550
| Function entry |
56-
| |--------------| |
57-
| | OutOff InOff | |
58-
| |--------------| |
59-
~~~~~~~~~~~~~~~~~~~~
51+
| |
52+
| Address |
53+
| translation |
54+
| table |
55+
| |
56+
| Secondary entry |
57+
| points |
58+
|------------------|
59+
6060
```
6161

6262
### Functions table
@@ -74,19 +74,20 @@ internal offsets, and between hot and cold fragments, to better spread deltas
7474
and save space.
7575

7676
Hot indices are delta encoded, implicitly starting at zero.
77-
| Entry | Encoding | Description |
78-
| ------ | ------| ----------- |
79-
| `Address` | Continuous, Delta, ULEB128 | Function address in the output binary |
80-
| `HotIndex` | Delta, ULEB128 | Cold functions only: index of corresponding hot function in hot functions table |
81-
| `FuncHash` | 8b | Hot functions only: function hash for input function |
82-
| `NumBlocks` | ULEB128 | Hot functions only: number of basic blocks in the original function |
83-
| `NumEntries` | ULEB128 | Number of address translation entries for a function |
84-
| `EqualElems` | ULEB128 | Hot functions only: number of equal offsets in the beginning of a function |
85-
| `BranchEntries` | Bitmask, `alignTo(EqualElems, 8)` bits | Hot functions only: if `EqualElems` is non-zero, bitmask denoting entries with `BRANCHENTRY` bit |
86-
87-
Function header is followed by `EqualElems` offsets (hot functions only) and
88-
`NumEntries-EqualElems` (`NumEntries` for cold functions) pairs of offsets for
89-
current function.
77+
| Entry | Encoding | Description | Hot/Cold |
78+
| ------ | ------| ----------- | ------ |
79+
| `Address` | Continuous, Delta, ULEB128 | Function address in the output binary | Both |
80+
| `HotIndex` | Delta, ULEB128 | Index of corresponding hot function in hot functions table | Cold |
81+
| `FuncHash` | 8b | Function hash for input function | Hot |
82+
| `NumBlocks` | ULEB128 | Number of basic blocks in the original function | Hot |
83+
| `NumSecEntryPoints` | ULEB128 | Number of secondary entry points in the original function | Hot |
84+
| `NumEntries` | ULEB128 | Number of address translation entries for a function | Both |
85+
| `EqualElems` | ULEB128 | Number of equal offsets in the beginning of a function | Hot |
86+
| `BranchEntries` | Bitmask, `alignTo(EqualElems, 8)` bits | If `EqualElems` is non-zero, bitmask denoting entries with `BRANCHENTRY` bit | Hot |
87+
88+
Function header is followed by *Address Translation Table* with `NumEntries`
89+
total entries, and *Secondary Entry Points* table with `NumSecEntryPoints`
90+
entries (hot functions only).
9091

9192
### Address translation table
9293
Delta encoding means that only the difference with the previous corresponding
@@ -98,8 +99,18 @@ entry is encoded. Input offsets implicitly start at zero.
9899
| `BBHash` | Optional, 8b | Basic block hash in input binary | BB |
99100
| `BBIdx` | Optional, Delta, ULEB128 | Basic block index in input binary | BB |
100101

102+
For hot fragments, the table omits the first `EqualElems` input offsets
103+
where the input offset equals output offset.
104+
101105
`BRANCHENTRY` bit denotes whether a given offset pair is a control flow source
102106
(branch or call instruction). If not set, it signifies a control flow target
103107
(basic block offset).
104108
`InputAddr` is omitted for equal offsets in input and output function. In this
105109
case, `BRANCHENTRY` bits are encoded separately in a `BranchEntries` bitvector.
110+
111+
### Secondary Entry Points table
112+
The table is emitted for hot fragments only. It contains `NumSecEntryPoints`
113+
offsets denoting secondary entry points, delta encoded, implicitly starting at zero.
114+
| Entry | Encoding | Description |
115+
| ----- | -------- | ----------- |
116+
| `SecEntryPoint` | Delta, ULEB128 | Secondary entry point offset |

bolt/include/bolt/Profile/BoltAddressTranslation.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,9 @@ class BoltAddressTranslation {
150150
/// Map a function to its basic blocks count
151151
std::unordered_map<uint64_t, size_t> NumBasicBlocksMap;
152152

153+
/// Map a function to its secondary entry points vector
154+
std::unordered_map<uint64_t, std::vector<uint32_t>> SecondaryEntryPointsMap;
155+
153156
/// Links outlined cold bocks to their original function
154157
std::map<uint64_t, uint64_t> ColdPartSource;
155158

bolt/lib/Profile/BoltAddressTranslation.cpp

Lines changed: 56 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -88,14 +88,21 @@ void BoltAddressTranslation::write(const BinaryContext &BC, raw_ostream &OS) {
8888
if (Function.isIgnored() || (!BC.HasRelocations && !Function.isSimple()))
8989
continue;
9090

91-
// TBD: handle BAT functions w/multiple entry points.
92-
if (Function.isMultiEntry())
93-
continue;
91+
uint32_t NumSecondaryEntryPoints = 0;
92+
Function.forEachEntryPoint([&](uint64_t Offset, const MCSymbol *) {
93+
if (!Offset)
94+
return true;
95+
++NumSecondaryEntryPoints;
96+
SecondaryEntryPointsMap[OutputAddress].push_back(Offset);
97+
return true;
98+
});
9499

95100
LLVM_DEBUG(dbgs() << "Function name: " << Function.getPrintName() << "\n");
96101
LLVM_DEBUG(dbgs() << " Address reference: 0x"
97102
<< Twine::utohexstr(Function.getOutputAddress()) << "\n");
98103
LLVM_DEBUG(dbgs() << formatv(" Hash: {0:x}\n", getBFHash(OutputAddress)));
104+
LLVM_DEBUG(dbgs() << " Secondary Entry Points: " << NumSecondaryEntryPoints
105+
<< '\n');
99106

100107
MapTy Map;
101108
for (const BinaryBasicBlock *const BB :
@@ -185,6 +192,10 @@ void BoltAddressTranslation::writeMaps(std::map<uint64_t, MapTy> &Maps,
185192
<< Twine::utohexstr(Address) << ".\n");
186193
encodeULEB128(Address - PrevAddress, OS);
187194
PrevAddress = Address;
195+
const uint32_t NumSecondaryEntryPoints =
196+
SecondaryEntryPointsMap.count(Address)
197+
? SecondaryEntryPointsMap[Address].size()
198+
: 0;
188199
if (Cold) {
189200
size_t HotIndex =
190201
std::distance(ColdPartSource.begin(), ColdPartSource.find(Address));
@@ -199,6 +210,10 @@ void BoltAddressTranslation::writeMaps(std::map<uint64_t, MapTy> &Maps,
199210
size_t NumBasicBlocks = getBBHashMap(HotInputAddress).getNumBasicBlocks();
200211
LLVM_DEBUG(dbgs() << "Basic blocks: " << NumBasicBlocks << '\n');
201212
encodeULEB128(NumBasicBlocks, OS);
213+
// Secondary entry points
214+
encodeULEB128(NumSecondaryEntryPoints, OS);
215+
LLVM_DEBUG(dbgs() << "Secondary Entry Points: " << NumSecondaryEntryPoints
216+
<< '\n');
202217
}
203218
encodeULEB128(NumEntries, OS);
204219
// For hot fragments only: encode the number of equal offsets
@@ -244,6 +259,17 @@ void BoltAddressTranslation::writeMaps(std::map<uint64_t, MapTy> &Maps,
244259
InOffset >> 1, BBHash, BBIndex));
245260
}
246261
}
262+
uint32_t PrevOffset = 0;
263+
if (!Cold && NumSecondaryEntryPoints) {
264+
LLVM_DEBUG(dbgs() << "Secondary entry points: ");
265+
// Secondary entry point offsets, delta-encoded
266+
for (uint32_t Offset : SecondaryEntryPointsMap[Address]) {
267+
encodeULEB128(Offset - PrevOffset, OS);
268+
LLVM_DEBUG(dbgs() << formatv("{0:x} ", Offset));
269+
PrevOffset = Offset;
270+
}
271+
LLVM_DEBUG(dbgs() << '\n');
272+
}
247273
}
248274
}
249275

@@ -287,6 +313,7 @@ void BoltAddressTranslation::parseMaps(std::vector<uint64_t> &HotFuncs,
287313
const uint64_t Address = PrevAddress + DE.getULEB128(&Offset, &Err);
288314
uint64_t HotAddress = Cold ? 0 : Address;
289315
PrevAddress = Address;
316+
uint32_t SecondaryEntryPoints = 0;
290317
if (Cold) {
291318
HotIndex += DE.getULEB128(&Offset, &Err);
292319
HotAddress = HotFuncs[HotIndex];
@@ -303,6 +330,12 @@ void BoltAddressTranslation::parseMaps(std::vector<uint64_t> &HotFuncs,
303330
LLVM_DEBUG(dbgs() << formatv("{0:x}: #bbs {1}, {2} bytes\n", Address,
304331
NumBasicBlocks,
305332
getULEB128Size(NumBasicBlocks)));
333+
// Secondary entry points
334+
SecondaryEntryPoints = DE.getULEB128(&Offset, &Err);
335+
LLVM_DEBUG(
336+
dbgs() << formatv("{0:x}: secondary entry points {1}, {2} bytes\n",
337+
Address, SecondaryEntryPoints,
338+
getULEB128Size(SecondaryEntryPoints)));
306339
}
307340
const uint32_t NumEntries = DE.getULEB128(&Offset, &Err);
308341
// Equal offsets, hot fragments only.
@@ -370,6 +403,19 @@ void BoltAddressTranslation::parseMaps(std::vector<uint64_t> &HotFuncs,
370403
});
371404
}
372405
Maps.insert(std::pair<uint64_t, MapTy>(Address, Map));
406+
if (!Cold && SecondaryEntryPoints) {
407+
uint32_t EntryPointOffset = 0;
408+
LLVM_DEBUG(dbgs() << "Secondary entry points: ");
409+
for (uint32_t EntryPointId = 0; EntryPointId != SecondaryEntryPoints;
410+
++EntryPointId) {
411+
uint32_t OffsetDelta = DE.getULEB128(&Offset, &Err);
412+
EntryPointOffset += OffsetDelta;
413+
SecondaryEntryPointsMap[Address].push_back(EntryPointOffset);
414+
LLVM_DEBUG(dbgs() << formatv("{0:x}/{1}b ", EntryPointOffset,
415+
getULEB128Size(OffsetDelta)));
416+
}
417+
LLVM_DEBUG(dbgs() << '\n');
418+
}
373419
}
374420
}
375421

@@ -397,6 +443,13 @@ void BoltAddressTranslation::dump(raw_ostream &OS) {
397443
OS << formatv(" hash: {0:x}", BBHashMap.getBBHash(Val));
398444
OS << "\n";
399445
}
446+
if (SecondaryEntryPointsMap.count(Address)) {
447+
const std::vector<uint32_t> &SecondaryEntryPoints =
448+
SecondaryEntryPointsMap[Address];
449+
OS << SecondaryEntryPoints.size() << " secondary entry points:\n";
450+
for (uint32_t EntryPointOffset : SecondaryEntryPoints)
451+
OS << formatv("{0:x}\n", EntryPointOffset);
452+
}
400453
OS << "\n";
401454
}
402455
const size_t NumColdParts = ColdPartSource.size();

bolt/test/X86/bolt-address-translation-yaml.test

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ RUN: | FileCheck --check-prefix CHECK-BOLT-YAML %s
1818

1919
WRITE-BAT-CHECK: BOLT-INFO: Wrote 5 BAT maps
2020
WRITE-BAT-CHECK: BOLT-INFO: Wrote 4 function and 22 basic block hashes
21-
WRITE-BAT-CHECK: BOLT-INFO: BAT section size (bytes): 380
21+
WRITE-BAT-CHECK: BOLT-INFO: BAT section size (bytes): 384
2222

2323
READ-BAT-CHECK-NOT: BOLT-ERROR: unable to save profile in YAML format for input file processed by BOLT
2424
READ-BAT-CHECK: BOLT-INFO: Parsed 5 BAT entries

bolt/test/X86/bolt-address-translation.test

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
# CHECK: BOLT: 3 out of 7 functions were overwritten.
3838
# CHECK: BOLT-INFO: Wrote 6 BAT maps
3939
# CHECK: BOLT-INFO: Wrote 3 function and 58 basic block hashes
40-
# CHECK: BOLT-INFO: BAT section size (bytes): 920
40+
# CHECK: BOLT-INFO: BAT section size (bytes): 924
4141
#
4242
# usqrt mappings (hot part). We match against any key (left side containing
4343
# the bolted binary offsets) because BOLT may change where it puts instructions

0 commit comments

Comments
 (0)