Skip to content

Commit a7cf0a1

Browse files
committed
[BOLT] Add BOLT Address Translation documentation (#76899)
Test Plan: Open the page in browser
1 parent fb09447 commit a7cf0a1

File tree

1 file changed

+97
-0
lines changed

1 file changed

+97
-0
lines changed

bolt/docs/BAT.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# BOLT Address Translation (BAT)
2+
# Purpose
3+
A regular profile collection for BOLT involves collecting samples from
4+
unoptimized binary. BOLT Address Translation allows collecting profile
5+
from BOLT-optimized binary and using it for optimizing the input (pre-BOLT)
6+
binary.
7+
8+
# Overview
9+
BOLT Address Translation is an extra section (`.note.bolt_bat`) inserted by BOLT
10+
into the output binary containing translation tables and split functions linkage
11+
information. This information enables mapping the profile back from optimized
12+
binary onto the original binary.
13+
14+
# Usage
15+
`--enable-bat` flag controls the generation of BAT section. Sampled profile
16+
needs to be passed along with the optimized binary containing BAT section to
17+
`perf2bolt` which reads BAT section and produces fdata profile for the original
18+
binary. Note that YAML profile generation is not supported since BAT doesn't
19+
contain the metadata for input functions.
20+
21+
# Internals
22+
## Section contents
23+
The section is organized as follows:
24+
- Functions table
25+
- Address translation tables
26+
- Fragment linkage table
27+
28+
## Construction and parsing
29+
BAT section is created from `BoltAddressTranslation` class which captures
30+
address translation information provided by BOLT linker. It is then encoded as a
31+
note section in the output binary.
32+
33+
During profile conversion when BAT-enabled binary is passed to perf2bolt,
34+
`BoltAddressTranslation` class is populated from BAT section. The class is then
35+
queried by `DataAggregator` during sample processing to reconstruct addresses/
36+
offsets in the input binary.
37+
38+
## Encoding format
39+
The encoding is specified in
40+
[BoltAddressTranslation.h](/bolt/include/bolt/Profile/BoltAddressTranslation.h)
41+
and [BoltAddressTranslation.cpp](/bolt/lib/Profile/BoltAddressTranslation.cpp).
42+
43+
### Layout
44+
The general layout is as follows:
45+
```
46+
Functions table header
47+
|------------------|
48+
| Function entry |
49+
| |--------------| |
50+
| | OutOff InOff | |
51+
| |--------------| |
52+
~~~~~~~~~~~~~~~~~~~~
53+
54+
Fragment linkage header
55+
|------------------|
56+
| ColdAddr HotAddr |
57+
~~~~~~~~~~~~~~~~~~~~
58+
```
59+
60+
### Functions table
61+
Header:
62+
| Entry | Width | Description |
63+
| ------ | ----- | ----------- |
64+
| `NumFuncs` | 4B | Number of functions in the functions table |
65+
66+
The header is followed by Functions table with `NumFuncs` entries.
67+
| Entry | Width | Description |
68+
| ------ | ------| ----------- |
69+
| `Address` | 8B | Function address in the output binary |
70+
| `NumEntries` | 4B | Number of address translation entries for a function |
71+
72+
Function header is followed by `NumEntries` pairs of offsets for current
73+
function.
74+
75+
### Address translation table
76+
| Entry | Width | Description |
77+
| ------ | ------| ----------- |
78+
| `OutputAddr` | 4B | Function offset in output binary |
79+
| `InputAddr` | 4B | Function offset in input binary with `BRANCHENTRY` top bit |
80+
81+
`BRANCHENTRY` bit denotes whether a given offset pair is a control flow source
82+
(branch or call instruction). If not set, it signifies a control flow target
83+
(basic block offset).
84+
85+
### Fragment linkage table
86+
Following Functions table, fragment linkage table is encoded to link split
87+
cold fragments with main (hot) fragment.
88+
Header:
89+
| Entry | Width | Description |
90+
| ------ | ------------ | ----------- |
91+
| `NumColdEntries` | 4B | Number of split functions in the functions table |
92+
93+
`NumColdEntries` pairs of addresses follow:
94+
| Entry | Width | Description |
95+
| ------ | ------| ----------- |
96+
| `ColdAddress` | 8B | Cold fragment address in output binary |
97+
| `HotAddress` | 8B | Hot fragment address in output binary |

0 commit comments

Comments
 (0)