|
| 1 | +# BOLT Address Translation (BAT) |
| 2 | +# Purpose |
| 3 | +A regular profile collection for BOLT involves collecting samples from |
| 4 | +unoptimized binary. BOLT Address Translation allows collecting profile |
| 5 | +from BOLT-optimized binary and using it for optimizing the input (pre-BOLT) |
| 6 | +binary. |
| 7 | + |
| 8 | +# Overview |
| 9 | +BOLT Address Translation is an extra section (`.note.bolt_bat`) inserted by BOLT |
| 10 | +into the output binary containing translation tables and split functions linkage |
| 11 | +information. This information enables mapping the profile back from optimized |
| 12 | +binary onto the original binary. |
| 13 | + |
| 14 | +# Usage |
| 15 | +`--enable-bat` flag controls the generation of BAT section. Sampled profile |
| 16 | +needs to be passed along with the optimized binary containing BAT section to |
| 17 | +`perf2bolt` which reads BAT section and produces fdata profile for the original |
| 18 | +binary. Note that YAML profile generation is not supported since BAT doesn't |
| 19 | +contain the metadata for input functions. |
| 20 | + |
| 21 | +# Internals |
| 22 | +## Section contents |
| 23 | +The section is organized as follows: |
| 24 | +- Functions table |
| 25 | + - Address translation tables |
| 26 | +- Fragment linkage table |
| 27 | + |
| 28 | +## Construction and parsing |
| 29 | +BAT section is created from `BoltAddressTranslation` class which captures |
| 30 | +address translation information provided by BOLT linker. It is then encoded as a |
| 31 | +note section in the output binary. |
| 32 | + |
| 33 | +During profile conversion when BAT-enabled binary is passed to perf2bolt, |
| 34 | +`BoltAddressTranslation` class is populated from BAT section. The class is then |
| 35 | +queried by `DataAggregator` during sample processing to reconstruct addresses/ |
| 36 | +offsets in the input binary. |
| 37 | + |
| 38 | +## Encoding format |
| 39 | +The encoding is specified in |
| 40 | +[BoltAddressTranslation.h](/bolt/include/bolt/Profile/BoltAddressTranslation.h) |
| 41 | +and [BoltAddressTranslation.cpp](/bolt/lib/Profile/BoltAddressTranslation.cpp). |
| 42 | + |
| 43 | +### Layout |
| 44 | +The general layout is as follows: |
| 45 | +``` |
| 46 | +Functions table header |
| 47 | +|------------------| |
| 48 | +| Function entry | |
| 49 | +| |--------------| | |
| 50 | +| | OutOff InOff | | |
| 51 | +| |--------------| | |
| 52 | +~~~~~~~~~~~~~~~~~~~~ |
| 53 | +
|
| 54 | +Fragment linkage header |
| 55 | +|------------------| |
| 56 | +| ColdAddr HotAddr | |
| 57 | +~~~~~~~~~~~~~~~~~~~~ |
| 58 | +``` |
| 59 | + |
| 60 | +### Functions table |
| 61 | +Header: |
| 62 | +| Entry | Width | Description | |
| 63 | +| ------ | ----- | ----------- | |
| 64 | +| `NumFuncs` | 4B | Number of functions in the functions table | |
| 65 | + |
| 66 | +The header is followed by Functions table with `NumFuncs` entries. |
| 67 | +| Entry | Width | Description | |
| 68 | +| ------ | ------| ----------- | |
| 69 | +| `Address` | 8B | Function address in the output binary | |
| 70 | +| `NumEntries` | 4B | Number of address translation entries for a function | |
| 71 | + |
| 72 | +Function header is followed by `NumEntries` pairs of offsets for current |
| 73 | +function. |
| 74 | + |
| 75 | +### Address translation table |
| 76 | +| Entry | Width | Description | |
| 77 | +| ------ | ------| ----------- | |
| 78 | +| `OutputAddr` | 4B | Function offset in output binary | |
| 79 | +| `InputAddr` | 4B | Function offset in input binary with `BRANCHENTRY` top bit | |
| 80 | + |
| 81 | +`BRANCHENTRY` bit denotes whether a given offset pair is a control flow source |
| 82 | +(branch or call instruction). If not set, it signifies a control flow target |
| 83 | +(basic block offset). |
| 84 | + |
| 85 | +### Fragment linkage table |
| 86 | +Following Functions table, fragment linkage table is encoded to link split |
| 87 | +cold fragments with main (hot) fragment. |
| 88 | +Header: |
| 89 | +| Entry | Width | Description | |
| 90 | +| ------ | ------------ | ----------- | |
| 91 | +| `NumColdEntries` | 4B | Number of split functions in the functions table | |
| 92 | + |
| 93 | +`NumColdEntries` pairs of addresses follow: |
| 94 | +| Entry | Width | Description | |
| 95 | +| ------ | ------| ----------- | |
| 96 | +| `ColdAddress` | 8B | Cold fragment address in output binary | |
| 97 | +| `HotAddress` | 8B | Hot fragment address in output binary | |
0 commit comments