Skip to content

[BOLT] Delta-encode offsets in BAT #76900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jan 11, 2024
6 changes: 4 additions & 2 deletions bolt/docs/BAT.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,10 +73,12 @@ Function header is followed by `NumEntries` pairs of offsets for current
function.

### Address translation table
Delta encoding means that only the difference with the previous corresponding
entry is encoded. Offsets implicitly start at zero.
| Entry | Encoding | Description |
| ------ | ------| ----------- |
| `OutputAddr` | ULEB128 | Function offset in output binary |
| `InputAddr` | ULEB128 | Function offset in input binary with `BRANCHENTRY` LSB bit |
| `OutputOffset` | Delta, ULEB128 | Function offset in output binary |
| `InputOffset` | Delta, SLEB128 | Function offset in input binary with `BRANCHENTRY` LSB bit |

`BRANCHENTRY` bit denotes whether a given offset pair is a control flow source
(branch or call instruction). If not set, it signifies a control flow target
Expand Down
21 changes: 14 additions & 7 deletions bolt/lib/Profile/BoltAddressTranslation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -114,9 +114,12 @@ void BoltAddressTranslation::write(const BinaryContext &BC, raw_ostream &OS) {
<< Twine::utohexstr(Address) << ".\n");
encodeULEB128(Address, OS);
encodeULEB128(NumEntries, OS);
uint64_t InOffset = 0, OutOffset = 0;
// Output and Input addresses and delta-encoded
for (std::pair<const uint32_t, uint32_t> &KeyVal : Map) {
encodeULEB128(KeyVal.first, OS);
encodeULEB128(KeyVal.second, OS);
encodeULEB128(KeyVal.first - OutOffset, OS);
encodeSLEB128(KeyVal.second - InOffset, OS);
std::tie(OutOffset, InOffset) = KeyVal;
}
}
const uint32_t NumColdEntries = ColdPartSource.size();
Expand Down Expand Up @@ -164,12 +167,16 @@ std::error_code BoltAddressTranslation::parse(StringRef Buf) {

LLVM_DEBUG(dbgs() << "Parsing " << NumEntries << " entries for 0x"
<< Twine::utohexstr(Address) << "\n");
uint64_t InputOffset = 0, OutputOffset = 0;
for (uint32_t J = 0; J < NumEntries; ++J) {
const uint32_t OutputAddr = DE.getULEB128(&Offset, &Err);
const uint32_t InputAddr = DE.getULEB128(&Offset, &Err);
Map.insert(std::pair<uint32_t, uint32_t>(OutputAddr, InputAddr));
LLVM_DEBUG(dbgs() << Twine::utohexstr(OutputAddr) << " -> "
<< Twine::utohexstr(InputAddr) << "\n");
const uint64_t OutputDelta = DE.getULEB128(&Offset, &Err);
const int64_t InputDelta = DE.getSLEB128(&Offset, &Err);
OutputOffset += OutputDelta;
InputOffset += InputDelta;
Map.insert(std::pair<uint32_t, uint32_t>(OutputOffset, InputOffset));
LLVM_DEBUG(dbgs() << Twine::utohexstr(OutputOffset) << " -> "
<< Twine::utohexstr(InputOffset) << " (" << OutputDelta
<< ", " << InputDelta << ")\n");
}
Maps.insert(std::pair<uint64_t, MapTy>(Address, Map));
}
Expand Down
2 changes: 1 addition & 1 deletion bolt/test/X86/bolt-address-translation.test
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
# CHECK: BOLT: 3 out of 7 functions were overwritten.
# CHECK: BOLT-INFO: Wrote 6 BAT maps
# CHECK: BOLT-INFO: Wrote 3 BAT cold-to-hot entries
# CHECK: BOLT-INFO: BAT section size (bytes): 680
# CHECK: BOLT-INFO: BAT section size (bytes): 436
#
# usqrt mappings (hot part). We match against any key (left side containing
# the bolted binary offsets) because BOLT may change where it puts instructions
Expand Down