Skip to content

Commit ad4cead

Browse files
authored
[BOLT][DWARF][NFC] Initialize CloneUnitCtxMap with current partition size (#75876)
We would always allocate maximum amount for vector containing DWARFUnitInfo. In real usecases what ends up hapenning is we allocate a giant vector when processing one CU, or for thin-lto case multiple CUs. This lead to a lot of memory overhead, and 2x BOLT processing slowdown for at least one service built with monolithic DWARF. For binaries built with LTO with clang all of CUs that have cross references will share an abbrev table and will be processed in one batch. Rest of CUs are processesd in --cu-processing-batch-size size. Which defaults to 1. For theoretical cases where cross-cu references are present, but they do not share abbrev will increase the size of CloneUnitCtxMap as each CU is being processsed.
1 parent 5537483 commit ad4cead

File tree

1 file changed

+9
-7
lines changed

1 file changed

+9
-7
lines changed

bolt/lib/Core/DIEBuilder.cpp

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -266,13 +266,11 @@ void DIEBuilder::buildCompileUnits(const bool Init) {
266266
}
267267
void DIEBuilder::buildCompileUnits(const std::vector<DWARFUnit *> &CUs) {
268268
BuilderState.reset(new State());
269-
// Initializing to full size because there could be cross CU references with
270-
// different abbrev offsets. LLVM happens to output CUs that have cross CU
271-
// references with the same abbrev table. So destinations end up in the first
272-
// set, even if they themselves don't have src cross cu ref. We could have
273-
// cases where this is not the case. In which case this container needs to be
274-
// big enough for all.
275-
getState().CloneUnitCtxMap.resize(DwarfContext->getNumCompileUnits());
269+
// Allocating enough for current batch being processed.
270+
// In real use cases we either processing a batch of CUs with no cross
271+
// references, or if they do have them it is due to LTO. With clang they will
272+
// share the same abbrev table. In either case this vector will not grow.
273+
getState().CloneUnitCtxMap.resize(CUs.size());
276274
getState().Type = ProcessingType::CUs;
277275
for (DWARFUnit *CU : CUs)
278276
registerUnit(*CU, false);
@@ -897,6 +895,10 @@ void DIEBuilder::registerUnit(DWARFUnit &DU, bool NeedSort) {
897895
});
898896
}
899897
getState().UnitIDMap[getHash(DU)] = getState().DUList.size();
898+
// This handles the case where we do have cross cu references, but CUs do not
899+
// share the same abbrev table.
900+
if (getState().DUList.size() == getState().CloneUnitCtxMap.size())
901+
getState().CloneUnitCtxMap.emplace_back();
900902
getState().DUList.push_back(&DU);
901903
}
902904

0 commit comments

Comments
 (0)