Skip to content

Commit 8d9db94

Browse files
Reland "[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option" (#95482)
Make `FunctionsToImportTy` an `unordered_map` rather than `DenseMap`. Credit goes to jvoung@ for the 'DenseMap -> unordered_map' change. This is a reland of #92718 * `DenseMap` allocates space for a large number of key/value pairs and wastes space when the number of elements are small. * While init bucket size is zero [1], it quickly allocates buckets for 64 elements [2] when the number of elements is small (for example, 3 or 4 elements). The programmer manual [3] also mentions it could waste space. * Experiments show `FunctionsToImportTy.size()` is smaller than 4 for multiple binaries with high indexing ram usage. `unordered_map` grows factor is at most 2 in llvm libc [4] for insert operations. With this change, `ComputeCrossModuleImport` ram increase is smaller than 0.5G on a couple of binaries with high indexing ram usage. A wider range of (pre-release) tests pass. [1] https://github.com/llvm/llvm-project/blob/ad79a14c9e5ec4a369eed4adf567c22cc029863f/llvm/include/llvm/ADT/DenseMap.h#L431-L432 [2] https://github.com/llvm/llvm-project/blob/ad79a14c9e5ec4a369eed4adf567c22cc029863f/llvm/include/llvm/ADT/DenseMap.h#L849 [3] https://llvm.org/docs/ProgrammersManual.html#llvm-adt-densemap-h [4] https://github.com/llvm/llvm-project/blob/ad79a14c9e5ec4a369eed4adf567c22cc029863f/libcxx/include/__hash_table#L1525-L1526 **Original commit message** The goal is to populate `declaration` import status if a new flag `-import-declaration` is on. * For in-process ThinLTO, the `declaration` status is visible to backend `function-import` pass, so `FunctionImporter::importFunctions` should read the import status and be no-op for declaration summaries. Basically, the postlink pipeline is updated to keep its current behavior (import definitions), but not updated to handle `declaration` summaries. Two use cases ([better call-graph sort](https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5) or [cross-module auto-init](#87597 (comment))) would use this bit differently. * For distributed ThinLTO, the `declaration` status is not serialized to bitcode. As discussed, #87600 will do this.
1 parent 8367030 commit 8d9db94

File tree

9 files changed

+449
-85
lines changed

9 files changed

+449
-85
lines changed

llvm/include/llvm/IR/ModuleSummaryIndex.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -587,6 +587,10 @@ class GlobalValueSummary {
587587

588588
void setImportKind(ImportKind IK) { Flags.ImportType = IK; }
589589

590+
GlobalValueSummary::ImportKind importType() const {
591+
return static_cast<ImportKind>(Flags.ImportType);
592+
}
593+
590594
GlobalValue::VisibilityTypes getVisibility() const {
591595
return (GlobalValue::VisibilityTypes)Flags.Visibility;
592596
}
@@ -1272,6 +1276,9 @@ using ModulePathStringTableTy = StringMap<ModuleHash>;
12721276
/// a particular module, and provide efficient access to their summary.
12731277
using GVSummaryMapTy = DenseMap<GlobalValue::GUID, GlobalValueSummary *>;
12741278

1279+
/// A set of global value summary pointers.
1280+
using GVSummaryPtrSet = SmallPtrSet<GlobalValueSummary *, 4>;
1281+
12751282
/// Map of a type GUID to type id string and summary (multimap used
12761283
/// in case of GUID conflicts).
12771284
using TypeIdSummaryMapTy =

llvm/include/llvm/Transforms/IPO/FunctionImport.h

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
#include <memory>
2121
#include <string>
2222
#include <system_error>
23+
#include <unordered_map>
2324
#include <unordered_set>
2425
#include <utility>
2526

@@ -31,9 +32,13 @@ class Module;
3132
/// based on the provided summary informations.
3233
class FunctionImporter {
3334
public:
34-
/// Set of functions to import from a source module. Each entry is a set
35-
/// containing all the GUIDs of all functions to import for a source module.
36-
using FunctionsToImportTy = std::unordered_set<GlobalValue::GUID>;
35+
/// The functions to import from a source module and their import type.
36+
/// Note we choose unordered_map over (Small)DenseMap. The number of imports
37+
/// from a source module could be small but DenseMap size grows to 64 quickly
38+
/// and not memory efficient (see
39+
/// https://llvm.org/docs/ProgrammersManual.html#llvm-adt-densemap-h)
40+
using FunctionsToImportTy =
41+
std::unordered_map<GlobalValue::GUID, GlobalValueSummary::ImportKind>;
3742

3843
/// The different reasons selectCallee will chose not to import a
3944
/// candidate.
@@ -99,8 +104,13 @@ class FunctionImporter {
99104
/// index's module path string table).
100105
using ImportMapTy = DenseMap<StringRef, FunctionsToImportTy>;
101106

102-
/// The set contains an entry for every global value the module exports.
103-
using ExportSetTy = DenseSet<ValueInfo>;
107+
/// The map contains an entry for every global value the module exports.
108+
/// The key is ValueInfo, and the value indicates whether the definition
109+
/// or declaration is visible to another module. If a function's definition is
110+
/// visible to other modules, the global values this function referenced are
111+
/// visible and shouldn't be internalized.
112+
/// TODO: Rename to `ExportMapTy`.
113+
using ExportSetTy = DenseMap<ValueInfo, GlobalValueSummary::ImportKind>;
104114

105115
/// A function of this type is used to load modules referenced by the index.
106116
using ModuleLoaderTy =

llvm/lib/LTO/LTO.cpp

Lines changed: 20 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,9 @@ void llvm::computeLTOCacheKey(
123123
support::endian::write64le(Data, I);
124124
Hasher.update(Data);
125125
};
126+
auto AddUint8 = [&](const uint8_t I) {
127+
Hasher.update(ArrayRef<uint8_t>((const uint8_t *)&I, 1));
128+
};
126129
AddString(Conf.CPU);
127130
// FIXME: Hash more of Options. For now all clients initialize Options from
128131
// command-line flags (which is unsupported in production), but may set
@@ -158,18 +161,18 @@ void llvm::computeLTOCacheKey(
158161
auto ModHash = Index.getModuleHash(ModuleID);
159162
Hasher.update(ArrayRef<uint8_t>((uint8_t *)&ModHash[0], sizeof(ModHash)));
160163

161-
std::vector<uint64_t> ExportsGUID;
164+
std::vector<std::pair<uint64_t, uint8_t>> ExportsGUID;
162165
ExportsGUID.reserve(ExportList.size());
163-
for (const auto &VI : ExportList) {
164-
auto GUID = VI.getGUID();
165-
ExportsGUID.push_back(GUID);
166-
}
166+
for (const auto &[VI, ExportType] : ExportList)
167+
ExportsGUID.push_back(
168+
std::make_pair(VI.getGUID(), static_cast<uint8_t>(ExportType)));
167169

168170
// Sort the export list elements GUIDs.
169171
llvm::sort(ExportsGUID);
170-
for (uint64_t GUID : ExportsGUID) {
172+
for (auto [GUID, ExportType] : ExportsGUID) {
171173
// The export list can impact the internalization, be conservative here
172174
Hasher.update(ArrayRef<uint8_t>((uint8_t *)&GUID, sizeof(GUID)));
175+
AddUint8(ExportType);
173176
}
174177

175178
// Include the hash for every module we import functions from. The set of
@@ -201,19 +204,21 @@ void llvm::computeLTOCacheKey(
201204
[](const ImportModule &Lhs, const ImportModule &Rhs) -> bool {
202205
return Lhs.getHash() < Rhs.getHash();
203206
});
204-
std::vector<uint64_t> ImportedGUIDs;
207+
std::vector<std::pair<uint64_t, uint8_t>> ImportedGUIDs;
205208
for (const ImportModule &Entry : ImportModulesVector) {
206209
auto ModHash = Entry.getHash();
207210
Hasher.update(ArrayRef<uint8_t>((uint8_t *)&ModHash[0], sizeof(ModHash)));
208211

209212
AddUint64(Entry.getFunctions().size());
210213

211214
ImportedGUIDs.clear();
212-
for (auto &Fn : Entry.getFunctions())
213-
ImportedGUIDs.push_back(Fn);
215+
for (auto &[Fn, ImportType] : Entry.getFunctions())
216+
ImportedGUIDs.push_back(std::make_pair(Fn, ImportType));
214217
llvm::sort(ImportedGUIDs);
215-
for (auto &GUID : ImportedGUIDs)
218+
for (auto &[GUID, Type] : ImportedGUIDs) {
216219
AddUint64(GUID);
220+
AddUint8(Type);
221+
}
217222
}
218223

219224
// Include the hash for the resolved ODR.
@@ -283,9 +288,9 @@ void llvm::computeLTOCacheKey(
283288
// Imported functions may introduce new uses of type identifier resolutions,
284289
// so we need to collect their used resolutions as well.
285290
for (const ImportModule &ImpM : ImportModulesVector)
286-
for (auto &ImpF : ImpM.getFunctions()) {
291+
for (auto &[GUID, UnusedImportType] : ImpM.getFunctions()) {
287292
GlobalValueSummary *S =
288-
Index.findSummaryInModule(ImpF, ImpM.getIdentifier());
293+
Index.findSummaryInModule(GUID, ImpM.getIdentifier());
289294
AddUsedThings(S);
290295
// If this is an alias, we also care about any types/etc. that the aliasee
291296
// may reference.
@@ -1397,6 +1402,7 @@ class lto::ThinBackendProc {
13971402
llvm::StringRef ModulePath,
13981403
const std::string &NewModulePath) {
13991404
std::map<std::string, GVSummaryMapTy> ModuleToSummariesForIndex;
1405+
14001406
std::error_code EC;
14011407
gatherImportedSummariesForModule(ModulePath, ModuleToDefinedGVSummaries,
14021408
ImportList, ModuleToSummariesForIndex);
@@ -1405,6 +1411,8 @@ class lto::ThinBackendProc {
14051411
sys::fs::OpenFlags::OF_None);
14061412
if (EC)
14071413
return errorCodeToError(EC);
1414+
1415+
// TODO: Serialize declaration bits to bitcode.
14081416
writeIndexToFile(CombinedIndex, OS, &ModuleToSummariesForIndex);
14091417

14101418
if (ShouldEmitImportsFiles) {

llvm/lib/LTO/LTOBackend.cpp

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -720,7 +720,14 @@ bool lto::initImportList(const Module &M,
720720
if (Summary->modulePath() == M.getModuleIdentifier())
721721
continue;
722722
// Add an entry to provoke importing by thinBackend.
723-
ImportList[Summary->modulePath()].insert(GUID);
723+
// Try emplace the entry first. If an entry with the same key already
724+
// exists, set the value to 'std::min(existing-value, new-value)' to make
725+
// sure a definition takes precedence over a declaration.
726+
auto [Iter, Inserted] = ImportList[Summary->modulePath()].try_emplace(
727+
GUID, Summary->importType());
728+
729+
if (!Inserted)
730+
Iter->second = std::min(Iter->second, Summary->importType());
724731
}
725732
}
726733
return true;

0 commit comments

Comments
 (0)