Skip to content

Commit 91b0501

Browse files
committed
[clang][modules] Do not resolve HeaderFileInfo externally in ASTWriter (llvm#87848)
Clang uses the `HeaderFileInfo` struct to track bits of information on header files, which gets used throughout the compiler. We also use this to compute the set of affecting module maps in `ASTWriter` and in the end serialize the information into the `HEADER_SEARCH_TABLE` record of a PCM file, allowing clients to learn about headers from the module. In doing so, Clang asks for existing `HeaderFileInfo` for all known `FileEntries`. Note that this asks the loaded PCM files for the information they have on each header file in question. This seems unnecessary: we only want to serialize information on header files that either belong to the current module or that got included textually. Loaded PCM files can't provide us with any useful information. For explicit modules with lazy loading (using `-fmodule-map-file=<path>` with `-fmodule-file=<name>=<path>`) the compiler knows about header files listed in the module map files on the command-line. This can be a large number. Asking for existing `HeaderFileInfo` can trigger deserialization of `HEADER_SEARCH_TABLE` from loaded PCM files. Keys of the on-disk hash table consist of the header file size and modification time. However, with explicit modules Clang zeroes out the modification time. Moreover, if you import lots of modules, some of their header files end up having identical sizes. This means lots of hash collisions that can only be resolved by running the serialized filename through `FileManager` and comparing equality of the `FileEntry`. This ends up being super expensive, essentially re-stating lots of the transitively loaded SDK header files. This patch cleans up the API for getting `HeaderFileInfo` and makes sure `ASTWriter` uses the version that doesn't ask loaded PCM files for more information. This removes the excessive stat traffic coming from `ASTWriter` hopefully without changing observable behavior. (cherry picked from commit 84df7a0)
1 parent 3bcf7f7 commit 91b0501

File tree

4 files changed

+60
-55
lines changed

4 files changed

+60
-55
lines changed

clang-tools-extra/clangd/index/SymbolCollector.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -440,7 +440,7 @@ class SymbolCollector::HeaderFileURICache {
440440
// Framework headers are spelled as <FrameworkName/Foo.h>, not
441441
// "path/FrameworkName.framework/Headers/Foo.h".
442442
auto &HS = PP->getHeaderSearchInfo();
443-
if (const auto *HFI = HS.getExistingFileInfo(*FE, /*WantExternal*/ false))
443+
if (const auto *HFI = HS.getExistingLocalFileInfo(*FE))
444444
if (!HFI->Framework.empty())
445445
if (auto Spelling =
446446
getFrameworkHeaderIncludeSpelling(*FE, HFI->Framework, HS))

clang/include/clang/Lex/HeaderSearch.h

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -540,14 +540,15 @@ class HeaderSearch {
540540
/// Return whether the specified file is a normal header,
541541
/// a system header, or a C++ friendly system header.
542542
SrcMgr::CharacteristicKind getFileDirFlavor(const FileEntry *File) {
543-
return (SrcMgr::CharacteristicKind)getFileInfo(File).DirInfo;
543+
if (const HeaderFileInfo *HFI = getExistingFileInfo(File))
544+
return (SrcMgr::CharacteristicKind)HFI->DirInfo;
545+
return (SrcMgr::CharacteristicKind)HeaderFileInfo().DirInfo;
544546
}
545547

546548
/// Mark the specified file as a "once only" file due to
547549
/// \#pragma once.
548550
void MarkFileIncludeOnce(const FileEntry *File) {
549-
HeaderFileInfo &FI = getFileInfo(File);
550-
FI.isPragmaOnce = true;
551+
getFileInfo(File).isPragmaOnce = true;
551552
}
552553

553554
/// Mark the specified file as a system header, e.g. due to
@@ -828,16 +829,17 @@ class HeaderSearch {
828829

829830
unsigned header_file_size() const { return FileInfo.size(); }
830831

831-
/// Return the HeaderFileInfo structure for the specified FileEntry,
832-
/// in preparation for updating it in some way.
832+
/// Return the HeaderFileInfo structure for the specified FileEntry, in
833+
/// preparation for updating it in some way.
833834
HeaderFileInfo &getFileInfo(const FileEntry *FE);
834835

835-
/// Return the HeaderFileInfo structure for the specified FileEntry,
836-
/// if it has ever been filled in.
837-
/// \param WantExternal Whether the caller wants purely-external header file
838-
/// info (where \p External is true).
839-
const HeaderFileInfo *getExistingFileInfo(const FileEntry *FE,
840-
bool WantExternal = true) const;
836+
/// Return the HeaderFileInfo structure for the specified FileEntry, if it has
837+
/// ever been filled in (either locally or externally).
838+
const HeaderFileInfo *getExistingFileInfo(const FileEntry *FE) const;
839+
840+
/// Return the headerFileInfo structure for the specified FileEntry, if it has
841+
/// ever been filled in locally.
842+
const HeaderFileInfo *getExistingLocalFileInfo(const FileEntry *FE) const;
841843

842844
SearchDirIterator search_dir_begin() { return {*this, 0}; }
843845
SearchDirIterator search_dir_end() { return {*this, SearchDirs.size()}; }

clang/lib/Lex/HeaderSearch.cpp

Lines changed: 42 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -949,9 +949,13 @@ OptionalFileEntryRef HeaderSearch::LookupFile(
949949
// If we have no includer, that means we're processing a #include
950950
// from a module build. We should treat this as a system header if we're
951951
// building a [system] module.
952-
bool IncluderIsSystemHeader =
953-
Includer ? getFileInfo(Includer).DirInfo != SrcMgr::C_User :
954-
BuildSystemModule;
952+
bool IncluderIsSystemHeader = [&]() {
953+
if (!Includer)
954+
return BuildSystemModule;
955+
const HeaderFileInfo *HFI = getExistingFileInfo(Includer);
956+
assert(HFI && "includer without file info");
957+
return HFI->DirInfo != SrcMgr::C_User;
958+
}();
955959
if (OptionalFileEntryRef FE = getFileAndSuggestModule(
956960
TmpDir, IncludeLoc, IncluderAndDir.second, IncluderIsSystemHeader,
957961
RequestingModule, SuggestedModule)) {
@@ -966,10 +970,11 @@ OptionalFileEntryRef HeaderSearch::LookupFile(
966970
// Note that we only use one of FromHFI/ToHFI at once, due to potential
967971
// reallocation of the underlying vector potentially making the first
968972
// reference binding dangling.
969-
HeaderFileInfo &FromHFI = getFileInfo(Includer);
970-
unsigned DirInfo = FromHFI.DirInfo;
971-
bool IndexHeaderMapHeader = FromHFI.IndexHeaderMapHeader;
972-
StringRef Framework = FromHFI.Framework;
973+
const HeaderFileInfo *FromHFI = getExistingFileInfo(Includer);
974+
assert(FromHFI && "includer without file info");
975+
unsigned DirInfo = FromHFI->DirInfo;
976+
bool IndexHeaderMapHeader = FromHFI->IndexHeaderMapHeader;
977+
StringRef Framework = FromHFI->Framework;
973978

974979
HeaderFileInfo &ToHFI = getFileInfo(&FE->getFileEntry());
975980
ToHFI.DirInfo = DirInfo;
@@ -1158,10 +1163,12 @@ OptionalFileEntryRef HeaderSearch::LookupFile(
11581163
// "Foo" is the name of the framework in which the including header was found.
11591164
if (!Includers.empty() && Includers.front().first && !isAngled &&
11601165
!Filename.contains('/')) {
1161-
HeaderFileInfo &IncludingHFI = getFileInfo(Includers.front().first);
1162-
if (IncludingHFI.IndexHeaderMapHeader) {
1166+
const HeaderFileInfo *IncludingHFI =
1167+
getExistingFileInfo(Includers.front().first);
1168+
assert(IncludingHFI && "includer without file info");
1169+
if (IncludingHFI->IndexHeaderMapHeader) {
11631170
SmallString<128> ScratchFilename;
1164-
ScratchFilename += IncludingHFI.Framework;
1171+
ScratchFilename += IncludingHFI->Framework;
11651172
ScratchFilename += '/';
11661173
ScratchFilename += Filename;
11671174

@@ -1294,11 +1301,11 @@ OptionalFileEntryRef HeaderSearch::LookupSubframeworkHeader(
12941301
}
12951302

12961303
// This file is a system header or C++ unfriendly if the old file is.
1297-
//
1298-
// Note that the temporary 'DirInfo' is required here, as either call to
1299-
// getFileInfo could resize the vector and we don't want to rely on order
1300-
// of evaluation.
1301-
unsigned DirInfo = getFileInfo(ContextFileEnt).DirInfo;
1304+
const HeaderFileInfo *ContextHFI = getExistingFileInfo(ContextFileEnt);
1305+
assert(ContextHFI && "context file without file info");
1306+
// Note that the temporary 'DirInfo' is required here, as the call to
1307+
// getFileInfo could resize the vector and might invalidate 'ContextHFI'.
1308+
unsigned DirInfo = ContextHFI->DirInfo;
13021309
getFileInfo(&File->getFileEntry()).DirInfo = DirInfo;
13031310

13041311
FrameworkName.pop_back(); // remove the trailing '/'
@@ -1356,8 +1363,6 @@ static void mergeHeaderFileInfo(HeaderFileInfo &HFI,
13561363
HFI.Framework = OtherHFI.Framework;
13571364
}
13581365

1359-
/// getFileInfo - Return the HeaderFileInfo structure for the specified
1360-
/// FileEntry.
13611366
HeaderFileInfo &HeaderSearch::getFileInfo(const FileEntry *FE) {
13621367
if (FE->getUID() >= FileInfo.size())
13631368
FileInfo.resize(FE->getUID() + 1);
@@ -1374,28 +1379,20 @@ HeaderFileInfo &HeaderSearch::getFileInfo(const FileEntry *FE) {
13741379
}
13751380

13761381
HFI->IsValid = true;
1377-
// We have local information about this header file, so it's no longer
1378-
// strictly external.
1382+
// We assume the caller has local information about this header file, so it's
1383+
// no longer strictly external.
13791384
HFI->External = false;
13801385
return *HFI;
13811386
}
13821387

1383-
const HeaderFileInfo *
1384-
HeaderSearch::getExistingFileInfo(const FileEntry *FE,
1385-
bool WantExternal) const {
1386-
// If we have an external source, ensure we have the latest information.
1387-
// FIXME: Use a generation count to check whether this is really up to date.
1388+
const HeaderFileInfo *HeaderSearch::getExistingFileInfo(const FileEntry *FE) const {
13881389
HeaderFileInfo *HFI;
13891390
if (ExternalSource) {
1390-
if (FE->getUID() >= FileInfo.size()) {
1391-
if (!WantExternal)
1392-
return nullptr;
1391+
if (FE->getUID() >= FileInfo.size())
13931392
FileInfo.resize(FE->getUID() + 1);
1394-
}
13951393

13961394
HFI = &FileInfo[FE->getUID()];
1397-
if (!WantExternal && (!HFI->IsValid || HFI->External))
1398-
return nullptr;
1395+
// FIXME: Use a generation count to check whether this is really up to date.
13991396
if (!HFI->Resolved) {
14001397
auto ExternalHFI = ExternalSource->GetHeaderFileInfo(FE);
14011398
if (ExternalHFI.IsValid) {
@@ -1404,16 +1401,25 @@ HeaderSearch::getExistingFileInfo(const FileEntry *FE,
14041401
mergeHeaderFileInfo(*HFI, ExternalHFI);
14051402
}
14061403
}
1407-
} else if (FE->getUID() >= FileInfo.size()) {
1408-
return nullptr;
1409-
} else {
1404+
} else if (FE->getUID() < FileInfo.size()) {
14101405
HFI = &FileInfo[FE->getUID()];
1406+
} else {
1407+
HFI = nullptr;
14111408
}
14121409

1413-
if (!HFI->IsValid || (HFI->External && !WantExternal))
1414-
return nullptr;
1410+
return (HFI && HFI->IsValid) ? HFI : nullptr;
1411+
}
1412+
1413+
const HeaderFileInfo *
1414+
HeaderSearch::getExistingLocalFileInfo(const FileEntry *FE) const {
1415+
HeaderFileInfo *HFI;
1416+
if (FE->getUID() < FileInfo.size()) {
1417+
HFI = &FileInfo[FE->getUID()];
1418+
} else {
1419+
HFI = nullptr;
1420+
}
14151421

1416-
return HFI;
1422+
return (HFI && HFI->IsValid && !HFI->External) ? HFI : nullptr;
14171423
}
14181424

14191425
bool HeaderSearch::isFileMultipleIncludeGuarded(const FileEntry *File) const {

clang/lib/Serialization/ASTWriter.cpp

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -183,8 +183,7 @@ GetAffectingModuleMaps(const Preprocessor &PP, Module *RootModule) {
183183
if (!File)
184184
continue;
185185

186-
const HeaderFileInfo *HFI =
187-
HS.getExistingFileInfo(File, /*WantExternal*/ false);
186+
const HeaderFileInfo *HFI = HS.getExistingLocalFileInfo(File);
188187
if (!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader))
189188
continue;
190189

@@ -2062,14 +2061,12 @@ void ASTWriter::WriteHeaderSearch(const HeaderSearch &HS) {
20622061
if (!File)
20632062
continue;
20642063

2065-
// Get the file info. This will load info from the external source if
2066-
// necessary. Skip emitting this file if we have no information on it
2067-
// as a header file (in which case HFI will be null) or if it hasn't
2064+
// Get the file info. Skip emitting this file if we have no information on
2065+
// it as a header file (in which case HFI will be null) or if it hasn't
20682066
// changed since it was loaded. Also skip it if it's for a modular header
20692067
// from a different module; in that case, we rely on the module(s)
20702068
// containing the header to provide this information.
2071-
const HeaderFileInfo *HFI =
2072-
HS.getExistingFileInfo(File, /*WantExternal*/!Chain);
2069+
const HeaderFileInfo *HFI = HS.getExistingLocalFileInfo(File);
20732070
if (!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader))
20742071
continue;
20752072

0 commit comments

Comments
 (0)