Skip to content

[clang][modules] Avoid calling expensive SourceManager::translateFile() #86216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions clang/include/clang/Lex/ModuleMap.h
Original file line number Diff line number Diff line change
Expand Up @@ -263,8 +263,8 @@ class ModuleMap {
Attributes Attrs;

/// If \c InferModules is non-zero, the module map file that allowed
/// inferred modules. Otherwise, nullopt.
OptionalFileEntryRef ModuleMapFile;
/// inferred modules. Otherwise, invalid.
FileID ModuleMapFID;

/// The names of modules that cannot be inferred within this
/// directory.
Expand All @@ -279,8 +279,7 @@ class ModuleMap {

/// A mapping from an inferred module to the module map that allowed the
/// inference.
// FIXME: Consider making the values non-optional.
llvm::DenseMap<const Module *, OptionalFileEntryRef> InferredModuleAllowedBy;
llvm::DenseMap<const Module *, FileID> InferredModuleAllowedBy;

llvm::DenseMap<const Module *, AdditionalModMapsSet> AdditionalModMaps;

Expand Down Expand Up @@ -618,8 +617,9 @@ class ModuleMap {
///
/// \param Module The module whose module map file will be returned, if known.
///
/// \returns The file entry for the module map file containing the given
/// module, or nullptr if the module definition was inferred.
/// \returns The FileID for the module map file containing the given module,
/// invalid if the module definition was inferred.
FileID getContainingModuleMapFileID(const Module *Module) const;
OptionalFileEntryRef getContainingModuleMapFile(const Module *Module) const;

/// Get the module map file that (along with the module name) uniquely
Expand All @@ -631,9 +631,10 @@ class ModuleMap {
/// of inferred modules, returns the module map that allowed the inference
/// (e.g. contained 'module *'). Otherwise, returns
/// getContainingModuleMapFile().
FileID getModuleMapFileIDForUniquing(const Module *M) const;
OptionalFileEntryRef getModuleMapFileForUniquing(const Module *M) const;

void setInferredModuleAllowedBy(Module *M, OptionalFileEntryRef ModMap);
void setInferredModuleAllowedBy(Module *M, FileID ModMapFID);

/// Canonicalize \p Path in a manner suitable for a module map file. In
/// particular, this canonicalizes the parent directory separately from the
Expand Down
19 changes: 17 additions & 2 deletions clang/lib/Frontend/CompilerInstance.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1337,9 +1337,24 @@ static bool compileModule(CompilerInstance &ImportingInstance,
// Get or create the module map that we'll use to build this module.
ModuleMap &ModMap
= ImportingInstance.getPreprocessor().getHeaderSearchInfo().getModuleMap();
SourceManager &SourceMgr = ImportingInstance.getSourceManager();
bool Result;
if (OptionalFileEntryRef ModuleMapFile =
ModMap.getContainingModuleMapFile(Module)) {
if (FileID ModuleMapFID = ModMap.getContainingModuleMapFileID(Module);
ModuleMapFID.isValid()) {
// We want to use the top-level module map. If we don't, the compiling
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the connection between this change and the rest of the commit?

Copy link
Contributor Author

@jansvoboda11 jansvoboda11 Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider that SourceManager::translateFile() always prefers local SLocEntries. Now let's say that a module is defined in a module map that gets included via the extern directive from a top-level module map Clang find via header search. Before loading PCM for this module, the following code will resolve to the local SLocEntry (there are no loaded entries yet).

SourceManager::translateFile(
  SourceManager::getFileEntryRefForID( // ModuleMap::getContainingModuleMapFile(Module)
    SourceManager::getFileID(          //
      Module::DefinitionLoc)))         //

After compiling and loading the PCM for such module, Module::DefinitionLoc gets updated and now points into the loaded SLocEntry. Therefore, ModuleMap::getContainingModuleMapFile(Module) returns the loaded SLocEntry. Then, SourceManager::translateFile() maps that to the local SLocEntry, so we end up with a result identical to the previous one.

This patch gets rid of the unnecessary conversions and does this instead.

SourceManager::getFileID(Module::DefinitionLoc)

This means that after loading the PCM for the module, this points into the table of loaded SLocEntries. The importer knows that the containing module map is not a top level module map. When it attempts to walk up the include chain to get the the top-level module map for this module (using SourceManager::getIncludeLoc()) it gets an invalid SourceLocation. This is because the PCM was not compiled from the top-level module map, its SourceManager never knew about the file, and therefore can't walk up from the loaded SLocEntry.

This means that the true top-level module map never makes it into the set of affecting module maps we compute in ASTWriter. It's in turn left out from the INPUT_FILE block that only has the containing module map that's marked as not-top-level file. This ultimately breaks the dependency scanner. When generating -fmodule-map-file= arguments for the explicit compilation, it only uses files that made it into the INPUT_FILE block (are affecting) and are top-level (to avoid exploding the command line with transitive module maps). This then breaks the explicit compile, which doesn't have any (neither the top-level nor the containing module map) for the module in question.

So this piece of code is here to ensure implicit builds are always invoked with the top-level module map, not the containing one. This fixes the walk over the include chain of loaded SLocEntries and fixes clang/test/ClangScanDeps/modules-extern-unrelated.m.

// instance may think the containing module map is a top-level one, while
// the importing instance knows it's included from a parent module map via
// the extern directive. This mismatch could bite us later.
SourceLocation Loc = SourceMgr.getIncludeLoc(ModuleMapFID);
while (Loc.isValid() && isModuleMap(SourceMgr.getFileCharacteristic(Loc))) {
ModuleMapFID = SourceMgr.getFileID(Loc);
Loc = SourceMgr.getIncludeLoc(ModuleMapFID);
}

OptionalFileEntryRef ModuleMapFile =
SourceMgr.getFileEntryRefForID(ModuleMapFID);
assert(ModuleMapFile && "Top-level module map with no FileID");

// Canonicalize compilation to start with the public module map. This is
// vital for submodules declarations in the private module maps to be
// correctly parsed when depending on a top level module in the public one.
Expand Down
10 changes: 8 additions & 2 deletions clang/lib/Frontend/FrontendAction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -535,8 +535,14 @@ static Module *prepareToBuildModule(CompilerInstance &CI,
if (*OriginalModuleMap != CI.getSourceManager().getFileEntryRefForID(
CI.getSourceManager().getMainFileID())) {
M->IsInferred = true;
CI.getPreprocessor().getHeaderSearchInfo().getModuleMap()
.setInferredModuleAllowedBy(M, *OriginalModuleMap);
auto FileCharacter =
M->IsSystem ? SrcMgr::C_System_ModuleMap : SrcMgr::C_User_ModuleMap;
FileID OriginalModuleMapFID = CI.getSourceManager().getOrCreateFileID(
*OriginalModuleMap, FileCharacter);
CI.getPreprocessor()
.getHeaderSearchInfo()
.getModuleMap()
.setInferredModuleAllowedBy(M, OriginalModuleMapFID);
}
}

Expand Down
66 changes: 36 additions & 30 deletions clang/lib/Lex/ModuleMap.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -648,8 +648,7 @@ ModuleMap::findOrCreateModuleForHeaderInUmbrellaDir(FileEntryRef File) {
UmbrellaModule = UmbrellaModule->Parent;

if (UmbrellaModule->InferSubmodules) {
OptionalFileEntryRef UmbrellaModuleMap =
getModuleMapFileForUniquing(UmbrellaModule);
FileID UmbrellaModuleMap = getModuleMapFileIDForUniquing(UmbrellaModule);

// Infer submodules for each of the directories we found between
// the directory of the umbrella header and the directory where
Expand Down Expand Up @@ -1021,7 +1020,7 @@ Module *ModuleMap::inferFrameworkModule(DirectoryEntryRef FrameworkDir,

// If the framework has a parent path from which we're allowed to infer
// a framework module, do so.
OptionalFileEntryRef ModuleMapFile;
FileID ModuleMapFID;
if (!Parent) {
// Determine whether we're allowed to infer a module map.
bool canInfer = false;
Expand Down Expand Up @@ -1060,7 +1059,7 @@ Module *ModuleMap::inferFrameworkModule(DirectoryEntryRef FrameworkDir,
Attrs.IsExhaustive |= inferred->second.Attrs.IsExhaustive;
Attrs.NoUndeclaredIncludes |=
inferred->second.Attrs.NoUndeclaredIncludes;
ModuleMapFile = inferred->second.ModuleMapFile;
ModuleMapFID = inferred->second.ModuleMapFID;
}
}
}
Expand All @@ -1069,7 +1068,7 @@ Module *ModuleMap::inferFrameworkModule(DirectoryEntryRef FrameworkDir,
if (!canInfer)
return nullptr;
} else {
ModuleMapFile = getModuleMapFileForUniquing(Parent);
ModuleMapFID = getModuleMapFileIDForUniquing(Parent);
}

// Look for an umbrella header.
Expand All @@ -1086,7 +1085,7 @@ Module *ModuleMap::inferFrameworkModule(DirectoryEntryRef FrameworkDir,
Module *Result = new Module(ModuleName, SourceLocation(), Parent,
/*IsFramework=*/true, /*IsExplicit=*/false,
NumCreatedModules++);
InferredModuleAllowedBy[Result] = ModuleMapFile;
InferredModuleAllowedBy[Result] = ModuleMapFID;
Result->IsInferred = true;
if (!Parent) {
if (LangOpts.CurrentModule == ModuleName)
Expand Down Expand Up @@ -1307,28 +1306,34 @@ void ModuleMap::addHeader(Module *Mod, Module::Header Header,
Cb->moduleMapAddHeader(Header.Entry.getName());
}

OptionalFileEntryRef
ModuleMap::getContainingModuleMapFile(const Module *Module) const {
FileID ModuleMap::getContainingModuleMapFileID(const Module *Module) const {
if (Module->DefinitionLoc.isInvalid())
return std::nullopt;
return {};

return SourceMgr.getFileEntryRefForID(
SourceMgr.getFileID(Module->DefinitionLoc));
return SourceMgr.getFileID(Module->DefinitionLoc);
}

OptionalFileEntryRef
ModuleMap::getModuleMapFileForUniquing(const Module *M) const {
ModuleMap::getContainingModuleMapFile(const Module *Module) const {
return SourceMgr.getFileEntryRefForID(getContainingModuleMapFileID(Module));
}

FileID ModuleMap::getModuleMapFileIDForUniquing(const Module *M) const {
if (M->IsInferred) {
assert(InferredModuleAllowedBy.count(M) && "missing inferred module map");
return InferredModuleAllowedBy.find(M)->second;
}
return getContainingModuleMapFile(M);
return getContainingModuleMapFileID(M);
}

OptionalFileEntryRef
ModuleMap::getModuleMapFileForUniquing(const Module *M) const {
return SourceMgr.getFileEntryRefForID(getModuleMapFileIDForUniquing(M));
}

void ModuleMap::setInferredModuleAllowedBy(Module *M,
OptionalFileEntryRef ModMap) {
void ModuleMap::setInferredModuleAllowedBy(Module *M, FileID ModMapFID) {
assert(M->IsInferred && "module not inferred");
InferredModuleAllowedBy[M] = ModMap;
InferredModuleAllowedBy[M] = ModMapFID;
}

std::error_code
Expand Down Expand Up @@ -1517,7 +1522,7 @@ namespace clang {
ModuleMap &Map;

/// The current module map file.
FileEntryRef ModuleMapFile;
FileID ModuleMapFID;

/// Source location of most recent parsed module declaration
SourceLocation CurrModuleDeclLoc;
Expand Down Expand Up @@ -1585,13 +1590,12 @@ namespace clang {
bool parseOptionalAttributes(Attributes &Attrs);

public:
explicit ModuleMapParser(Lexer &L, SourceManager &SourceMgr,
const TargetInfo *Target, DiagnosticsEngine &Diags,
ModuleMap &Map, FileEntryRef ModuleMapFile,
DirectoryEntryRef Directory, bool IsSystem)
ModuleMapParser(Lexer &L, SourceManager &SourceMgr,
const TargetInfo *Target, DiagnosticsEngine &Diags,
ModuleMap &Map, FileID ModuleMapFID,
DirectoryEntryRef Directory, bool IsSystem)
: L(L), SourceMgr(SourceMgr), Target(Target), Diags(Diags), Map(Map),
ModuleMapFile(ModuleMapFile), Directory(Directory),
IsSystem(IsSystem) {
ModuleMapFID(ModuleMapFID), Directory(Directory), IsSystem(IsSystem) {
Tok.clear();
consumeToken();
}
Expand Down Expand Up @@ -2011,11 +2015,13 @@ void ModuleMapParser::parseModuleDecl() {
}

if (TopLevelModule &&
ModuleMapFile != Map.getContainingModuleMapFile(TopLevelModule)) {
assert(ModuleMapFile != Map.getModuleMapFileForUniquing(TopLevelModule) &&
ModuleMapFID != Map.getContainingModuleMapFileID(TopLevelModule)) {
assert(ModuleMapFID !=
Map.getModuleMapFileIDForUniquing(TopLevelModule) &&
"submodule defined in same file as 'module *' that allowed its "
"top-level module");
Map.addAdditionalModuleMapFile(TopLevelModule, ModuleMapFile);
Map.addAdditionalModuleMapFile(
TopLevelModule, *SourceMgr.getFileEntryRefForID(ModuleMapFID));
}
}

Expand Down Expand Up @@ -2120,7 +2126,8 @@ void ModuleMapParser::parseModuleDecl() {
ActiveModule->NoUndeclaredIncludes = true;
ActiveModule->Directory = Directory;

StringRef MapFileName(ModuleMapFile.getName());
StringRef MapFileName(
SourceMgr.getFileEntryRefForID(ModuleMapFID)->getName());
if (MapFileName.ends_with("module.private.modulemap") ||
MapFileName.ends_with("module_private.map")) {
ActiveModule->ModuleMapIsPrivate = true;
Expand Down Expand Up @@ -2906,7 +2913,7 @@ void ModuleMapParser::parseInferredModuleDecl(bool Framework, bool Explicit) {
// We'll be inferring framework modules for this directory.
Map.InferredDirectories[Directory].InferModules = true;
Map.InferredDirectories[Directory].Attrs = Attrs;
Map.InferredDirectories[Directory].ModuleMapFile = ModuleMapFile;
Map.InferredDirectories[Directory].ModuleMapFID = ModuleMapFID;
// FIXME: Handle the 'framework' keyword.
}

Expand Down Expand Up @@ -3139,8 +3146,7 @@ bool ModuleMap::parseModuleMapFile(FileEntryRef File, bool IsSystem,
Buffer->getBufferStart() + (Offset ? *Offset : 0),
Buffer->getBufferEnd());
SourceLocation Start = L.getSourceLocation();
ModuleMapParser Parser(L, SourceMgr, Target, Diags, *this, File, Dir,
IsSystem);
ModuleMapParser Parser(L, SourceMgr, Target, Diags, *this, ID, Dir, IsSystem);
bool Result = Parser.parseModuleMapFile();
ParsedModuleMap[File] = Result;

Expand Down
17 changes: 8 additions & 9 deletions clang/lib/Serialization/ASTWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -193,17 +193,17 @@ std::set<const FileEntry *> GetAffectingModuleMaps(const Preprocessor &PP,
const ModuleMap &MM = HS.getModuleMap();
SourceManager &SourceMgr = PP.getSourceManager();

std::set<const FileEntry *> ModuleMaps{};
auto CollectIncludingModuleMaps = [&](FileEntryRef F) {
std::set<const FileEntry *> ModuleMaps;
auto CollectIncludingModuleMaps = [&](FileID FID, FileEntryRef F) {
if (!ModuleMaps.insert(F).second)
return;
FileID FID = SourceMgr.translateFile(F);
SourceLocation Loc = SourceMgr.getIncludeLoc(FID);
// The include location of inferred module maps can point into the header
// file that triggered the inferring. Cut off the walk if that's the case.
while (Loc.isValid() && isModuleMap(SourceMgr.getFileCharacteristic(Loc))) {
FID = SourceMgr.getFileID(Loc);
if (!ModuleMaps.insert(*SourceMgr.getFileEntryRefForID(FID)).second)
F = *SourceMgr.getFileEntryRefForID(FID);
if (!ModuleMaps.insert(F).second)
break;
Loc = SourceMgr.getIncludeLoc(FID);
}
Expand All @@ -216,13 +216,13 @@ std::set<const FileEntry *> GetAffectingModuleMaps(const Preprocessor &PP,
break;
// The containing module map is affecting, because it's being pointed
// into by Module::DefinitionLoc.
if (auto ModuleMapFile = MM.getContainingModuleMapFile(Mod))
CollectIncludingModuleMaps(*ModuleMapFile);
if (FileID FID = MM.getContainingModuleMapFileID(Mod); FID.isValid())
CollectIncludingModuleMaps(FID, *SourceMgr.getFileEntryRefForID(FID));
// For inferred modules, the module map that allowed inferring is not in
// the include chain of the virtual containing module map file. It did
// affect the compilation, though.
if (auto ModuleMapFile = MM.getModuleMapFileForUniquing(Mod))
CollectIncludingModuleMaps(*ModuleMapFile);
if (FileID FID = MM.getModuleMapFileIDForUniquing(Mod); FID.isValid())
CollectIncludingModuleMaps(FID, *SourceMgr.getFileEntryRefForID(FID));
}
};

Expand Down Expand Up @@ -4728,7 +4728,6 @@ void ASTWriter::computeNonAffectingInputFiles() {
continue;

if (!isModuleMap(File.getFileCharacteristic()) ||
AffectingModuleMaps.empty() ||
llvm::is_contained(AffectingModuleMaps, *Cache->OrigEntry))
continue;

Expand Down
1 change: 1 addition & 0 deletions clang/test/ClangScanDeps/modules-extern-unrelated.m
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@
// CHECK-NEXT: "context-hash": "{{.*}}",
// CHECK-NEXT: "file-deps": [
// CHECK-NEXT: "[[PREFIX]]/first/module.modulemap",
// CHECK-NEXT: "[[PREFIX]]/second/module.modulemap",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a side-effect of the change in CompilerInstance.cpp. The implicit scan now uses the top-level module map to compile modules instead of the containing module map and that makes it included in file dependencies. This change is a bit unfortunate, since the explicit command line we generate actually only uses the containing module map, not the top-level one. Ideally, we'd either find a way to remove this file dependency or we'd make the explicit compile to also consume the top-level module map (in which case this file dependency would be real).

// CHECK-NEXT: "[[PREFIX]]/second/second.h",
// CHECK-NEXT: "[[PREFIX]]/second/second.modulemap"
// CHECK-NEXT: ],
Expand Down