Skip to content

[clang][modules] Allow module maps with textual headers to be non-affecting #89441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 24, 2024

Conversation

jansvoboda11
Copy link
Contributor

@jansvoboda11 jansvoboda11 commented Apr 19, 2024

When writing out a PCM, we skip serializing headers' HeaderFileInfo struct whenever this condition evaluates to true:

!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader)

However, when Clang parses a module map file, each textual header gets a HFI with isModuleHeader=false, isTextualModuleHeader=true and isCompilingModuleHeader=false. This means the condition evaluates to false even if the header was never included and the module map did not affect the compilation. Each PCM file that happened to parse such module map then contains a copy of the HeaderFileInfo struct for all textual headers, and considers the containing module map affecting.

This patch makes it so that we skip headers that have not been included, essentially removing the virality of textual headers when it comes to PCM serialization.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules labels Apr 19, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 19, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-modules

Author: Jan Svoboda (jansvoboda11)

Changes

Depends on #89428.


Full diff: https://github.com/llvm/llvm-project/pull/89441.diff

8 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+5)
  • (modified) clang/include/clang/Lex/HeaderSearchOptions.h (+6-1)
  • (modified) clang/lib/Serialization/ASTWriter.cpp (+9-7)
  • (modified) clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp (+5)
  • (modified) clang/test/ClangScanDeps/modules-full.cpp (+3)
  • (removed) clang/test/Modules/add-remove-irrelevant-module-map.m (-32)
  • (added) clang/test/Modules/prune-non-affecting-module-map-files-textual.c (+26)
  • (added) clang/test/Modules/prune-non-affecting-module-map-files.m (+62)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index d83031b05cebc4..a3d2863bc7507e 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3100,6 +3100,11 @@ defm modules_skip_header_search_paths : BoolFOption<"modules-skip-header-search-
     HeaderSearchOpts<"ModulesSkipHeaderSearchPaths">, DefaultFalse,
     PosFlag<SetTrue, [], [], "Disable writing header search paths">,
     NegFlag<SetFalse>, BothFlags<[], [CC1Option]>>;
+def fno_modules_prune_non_affecting_module_map_files :
+    Flag<["-"], "fno-modules-prune-non-affecting-module-map-files">,
+    Group<f_Group>, Flags<[]>, Visibility<[CC1Option]>,
+    MarshallingInfoNegativeFlag<HeaderSearchOpts<"ModulesPruneNonAffectingModuleMaps">>,
+    HelpText<"Do not prune non-affecting module map files when writing module files">;
 
 def fincremental_extensions :
   Flag<["-"], "fincremental-extensions">,
diff --git a/clang/include/clang/Lex/HeaderSearchOptions.h b/clang/include/clang/Lex/HeaderSearchOptions.h
index 637dc77e5d957e..e4437ac0e35263 100644
--- a/clang/include/clang/Lex/HeaderSearchOptions.h
+++ b/clang/include/clang/Lex/HeaderSearchOptions.h
@@ -252,6 +252,10 @@ class HeaderSearchOptions {
   LLVM_PREFERRED_TYPE(bool)
   unsigned ModulesSkipPragmaDiagnosticMappings : 1;
 
+  /// Whether to prune non-affecting module map files from PCM files.
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned ModulesPruneNonAffectingModuleMaps : 1;
+
   LLVM_PREFERRED_TYPE(bool)
   unsigned ModulesHashContent : 1;
 
@@ -280,7 +284,8 @@ class HeaderSearchOptions {
         ModulesValidateDiagnosticOptions(true),
         ModulesSkipDiagnosticOptions(false),
         ModulesSkipHeaderSearchPaths(false),
-        ModulesSkipPragmaDiagnosticMappings(false), ModulesHashContent(false),
+        ModulesSkipPragmaDiagnosticMappings(false),
+        ModulesPruneNonAffectingModuleMaps(true), ModulesHashContent(false),
         ModulesStrictContextHash(false), ModulesIncludeVFSUsage(false) {}
 
   /// AddPath - Add the \p Path path to the specified \p Group list.
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index 6dd87b5d200db6..4825c245a4b846 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -166,9 +166,9 @@ namespace {
 
 std::optional<std::set<const FileEntry *>>
 GetAffectingModuleMaps(const Preprocessor &PP, Module *RootModule) {
-  // Without implicit module map search, there's no good reason to know about
-  // any module maps that are not affecting.
-  if (!PP.getHeaderSearchInfo().getHeaderSearchOpts().ImplicitModuleMaps)
+  if (!PP.getHeaderSearchInfo()
+           .getHeaderSearchOpts()
+           .ModulesPruneNonAffectingModuleMaps)
     return std::nullopt;
 
   SmallVector<const Module *> ModulesToProcess{RootModule};
@@ -187,7 +187,8 @@ GetAffectingModuleMaps(const Preprocessor &PP, Module *RootModule) {
       continue;
 
     const HeaderFileInfo *HFI = HS.getExistingLocalFileInfo(*File);
-    if (!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader))
+    if (!HFI || ((HFI->isModuleHeader || HFI->isTextualModuleHeader) &&
+                 !HFI->isCompilingModuleHeader))
       continue;
 
     for (const auto &KH : HS.findResolvedModulesForHeader(*File)) {
@@ -2055,11 +2056,12 @@ void ASTWriter::WriteHeaderSearch(const HeaderSearch &HS) {
 
     // Get the file info. Skip emitting this file if we have no information on
     // it as a header file (in which case HFI will be null) or if it hasn't
-    // changed since it was loaded. Also skip it if it's for a modular header
-    // from a different module; in that case, we rely on the module(s)
+    // changed since it was loaded. Also skip it if it's for a non-excluded
+    // header from a different module; in that case, we rely on the module(s)
     // containing the header to provide this information.
     const HeaderFileInfo *HFI = HS.getExistingLocalFileInfo(*File);
-    if (!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader))
+    if (!HFI || ((HFI->isModuleHeader || HFI->isTextualModuleHeader) &&
+                 !HFI->isCompilingModuleHeader))
       continue;
 
     // Massage the file path into an appropriate form.
diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index e19f19b2528c15..f46324ee9989eb 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -179,6 +179,11 @@ makeCommonInvocationForModuleBuild(CompilerInvocation CI) {
   CI.resetNonModularOptions();
   CI.clearImplicitModuleBuildOptions();
 
+  // The scanner takes care to avoid passing non-affecting module maps to the
+  // explicit compiles. No need to do extra work just to find out there are no
+  // module map files to prune.
+  CI.getHeaderSearchOpts().ModulesPruneNonAffectingModuleMaps = false;
+
   // Remove options incompatible with explicit module build or are likely to
   // differ between identical modules discovered from different translation
   // units.
diff --git a/clang/test/ClangScanDeps/modules-full.cpp b/clang/test/ClangScanDeps/modules-full.cpp
index 59efef0ecbaa64..a00a431eb56911 100644
--- a/clang/test/ClangScanDeps/modules-full.cpp
+++ b/clang/test/ClangScanDeps/modules-full.cpp
@@ -33,6 +33,7 @@
 // CHECK-NEXT:       "command-line": [
 // CHECK-NEXT:         "-cc1"
 // CHECK:              "-emit-module"
+// CHECK:              "-fno-modules-prune-non-affecting-module-map-files"
 // CHECK:              "-fmodule-file={{.*}}[[PREFIX]]/module-cache{{(_clangcl)?}}/[[HASH_H2_DINCLUDE]]/header2-{{[A-Z0-9]+}}.pcm"
 // CHECK-NOT:          "-fimplicit-module-maps"
 // CHECK:              "-fmodule-name=header1"
@@ -51,6 +52,7 @@
 // CHECK-NEXT:       "command-line": [
 // CHECK-NEXT:         "-cc1",
 // CHECK:              "-emit-module",
+// CHECK:              "-fno-modules-prune-non-affecting-module-map-files"
 // CHECK-NOT:          "-fimplicit-module-maps",
 // CHECK:              "-fmodule-name=header1",
 // CHECK:              "-fno-implicit-modules",
@@ -68,6 +70,7 @@
 // CHECK-NEXT:       "command-line": [
 // CHECK-NEXT:         "-cc1",
 // CHECK:              "-emit-module",
+// CHECK:              "-fno-modules-prune-non-affecting-module-map-files"
 // CHECK:              "-fmodule-name=header2",
 // CHECK-NOT:          "-fimplicit-module-maps",
 // CHECK:              "-fno-implicit-modules",
diff --git a/clang/test/Modules/add-remove-irrelevant-module-map.m b/clang/test/Modules/add-remove-irrelevant-module-map.m
deleted file mode 100644
index 7e3e58037e6f21..00000000000000
--- a/clang/test/Modules/add-remove-irrelevant-module-map.m
+++ /dev/null
@@ -1,32 +0,0 @@
-// RUN: rm -rf %t && mkdir %t
-// RUN: split-file %s %t
-
-//--- a/module.modulemap
-module a {}
-
-//--- b/module.modulemap
-module b {}
-
-//--- c/module.modulemap
-module c {}
-
-//--- module.modulemap
-module m { header "m.h" }
-//--- m.h
-@import c;
-
-//--- test-simple.m
-// expected-no-diagnostics
-@import m;
-
-// Build modules with the non-affecting "a/module.modulemap".
-// RUN: %clang_cc1 -I %t/a -I %t/b -I %t/c -I %t -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache -fdisable-module-hash %t/test-simple.m -verify
-// RUN: mv %t/cache %t/cache-with
-
-// Build modules without the non-affecting "a/module.modulemap".
-// RUN: rm -rf %t/a/module.modulemap
-// RUN: %clang_cc1 -I %t/a -I %t/b -I %t/c -I %t -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/cache -fdisable-module-hash %t/test-simple.m -verify
-// RUN: mv %t/cache %t/cache-without
-
-// Check that the PCM files are bit-for-bit identical.
-// RUN: diff %t/cache-with/m.pcm %t/cache-without/m.pcm
diff --git a/clang/test/Modules/prune-non-affecting-module-map-files-textual.c b/clang/test/Modules/prune-non-affecting-module-map-files-textual.c
new file mode 100644
index 00000000000000..8f8f00560b1834
--- /dev/null
+++ b/clang/test/Modules/prune-non-affecting-module-map-files-textual.c
@@ -0,0 +1,26 @@
+// This test checks that a module map with a textual header can be marked as
+// non-affecting.
+
+// RUN: rm -rf %t && mkdir %t
+// RUN: split-file %s %t
+
+//--- A.modulemap
+module A { textual header "A.h" }
+//--- B.modulemap
+module B { header "B.h" export * }
+//--- A.h
+typedef int A_int;
+//--- B.h
+#include "A.h"
+typedef A_int B_int;
+
+// RUN: %clang_cc1 -fmodules -emit-module %t/A.modulemap -fmodule-name=A -o %t/A.pcm \
+// RUN:   -fmodule-map-file=%t/A.modulemap -fmodule-map-file=%t/B.modulemap
+
+// RUN: %clang_cc1 -fmodules -emit-module %t/B.modulemap -fmodule-name=B -o %t/B0.pcm \
+// RUN:   -fmodule-map-file=%t/A.modulemap -fmodule-map-file=%t/B.modulemap -fmodule-file=%t/A.pcm
+
+// RUN: %clang_cc1 -fmodules -emit-module %t/B.modulemap -fmodule-name=B -o %t/B1.pcm \
+// RUN:                                    -fmodule-map-file=%t/B.modulemap -fmodule-file=%t/A.pcm
+
+// RUN: diff %t/B0.pcm %t/B1.pcm
diff --git a/clang/test/Modules/prune-non-affecting-module-map-files.m b/clang/test/Modules/prune-non-affecting-module-map-files.m
new file mode 100644
index 00000000000000..ba2b3a306eaf46
--- /dev/null
+++ b/clang/test/Modules/prune-non-affecting-module-map-files.m
@@ -0,0 +1,62 @@
+// Check that the presence of non-affecting module map files does not affect the
+// contents of PCM files.
+
+// RUN: rm -rf %t && mkdir %t
+// RUN: split-file %s %t
+
+//--- a/module.modulemap
+module a {}
+
+//--- b/module.modulemap
+module b {}
+
+//--- c/module.modulemap
+module c { header "c.h" }
+//--- c/c.h
+@import b;
+
+//--- tu.m
+@import c;
+
+//--- explicit-mms-common-args.rsp
+-fmodule-map-file=b/module.modulemap -fmodule-map-file=c/module.modulemap -fmodules -fmodules-cache-path=cache -fdisable-module-hash -fsyntax-only tu.m
+//--- implicit-search-args.rsp
+-I a -I b -I c -fimplicit-module-maps -fmodules -fmodules-cache-path=cache -fdisable-module-hash -fsyntax-only tu.m
+//--- implicit-search-args.rsp-end
+
+// Test with explicit module map files.
+//
+// RUN: %clang_cc1 -working-directory %t @%t/explicit-mms-common-args.rsp
+// RUN: mv %t/cache %t/cache-explicit-no-a-prune
+// RUN: %clang_cc1 -working-directory %t @%t/explicit-mms-common-args.rsp -fno-modules-prune-non-affecting-module-map-files
+// RUN: mv %t/cache %t/cache-explicit-no-a-keep
+//
+// RUN: %clang_cc1 -working-directory %t -fmodule-map-file=a/module.modulemap @%t/explicit-mms-common-args.rsp
+// RUN: mv %t/cache %t/cache-explicit-a-prune
+// RUN: %clang_cc1 -working-directory %t -fmodule-map-file=a/module.modulemap @%t/explicit-mms-common-args.rsp -fno-modules-prune-non-affecting-module-map-files
+// RUN: mv %t/cache %t/cache-explicit-a-keep
+//
+// RUN: diff %t/cache-explicit-no-a-prune/c.pcm %t/cache-explicit-a-prune/c.pcm
+// RUN: not diff %t/cache-explicit-no-a-keep/c.pcm %t/cache-explicit-a-keep/c.pcm
+
+// Test with implicit module map search.
+//
+// RUN: %clang_cc1 -working-directory %t @%t/implicit-search-args.rsp
+// RUN: mv %t/cache %t/cache-implicit-no-a-prune
+// RUN: %clang_cc1 -working-directory %t @%t/implicit-search-args.rsp -fno-modules-prune-non-affecting-module-map-files
+// RUN: mv %t/cache %t/cache-implicit-no-a-keep
+//
+// FIXME: Instead of removing "a/module.modulemap" from the file system, we
+//        could drop the "-I a" search path argument in combination with the
+//        "-fmodules-skip-header-search-paths" flag. Unfortunately, that flag
+//        does not prevent serialization of the search path usage bit vector,
+//        making the files differ anyways.
+// RUN: rm %t/a/module.modulemap
+//
+// RUN: %clang_cc1 -working-directory %t @%t/implicit-search-args.rsp
+// RUN: mv %t/cache %t/cache-implicit-a-prune
+// RUN: %clang_cc1 -working-directory %t @%t/implicit-search-args.rsp -fno-modules-prune-non-affecting-module-map-files
+// RUN: mv %t/cache %t/cache-implicit-a-keep
+//
+// RUN: diff %t/cache-implicit-no-a-prune/c.pcm %t/cache-implicit-a-prune/c.pcm
+// RUN: not diff %t/cache-implicit-no-a-keep/c.pcm %t/cache-implicit-a-keep/c.pcm

@jansvoboda11 jansvoboda11 changed the title [clang][modules] Allow module map files with textual headers be non-affecting [clang][modules] Allow module map files with textual headers to be non-affecting Apr 19, 2024
@jansvoboda11 jansvoboda11 changed the title [clang][modules] Allow module map files with textual headers to be non-affecting [clang][modules] Allow module maps with textual headers to be non-affecting Apr 19, 2024
@jansvoboda11 jansvoboda11 force-pushed the textual-non-affecting branch from c68b646 to 9165d60 Compare April 19, 2024 19:47
@jansvoboda11
Copy link
Contributor Author

Hmm, this will probably only work if you compile A into a PCM and then pass it to B. This is not how "Modules/preprocess-decluse.cpp" is set up.

@sam-mccall
Copy link
Collaborator

I can confirm applying this allows our targets to build again! 🎉
Thank you, will take a look at the implementation now.

@sam-mccall
Copy link
Collaborator

Hmm, this will probably only work if you compile A into a PCM and then pass it to B. This is not how "Modules/preprocess-decluse.cpp" is set up.

In our case we always have a chain A <- B <- C.
A.modulemap can be affecting for B but should not be for C.
(Approximately, A covers approximately clang's headers (textual), B covers the standard library (modular). Every other library C sees A.modulemap, B.modulemap, B.pcm. There is no PCM for A).

Copy link
Collaborator

@sam-mccall sam-mccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. It corresponds to our build structure and fixes the new build failures we saw after 0cd0aa0.

Really appreciate your work on this!

Copy link
Contributor

@ian-twilightcoder ian-twilightcoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't totally understand the change, but I don't see anything wrong with it!

@ian-twilightcoder
Copy link
Contributor

Hmm, this will probably only work if you compile A into a PCM and then pass it to B. This is not how "Modules/preprocess-decluse.cpp" is set up.

In our case we always have a chain A <- B <- C. A.modulemap can be affecting for B but should not be for C. (Approximately, A covers approximately clang's headers (textual), B covers the standard library (modular). Every other library C sees A.modulemap, B.modulemap, B.pcm. There is no PCM for A).

clang's headers all have proper modules now, are you sure you still need A?

@jansvoboda11
Copy link
Contributor Author

I updated the description of this PR, hopefully it makes more sense now. I still need to investigate what goes wrong in "Modules/preprocess-decluse.cpp". It seems that it assumes %t/b.pcm embeds the information from "a.modulemap".

@ian-twilightcoder
Copy link
Contributor

I updated the description of this PR, hopefully it makes more sense now. I still need to investigate what goes wrong in "Modules/preprocess-decluse.cpp". It seems that it assumes %t/b.pcm embeds the information from "a.modulemap".

Is this a pre-existing issue, or did my patch change to make "each textual header gets a HFI"?

@jansvoboda11
Copy link
Contributor Author

Is this a pre-existing issue, or did my patch change to make "each textual header gets a HFI"?

I'm not sure yet, I think we'll need to investigate Sam's reproducer and consider Richard's comment. It's possible that we now really do end up with header file infos for textual headers where we didn't before, which triggers the virality much more often.

@sam-mccall
Copy link
Collaborator

clang's headers all have proper modules now, are you sure you still need A?

We can't use clang/lib/Headers/module.modulemap, so we need something to describe those headers.

Why can't we? In our build system, it's the build system's job to generate modulemaps for your libraries, and you don't get to supply your own and build a PCM from it. This is a design decision with some far-reaching ups and downs.

It needs a little special bootstrapping logic for the compiler/stdlib headers that can't be part of any real library, but that bootstrapping is much simpler when we're just injecting one hand-written modulemap, and don't also need the build system to produce a PCM.

(I misspoke slightly: A is textual headers for clang+stdlib, and B is modular passthrough headers for the parts that are self-contained. A is special-cased in the bulid system, B is a normal library except that everything implicitly depends on it).

@sam-mccall
Copy link
Collaborator

sam-mccall commented Apr 19, 2024

Is this a pre-existing issue, or did my patch change to make "each textual header gets a HFI"?

My best understanding that your patch gave textual headersHFIs when the module map was loaded, rather than when the header was included. This shouldn't have mattered, but for the latent pre-existing bug: this caused the modulemap to be "affecting" and thus serialized. (Fixed by this patch)

Your patch changed an early-bailout condition in markFileModuleHeader (1426) from if !ModularHeader to if ExcludedHeader, so now we carry on further in case of textual headers.
The next line calls getExistingFileInfo which is probably fine, and then bails out if ModularHeader (1430). But with textual headers, we keep going. Finally on line 1433 we call getFileInfo and the HFI is forced to exist.

(Having read through this a few times, I really appreciate the detail you put in the comments and commit message, thanks for that!)

@ian-twilightcoder
Copy link
Contributor

Is this a pre-existing issue, or did my patch change to make "each textual header gets a HFI"?

My best understanding that your patch gave textual headersHFIs when the module map was loaded, rather than when the header was included. This shouldn't have mattered, but for the latent pre-existing bug: this caused the modulemap to be "affecting" and thus serialized. (Fixed by this patch)

Your patch changed an early-bailout condition in markFileModuleHeader (1426) from if !ModularHeader to if ExcludedHeader, so now we carry on further in case of textual headers. The next line calls getExistingFileInfo which is probably fine, and then bails out if ModularHeader (1430). But with textual headers, we keep going. Finally on line 1433 we call getFileInfo and the HFI is forced to exist.

(Having read through this a few times, I really appreciate the detail you put in the comments and commit message, thanks for that!)

Ah yeah, that makes sense. Thanks for the help (and Richard and Jan and Ilya too) in finding and fixing this. HFI's are a lot tricker than they look on the surface!

@sam-mccall
Copy link
Collaborator

I updated the description of this PR, hopefully it makes more sense now. I still need to investigate what goes wrong in "Modules/preprocess-decluse.cpp". It seems that it assumes %t/b.pcm embeds the information from "a.modulemap".

I think it should embed that information. The high-level idea here is: the PCM should describe the source of the modulemaps that are relevant to build it.

In this test, that means that b.pcm should have a slocentry for a.modulemap, because that map was relevant in allowing b.h to include a.h. (If a.pcm existed, then it would encode a's module map and we'd use that instead. But as it only has textual headers, we don't build a PCM).

The problem is HFI->isTextualModuleHeader && !HFI->isCompilingModuleHeader is true for a.h when building B: it's from a different module, but being built as part of this one. We don't have a way to represent this I think: it used to be effectively "HFI exists, it's not a modular header, assume we're parsing it" but now we're creating the HFI in non-parsing cases.

I can imagine we could change the semantics of isCompilingModuleHeader for textual headers to be true if the header is being included within a TU for the compiling module, rather than included in the modulemap.

@jansvoboda11
Copy link
Contributor Author

@sam-mccall That makes sense.

I think one option we have here is to consider all module maps describing a textual header that got included as affecting. I'm concerned that a long chain of textual header includes might again be problematic.

For this particular test, we only need to consider the module maps describing used modules affecting to keep -fmodules-strict-decluse happy, which seems like a more precise solution.

@sam-mccall
Copy link
Collaborator

@sam-mccall That makes sense.

I think one option we have here is to consider all module maps describing a textual header that got included as affecting. I'm concerned that a long chain of textual header includes might again be problematic.

Yeah, that's the option that comes closest to preserving previous behavior, and the one that I think most precisely captures the "affecting" semantics - including the module maps that were required for the compilation.

I've sent a version of this as #89729, based on your first commit here.

For this particular test, we only need to consider the module maps describing used modules affecting to keep -fmodules-strict-decluse happy, which seems like a more precise solution.

I'm not sure this is correct in general: if A uses B which uses C, and both B and C are textual, then C.modulemap affects the compilation of A.pcm but won't be picked up.

@jansvoboda11
Copy link
Contributor Author

I've sent a version of this as #89729, based on your first commit here.

I left a comment there.

I think one option we have here is to consider all module maps describing a textual header that got included as affecting. I'm concerned that a long chain of textual header includes might again be problematic.

Yeah, that's the option that comes closest to preserving previous behavior, and the one that I think most precisely captures the "affecting" semantics - including the module maps that were required for the compilation.

Ok, if that doesn't explode your builds again, it makes sense to consider those module maps affecting as well. I updated the patch that now checks for Preprocessor::alreadyIncluded() for textual headers.

@@ -187,7 +187,8 @@ GetAffectingModuleMaps(const Preprocessor &PP, Module *RootModule) {
continue;

const HeaderFileInfo *HFI = HS.getExistingLocalFileInfo(*File);
if (!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader))
if (!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader) ||
(HFI->isTextualModuleHeader && !PP.alreadyIncluded(*File)))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not happy with allowing the textual headers through here, since that puts the containing module to ModulesToProcess, for which we then check the imports, uses and undeclared uses. I don't think that's necessary.

We should probably track the module that covers the textual header we included in similar way to the other modules I listed (e.g. llvm::SmallSetVector<Module *, 2> Module::TextualIncludes, or just stash them into Module::AffectingClangModules and include those in affecting module map computation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I don't have any intuition for whether it's necessary or not.

Shortcutting this analysis seems like a useful change of possible, but I think restoring the previous behavior is a helpful starting point.

Copy link
Collaborator

@sam-mccall sam-mccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good, much neater than my approach.

I'm interested in whether the DirectUses change is related to this, and would like to slap on a couple of tests.

But either way, this looks good and it would be great to have it landed!

@@ -0,0 +1,20 @@
// This test checks that a module map with a textual header can be marked as
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a useful test, I think there are a couple of other affecting-ness properties that are important to test:

  • that a textual header that is included means its module is affecting (i.e. the include matters)
  • that this textual-affecting-ness doesn't leak outside the module (a module depending on B isn't affected by A.modulemap)

I tried to cover these in https://github.com/llvm/llvm-project/pull/89729/files#diff-2eed05f9a85bc5a79b9651ee0f23e5f1494e94a2f32e093847aa6dae5ce5d839 - it might be nice to add these as a second test here (I can do it as a followup if you prefer)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch unfortunately doesn't pass that test (on line 44) due to this: when computing affecting module maps for B, "A.h" makes it through the HFI filter, module A is added to ModulesToProcess and the module map for its import X is considered affecting.

I tried splitting up that logic so that we consider imports and undeclared uses only of the current module and its submodules. That doesn't work either, because PP.alreadyIncluded("X.h") returns true, since it checks the global (cross-module) state, not the local state. This might be possible with #71117. Until then, I think we need some "has this been included locally" state that's local to the current TU. Maybe that can live in HeaderFileInfo as per your patch, or as a Preprocessor state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should work with b03c34f.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And thanks for the extended test case!)

@@ -187,7 +187,8 @@ GetAffectingModuleMaps(const Preprocessor &PP, Module *RootModule) {
continue;

const HeaderFileInfo *HFI = HS.getExistingLocalFileInfo(*File);
if (!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader))
if (!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader) ||
(HFI->isTextualModuleHeader && !PP.alreadyIncluded(*File)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I don't have any intuition for whether it's necessary or not.

Shortcutting this analysis seems like a useful change of possible, but I think restoring the previous behavior is a helpful starting point.

…r, not just textual module headers. Also remove DirectUses.
Copy link

github-actions bot commented Apr 23, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@jansvoboda11 jansvoboda11 force-pushed the textual-non-affecting branch from ac069f4 to 2993a63 Compare April 23, 2024 20:40
Copy link
Collaborator

@sam-mccall sam-mccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@@ -1574,6 +1574,7 @@ bool HeaderSearch::ShouldEnterIncludeFile(Preprocessor &PP,
}
}

FileInfo.IsLocallyIncluded = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider placing this at the end of HandleHeaderIncludeOrImport rather than here:

  • it looks like there are cases where this function returns true and we don't enter the file textually (or at least, it's not obvious that there are no such cases)
  • having a load-bearing side-effect of a "should we do X" query seems unexpected and liable to break during refactoring or in edge cases

Concretely I don't see a case where this doesn't work though, up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I wanted to colocate this with call to Preprocessor::markIncluded(). If we find a better way of doing things, we can move both of them.

const HeaderFileInfo *HFI = HS.getExistingLocalFileInfo(*File);
if (!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader))
if (!HFI || (!HFI->isCompilingModuleHeader &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this seems hard to follow, consider splitting up as:

if (!HFI)
  continue; // we're not relying on this file at all
if (HFI->isModuleHeader && !HFI->isCompilingModuleHeader)
  continue; // will import HFI from its module
if (!HFI->isCompilingModuleHeader && !PP->alreadyIncluded(*File))
  continue; // unused header needs no description

and possibly pulling out as static shouldDescribeHeader(HFI) or so?

@jansvoboda11 jansvoboda11 merged commit d609029 into llvm:main Apr 24, 2024
@jansvoboda11 jansvoboda11 deleted the textual-non-affecting branch April 24, 2024 16:06
jansvoboda11 added a commit to swiftlang/llvm-project that referenced this pull request May 1, 2024
…ecting (llvm#89441)

When writing out a PCM, we skip serializing headers' `HeaderFileInfo`
struct whenever this condition evaluates to `true`:

```c++
!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader)
```

However, when Clang parses a module map file, each textual header gets a
`HFI` with `isModuleHeader=false`, `isTextualModuleHeader=true` and
`isCompilingModuleHeader=false`. This means the condition evaluates to
`false` even if the header was never included and the module map did not
affect the compilation. Each PCM file that happened to parse such module
map then contains a copy of the `HeaderFileInfo` struct for all textual
headers, and considers the containing module map affecting.

This patch makes it so that we skip headers that have not been included,
essentially removing the virality of textual headers when it comes to
PCM serialization.

(cherry picked from commit d609029)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants