Skip to content

[clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds #83660

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 5, 2024

Conversation

ian-twilightcoder
Copy link
Contributor

Once a file has been #import'ed, it gets stamped as if it was #pragma once and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as <assert.h>, will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. <assert.h> can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely.

Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or #pragma once.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order.

#import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked textual in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like <assert.h> to always be included when modules are enabled, even if #import is erroneously used somewhere.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules labels Mar 2, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 2, 2024

@llvm/pr-subscribers-clang-modules

@llvm/pr-subscribers-clang

Author: Ian Anderson (ian-twilightcoder)

Changes

Once a file has been #import'ed, it gets stamped as if it was #pragma once and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as <assert.h>, will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. <assert.h> can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely.

Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or #pragma once.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order.

#import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked textual in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like <assert.h> to always be included when modules are enabled, even if #import is erroneously used somewhere.


Full diff: https://github.com/llvm/llvm-project/pull/83660.diff

7 Files Affected:

  • (modified) clang/include/clang/Lex/HeaderSearch.h (+20-4)
  • (modified) clang/lib/Lex/HeaderSearch.cpp (+99-57)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+1-1)
  • (modified) clang/test/Modules/builtin-import.mm (+2)
  • (modified) clang/test/Modules/import-textual-noguard.mm (+5-1)
  • (modified) clang/test/Modules/import-textual.mm (+2)
  • (added) clang/test/Modules/multiple-import.m (+43)
diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/include/clang/Lex/HeaderSearch.h
index 705dcfa8aacc3f..bf8981b94d1b57 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -78,11 +78,17 @@ struct HeaderFileInfo {
   LLVM_PREFERRED_TYPE(bool)
   unsigned External : 1;
 
-  /// Whether this header is part of a module.
+  /// Whether this header is part of and built with a module.
   LLVM_PREFERRED_TYPE(bool)
   unsigned isModuleHeader : 1;
 
-  /// Whether this header is part of the module that we are building.
+  /// Whether this header is a textual header in a module (isModuleHeader will
+  /// be false)
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned isTextualModuleHeader : 1;
+
+  /// Whether this header is part of and built with the module that we are
+  /// building.
   LLVM_PREFERRED_TYPE(bool)
   unsigned isCompilingModuleHeader : 1;
 
@@ -128,13 +134,20 @@ struct HeaderFileInfo {
 
   HeaderFileInfo()
       : isImport(false), isPragmaOnce(false), DirInfo(SrcMgr::C_User),
-        External(false), isModuleHeader(false), isCompilingModuleHeader(false),
-        Resolved(false), IndexHeaderMapHeader(false), IsValid(false)  {}
+        External(false), isModuleHeader(false), isTextualModuleHeader(false),
+        isCompilingModuleHeader(false), Resolved(false),
+        IndexHeaderMapHeader(false), IsValid(false) {}
 
   /// Retrieve the controlling macro for this header file, if
   /// any.
   const IdentifierInfo *
   getControllingMacro(ExternalPreprocessorSource *External);
+
+  /// Update the module membership bits based on the header role.
+  ///
+  /// isModuleHeader will potentially be set, but not cleared.
+  /// isTextualModuleHeader will be set or cleared based on the role update.
+  void mergeModuleMembership(ModuleMap::ModuleHeaderRole Role);
 };
 
 /// An external source of header file information, which may supply
@@ -522,6 +535,9 @@ class HeaderSearch {
   ///
   /// \return false if \#including the file will have no effect or true
   /// if we should include it.
+  ///
+  /// \param M The module to which `File` belongs (this should usually be the
+  /// SuggestedModule returned by LookupFile/LookupSubframeworkHeader)
   bool ShouldEnterIncludeFile(Preprocessor &PP, FileEntryRef File,
                               bool isImport, bool ModulesEnabled, Module *M,
                               bool &IsFirstIncludeOfFile);
diff --git a/clang/lib/Lex/HeaderSearch.cpp b/clang/lib/Lex/HeaderSearch.cpp
index fcc2b56df166b8..fb6ec456dd277e 100644
--- a/clang/lib/Lex/HeaderSearch.cpp
+++ b/clang/lib/Lex/HeaderSearch.cpp
@@ -1307,6 +1307,23 @@ OptionalFileEntryRef HeaderSearch::LookupSubframeworkHeader(
 // File Info Management.
 //===----------------------------------------------------------------------===//
 
+static void mergeHeaderFileInfoModuleBits(HeaderFileInfo &HFI,
+                                          bool isModuleHeader,
+                                          bool isTextualModuleHeader) {
+  assert((!isModuleHeader || !isTextualModuleHeader) &&
+         "A header can't build with a module and be textual at the same time");
+  HFI.isModuleHeader |= isModuleHeader;
+  if (HFI.isModuleHeader)
+    HFI.isTextualModuleHeader = false;
+  else
+    HFI.isTextualModuleHeader |= isTextualModuleHeader;
+}
+
+void HeaderFileInfo::mergeModuleMembership(ModuleMap::ModuleHeaderRole Role) {
+  mergeHeaderFileInfoModuleBits(*this, ModuleMap::isModular(Role),
+                                (Role & ModuleMap::TextualHeader));
+}
+
 /// Merge the header file info provided by \p OtherHFI into the current
 /// header file info (\p HFI)
 static void mergeHeaderFileInfo(HeaderFileInfo &HFI,
@@ -1315,7 +1332,8 @@ static void mergeHeaderFileInfo(HeaderFileInfo &HFI,
 
   HFI.isImport |= OtherHFI.isImport;
   HFI.isPragmaOnce |= OtherHFI.isPragmaOnce;
-  HFI.isModuleHeader |= OtherHFI.isModuleHeader;
+  mergeHeaderFileInfoModuleBits(HFI, OtherHFI.isModuleHeader,
+                                OtherHFI.isTextualModuleHeader);
 
   if (!HFI.ControllingMacro && !HFI.ControllingMacroID) {
     HFI.ControllingMacro = OtherHFI.ControllingMacro;
@@ -1403,11 +1421,9 @@ bool HeaderSearch::isFileMultipleIncludeGuarded(FileEntryRef File) const {
 void HeaderSearch::MarkFileModuleHeader(FileEntryRef FE,
                                         ModuleMap::ModuleHeaderRole Role,
                                         bool isCompilingModuleHeader) {
-  bool isModularHeader = ModuleMap::isModular(Role);
-
   // Don't mark the file info as non-external if there's nothing to change.
   if (!isCompilingModuleHeader) {
-    if (!isModularHeader)
+    if ((Role & ModuleMap::ExcludedHeader))
       return;
     auto *HFI = getExistingFileInfo(FE);
     if (HFI && HFI->isModuleHeader)
@@ -1415,82 +1431,109 @@ void HeaderSearch::MarkFileModuleHeader(FileEntryRef FE,
   }
 
   auto &HFI = getFileInfo(FE);
-  HFI.isModuleHeader |= isModularHeader;
-  HFI.isCompilingModuleHeader |= isCompilingModuleHeader;
+  HFI.mergeModuleMembership(Role);
+  HFI.isCompilingModuleHeader |= (HFI.isModuleHeader && isCompilingModuleHeader);
 }
 
 bool HeaderSearch::ShouldEnterIncludeFile(Preprocessor &PP,
                                           FileEntryRef File, bool isImport,
                                           bool ModulesEnabled, Module *M,
                                           bool &IsFirstIncludeOfFile) {
-  ++NumIncluded; // Count # of attempted #includes.
-
+  // An include file should be entered if either:
+  // 1. This is the first include of the file.
+  // 2. This file can be included multiple times, that is it's not an
+  //    "include-once" file.
+  //
+  // Include-once is controlled by these preprocessor directives.
+  //
+  // #pragma once
+  // This directive is in the include file, and marks it as an include-once
+  // file.
+  //
+  // #import <file>
+  // This directive is in the includer, and indicates that the include file
+  // should only be entered if this is the first include.
+  ++NumIncluded;
   IsFirstIncludeOfFile = false;
-
-  // Get information about this file.
   HeaderFileInfo &FileInfo = getFileInfo(File);
 
-  // FIXME: this is a workaround for the lack of proper modules-aware support
-  // for #import / #pragma once
-  auto TryEnterImported = [&]() -> bool {
+  auto MaybeReenterImportedFile = [&]() -> bool {
     if (!ModulesEnabled)
       return false;
+    // Modules add a wrinkle though: what's included isn't necessarily visible.
+    // Consider this module.
+    // module Example {
+    //   module A { header "a.h" export * }
+    //   module B { header "b.h" export * }
+    // }
+    // b.h includes c.h. The main file includes a.h, which will trigger a module
+    // build of Example, and c.h will be included. However, c.h isn't visible to
+    // the main file. Normally this is fine, the main file can just include c.h
+    // if it needs it. If c.h is in a module, the include will translate into a
+    // module import, this function will be skipped, and everything will work as
+    // expected. However, if c.h is not in a module (or is `textual`), then this
+    // function will run. If c.h is include-once, it will not be entered from
+    // the main file and it will still not be visible.
+
     // Ensure FileInfo bits are up to date.
     ModMap.resolveHeaderDirectives(File);
-    // Modules with builtins are special; multiple modules use builtins as
-    // modular headers, example:
-    //
-    //    module stddef { header "stddef.h" export * }
-    //
-    // After module map parsing, this expands to:
-    //
-    //    module stddef {
-    //      header "/path_to_builtin_dirs/stddef.h"
-    //      textual "stddef.h"
-    //    }
+
+    // This brings up a subtlety of #import - it's not a very good indicator of
+    // include-once. Developers are often unaware of the difference between
+    // #include and #import, and tend to use one or the other indiscrimiately.
+    // In order to support #include on include-once headers that lack macro
+    // guards and `#pragma once` (which is the vast majority of Objective-C
+    // headers), if a file is ever included with #import, it's marked as
+    // isImport in the HeaderFileInfo and treated as include-once. This allows
+    // #include to work in Objective-C.
+    // #include <Foundation/Foundation.h>
+    // #include <Foundation/NSString.h>
+    // Foundation.h has an #import of NSString.h, and so the second #include is
+    // skipped even though NSString.h has no `#pragma once` and no macro guard.
     //
-    // It's common that libc++ and system modules will both define such
-    // submodules. Make sure cached results for a builtin header won't
-    // prevent other builtin modules from potentially entering the builtin
-    // header. Note that builtins are header guarded and the decision to
-    // actually enter them is postponed to the controlling macros logic below.
-    bool TryEnterHdr = false;
-    if (FileInfo.isCompilingModuleHeader && FileInfo.isModuleHeader)
-      TryEnterHdr = ModMap.isBuiltinHeader(File);
-
-    // Textual headers can be #imported from different modules. Since ObjC
-    // headers find in the wild might rely only on #import and do not contain
-    // controlling macros, be conservative and only try to enter textual headers
-    // if such macro is present.
-    if (!FileInfo.isModuleHeader &&
-        FileInfo.getControllingMacro(ExternalLookup))
-      TryEnterHdr = true;
-    return TryEnterHdr;
+    // However, this helpfulness causes problems with modules. If c.h is not an
+    // include-once file, but something included it with #import anyway (as is
+    // typical in Objective-C code), this include will be skipped and c.h will
+    // not be visible. If the file is not `#pragma once`, consider it not
+    // include-once if it is a `textual` header in a module.
+    return !FileInfo.isPragmaOnce && FileInfo.isTextualModuleHeader;
+    // If the include file has a macro guard, then it might still not be
+    // re-entered if the controlling macro is visibly defined. e.g. another
+    // header in the module being built included this file and local submodule
+    // visibility is not enabled.
+
+    // It might be tempting to re-enter the include-once file if it's not
+    // visible in an attempt to make it visible. However this will still cause
+    // redeclaration errors against the known-but-not-visible declarations. The
+    // include file not being visible will most likely cause "undefined x"
+    // errors, but at least there's a slim chance of compilation succeeding.
   };
 
-  // If this is a #import directive, check that we have not already imported
-  // this header.
   if (isImport) {
-    // If this has already been imported, don't import it again.
+    // As discussed above, record that this file was ever `#import`ed, and treat
+    // it as an include-once file from here out.
     FileInfo.isImport = true;
-
-    // Has this already been #import'ed or #include'd?
-    if (PP.alreadyIncluded(File) && !TryEnterImported())
+    if (PP.alreadyIncluded(File) && !MaybeReenterImportedFile())
       return false;
   } else {
-    // Otherwise, if this is a #include of a file that was previously #import'd
-    // or if this is the second #include of a #pragma once file, ignore it.
-    if ((FileInfo.isPragmaOnce || FileInfo.isImport) && !TryEnterImported())
+    // isPragmaOnce and isImport are only set after the file has been included
+    // at least once. If either are set then this is a repeat #include of an
+    // include-once file.
+    if (FileInfo.isPragmaOnce ||
+        (FileInfo.isImport && !MaybeReenterImportedFile()))
       return false;
   }
 
-  // Next, check to see if the file is wrapped with #ifndef guards.  If so, and
-  // if the macro that guards it is defined, we know the #include has no effect.
-  if (const IdentifierInfo *ControllingMacro
-      = FileInfo.getControllingMacro(ExternalLookup)) {
+  // As a final optimization, check for a macro guard and skip entering the file
+  // if the controlling macro is defined. The macro guard will effectively erase
+  // the file's contents, and the include would have no effect other than to
+  // waste time opening and reading a file.
+  if (const IdentifierInfo *ControllingMacro =
+          FileInfo.getControllingMacro(ExternalLookup)) {
     // If the header corresponds to a module, check whether the macro is already
-    // defined in that module rather than checking in the current set of visible
-    // modules.
+    // defined in that module rather than checking all visible modules. This is
+    // mainly to cover corner cases where the same controlling macro is used in
+    // different files in multiple modules.
     if (M ? PP.isMacroDefinedInLocalModule(ControllingMacro, M)
           : PP.isMacroDefined(ControllingMacro)) {
       ++NumMultiIncludeFileOptzn;
@@ -1499,7 +1542,6 @@ bool HeaderSearch::ShouldEnterIncludeFile(Preprocessor &PP,
   }
 
   IsFirstIncludeOfFile = PP.markIncluded(File);
-
   return true;
 }
 
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 683a076e6bc399..862618eb9ed7f3 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -2094,7 +2094,7 @@ HeaderFileInfoTrait::ReadData(internal_key_ref key, const unsigned char *d,
       Module::Header H = {std::string(key.Filename), "", *FE};
       ModMap.addHeader(Mod, H, HeaderRole, /*Imported=*/true);
     }
-    HFI.isModuleHeader |= ModuleMap::isModular(HeaderRole);
+    HFI.mergeModuleMembership(HeaderRole);
   }
 
   // This HeaderFileInfo was externally loaded.
diff --git a/clang/test/Modules/builtin-import.mm b/clang/test/Modules/builtin-import.mm
index 8a27cb358484c9..52db9c15803ce5 100644
--- a/clang/test/Modules/builtin-import.mm
+++ b/clang/test/Modules/builtin-import.mm
@@ -1,4 +1,6 @@
 // RUN: rm -rf %t
+// RUN: %clang_cc1 -fsyntax-only -nobuiltininc -nostdinc++ -isysroot %S/Inputs/libc-libcxx/sysroot -isystem %S/Inputs/libc-libcxx/sysroot/usr/include/c++/v1 -isystem %S/Inputs/libc-libcxx/sysroot/usr/include -std=c++11 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -x objective-c++ %s
+// RUN: rm -rf %t
 // RUN: %clang_cc1 -fsyntax-only -nobuiltininc -nostdinc++ -isysroot %S/Inputs/libc-libcxx/sysroot -isystem %S/Inputs/libc-libcxx/sysroot/usr/include/c++/v1 -isystem %S/Inputs/libc-libcxx/sysroot/usr/include -std=c++11 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -x objective-c++ -fmodules-local-submodule-visibility %s
 
 #include <stdio.h>
diff --git a/clang/test/Modules/import-textual-noguard.mm b/clang/test/Modules/import-textual-noguard.mm
index dd124b6609d000..ffc506117f519d 100644
--- a/clang/test/Modules/import-textual-noguard.mm
+++ b/clang/test/Modules/import-textual-noguard.mm
@@ -1,7 +1,11 @@
 // RUN: rm -rf %t
+// RUN: %clang_cc1 -fsyntax-only -std=c++11 -fmodules -fimplicit-module-maps -I%S/Inputs/import-textual/M2 -fmodules-cache-path=%t -x objective-c++ %s -verify
+// RUN: rm -rf %t
 // RUN: %clang_cc1 -fsyntax-only -std=c++11 -fmodules -fimplicit-module-maps -I%S/Inputs/import-textual/M2 -fmodules-cache-path=%t -x objective-c++ -fmodules-local-submodule-visibility %s -verify
 
-#include "A/A.h" // expected-error {{could not build module 'M'}}
+// expected-no-diagnostics
+
+#include "A/A.h"
 #include "B/B.h"
 
 typedef aint xxx;
diff --git a/clang/test/Modules/import-textual.mm b/clang/test/Modules/import-textual.mm
index 6593239d7fd709..94a6aa448cdc2a 100644
--- a/clang/test/Modules/import-textual.mm
+++ b/clang/test/Modules/import-textual.mm
@@ -1,4 +1,6 @@
 // RUN: rm -rf %t
+// RUN: %clang_cc1 -fsyntax-only -std=c++11 -fmodules -fimplicit-module-maps -I%S/Inputs/import-textual/M -fmodules-cache-path=%t -x objective-c++ %s -verify
+// RUN: rm -rf %t
 // RUN: %clang_cc1 -fsyntax-only -std=c++11 -fmodules -fimplicit-module-maps -I%S/Inputs/import-textual/M -fmodules-cache-path=%t -x objective-c++ -fmodules-local-submodule-visibility %s -verify
 
 // expected-no-diagnostics
diff --git a/clang/test/Modules/multiple-import.m b/clang/test/Modules/multiple-import.m
new file mode 100644
index 00000000000000..24b0f50cd3433c
--- /dev/null
+++ b/clang/test/Modules/multiple-import.m
@@ -0,0 +1,43 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: %clang_cc1 -fmodules-cache-path=%t/no-lsv -fmodules -fimplicit-module-maps -I%t %t/multiple-imports.m -verify
+// RUN: %clang_cc1 -fmodules-cache-path=%t/lsv -fmodules -fimplicit-module-maps -fmodules-local-submodule-visibility -I%t %t/multiple-imports.m -verify
+
+//--- multiple-imports.m
+// expected-no-diagnostics
+#import <one.h>
+#import <assert.h>
+void test(void) {
+  assert(0);
+}
+
+//--- module.modulemap
+module Submodules [system] {
+  module one {
+    header "one.h"
+    export *
+  }
+  module two {
+    header "two.h"
+    export *
+  }
+}
+
+module libc [system] {
+  textual header "assert.h"
+}
+
+//--- one.h
+#ifndef one_h
+#define one_h
+#endif
+
+//--- two.h
+#ifndef two_h
+#define two_h
+#include <assert.h>
+#endif
+
+//--- assert.h
+#undef assert
+#define assert(expression) ((void)0)

@ian-twilightcoder
Copy link
Contributor Author

Thanks to @Bigcheese for helping with this patch.

https://reviews.llvm.org/D26267 looks like the most recent significant update to this code that we could find. It had an additional allowance where clang headers that were not marked textual could be re-entered. I think this was to allow stddef.h (which is actually a multiple-include file despite it not being textual) to be entered multiple times while building the Darwin module. That shouldn't be necessary since nothing in the Darwin module or libc++ defines the __need_ macros.

The unit tests added in D26267 still pass, so I don't think the builtin special case is needed anymore, but I'm still doing testing.

@llvm llvm deleted a comment from github-actions bot Mar 2, 2024
@jansvoboda11
Copy link
Contributor

Once a file has been #import'ed, it gets stamped as if it was #pragma once and will not be re-entered, even on #include.

Can you explain how this is happening? The only place where HeaderFileInfo::isPragmaOnce is set to true is in HeaderSearch::MarkFileIncludeOnce(), only called from Preprocessor::HandlePragmaOnce(), which seems correct.

The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order.

Could this be also solved by tracking the "already included" state per submodule?

@ian-twilightcoder
Copy link
Contributor Author

Once a file has been #import'ed, it gets stamped as if it was #pragma once and will not be re-entered, even on #include.

Can you explain how this is happening? The only place where HeaderFileInfo::isPragmaOnce is set to true is in HeaderSearch::MarkFileIncludeOnce(), only called from Preprocessor::HandlePragmaOnce(), which seems correct.

It gets HeaderFileInfo::isImport which causes it to be treated the same as #pragma once from then on.

The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order.

Could this be also solved by tracking the "already included" state per submodule?

That would fix headers that can be included multiple times (which is what I'm trying to fix here). But I think that would break headers that can only be included once and are not part of a module, since their declarations would go into multiple pcms.

@ian-twilightcoder ian-twilightcoder force-pushed the import-textual branch 2 times, most recently from 08681ff to f90fe8b Compare March 5, 2024 00:09
@ian-twilightcoder
Copy link
Contributor Author

Thanks to @Bigcheese for helping with this patch.

https://reviews.llvm.org/D26267 looks like the most recent significant update to this code that we could find. It had an additional allowance where clang headers that were not marked textual could be re-entered. I think this was to allow stddef.h (which is actually a multiple-include file despite it not being textual) to be entered multiple times while building the Darwin module. That shouldn't be necessary since nothing in the Darwin module or libc++ defines the __need_ macros.

The unit tests added in D26267 still pass, so I don't think the builtin special case is needed anymore, but I'm still doing testing.

@bcardosolopes as far as I can tell the builtin special case isn't needed. Can you think of anything else I might need to test?

@vsapsai
Copy link
Collaborator

vsapsai commented Mar 11, 2024

To clarify a little bit

[...] The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors).

The necessity isn't actually true. The same definition in multiple modules shouldn't confuse clang as long as these definitions are identical. But this is a side issue.

To re-iterate what this change is about:

  1. #import implies a file is a single-include
  2. Textual header in a module map implies a file is a multi-include (aka re-entrant)
  3. When we have both #import and the header marked as textual, #import "wins", i.e. the file is single-include
  4. You want to make that when we have both #import and a textual header, textual should "win", i.e. the file should be multi-include

Is it correct?

@ian-twilightcoder
Copy link
Contributor Author

ian-twilightcoder commented Mar 12, 2024

To clarify a little bit

[...] The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors).

The necessity isn't actually true. The same definition in multiple modules shouldn't confuse clang as long as these definitions are identical. But this is a side issue.

Sometimes it does confuse clang, at least I saw problems with a typedef enum when I made an include-once header textual.

To re-iterate what this change is about:

  1. #import implies a file is a single-include
  2. Textual header in a module map implies a file is a multi-include (aka re-entrant)
  3. When we have both #import and the header marked as textual, #import "wins", i.e. the file is single-include
  4. You want to make that when we have both #import and a textual header, textual should "win", i.e. the file should be multi-include

Is it correct?

That's correct. #import is an external source - often it comes from the users of the header and not the author, and the users might not be consistent with each other. textual comes from the author and a much stronger indicator of intent.

@vsapsai
Copy link
Collaborator

vsapsai commented Mar 12, 2024

Sometimes it does confuse clang, at least I saw problems with a typedef enum when I made an include-once header textual.

Ok, I see. I would just consider it a bug, not a design decision.

That's correct. #import is an external source - often it comes from the users of the header and not the author, and the users might not be consistent with each other. textual comes from the author and a much stronger indicator of intent.

Hmm, I need to think more about it. Usually I prefer for modular builds to act like non-modular as it minimizes the surprises and allows to have a portable code. But I haven't really considered the implications of such preference.

I see why you prefer to give more priority to textual. To give a different perspective, what if a developer wants to use a newer library, for example, ncurses. The rest of SDK expects an older version and the developer might need to do some header gymnastics to make it work. So there are cases when developers need to go against library author's intent. I'm thinking that library authors aren't able to predict all the ways their library is used, so it is useful to have some flexibility around that, even at the cost of misusing the library. But this is a general opinion, I don't have specific data about the needs to tweak libraries or the risks of a library misuse. So the opinion isn't particularly strong.

@ian-twilightcoder
Copy link
Contributor Author

Sometimes it does confuse clang, at least I saw problems with a typedef enum when I made an include-once header textual.

Ok, I see. I would just consider it a bug, not a design decision.

That's correct. #import is an external source - often it comes from the users of the header and not the author, and the users might not be consistent with each other. textual comes from the author and a much stronger indicator of intent.

Hmm, I need to think more about it. Usually I prefer for modular builds to act like non-modular as it minimizes the surprises and allows to have a portable code. But I haven't really considered the implications of such preference.

Yes generally, but modules build more headers than non-modular builds, nothing we can do about that.

I see why you prefer to give more priority to textual. To give a different perspective, what if a developer wants to use a newer library, for example, ncurses. The rest of SDK expects an older version and the developer might need to do some header gymnastics to make it work. So there are cases when developers need to go against library author's intent. I'm thinking that library authors aren't able to predict all the ways their library is used, so it is useful to have some flexibility around that, even at the cost of misusing the library. But this is a general opinion, I don't have specific data about the needs to tweak libraries or the risks of a library misuse. So the opinion isn't particularly strong.

That one seems like an edge case, and using #import wouldn't necessarily help you because the SDK module could build first before your #import is seen.

@vsapsai
Copy link
Collaborator

vsapsai commented Apr 2, 2024

That one seems like an edge case, and using #import wouldn't necessarily help you because the SDK module could build first before your #import is seen.

This isn't a specific use case I want to support. I just want to express it is impossible to predict all the ways a library is going to be used. And taking away the control from developers can make their lives harder.

Thinking about it more I still don't like the idea that the source code isn't the source of truth but some side file is. I've wasted too much time trying to figure out why an include isn't including anything and that was because of no_undeclared_includes in a module map.

@ian-twilightcoder
Copy link
Contributor Author

That one seems like an edge case, and using #import wouldn't necessarily help you because the SDK module could build first before your #import is seen.

This isn't a specific use case I want to support. I just want to express it is impossible to predict all the ways a library is going to be used. And taking away the control from developers can make their lives harder.

Thinking about it more I still don't like the idea that the source code isn't the source of truth but some side file is. I've wasted too much time trying to figure out why an include isn't including anything and that was because of no_undeclared_includes in a module map.

The current behavior of #import with modules causes the same kind of maddening "why isn't this include doing anything?" behavior that no_undeclared_includes does. #import is just a broken feature. It relies on every single include in the OS/SDK to correctly choose between #include and #import, and the Apple SDKs have it wrong all over the place. I really don't think it's reasonable to expect humans to know which one to use the difference is so esoteric. And when a header changes such that it needs to be used with #include instead of #import, it's also too much to expect all of the clients, internal and external, to switch. Because it's not enough that the SDK has it right, the clients have to have it right too, even the first #import will silently do nothing if a not-visible submodule previously did an #include. While I agree with the sentiment that it's frustrating that module maps add an extra layer of behavior that's sometimes unintuitive, I think the fact is that #import is a misguided feature that can't practically be made to work properly, especially with modules, and this PR is the best workaround we've come up with so far.

@jansvoboda11
Copy link
Contributor

jansvoboda11 commented Apr 3, 2024

Could this be also solved by tracking the "already included" state per submodule?

That would fix headers that can be included multiple times (which is what I'm trying to fix here). But I think that would break headers that can only be included once and are not part of a module, since their declarations would go into multiple pcms.

How exactly would that break? We do allow multiple declarations with the same name as long as they are structurally equivalent.

@ian-twilightcoder
Copy link
Contributor Author

Could this be also solved by tracking the "already included" state per submodule?

That would fix headers that can be included multiple times (which is what I'm trying to fix here). But I think that would break headers that can only be included once and are not part of a module, since their declarations would go into multiple pcms.

How exactly would that break? We do allow multiple declarations with the same name as long as they are structurally equivalent.

Only the simplest of declarations successfully do that comparison. Even if they did it would still be an issue for Swift which uses the module name as part of the type name.

…ompletely invisible in clang module builds

Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as <assert.h>, will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. <assert.h> can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely.

Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or `#pragma once`.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order.

#import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked `textual` in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like <assert.h> to always be included when modules are enabled, even if #import is erroneously used somewhere.
Copy link
Contributor

@jansvoboda11 jansvoboda11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ian-twilightcoder ian-twilightcoder merged commit 0cd0aa0 into llvm:main Apr 5, 2024
@ian-twilightcoder ian-twilightcoder deleted the import-textual branch April 5, 2024 17:13
@ilya-biryukov
Copy link
Contributor

ilya-biryukov commented Apr 10, 2024

@ian-twilightcoder this change seemed to cause our internal builds to run out of source location space.

remark: source manager location address space usage: [-Rsloc-usage]
note: 408559B in local locations, 2146921126B in locations loaded from AST files, for a total of 2147329685B (99% of available space)

The main offenders from the follow-ups are:

  • module map files for standard library and C/C++ runtime headers, each entered a few thousand times,
  • various headers without header guards, e.g. libc++'s __config and others,
  • arm_neon.h.
    Without looking deeper into what the change does, this might be an expected outcome here. However, let me know if something already looks off from this small description (e.g. the module maps entered so many times is surprising, I'll need to see if this happened before).

I will investigate this further, but we might not be able to change how our builds work quickly.
Would it be ok to revert this change or put the new behavior under a flag to give us some more time to investigate?

@ian-twilightcoder
Copy link
Contributor Author

@ian-twilightcoder this change seemed to cause our internal builds to run out of source location space.

remark: source manager location address space usage: [-Rsloc-usage]
note: 408559B in local locations, 2146921126B in locations loaded from AST files, for a total of 2147329685B (99% of available space)

The main offenders from the follow-ups are:

  • module map files for standard library and C/C++ runtime headers, each entered a few thousand times,
  • various headers without header guards, e.g. libc++'s __config and others,
  • arm_neon.h.
    Without looking deeper into what the change does, this might be an expected outcome here. However, let me know if something already looks off from this small description (e.g. the module maps entered so many times is surprising, I'll need to see if this happened before).

I will investigate this further, but we might not be able to change how our builds work quickly. Would it be ok to revert this change or put the new behavior under a flag to give us some more time to investigate?

That's surprising, the code to re-enter files is HeaderSearch::ShouldEnterIncludeFile, and I only changed the behavior there for #import. The #include behavior is still governed by #pragma once and FileInfo.getControllingMacro, and that isn't changed. Are you sure it's this change? Do you have a repro case?

@ilya-biryukov
Copy link
Contributor

It's definitely caused by that change, reverting it fixes the failure.

We use module maps so that some of our #include get processed as #import, I believe the same code gets executed for our use-case. Because our setup is so unusual, producing a repro is hard and make take some time. (Build system generated module maps and #include is essentially turned into #import by Clang). I will try to investigate this further today and come back with more details.

@ilya-biryukov
Copy link
Contributor

Most of the increase seems to be coming from module maps that describe textual headers:

// This describes builtin headers.
Before: gen-crosstool-x86.cppmap:1:1: note: file entered 630 times using 83989080B of space
After: gen-crosstool-x86.cppmap:1:1: note: file entered 2260 times using 319523320B of space

I will dig further to see where it's coming from.

@ian-twilightcoder
Copy link
Contributor Author

Are your textual headers meant to be included more than once? If not, do they use header guards or #pragma once?

@ian-twilightcoder
Copy link
Contributor Author

It's definitely caused by that change, reverting it fixes the failure.

We use module maps so that some of our #include get processed as #import, I believe the same code gets executed for our use-case. Because our setup is so unusual, producing a repro is hard and make take some time. (Build system generated module maps and #include is essentially turned into #import by Clang).

clang never transforms #include to #import. It will effectively attempt to transform #include and #import to @import but that's different.

@ilya-biryukov
Copy link
Contributor

ilya-biryukov commented Apr 12, 2024

Are your textual headers meant to be included more than once? If not, do they use header guards or #pragma once?

As I mentioned, the increase is coming from the module map files.
The textual headers are header-guarded and not intended to be included more than once.

clang never transforms #include to #import. It will effectively attempt to transform #include and #import to @import but that's different.

I meant to say that we use module maps to replace #include with module imports. Sorry for getting the exact terminology wrong.

@ilya-biryukov
Copy link
Contributor

I still don't understand why the number of FileIDs for module maps has increased so much. I am observing this increase when compiling a file that has many module dependencies, it creates only one FileID for a module map I mentioned above, the rest 2259 are from loaded binary PCM files.

@ian-twilightcoder
Copy link
Contributor Author

Wow that's a ton of PCM files. Why are your headers textual if they're only meant to be included once? Do they need to inherit the includer's preprocessor environment or something like that?

@jyknight
Copy link
Member

jyknight commented Apr 12, 2024

I think the bug this change was attempting to fix is actually the same as #38554 and related bugs -- which, it appears, was not really fixed.

The underlying problem here is that #pragma once should should work identically to ifndef guards, as far as what macros/decls are made visible to the includer. However, with the current modules implementation, it fails to do so. This PR addresses only one symptom of that issue, but does not address the underlying problem.

Additionally, I think the semantics change implemented here is a bad idea. It significantly complicates the semantics of #import, in a way that I think is justifiable, especially given that there is a way to fix the identified issue in a more principled way.

Yes, #import was a bad and confusing idea, and should have been dropped 20 years ago, but adding more special cases to it does not make it better. A header file being marked "textual" in a module-map does not mean a file should be be included multiple times -- it simply means that you didn't want to build it into a module. Linking these two concepts together ties together things which have no good reason to be related. [edited this bit after posting; forgot to finish the thought]

As such, I would propose this PR ought to be reverted, and work to fix the underlying issue be done instead.

Here's a test case which I think demonstrates the actual bug -- it fails both before and after this PR. If you modify 'header.h' to use include guards, then it passes. This difference in behavior should not exist.

// RUN: rm -rf %t
// RUN: split-file %s %t
// RUN: %clang_cc1 -std=c17 -fmodules-cache-path=%t/no-lsv -fmodules -fimplicit-module-maps -I%t %t/pragma-once-modules.c -verify
// RUN: %clang_cc1 -std=c17 -fmodules-cache-path=%t/lsv -fmodules -fimplicit-module-maps -fmodules-local-submodule-visibility -I%t %t/pragma-once-modules.c -verify

//--- pragma-once-modules.c
// expected-no-diagnostics
#include "one.h"
#include "header.h"
my_int test(void) {
  return MY_MACRO;
}

//--- module.modulemap
module Submodules {
  module one {
    header "one.h"
    export *
  }
  module two {
    header "two.h"
    export *
  }
}

//--- one.h
#ifndef one_h
#define one_h
#endif

//--- two.h
#ifndef two_h
#define two_h
#include "header.h"
#endif

//--- header.h
#pragma once
#define MY_MACRO 5
typedef int my_int;

It produces the following unexpected output.

error: 'expected-error' diagnostics seen but not expected: 
  [...]/pragma-once-modules.c Line 4: missing '#include "header.h"'; 'my_int' must be declared before it is used
  [...]/pragma-once-modules.c Line 5: use of undeclared identifier 'MY_MACRO'
error: 'expected-note' diagnostics seen but not expected: 
  [...]/header.h Line 3: declaration here is not visible

@jyknight
Copy link
Member

Oh -- I'd also note that even if this is reverted, ilya-biryukov may want to continue to investigate the source-location issue -- it's entirely possible that the correct fix will trigger that same problem!

@ian-twilightcoder
Copy link
Contributor Author

ian-twilightcoder commented Apr 12, 2024

I don't really think it's the same thing. The problem I'm trying to fix is that nobody knows when it's appropriate to use #import vs #include, and the unfortunate convention of Objective-C makes it impossible for header owners to indicate if they support being included multiple times or not, as using header guards or #pragma once is very "un-Objective-C". In other words, not having a header guard is 1) not a very good indication since include-once ObjC headers almost never have header guards, and 2) the header developer is at the mercy of every ObjC developer getting it right. Marking the header as textual is a fairly reasonable way for header owners to indicate that their header is meant to be included multiple times (e.g. stddef.h), even if ObjC developers don't know that and used #import. (The other use of textual that I've seen is for headers like unwind_arm_ehabi.h that are only allowed to have a single includer, and aren't really standalone.) If you don't want the header to build with the module, that's what excluded is for is it not?

An alternative solution would be @jansvoboda11's #71117, but I believe that will cause issues with non-modular headers building into multiple modules.

@ian-twilightcoder
Copy link
Contributor Author

ian-twilightcoder commented Apr 12, 2024

Also this change isn't trying to modify #include in any kind of way, only #import. So it's quite intentional that your test behaves the same before and after. It's also not trying to make including non-modular headers from modular headers do anything better, it's mostly considered an error to include a non-modular header from a modular header.

@jyknight
Copy link
Member

The problem I'm trying to fix is that nobody knows when it's appropriate to use #import vs #include

But you haven't really (and I think cannot) fixed that.

using header guards or #pragma once is very "un-Objective-C".

Yes, this is quite unfortunate. The best answer would be to change the recommended style, but unfortunately, it's probably 20-30 years too late to do so effectively. (I mean, it would be a start if Apple added #pragma once to all of the SDK headers.)

Your initial comment said:

Normally this isn't a big problem, [...] However, when clang modules are involved, this can cause the header to be hidden entirely.

And I am basically agreeing with this. The current state of objc include/import, while unfortunate, has existed for decades, and is usually not a big problem. And, I think that means we shouldn't add more complexity to the semantics of #import in an attempt to fix it. But the "header is hidden entirely" is a big problem, and that part is due to the bug I described -- so if that bug is fixed, we're back to "isn't a big problem", without making this change in semantics.

An alternative solution would be @jansvoboda11's #71117, but I believe that will cause issues with non-modular headers building into multiple modules.

I haven't looked at that, but it sounds promising. The end result should be that #imported and #pragma once-guarded files are treated the same way as #ifndef-guarded files. I wouldn't expect that to cause issues with non-modular headers included in multiple modules, given that works for #ifndef-guarded files, today.

Also this change isn't trying to modify #include in any kind of way, only #import. So it's quite intentional that your test behaves the same before and after.

I know you weren't trying to fix that -- but I'm saying that's the thing that should be fixed, and that doing so would fix the serious part of the problem you described in the initial message.

@ian-twilightcoder
Copy link
Contributor Author

The end result should be that #imported and #pragma once-guarded files are treated the same way as #ifndef-guarded files.

While I don't necessarily disagree with that goal in principle, it wasn't true before this change either. There was already some special casing to defeat #import. This change just adds another special case to the #import path. #include doesn't even go through the modified code that is MaybeReenterImportedFile (née TryEnterImported).

I know you weren't trying to fix that -- but I'm saying that's the thing that should be fixed, and that doing so would fix the serious part of the problem you described in the initial message.

Modular headers including non-modular headers is a special problem. If we let every module enter the non-modular header (which Jan's patch does), its declarations will build into multiple modules/pcm files, where they will 1) conflict and cause errors in the several cases where type merging can't handle them, and 2) come from multiple modules which is undefined behavior for Swift (i.e. Swift will see mod1.type and mod2.type and won't know how to resolve un-qualified type in code). So I don't think that case really can be fixed.

@jyknight
Copy link
Member

The end result should be that #imported and #pragma once-guarded files are treated the same way as #ifndef-guarded files.

While I don't necessarily disagree with that goal in principle, it wasn't true before this change either.

Well, yes, I know it isn't true yet -- that's exactly what I've been trying to say, and demonstrated with the test -- it's not true and that's a bug.

Modular headers including non-modular headers is a special problem. If we let every module enter the non-modular header (which Jan's patch does), its declarations will build into multiple modules/pcm files

I'm not really sure what you're getting at in this paragraph. Is there a new special problem introduced here?

Yes, a textual header can be built into multiple modules/pcm files -- that's the nature of it being textual, right? Certainly it's always be a better to modularize headers from the bottom-up, and avoid textually including a header into multiple modules. And, yes, the performance cost of duplicating the decls into multiple modules' pcm files can be quite significant, depending on the size of the header. Yet, sometimes it happens. And, when it does, the declaration-merging takes care of it -- for C/C++ clients. This all works (and is used) today, with ifndef-guarded header inclusion.

But a textually-included header shouldn't be re-entered if it's been visibly included already in the current compilation -- including if that was exported from an imported module. It should only be re-entered if the previous include is not visible in the current context. Maybe you're saying Jan's patch doesn't do that properly yet. (I could see that being the case, since the behavior mostly just falls out automatically for the ifndef-guard case, based on whether the guard-macro was made visible or not!)

@jyknight
Copy link
Member

To get back to my previous point about the semantics implemented by this patch being unfortunate -- the upshot is that, now:

#include <assert.h>
#define NDEBUG
#import <assert.h>

will include assert.h twice (even though the latter is an "import") only if modules are enabled. If modules are disabled, import keeps the original behavior of suppressing multiple inclusion. This doesn't make sense: with modules disabled, everything is textual -- to say that "textual header" in a modulemap should make a header somehow "more" textual than the default non-modular behavior is weird.

That is what I meant by my previous comment that this change has tied together features which should not be related -- and that's the main reason why I believe this PR should be reverted.

If we do need to implement a behavior of suppressing #import's include-once behavior for certain headers, it should be based on something within the header being included. E.g. could create a new#pragma multiply_include_even_with_import and add it to assert.h -- this would be effectively the reverse of #pragma once. I do not believe it's really needed (once the "header is hidden entirely" bug is fixed), but it would be consistent and not produce broken/surprising semantics.

@ian-twilightcoder
Copy link
Contributor Author

Wall of text incoming... Sorry but our documentation in this area is poor, so I feel like I need to spell out the assumptions behind this change.

First of all, this should be a relatively minor change that should affect a very limited set of circumstances.

  1. Only #import should be affected, not #include.
  2. Only headers marked textual header in a module map should be affected, nothing else.
  3. Headers that use #pragma once or have header guards should not be affected.
  4. Module map reading should not be affected.

That should end up being a rather narrow change, and if something else got affected I'd like to know what and fix it.

#import is already treated differently if modules are enabled. This adds another case to the special treatment, but doesn't introduce that fact that sometimes #imported headers will be re-entered when building with modules. We can say we don't like #import behaving differently when modules are enabled. However, it's not going to be any harder to undo this special treatment than it will to undo the other ones, if we ever care enough to try, so I don't think this is such an egregious change.

textual headers are already treated differently as well. To better define that, there are 4 different types of headers in a clang modules world.

  1. Headers covered by a header statement in a module map - ye olde standard include-once header.
  2. Headers covered by a textual header statement in a module map - usually headers that are intended to be included multiple times. Generally these are X macro generator headers, or special headers like <assert.h> or clang's <stdarg.h> and <stddef.h> which are designed to have different behavior based on the preprocessor environment of the includer.
  3. Headers covered by an exclude header statement in a module map - headers that can't be used in a module. Sometimes these are just obsolete headers, and sometimes it's headers that have dependencies that can't be used in a module.
  4. Headers that are not listed in any module map. These are essentially the same thing as excluded headers.

There's a semantic difference between textual and excluded headers. Textual headers are anointed to be safe to use in modules, and excluded headers are not supposed to be included from any modular header. (If I had my way, -Wnon-modular-include-in-module would default to an error much like -Wmodule-import-in-extern-c does, but that can be a soap box for another day.) Textual headers are already treated differently from excluded headers in a few ways, such as they respect [no_undeclared_includes]/use, while excluded headers don't.

All this to say that #imported headers are already treated different when modules are enabled, and textual headers are already treated different than excluded headers. I think it's fine to tweak this different treatment, especially since it's solving a real world problem that occurs in a fully modularized system. I also think it's preferable to focus the change down to the very specific problem, which is #imported textual headers, instead of trying to fix all of the submodule visibility problems we can think of.

@ian-twilightcoder
Copy link
Contributor Author

To get back to my previous point about the semantics implemented by this patch being unfortunate -- the upshot is that, now:

#include <assert.h>
#define NDEBUG
#import <assert.h>

It would be nice if we could make this work without modules too. #pragma many or something would do that, but I fear we'd have compatibility issues if we added a pragma that's only supported by clang to a bunch of system headers.

This doesn't make sense: with modules disabled, everything is textual -- to say that "textual header" in a modulemap should make a header somehow "more" textual than the default non-modular behavior is weird.

With modules disabled, everything isn't textual - it's effectively excluded. textual was probably not the best name for that keyword in the module map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants