Skip to content

[Modules] Fix an identifier hiding a function-like macro definition. #135471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 16, 2025

Conversation

vsapsai
Copy link
Collaborator

@vsapsai vsapsai commented Apr 12, 2025

We emit a macro definition only in a module defining it. But it means that if another module has an identifier with the same name as the macro, the users of such module won't be able to use the macro anymore.

Fix by storing that an identifier has a macro definition that's not in a current module (MacroDirectivesOffset == 0). This way IdentifierLookupVisitor knows not to stop at the first module with an identifier but to keep checking included modules for the actual macro definition.

Fixes issue #32040.

rdar://30258278

We emit a macro definition only in a module defining it. But it means
that if another module has an identifier with the same name as the macro,
the users of such module won't be able to use the macro anymore.

Fix by storing that an identifier has a macro definition that's not in
a current module (`MacroDirectivesOffset == 0`). This way
`IdentifierLookupVisitor` knows not to stop at the first module with
an identifier but to keep checking included modules for the actual macro
definition.

Fixes issue llvm#32040.

rdar://30258278
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:modules C++20 modules and Clang Header Modules labels Apr 12, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 12, 2025

@llvm/pr-subscribers-clang

Author: Volodymyr Sapsai (vsapsai)

Changes

We emit a macro definition only in a module defining it. But it means that if another module has an identifier with the same name as the macro, the users of such module won't be able to use the macro anymore.

Fix by storing that an identifier has a macro definition that's not in a current module (MacroDirectivesOffset == 0). This way IdentifierLookupVisitor knows not to stop at the first module with an identifier but to keep checking included modules for the actual macro definition.

Fixes issue #32040.

rdar://30258278


Full diff: https://github.com/llvm/llvm-project/pull/135471.diff

4 Files Affected:

  • (modified) clang/lib/Serialization/ASTReader.cpp (+11-5)
  • (modified) clang/lib/Serialization/ASTReaderInternals.h (+6)
  • (modified) clang/lib/Serialization/ASTWriter.cpp (+11-5)
  • (added) clang/test/Modules/shadow-macro.c (+29)
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 02c31dff620ec..e8a3c148f9f61 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -1131,7 +1131,7 @@ IdentifierInfo *ASTIdentifierLookupTrait::ReadData(const internal_key_type& k,
   bool HasRevertedTokenIDToIdentifier = readBit(Bits);
   bool Poisoned = readBit(Bits);
   bool ExtensionToken = readBit(Bits);
-  bool HadMacroDefinition = readBit(Bits);
+  bool HasMacroDefinition = readBit(Bits);
 
   assert(Bits == 0 && "Extra bits in the identifier?");
   DataLen -= sizeof(uint16_t) * 2;
@@ -1151,14 +1151,17 @@ IdentifierInfo *ASTIdentifierLookupTrait::ReadData(const internal_key_type& k,
          "Incorrect C++ operator keyword flag");
   (void)CPlusPlusOperatorKeyword;
 
-  // If this identifier is a macro, deserialize the macro
-  // definition.
-  if (HadMacroDefinition) {
+  // If this identifier has a macro definition, deserialize it or notify the
+  // visitor the actual definition is in a different module.
+  if (HasMacroDefinition) {
     uint32_t MacroDirectivesOffset =
         endian::readNext<uint32_t, llvm::endianness::little>(d);
     DataLen -= 4;
 
-    Reader.addPendingMacro(II, &F, MacroDirectivesOffset);
+    if (MacroDirectivesOffset)
+      Reader.addPendingMacro(II, &F, MacroDirectivesOffset);
+    else
+      hasMacroDefinitionInDependencies = true;
   }
 
   Reader.SetIdentifierInfo(ID, II);
@@ -2419,6 +2422,9 @@ namespace {
       // declarations it needs.
       ++NumIdentifierLookupHits;
       Found = *Pos;
+      if (Trait.hasMoreInformationInDependencies())
+        // Look for the identifier in extra modules as they contain more info.
+        return false;
       return true;
     }
 
diff --git a/clang/lib/Serialization/ASTReaderInternals.h b/clang/lib/Serialization/ASTReaderInternals.h
index 353e0a53cad9b..4a7794889b039 100644
--- a/clang/lib/Serialization/ASTReaderInternals.h
+++ b/clang/lib/Serialization/ASTReaderInternals.h
@@ -286,6 +286,8 @@ class ASTIdentifierLookupTrait : public ASTIdentifierLookupTraitBase {
   // identifier that was constructed before the AST file was read.
   IdentifierInfo *KnownII;
 
+  bool hasMacroDefinitionInDependencies = false;
+
 public:
   using data_type = IdentifierInfo *;
 
@@ -300,6 +302,10 @@ class ASTIdentifierLookupTrait : public ASTIdentifierLookupTraitBase {
   IdentifierID ReadIdentifierID(const unsigned char *d);
 
   ASTReader &getReader() const { return Reader; }
+
+  bool hasMoreInformationInDependencies() const {
+    return hasMacroDefinitionInDependencies;
+  }
 };
 
 /// The on-disk hash table used to contain information about
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index a48c05061626a..05a3b8f08029f 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -3792,7 +3792,10 @@ bool IsInterestingIdentifier(const IdentifierInfo *II, uint64_t MacroOffset,
       II->getNotableIdentifierID() != tok::NotableIdentifierKind::not_notable ||
       II->getBuiltinID() != Builtin::ID::NotBuiltin ||
       II->getObjCKeywordID() != tok::ObjCKeywordKind::objc_not_keyword;
-  if (MacroOffset || II->isPoisoned() || (!IsModule && IsInteresting) ||
+  if (MacroOffset ||
+      (II->hasMacroDefinition() &&
+       II->hasFETokenInfoChangedSinceDeserialization()) ||
+      II->isPoisoned() || (!IsModule && IsInteresting) ||
       II->hasRevertedTokenIDToIdentifier() ||
       (NeedDecls && II->getFETokenInfo()))
     return true;
@@ -3871,7 +3874,8 @@ class ASTIdentifierTableTrait {
     if (isInterestingIdentifier(II, MacroOffset)) {
       DataLen += 2; // 2 bytes for builtin ID
       DataLen += 2; // 2 bytes for flags
-      if (MacroOffset)
+      if (MacroOffset || (II->hasMacroDefinition() &&
+                          II->hasFETokenInfoChangedSinceDeserialization()))
         DataLen += 4; // MacroDirectives offset.
 
       if (NeedDecls && IdResolver)
@@ -3902,15 +3906,17 @@ class ASTIdentifierTableTrait {
     assert((Bits & 0xffff) == Bits && "ObjCOrBuiltinID too big for ASTReader.");
     LE.write<uint16_t>(Bits);
     Bits = 0;
-    bool HadMacroDefinition = MacroOffset != 0;
-    Bits = (Bits << 1) | unsigned(HadMacroDefinition);
+    bool HasMacroDefinition =
+        (MacroOffset != 0) || (II->hasMacroDefinition() &&
+                               II->hasFETokenInfoChangedSinceDeserialization());
+    Bits = (Bits << 1) | unsigned(HasMacroDefinition);
     Bits = (Bits << 1) | unsigned(II->isExtensionToken());
     Bits = (Bits << 1) | unsigned(II->isPoisoned());
     Bits = (Bits << 1) | unsigned(II->hasRevertedTokenIDToIdentifier());
     Bits = (Bits << 1) | unsigned(II->isCPlusPlusOperatorKeyword());
     LE.write<uint16_t>(Bits);
 
-    if (HadMacroDefinition)
+    if (HasMacroDefinition)
       LE.write<uint32_t>(MacroOffset);
 
     if (NeedDecls && IdResolver) {
diff --git a/clang/test/Modules/shadow-macro.c b/clang/test/Modules/shadow-macro.c
new file mode 100644
index 0000000000000..4cd7cf0500322
--- /dev/null
+++ b/clang/test/Modules/shadow-macro.c
@@ -0,0 +1,29 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/modules.cache \
+// RUN:            -fsyntax-only %t/test.c -verify
+// Test again with the populated module cache.
+// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/modules.cache \
+// RUN:            -fsyntax-only %t/test.c -verify
+
+// Test that an identifier with the same name as a macro doesn't hide this
+// macro from the includers.
+
+//--- macro-definition.h
+#define __P(protos) ()
+#define __Q(protos) ()
+
+//--- macro-transitive.h
+#include "macro-definition.h"
+void test(int __P) {} // not "interesting" identifier
+struct __Q {};        // "interesting" identifier
+
+//--- module.modulemap
+module MacroDefinition { header "macro-definition.h" export * }
+module MacroTransitive { header "macro-transitive.h" export * }
+
+//--- test.c
+// expected-no-diagnostics
+#include "macro-transitive.h"
+void foo __P(());
+void bar __Q(());

@llvmbot
Copy link
Member

llvmbot commented Apr 12, 2025

@llvm/pr-subscribers-clang-modules

Author: Volodymyr Sapsai (vsapsai)

Changes

We emit a macro definition only in a module defining it. But it means that if another module has an identifier with the same name as the macro, the users of such module won't be able to use the macro anymore.

Fix by storing that an identifier has a macro definition that's not in a current module (MacroDirectivesOffset == 0). This way IdentifierLookupVisitor knows not to stop at the first module with an identifier but to keep checking included modules for the actual macro definition.

Fixes issue #32040.

rdar://30258278


Full diff: https://github.com/llvm/llvm-project/pull/135471.diff

4 Files Affected:

  • (modified) clang/lib/Serialization/ASTReader.cpp (+11-5)
  • (modified) clang/lib/Serialization/ASTReaderInternals.h (+6)
  • (modified) clang/lib/Serialization/ASTWriter.cpp (+11-5)
  • (added) clang/test/Modules/shadow-macro.c (+29)
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 02c31dff620ec..e8a3c148f9f61 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -1131,7 +1131,7 @@ IdentifierInfo *ASTIdentifierLookupTrait::ReadData(const internal_key_type& k,
   bool HasRevertedTokenIDToIdentifier = readBit(Bits);
   bool Poisoned = readBit(Bits);
   bool ExtensionToken = readBit(Bits);
-  bool HadMacroDefinition = readBit(Bits);
+  bool HasMacroDefinition = readBit(Bits);
 
   assert(Bits == 0 && "Extra bits in the identifier?");
   DataLen -= sizeof(uint16_t) * 2;
@@ -1151,14 +1151,17 @@ IdentifierInfo *ASTIdentifierLookupTrait::ReadData(const internal_key_type& k,
          "Incorrect C++ operator keyword flag");
   (void)CPlusPlusOperatorKeyword;
 
-  // If this identifier is a macro, deserialize the macro
-  // definition.
-  if (HadMacroDefinition) {
+  // If this identifier has a macro definition, deserialize it or notify the
+  // visitor the actual definition is in a different module.
+  if (HasMacroDefinition) {
     uint32_t MacroDirectivesOffset =
         endian::readNext<uint32_t, llvm::endianness::little>(d);
     DataLen -= 4;
 
-    Reader.addPendingMacro(II, &F, MacroDirectivesOffset);
+    if (MacroDirectivesOffset)
+      Reader.addPendingMacro(II, &F, MacroDirectivesOffset);
+    else
+      hasMacroDefinitionInDependencies = true;
   }
 
   Reader.SetIdentifierInfo(ID, II);
@@ -2419,6 +2422,9 @@ namespace {
       // declarations it needs.
       ++NumIdentifierLookupHits;
       Found = *Pos;
+      if (Trait.hasMoreInformationInDependencies())
+        // Look for the identifier in extra modules as they contain more info.
+        return false;
       return true;
     }
 
diff --git a/clang/lib/Serialization/ASTReaderInternals.h b/clang/lib/Serialization/ASTReaderInternals.h
index 353e0a53cad9b..4a7794889b039 100644
--- a/clang/lib/Serialization/ASTReaderInternals.h
+++ b/clang/lib/Serialization/ASTReaderInternals.h
@@ -286,6 +286,8 @@ class ASTIdentifierLookupTrait : public ASTIdentifierLookupTraitBase {
   // identifier that was constructed before the AST file was read.
   IdentifierInfo *KnownII;
 
+  bool hasMacroDefinitionInDependencies = false;
+
 public:
   using data_type = IdentifierInfo *;
 
@@ -300,6 +302,10 @@ class ASTIdentifierLookupTrait : public ASTIdentifierLookupTraitBase {
   IdentifierID ReadIdentifierID(const unsigned char *d);
 
   ASTReader &getReader() const { return Reader; }
+
+  bool hasMoreInformationInDependencies() const {
+    return hasMacroDefinitionInDependencies;
+  }
 };
 
 /// The on-disk hash table used to contain information about
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index a48c05061626a..05a3b8f08029f 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -3792,7 +3792,10 @@ bool IsInterestingIdentifier(const IdentifierInfo *II, uint64_t MacroOffset,
       II->getNotableIdentifierID() != tok::NotableIdentifierKind::not_notable ||
       II->getBuiltinID() != Builtin::ID::NotBuiltin ||
       II->getObjCKeywordID() != tok::ObjCKeywordKind::objc_not_keyword;
-  if (MacroOffset || II->isPoisoned() || (!IsModule && IsInteresting) ||
+  if (MacroOffset ||
+      (II->hasMacroDefinition() &&
+       II->hasFETokenInfoChangedSinceDeserialization()) ||
+      II->isPoisoned() || (!IsModule && IsInteresting) ||
       II->hasRevertedTokenIDToIdentifier() ||
       (NeedDecls && II->getFETokenInfo()))
     return true;
@@ -3871,7 +3874,8 @@ class ASTIdentifierTableTrait {
     if (isInterestingIdentifier(II, MacroOffset)) {
       DataLen += 2; // 2 bytes for builtin ID
       DataLen += 2; // 2 bytes for flags
-      if (MacroOffset)
+      if (MacroOffset || (II->hasMacroDefinition() &&
+                          II->hasFETokenInfoChangedSinceDeserialization()))
         DataLen += 4; // MacroDirectives offset.
 
       if (NeedDecls && IdResolver)
@@ -3902,15 +3906,17 @@ class ASTIdentifierTableTrait {
     assert((Bits & 0xffff) == Bits && "ObjCOrBuiltinID too big for ASTReader.");
     LE.write<uint16_t>(Bits);
     Bits = 0;
-    bool HadMacroDefinition = MacroOffset != 0;
-    Bits = (Bits << 1) | unsigned(HadMacroDefinition);
+    bool HasMacroDefinition =
+        (MacroOffset != 0) || (II->hasMacroDefinition() &&
+                               II->hasFETokenInfoChangedSinceDeserialization());
+    Bits = (Bits << 1) | unsigned(HasMacroDefinition);
     Bits = (Bits << 1) | unsigned(II->isExtensionToken());
     Bits = (Bits << 1) | unsigned(II->isPoisoned());
     Bits = (Bits << 1) | unsigned(II->hasRevertedTokenIDToIdentifier());
     Bits = (Bits << 1) | unsigned(II->isCPlusPlusOperatorKeyword());
     LE.write<uint16_t>(Bits);
 
-    if (HadMacroDefinition)
+    if (HasMacroDefinition)
       LE.write<uint32_t>(MacroOffset);
 
     if (NeedDecls && IdResolver) {
diff --git a/clang/test/Modules/shadow-macro.c b/clang/test/Modules/shadow-macro.c
new file mode 100644
index 0000000000000..4cd7cf0500322
--- /dev/null
+++ b/clang/test/Modules/shadow-macro.c
@@ -0,0 +1,29 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/modules.cache \
+// RUN:            -fsyntax-only %t/test.c -verify
+// Test again with the populated module cache.
+// RUN: %clang_cc1 -fmodules -fimplicit-module-maps -fmodules-cache-path=%t/modules.cache \
+// RUN:            -fsyntax-only %t/test.c -verify
+
+// Test that an identifier with the same name as a macro doesn't hide this
+// macro from the includers.
+
+//--- macro-definition.h
+#define __P(protos) ()
+#define __Q(protos) ()
+
+//--- macro-transitive.h
+#include "macro-definition.h"
+void test(int __P) {} // not "interesting" identifier
+struct __Q {};        // "interesting" identifier
+
+//--- module.modulemap
+module MacroDefinition { header "macro-definition.h" export * }
+module MacroTransitive { header "macro-transitive.h" export * }
+
+//--- test.c
+// expected-no-diagnostics
+#include "macro-transitive.h"
+void foo __P(());
+void bar __Q(());

@vsapsai
Copy link
Collaborator Author

vsapsai commented Apr 12, 2025

cc @bcardosolopes @vgvassilev as participants of the original bug discussion.

@vgvassilev
Copy link
Contributor

Thanks! Does the pr fix https://bugs.llvm.org//show_bug.cgi?id=32670

@vsapsai
Copy link
Collaborator Author

vsapsai commented Apr 14, 2025

Thanks! Does the pr fix https://bugs.llvm.org//show_bug.cgi?id=32670

Nope. The test case in #32017 is still failing. I believe it is a different issue as we don't have any identifiers with the same names as macros. And seems like the behavior wasn't affected by d79514e (aka r259901) while #32040 was exposed by that change.

Copy link
Contributor

@vgvassilev vgvassilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch looks reasonable to me, however, I'd wait for somebody with more experience in the macro logic in modules.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, looked at the other tests in Modules directory and think that the test should start with "macro-" for consistency. Probably something like "macro-identifier-hiding.c" as we have "macro-hiding.cpp" already.

Copy link
Contributor

@jansvoboda11 jansvoboda11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The behavior is a bit subtle (always was), but it's good to have this edge-case fixed.

@vsapsai vsapsai merged commit 81739c3 into llvm:main Apr 16, 2025
11 checks passed
@vsapsai vsapsai deleted the shadow-macro-identifier branch April 16, 2025 17:15
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…lvm#135471)

We emit a macro definition only in a module defining it. But it means
that if another module has an identifier with the same name as the
macro, the users of such module won't be able to use the macro anymore.

Fix by storing that an identifier has a macro definition that's not in a
current module (`MacroDirectivesOffset == 0`). This way
`IdentifierLookupVisitor` knows not to stop at the first module with an
identifier but to keep checking included modules for the actual macro
definition.

Fixes issue llvm#32040.

rdar://30258278
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants