Skip to content

[Clang] Implement P3034R1 Module Declarations Shouldn’t be Macros #90574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Jul 20, 2024

Conversation

yronglin
Copy link
Contributor

@yronglin yronglin commented Apr 30, 2024

This PR implement P3034R1 Module Declarations Shouldn’t be Macros, and refactor the convoluted state machines in module name lexical analysis.

@yronglin yronglin requested review from cor3ntin and ChuanqiXu9 April 30, 2024 09:20
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules labels Apr 30, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 30, 2024

@llvm/pr-subscribers-clang-modules

@llvm/pr-subscribers-clang

Author: None (yronglin)

Changes

This PR implement P3034R1 Module Declarations Shouldn’t be Macros


Full diff: https://github.com/llvm/llvm-project/pull/90574.diff

9 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/include/clang/Basic/DiagnosticParseKinds.td (+2)
  • (modified) clang/lib/Parse/Parser.cpp (+7)
  • (added) clang/test/CXX/cpp/cpp.module/p1.cppm (+13)
  • (added) clang/test/CXX/cpp/cpp.module/version.h (+8)
  • (modified) clang/test/CXX/module/basic/basic.link/module-declaration.cpp (+20-15)
  • (modified) clang/test/CXX/module/dcl.dcl/dcl.module/dcl.module.import/p1.cppm (+2-2)
  • (modified) clang/test/SemaCXX/modules.cppm (+4)
  • (modified) clang/www/cxx_status.html (+1-1)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 1abc00a25f1f42..40c6bd63e9948f 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -153,6 +153,8 @@ C++2c Feature Support
 
 - Implemented `P2748R5 Disallow Binding a Returned Glvalue to a Temporary <https://wg21.link/P2748R5>`_.
 
+- Implemented `P3034R1 Module Declarations Shouldn’t be Macros <https://wg21.link/P3034R1>`_.
+
 Resolutions to C++ Defect Reports
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 - Substitute template parameter pack, when it is not explicitly specified
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td
index fdffb35ea0d955..0d4b526ec6d15a 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1671,6 +1671,8 @@ def err_unexpected_module_decl : Error<
   "module declaration can only appear at the top level">;
 def err_module_expected_ident : Error<
   "expected a module name after '%select{module|import}0'">;
+def err_module_decl_cannot_be_macros : Error<
+  "module declaration cannot be a macro">;
 def err_attribute_not_module_attr : Error<
   "%0 attribute cannot be applied to a module">;
 def err_keyword_not_module_attr : Error<
diff --git a/clang/lib/Parse/Parser.cpp b/clang/lib/Parse/Parser.cpp
index adcbe5858bc78e..ef66348a83125c 100644
--- a/clang/lib/Parse/Parser.cpp
+++ b/clang/lib/Parse/Parser.cpp
@@ -2690,6 +2690,13 @@ bool Parser::ParseModuleName(
       return true;
     }
 
+    // P3034R1: Module Declarations Shouldn’t be Macros
+    if (!IsImport && Tok.getLocation().isMacroID()) {
+      Diag(Tok, diag::err_module_decl_cannot_be_macros);
+      SkipUntil(tok::semi);
+      return true;
+    }
+
     // Record this part of the module path.
     Path.push_back(std::make_pair(Tok.getIdentifierInfo(), Tok.getLocation()));
     ConsumeToken();
diff --git a/clang/test/CXX/cpp/cpp.module/p1.cppm b/clang/test/CXX/cpp/cpp.module/p1.cppm
new file mode 100644
index 00000000000000..b439366db3fba0
--- /dev/null
+++ b/clang/test/CXX/cpp/cpp.module/p1.cppm
@@ -0,0 +1,13 @@
+// RUN: %clang_cc1 -std=c++20 -emit-module-interface %s -triple x86_64-linux-gnu -DTEST=1 -verify
+// RUN: %clang_cc1 -std=c++20 -emit-module-interface %s -triple x86_64-linux-gnu -DTEST=2 -verify
+
+module;
+export module x;
+#include "version.h"
+#if TEST == 1
+export module VERSION;  // expected-error {{module declaration cannot be a macro}}
+#endif // TEST == 1
+
+#if TEST == 2
+export module A.B;      // expected-error {{module declaration cannot be a macro}}
+#endif // TEST == 2
diff --git a/clang/test/CXX/cpp/cpp.module/version.h b/clang/test/CXX/cpp/cpp.module/version.h
new file mode 100644
index 00000000000000..4608934290950b
--- /dev/null
+++ b/clang/test/CXX/cpp/cpp.module/version.h
@@ -0,0 +1,8 @@
+#ifndef VERSION_H
+#define VERSION_H
+
+#define VERSION libv5
+#define A a
+#define B b
+
+#endif
diff --git a/clang/test/CXX/module/basic/basic.link/module-declaration.cpp b/clang/test/CXX/module/basic/basic.link/module-declaration.cpp
index d71358cc7a571f..aa4bb52a57face 100644
--- a/clang/test/CXX/module/basic/basic.link/module-declaration.cpp
+++ b/clang/test/CXX/module/basic/basic.link/module-declaration.cpp
@@ -9,26 +9,26 @@
 //
 // Module implementation for unknown and known module. (The former is ill-formed.)
 // RUN: %clang_cc1 -std=c++20 -I%t -fmodule-file=x.y=%t/x.y.pcm -verify -x c++ %t/M.cpp \
-// RUN:            -DTEST=1 -DEXPORT= -DMODULE_NAME=z
+// RUN:            -DTEST=1
 // RUN: %clang_cc1 -std=c++20 -I%t -fmodule-file=x=%t/x.pcm -fmodule-file=x.y=%t/x.y.pcm -verify -x c++ %t/M.cpp \
-// RUN:            -DTEST=2 -DEXPORT= -DMODULE_NAME=x
+// RUN:            -DTEST=2
 //
 // Module interface for unknown and known module. (The latter is ill-formed due to
 // redefinition.)
 // RUN: %clang_cc1 -std=c++20 -I%t -fmodule-file=x.y=%t/x.y.pcm -verify %t/M.cpp \
-// RUN:            -DTEST=3 -DEXPORT=export -DMODULE_NAME=z
+// RUN:            -DTEST=3
 // RUN: %clang_cc1 -std=c++20 -I%t -fmodule-file=x.y=%t/x.y.pcm -verify %t/M.cpp \
-// RUN:            -DTEST=4 -DEXPORT=export -DMODULE_NAME=x
+// RUN:            -DTEST=4
 //
 // Miscellaneous syntax.
 // RUN: %clang_cc1 -std=c++20 -I%t -fmodule-file=x.y=%t/x.y.pcm -verify %t/M.cpp \
-// RUN:            -DTEST=7 -DEXPORT=export -DMODULE_NAME='z elderberry'
+// RUN:            -DTEST=7
 // RUN: %clang_cc1 -std=c++20 -I%t -fmodule-file=x.y=%t/x.y.pcm -verify %t/M.cpp \
-// RUN:            -DTEST=8 -DEXPORT=export -DMODULE_NAME='z [[]]'
+// RUN:            -DTEST=8
 // RUN: %clang_cc1 -std=c++20 -I%t -fmodule-file=x.y=%t/x.y.pcm -verify %t/M.cpp \
-// RUN:            -DTEST=9 -DEXPORT=export -DMODULE_NAME='z [[fancy]]'
+// RUN:            -DTEST=9
 // RUN: %clang_cc1 -std=c++20 -I%t -fmodule-file=x.y=%t/x.y.pcm -verify %t/M.cpp \
-// RUN:            -DTEST=10 -DEXPORT=export -DMODULE_NAME='z [[maybe_unused]]'
+// RUN:            -DTEST=10
 
 //--- x.cppm
 export module x;
@@ -40,15 +40,20 @@ int c;
 
 //--- M.cpp
 
-EXPORT module MODULE_NAME;
-#if TEST == 7
-// expected-error@-2 {{expected ';'}} expected-error@-2 {{a type specifier is required}}
+#if TEST == 1
+module z; // expected-error {{module 'z' not found}}
+#elif TEST == 2
+module x; // expected-no-diagnostics
+#elif TEST == 3
+export module z; // expected-no-diagnostics
+#elif TEST == 4
+export module x; // expected-no-diagnostics
+#elif TEST == 7
+export module z elderberry; // expected-error {{expected ';'}} expected-error {{a type specifier is required}}
 #elif TEST == 9
-// expected-warning@-4 {{unknown attribute 'fancy' ignored}}
+export module z [[fancy]]; // expected-warning {{unknown attribute 'fancy' ignored}}
 #elif TEST == 10
-// expected-error-re@-6 {{'maybe_unused' attribute cannot be applied to a module{{$}}}}
-#elif TEST == 1
-// expected-error@-8 {{module 'z' not found}}
+export module z [[maybe_unused]]; // expected-error-re {{'maybe_unused' attribute cannot be applied to a module{{$}}}}
 #else
 // expected-no-diagnostics
 #endif
diff --git a/clang/test/CXX/module/dcl.dcl/dcl.module/dcl.module.import/p1.cppm b/clang/test/CXX/module/dcl.dcl/dcl.module/dcl.module.import/p1.cppm
index 873e4c0edeac25..074589ccc26926 100644
--- a/clang/test/CXX/module/dcl.dcl/dcl.module/dcl.module.import/p1.cppm
+++ b/clang/test/CXX/module/dcl.dcl/dcl.module/dcl.module.import/p1.cppm
@@ -35,9 +35,9 @@ int use_3 = c; // expected-error {{use of undeclared identifier 'c'}}
 
 //--- test.cpp
 #ifdef INTERFACE
-export module MODULE_NAME;
+export module MODULE_NAME; // expected-error {{module declaration cannot be a macro}}
 #else
-module MODULE_NAME;
+module MODULE_NAME; // expected-error {{module declaration cannot be a macro}}
 #endif
 
 import x;
diff --git a/clang/test/SemaCXX/modules.cppm b/clang/test/SemaCXX/modules.cppm
index 41204be76eafa1..75bbc5366d4a71 100644
--- a/clang/test/SemaCXX/modules.cppm
+++ b/clang/test/SemaCXX/modules.cppm
@@ -7,6 +7,10 @@
 // expected-no-diagnostics
 #endif
 
+#if TEST == 3
+// expected-error {{module declaration cannot be a macro}}
+#endif //  TEST == 3
+
 export module foo;
 
 static int m;
diff --git a/clang/www/cxx_status.html b/clang/www/cxx_status.html
index 0996abc2405857..8ae9a25caf3604 100755
--- a/clang/www/cxx_status.html
+++ b/clang/www/cxx_status.html
@@ -182,7 +182,7 @@ <h2 id="cxx26">C++2c implementation status</h2>
  <tr>
   <td>Module Declarations Shouldn’t be Macros</td>
   <td><a href="https://wg21.link/P3034R1">P3034R1</a> (<a href="#dr">DR</a>)</td>
-  <td class="none" align="center">No</td>
+  <td class="unreleased" align="center">Clang 19</td>
  </tr>
  <tr>
   <td>Trivial infinite loops are not Undefined Behavior</td>

Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise. I'd like to leave this to @cor3ntin

@yronglin
Copy link
Contributor Author

yronglin commented Apr 30, 2024

The paper does not clearly says whether disallow function-like macro is also needed, but I think disallow function-like macro has the same goal as the paper. WDYT? @cor3ntin @ChuanqiXu9

The wording in the paper said:
No identifier in the pp-module-name or pp-module-partition shall currently be defined as an object-like macro.

@cor3ntin
Copy link
Contributor

A function like macro needs parentheses to be substituted. As such,

#define foo()
module foo;

Should be allowed (and foo is not a macro expansion on line 2)

module foo(); should be Ill-formed.

@ChuanqiXu9
Copy link
Member

The paper does not clearly says whether disallow function-like macro is also needed, but I think disallow function-like macro has the same goal as the paper. WDYT? @cor3ntin @ChuanqiXu9

The wording in the paper said: No identifier in the pp-module-name or pp-module-partition shall currently be defined as an object-like macro.

The intention of the paper is, we can get the module name of a TU by cat and grep. So I feel the current wording is correct: https://godbolt.org/z/45xnsh7Eh

@yronglin
Copy link
Contributor Author

module foo(); should be Ill-formed.

Ahh, so case like the following also should be ill-formed:

#define FUNC_LIKE(X) function_like_##X
export module FUNC_LIKE(bar);

we can get the module name of a TU by cat and grep

This means that no macros of any kind can appear in the module name.

Thank you all. Things much clearer now.

@cor3ntin
Copy link
Contributor

This means that no macros of any kind can appear in the module name.

Exactly. This is why the grammar in the wording is defined in terms of pp-token and not identifiers: no macro expansion takes place at all.

@yronglin
Copy link
Contributor Author

This means that no macros of any kind can appear in the module name.

Exactly. This is why the grammar in the wording is defined in terms of pp-token and not identifiers: no macro expansion takes place at all.

Thanks! Let's use LexUnexpandedToken during parsing module name.

Copy link

github-actions bot commented May 3, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@yronglin
Copy link
Contributor Author

yronglin commented May 3, 2024

Dose this note have a conflict with P3034R1?

[Note 1: Each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens. — end note]

@cor3ntin
Copy link
Contributor

cor3ntin commented May 3, 2024

Dose this note have a conflict with P3034R1?

[Note 1: Each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens. — end note]

My reading is that

#define SOME_MACRO
module foo SOME_MACRO;

SOME_MACRO is expanded

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach looks good.

Do we have existing tests for the use of module as an identifier outside of a module declaration?

@cor3ntin cor3ntin requested a review from Bigcheese May 3, 2024 14:10
@yronglin
Copy link
Contributor Author

yronglin commented May 3, 2024

I think the approach looks good.

Do we have existing tests for the use of module as an identifier outside of a module declaration?

IIUC, do you mean something looks like the following? I didn't find it in the test case:

void foo() {
  int module = 0;
}

@yronglin
Copy link
Contributor Author

yronglin commented May 3, 2024

My reading is that

#define SOME_MACRO
module foo SOME_MACRO;

SOME_MACRO is expanded

If this is the case, the current processing in the Preprocessor::LexAfterModuleDecl function is incorrect. The implementation in this PR treats the tokens between module and ; as normal text(use LexUnexpandedToken). So I'm a bit confused about the wording of [cpp.module]/p3 and the note.

Any preprocessing tokens after the module preprocessing token in the module directive are processed just as in normal text.
[Note 1: Each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens. — end note]

@cor3ntin
Copy link
Contributor

cor3ntin commented May 4, 2024

I think the approach looks good.
Do we have existing tests for the use of module as an identifier outside of a module declaration?

IIUC, do you mean something looks like the following? I didn't find it in the test case:

void foo() {
  int module = 0;
}

Yes, it might be useful to have that somewhere

I think in ParseModuleDecl, you want to reset the lexer kind before parsing attributes.
(and add tests for that)

Here is a test:

#define ATTRS [[]]
#define SEMICOLON
export module unexpanded : unexpanded : ATTRS SEMICOLON

Sorry i did not notice that earlier.
The goal of the paper is that the name of the module is not a macro. everything else can be, including attributes attached to the module

@yronglin
Copy link
Contributor Author

yronglin commented May 4, 2024

Thanks! I'd like to move the implementation from Parser to Preprocessor. I think It would make more sense to do these checks during lexing.

@yronglin
Copy link
Contributor Author

Thanks for the review!

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think all the concerns have been addressed.

Give it until tomorrow before landing in case @ChuanqiXu9 @Bigcheese @AaronBallman have additional comments

Thanks a lot

@yronglin
Copy link
Contributor Author

Thank you, I can't finish this PR without your help.

Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more comments from me.

@yronglin yronglin merged commit e77a01d into llvm:main Jul 20, 2024
8 checks passed
@yronglin
Copy link
Contributor Author

I landed this PR and I'd be happy to follow up if you have any concerns.

@dcci
Copy link
Member

dcci commented Jul 22, 2024

This commit seems to have broken well-formed code in thrift:

fbcode/thrift/compiler/generate/t_py_generator.cc:881:20: error: declaration of variable 'module' with deduced type 'const auto &' requires an initializer
  881 |   for (const auto &module : modules) {
      |                    ^
fbcode/thrift/compiler/generate/t_py_generator.cc:881:27: error: expected ';' in 'for' statement specifier
  881 |   for (const auto &module : modules) {
      |                           ^
fbcode/thrift/compiler/generate/t_py_generator.cc:881:27: error: expected expression
fbcode/thrift/compiler/generate/t_py_generator.cc:881:27: error: expected ';' in 'for' statement specifier
4 errors generated.

@dcci
Copy link
Member

dcci commented Jul 22, 2024

Code in question:
https://github.com/facebook/fbthrift/blob/ccfcd6743488c26d720bc4ff3348534491078b43/thrift/compiler/generate/t_py_generator.cc#L881

^ I am trying to find out a reduced testcase, but sending this here in case it rings a bell.

@mizvekov
Copy link
Contributor

I am also seeing breakages which are solved by reverting this change.

The failure occurs when importing modules from within header files:

error: expected a module name after 'import'
   11 | import std;

@yronglin
Copy link
Contributor Author

Thank you for reporting this issue. if we have a reproducer, I'll try to quick fix it today.

@mizvekov
Copy link
Contributor

I don't think I'll be able to provide a reproducer today. Regardless, bugs in modules are in general not easy to reduce.
Unless you think you can fix this soon enough, I think a revert is appropriate here.

@yronglin
Copy link
Contributor Author

yronglin commented Jul 22, 2024

I am also seeing breakages which are solved by reverting this change.

The failure occurs when importing modules from within header files:

error: expected a module name after 'import'
   11 | import std;

It would be nice to have a reproducer. I can't tell the root cause from this error, so I can't use it to strengthen the current tests.

import std is definitely covered by the current test, but we can get more useful information if you provide some more context.

@yronglin
Copy link
Contributor Author

This commit seems to have broken well-formed code in thrift:

fbcode/thrift/compiler/generate/t_py_generator.cc:881:20: error: declaration of variable 'module' with deduced type 'const auto &' requires an initializer
  881 |   for (const auto &module : modules) {
      |                    ^
fbcode/thrift/compiler/generate/t_py_generator.cc:881:27: error: expected ';' in 'for' statement specifier
  881 |   for (const auto &module : modules) {
      |                           ^
fbcode/thrift/compiler/generate/t_py_generator.cc:881:27: error: expected expression
fbcode/thrift/compiler/generate/t_py_generator.cc:881:27: error: expected ';' in 'for' statement specifier
4 errors generated.

This is clearer, I think I may have figured out where the problem is.

@mizvekov
Copy link
Contributor

It would be nice to have a reproducer. I can't tell the root cause from this error, so I can't use it to strengthen the current tests.

I intend to provide a reproducer. I just can't do it today. So don't wait on it as a justification to avoid reverting.

@dcci
Copy link
Member

dcci commented Jul 22, 2024

This is clearer, I think I may have figured out where the problem is.

^ I have a reproducer that's running through creduce. It should be ready tomorrow morning PST time, and I'll add you here. In the meanwhile, do you mind if I revert this given we understand it breaks several cases so we can get our services building again? It affected many codebases.

@dcci
Copy link
Member

dcci commented Jul 22, 2024

Notice you already reverted, thanks. I'll follow-up with a reproducer once creduce finishes running.

@yronglin
Copy link
Contributor Author

Thanks!

@yronglin
Copy link
Contributor Author

It would be nice to have a reproducer. I can't tell the root cause from this error, so I can't use it to strengthen the current tests.

I intend to provide a reproducer. I just can't do it today. So don't wait on it as a justification to avoid reverting.

The PR has already been reverted before this, and no one is avoiding reverting it.

@cor3ntin
Copy link
Contributor

When you create a new PR, can you try to preserve the history of your commits, so that we can easily see what you changed? Thanks

@yronglin
Copy link
Contributor Author

When you create a new PR, can you try to preserve the history of your commits, so that we can easily see what you changed? Thanks

Sure, I'd be happy to do that! I will put the new changes in a separate commit. 😁

@dcci
Copy link
Member

dcci commented Jul 22, 2024

Reduced: #99894

@yronglin
Copy link
Contributor Author

After investigation, it was found that the root cause of the problem was that this PR mistakenly treated some identifiers named module as module-keywords, and packaged the subsequent identifiers into annot_module_name.

yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
…0574)

Summary:
This PR implement [P3034R1 Module Declarations Shouldn’t be
Macros](https://wg21.link/P3034R1), and refactor the convoluted state
machines in module name lexical analysis.

---------

Signed-off-by: yronglin <[email protected]>
Co-authored-by: Aaron Ballman <[email protected]>
Co-authored-by: cor3ntin <[email protected]>

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251236
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
…cros" (#99838)

Summary: Reverts #90574

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251154
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants