-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang] Do not serialize function definitions without a body #121550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang] Do not serialize function definitions without a body #121550
Conversation
An instantiated templated function definition may not have a body due to parsing errors inside the templated function. When serializing, an assert is tripped inside `ASTRecordWriter::AddFunctionDefinition` when building with assertions enabled. The instantiation may happen on an intermediate module. The test case was reduced from `mp-units`.
@llvm/pr-subscribers-clang-modules Author: Alejandro Álvarez Ayllón (alejandro-alvarez-sonarsource) ChangesAn instantiated templated function definition may not have a body due to parsing errors inside the templated function. When serializing, an assert is triggered inside The instantiation may happen on an intermediate module. The test case was reduced from Full diff: https://github.com/llvm/llvm-project/pull/121550.diff 2 Files Affected:
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index 4a6027943072c0..36b9e3e2ba1720 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -6227,8 +6227,10 @@ void ASTWriter::WriteDeclUpdatesBlocks(ASTContext &Context,
// Add a trailing update record, if any. These must go last because we
// lazily load their attached statement.
if (!GeneratingReducedBMI || !CanElideDeclDef(D)) {
- if (HasUpdatedBody) {
- const auto *Def = cast<FunctionDecl>(D);
+ assert(!(HasUpdatedBody && HasAddedVarDefinition) &&
+ "Declaration can not be both a FunctionDecl and a VarDecl");
+ if (const auto *Def = dyn_cast<FunctionDecl>(D);
+ HasUpdatedBody && Def->doesThisDeclarationHaveABody()) {
Record.push_back(UPD_CXX_ADDED_FUNCTION_DEFINITION);
Record.push_back(Def->isInlined());
Record.AddSourceLocation(Def->getInnerLocStart());
diff --git a/clang/test/Modules/missing-body-in-import.cpp b/clang/test/Modules/missing-body-in-import.cpp
new file mode 100644
index 00000000000000..b52ebba15087a3
--- /dev/null
+++ b/clang/test/Modules/missing-body-in-import.cpp
@@ -0,0 +1,42 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: cd %t
+
+// RUN: %clang_cc1 -std=c++23 mod1.cppm -emit-module-interface -o mod1.pcm -fallow-pcm-with-compiler-errors -verify
+// RUN: %clang_cc1 -std=c++23 mod2.cppm -emit-module-interface -o mod2.pcm -fmodule-file=mod1=mod1.pcm -verify -fallow-pcm-with-compiler-errors
+// RUN: %clang_cc1 -std=c++23 mod3.cppm -emit-module-interface -o mod3.pcm -fmodule-file=mod1=mod1.pcm -fmodule-file=mod2=mod2.pcm -verify -fallow-pcm-with-compiler-errors
+// RUN: %clang_cc1 -std=c++23 main.cpp -fmodule-file=mod1=mod1.pcm -fmodule-file=mod2=mod2.pcm -fmodule-file=mod3=mod3.pcm -verify -fallow-pcm-with-compiler-errors -ast-dump-all
+
+//--- mod1.cppm
+export module mod1;
+
+export template <unsigned N, unsigned M>
+class A {
+public:
+ constexpr A(const char[], const char[]) {
+ auto x = BrokenExpr; // expected-error {{use of undeclared identifier 'BrokenExpr'}}
+ }
+};
+
+export template<A<1,1> NTTP>
+struct B {};
+
+template < unsigned N, unsigned M >
+A(const char (&)[N], const char (&)[M]) -> A< 1, 1 >;
+
+//--- mod2.cppm
+export module mod2;
+import mod1;
+
+struct C: B <A{"a", "b"}> { // expected-error {{non-type template argument is not a constant expression}}
+ constexpr C(int a) { }
+};
+
+//--- mod3.cppm
+// expected-no-diagnostics
+export module mod3;
+export import mod2;
+
+//--- main.cpp
+// expected-no-diagnostics
+import mod3; // no crash
|
@llvm/pr-subscribers-clang Author: Alejandro Álvarez Ayllón (alejandro-alvarez-sonarsource) ChangesAn instantiated templated function definition may not have a body due to parsing errors inside the templated function. When serializing, an assert is triggered inside The instantiation may happen on an intermediate module. The test case was reduced from Full diff: https://github.com/llvm/llvm-project/pull/121550.diff 2 Files Affected:
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index 4a6027943072c0..36b9e3e2ba1720 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -6227,8 +6227,10 @@ void ASTWriter::WriteDeclUpdatesBlocks(ASTContext &Context,
// Add a trailing update record, if any. These must go last because we
// lazily load their attached statement.
if (!GeneratingReducedBMI || !CanElideDeclDef(D)) {
- if (HasUpdatedBody) {
- const auto *Def = cast<FunctionDecl>(D);
+ assert(!(HasUpdatedBody && HasAddedVarDefinition) &&
+ "Declaration can not be both a FunctionDecl and a VarDecl");
+ if (const auto *Def = dyn_cast<FunctionDecl>(D);
+ HasUpdatedBody && Def->doesThisDeclarationHaveABody()) {
Record.push_back(UPD_CXX_ADDED_FUNCTION_DEFINITION);
Record.push_back(Def->isInlined());
Record.AddSourceLocation(Def->getInnerLocStart());
diff --git a/clang/test/Modules/missing-body-in-import.cpp b/clang/test/Modules/missing-body-in-import.cpp
new file mode 100644
index 00000000000000..b52ebba15087a3
--- /dev/null
+++ b/clang/test/Modules/missing-body-in-import.cpp
@@ -0,0 +1,42 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: cd %t
+
+// RUN: %clang_cc1 -std=c++23 mod1.cppm -emit-module-interface -o mod1.pcm -fallow-pcm-with-compiler-errors -verify
+// RUN: %clang_cc1 -std=c++23 mod2.cppm -emit-module-interface -o mod2.pcm -fmodule-file=mod1=mod1.pcm -verify -fallow-pcm-with-compiler-errors
+// RUN: %clang_cc1 -std=c++23 mod3.cppm -emit-module-interface -o mod3.pcm -fmodule-file=mod1=mod1.pcm -fmodule-file=mod2=mod2.pcm -verify -fallow-pcm-with-compiler-errors
+// RUN: %clang_cc1 -std=c++23 main.cpp -fmodule-file=mod1=mod1.pcm -fmodule-file=mod2=mod2.pcm -fmodule-file=mod3=mod3.pcm -verify -fallow-pcm-with-compiler-errors -ast-dump-all
+
+//--- mod1.cppm
+export module mod1;
+
+export template <unsigned N, unsigned M>
+class A {
+public:
+ constexpr A(const char[], const char[]) {
+ auto x = BrokenExpr; // expected-error {{use of undeclared identifier 'BrokenExpr'}}
+ }
+};
+
+export template<A<1,1> NTTP>
+struct B {};
+
+template < unsigned N, unsigned M >
+A(const char (&)[N], const char (&)[M]) -> A< 1, 1 >;
+
+//--- mod2.cppm
+export module mod2;
+import mod1;
+
+struct C: B <A{"a", "b"}> { // expected-error {{non-type template argument is not a constant expression}}
+ constexpr C(int a) { }
+};
+
+//--- mod3.cppm
+// expected-no-diagnostics
+export module mod3;
+export import mod2;
+
+//--- main.cpp
+// expected-no-diagnostics
+import mod3; // no crash
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks correct but I'd like @ChuanqiXu9 to confirm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, previously we didn't allow -fallow-pcm-with-compiler-errors
so we didn't meet such problems. Out of curiosity (not a blocking issue), why do you want this feature for C++20 modules?
if (const auto *Def = dyn_cast<FunctionDecl>(D); | ||
HasUpdatedBody && Def->doesThisDeclarationHaveABody()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically, it looks better to not set HasUpdatedBody
if the declaration doesn't have a body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed as suggested.
Sanitizers, linters and indexers have to deal with code using non supported C++ language features or extensions. This is also true for C++20 modules. While conceivably one may implement module support without going through the intermediate PCM, to keep the semantics close to what the compiler does, the most straightforward is to do what the compiler does and go throught the same codepath as when actually compiling 😃 |
Just to make sure I understand things correctly, you're saying:
You hope the linter doesn't give up But why Sanitizers? IIRC, they should only work if the project compiles. Or are you saying the static analyzer? |
Precisely.
Sorry, my mind slipped there. Indeed I meant analyzers. |
if (Kind == UPD_CXX_ADDED_FUNCTION_DEFINITION) { | ||
assert(isa<FunctionDecl>(D) && "expected FunctionDecl"); | ||
HasUpdatedBody = dyn_cast<FunctionDecl>(D)->hasBody(); | ||
} else if (Kind == UPD_CXX_ADDED_VAR_DEFINITION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} else if (Kind == UPD_CXX_ADDED_VAR_DEFINITION) | |
} else if (Kind == UPD_CXX_ADDED_VAR_DEFINITION) { |
Please see LLVM's policy for the use of {
: https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements
we hope the style is consistent with the if-else branches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
if (Kind == UPD_CXX_ADDED_FUNCTION_DEFINITION) { | ||
assert(isa<FunctionDecl>(D) && "expected FunctionDecl"); | ||
HasUpdatedBody = dyn_cast<FunctionDecl>(D)->hasBody(); | ||
} else if (Kind == UPD_CXX_ADDED_VAR_DEFINITION) | ||
HasAddedVarDefinition = true; | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto,
else | |
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with nits
Co-authored-by: Chuanqi Xu <[email protected]>
else if (Kind == UPD_CXX_ADDED_VAR_DEFINITION) | ||
if (Kind == UPD_CXX_ADDED_FUNCTION_DEFINITION) { | ||
assert(isa<FunctionDecl>(D) && "expected FunctionDecl"); | ||
HasUpdatedBody = cast<FunctionDecl>(D)->hasBody(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to skip this on the last iteration. We should use doesThisDeclarationHaveABody
instead of hasBody
. Since hasBody
may be more expensive.
And also it is still slightly better to not set UPD_CXX_ADDED_FUNCTION_DEFINITION
if the function doesn't have a body. We can check this in CompletedImplicitDefinition and VariableDefinitionInstantiated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem! I want the get it right, not to rush it :D
Changed, although I imagine you meant FunctionDefinitionInstantiated (not VariableDefinitionInstantiated)
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now.
We can land this after you fix the formatting issue. I generally use the following command to format my patch:
git diff -U0 --no-color --relative HEAD^ | clang/tools/clang-format/clang-format-diff.py -p1 -i
feel free to use other methods
@@ -7230,6 +7230,9 @@ void ASTWriter::CompletedImplicitDefinition(const FunctionDecl *D) { | |||
if (!D->isFromASTFile()) | |||
return; // Declaration not imported from PCH. | |||
|
|||
if (!D->doesThisDeclarationHaveABody()) | |||
return; // The function definition may not have a body due to parsing errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to see the comments above the if.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved
7b2f79d
to
cefbcf0
Compare
Thanks for the review! Would you mind doing the merge? I do not have permission. |
I am just waiting the CI to be green : ) |
…1550) An instantiated templated function definition may not have a body due to parsing errors inside the templated function. When serializing, an assert is triggered inside `ASTRecordWriter::AddFunctionDefinition`. The instantiation may happen on an intermediate module. The test case was reduced from `mp-units`.
An instantiated templated function definition may not have a body due to parsing errors inside the templated function. When serializing, an assert is triggered inside
ASTRecordWriter::AddFunctionDefinition
.The instantiation may happen on an intermediate module.
The test case was reduced from
mp-units
.