Skip to content

[clang] Fix a crash issue that caused by handling of fields with initializers in nested anonymous unions #113049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 10, 2024

Conversation

yronglin
Copy link
Contributor

@yronglin yronglin commented Oct 19, 2024

Fixes: #112560

This PR create an RecoveryExpr for invalid in-class-initializer.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Oct 19, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 19, 2024

@llvm/pr-subscribers-clang

Author: None (yronglin)

Changes

Fix a crash issue that caused by handling of fields with initializers in nested anonymous unions.
If my thinking is wrong, I apologize, and IIUC seems it's caused by 5ae5774.


Full diff: https://github.com/llvm/llvm-project/pull/113049.diff

2 Files Affected:

  • (modified) clang/lib/Sema/SemaInit.cpp (+10-5)
  • (modified) clang/test/SemaCXX/cxx1y-initializer-aggregates.cpp (+11)
diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index f560865681fa5a..4878dcb5d2c8e0 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -868,13 +868,19 @@ InitListChecker::FillInEmptyInitializations(const InitializedEntity &Entity,
 
   if (const RecordType *RType = ILE->getType()->getAs<RecordType>()) {
     const RecordDecl *RDecl = RType->getDecl();
-    if (RDecl->isUnion() && ILE->getInitializedFieldInUnion()) {
+    if (RDecl->isUnion() && ILE->getInitializedFieldInUnion())
       FillInEmptyInitForField(0, ILE->getInitializedFieldInUnion(), Entity, ILE,
                               RequiresSecondPass, FillWithNoInit);
+    else if (RDecl->isUnion() && isa<CXXRecordDecl>(RDecl) &&
+             cast<CXXRecordDecl>(RDecl)->hasInClassInitializer()) {
+      for (auto *Field : RDecl->fields()) {
+        if (Field->hasInClassInitializer()) {
+          FillInEmptyInitForField(0, Field, Entity, ILE, RequiresSecondPass,
+                                  FillWithNoInit);
+          break;
+        }
+      }
     } else {
-      assert((!RDecl->isUnion() || !isa<CXXRecordDecl>(RDecl) ||
-              !cast<CXXRecordDecl>(RDecl)->hasInClassInitializer()) &&
-             "We should have computed initialized fields already");
       // The fields beyond ILE->getNumInits() are default initialized, so in
       // order to leave them uninitialized, the ILE is expanded and the extra
       // fields are then filled with NoInitExpr.
@@ -2296,7 +2302,6 @@ void InitListChecker::CheckStructUnionTypes(
           return;
         }
       }
-      llvm_unreachable("Couldn't find in-class initializer");
     }
 
     // Value-initialize the first member of the union that isn't an unnamed
diff --git a/clang/test/SemaCXX/cxx1y-initializer-aggregates.cpp b/clang/test/SemaCXX/cxx1y-initializer-aggregates.cpp
index 03a6800898d18f..8360b8fd7d8ee2 100644
--- a/clang/test/SemaCXX/cxx1y-initializer-aggregates.cpp
+++ b/clang/test/SemaCXX/cxx1y-initializer-aggregates.cpp
@@ -115,3 +115,14 @@ namespace nested_union {
   // of Test3, or we should exclude f(Test3) as a candidate.
   static_assert(f({1}) == 2, ""); // expected-error {{call to 'f' is ambiguous}}
 }
+
+// Fix crash issue https://github.com/llvm/llvm-project/issues/112560.
+// Make sure clang compiles the following code without crashing:
+namespace GH112560 {
+union U {
+  int f = ; // expected-error {{expected expression}}
+};
+void foo() {
+  U g{};
+}
+} // namespace GH112560

// Make sure clang compiles the following code without crashing:
namespace GH112560 {
union U {
int f = ; // expected-error {{expected expression}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, the reason the code in SemaInit is getting confused is that the union is marked "hasInClassInitializer" because of the initializer... but then there isn't actually an initializer anywhere because we failed to parse it.

I think hasInClassInitializer() on RecordDecl needs to be consistent with hasInClassInitializer() on FieldDecl. If hasInClassInitializer is true, there should be an initializer somewhere. Either we should pretend the user didn't write the "=", or we should construct a RecoveryExpr to represent the missing initializer. Adding extra handling to every bit of code that checks hasInClassInitializer spreads out error handling to places it shouldn't exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Make sense, there should have a RecoveryExpr as the initializer. I'd like to give a try.

@shafik
Copy link
Collaborator

shafik commented Nov 16, 2024

Just wanted to ping on this and see how it was going.

@efriedma-quic
Copy link
Collaborator

On my end, still waiting for review comments to be addressed.

…ializers in nested anonymous unions

Signed-off-by: yronglin <[email protected]>
@yronglin
Copy link
Contributor Author

yronglin commented Nov 20, 2024

Sorry for the very late update, I had some troubles in the past two months.

This update check the number of C++11 in-class-initializer in the class and update CXXRecordDecl::DefinitionData::HasInClassInitializer when calling FieldDecl::hasInClassInitializer().

I think we might go futhur to create RecoveryExpr for invalid in-class-initializer in FieldDecl, but this will interrupt the processing of the rest class members. e.g. The "default member initializer for 'r' needed" will missing because FibTree *l is invalid in the following:

namespace use_self {
  struct FibTree {
    int n;
    FibTree *l = // expected-note {{declared here}}
      n > 1 ? new FibTree{n-1} : &fib0; // expected-error {{default member initializer for 'l' needed}}
    FibTree *r = // expected-note {{declared here}}
      n > 2 ? new FibTree{n-2} : &fib0; // expected-error {{default member initializer for 'r' needed}}
    int v = l->v + r->v;

    static FibTree fib0;
  };
  FibTree FibTree::fib0{0, nullptr, nullptr, 1};

  int fib(int n) { return FibTree{n}.v; }
}

The hadError flag will be true after filled in FibTree *l.

for (auto *Field : RDecl->fields()) {
if (Field->isUnnamedBitField())
continue;
if (hadError)
return;
FillInEmptyInitForField(Init, Field, Entity, ILE, RequiresSecondPass,
FillWithNoInit);
if (hadError)
return;
++Init;
// Only look at the first initialization of a union.
if (RDecl->isUnion())
break;
}

Sould we create an RecoveryExpr for invalid in-class-initializer? WDYT?

@efriedma-quic
Copy link
Collaborator

efriedma-quic commented Nov 20, 2024

We could maybe look at setting hadError less aggressively in InitListChecker, for cases where the error is unlikely to impact the overall parse. If a member has a RecoveryExpr as its initializer, it's probably reasonable to continue producing error messages.

That said, I don't really care if we suppress the second error message in your FibTree example. It's not a situation I'd expect people to trip over regularly.


When I first looked at the patch, I didn't realize the error messages were coming out of delayed C++ field parsing. I don't think the current version of the patch is viable: modifying RecordDecl::hasInClassInitializer() after it's already been queried is likely to lead to strange results.

@yronglin yronglin requested a review from Endilll as a code owner November 24, 2024 16:10
@yronglin
Copy link
Contributor Author

Thanks for your review! The latest update add a RecoveryExpr for invalid in-class-initializer. But I have some concerns about the inconsistency between RecordDecl::hasInClassInitializer() and FieldDecl::hasInClassInitializer(). FieldDecl has a method removeInClassInitializer to modifiy the AST Node.

@yronglin yronglin requested a review from hokein November 24, 2024 16:16
Copy link
Collaborator

@hokein hokein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attaching an empty RecoveryExpr as an initializer to indicate that initialization was attempted but failed seems reasonable to me. I'll leave the approval to @efriedma-quic.

Signed-off-by: yronglin <[email protected]>
@yronglin
Copy link
Contributor Author

friendly ping~

Copy link
Collaborator

@hokein hokein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the PR description still up-to-date?

@yronglin
Copy link
Contributor Author

yronglin commented Dec 4, 2024

Is the PR description still up-to-date?

Yeah! The description out-of-date.


// We do not want to aggressively set the hadError flag and cutoff
// parsing. Try to recover when in-class-initializer had errors.
if (Field->getInClassInitializer() &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the purpose of this code. Why are we not handling this in BuildCXXDefaultInitExpr()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've move the code into BuildCXXDefaultInitExpr(), the original purpose of putting it here is to return quickly and avoid entering any context again.

    ExprResult DIE;
      {
        // Enter a default initializer rebuild context, then we can support
        // lifetime extension of temporary created by aggregate initialization
        // using a default member initializer.
        // CWG1815 (https://wg21.link/CWG1815).
        EnterExpressionEvaluationContext RebuildDefaultInit(
            SemaRef, Sema::ExpressionEvaluationContext::PotentiallyEvaluated);
        SemaRef.currentEvaluationContext().RebuildDefaultArgOrDefaultInit =
            true;
        SemaRef.currentEvaluationContext().DelayedDefaultInitializationContext =
            SemaRef.parentEvaluationContext()
                .DelayedDefaultInitializationContext;
        SemaRef.currentEvaluationContext().InLifetimeExtendingContext =
            SemaRef.parentEvaluationContext().InLifetimeExtendingContext;
        DIE = SemaRef.BuildCXXDefaultInitExpr(Loc, Field);
      }

// in-class-initializer had errors.
if (Field->getInClassInitializer() &&
Field->getInClassInitializer()->containsErrors())
return Field->getInClassInitializer();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the idea here is that we're throwing away too much if we just return an ExprError? I assume everything would work without crashing, we'd just produce slightly different diagnostics. So... I guess my first question is, do we really need to do this in this patch, as opposed to a followup?

BuildCXXDefaultInitExpr is supposed to, as the name suggests, return a CXXDefaultInitExpr. As-is, this is going to produce weird results with erroneous expressions that aren't just a RecoveryExpr, like typo corrections. Maybe we can explicitly create a RecoveryExpr here, though, nested inside the CXXDefaultInitExpr, or something like that.

Copy link
Contributor Author

@yronglin yronglin Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the idea here is that we're throwing away too much if we just return an ExprError?

Yes.

BuildCXXDefaultInitExpr is supposed to, as the name suggests, return a CXXDefaultInitExpr.

Agree.

Should we remove this code from BuildCXXDefaultInitExpr? Because we createdRecoveryExpr in Sema::ActOnFinishCXXInClassMemberInitializer.

  // If we might have already tried and failed to instantiate, don't try again.
  if (Field->isInvalidDecl())
    return ExprError();

and the AST will be:

-CompoundStmt 0x12480cb70 <col:12, line:494:1> 
        537:       `-DeclStmt 0x12480cb58 <line:493:3, col:8> 
        538:         `-VarDecl 0x12480ca40 <col:3, col:7> col:5 g 'U':'GH112560::U' listinit 
        539:           `-InitListExpr 0x12480cae8 <col:6, col:7> 'U':'GH112560::U' contains-errors field Field 0x12480c870 'f' 'int' 
        540:             `-CXXDefaultInitExpr 0x12480cb30 <col:7> 'int' contains-errors has rewritten init 
        541:               `-RecoveryExpr 0x12480c910 <line:482:9> 'int' contains-errors 

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I'm a little concerned this is going to lead to other crashes due to exposing more codepaths to "invalid" decls. But it seems like it's doing the right thing in the given cases.

@yronglin
Copy link
Contributor Author

Thanks for the review!

@yronglin yronglin merged commit 740861d into llvm:main Dec 10, 2024
8 checks passed
@yronglin
Copy link
Contributor Author

LGTM

I'm a little concerned this is going to lead to other crashes due to exposing more codepaths to "invalid" decls. But it seems like it's doing the right thing in the given cases.

Yeah, if the PR breaks clang, I'll revert it quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[clang++] Couldn't find in-class initializer UNREACHABLE executed at /root/llvm-project/clang/lib/Sema/SemaInit.cpp:2299!
5 participants