Skip to content

[clang][dataflow] Correctly treat empty initializer lists for unions. #82986

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 1, 2024

Conversation

martinboehme
Copy link
Contributor

This fixes a crash introduced by #82348
but also adds additional handling to make sure that we treat empty initializer
lists for both unions and structs/classes correctly (see tests added in this
patch).

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Feb 26, 2024
@llvmbot
Copy link
Member

llvmbot commented Feb 26, 2024

@llvm/pr-subscribers-clang-analysis

@llvm/pr-subscribers-clang

Author: None (martinboehme)

Changes

This fixes a crash introduced by #82348
but also adds additional handling to make sure that we treat empty initializer
lists for both unions and structs/classes correctly (see tests added in this
patch).


Full diff: https://github.com/llvm/llvm-project/pull/82986.diff

3 Files Affected:

  • (modified) clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp (+6-1)
  • (modified) clang/lib/Analysis/FlowSensitive/Transfer.cpp (+14-1)
  • (modified) clang/unittests/Analysis/FlowSensitive/TransferTest.cpp (+66-2)
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
index 0cfc26ea952cda..fd7b06efcc7861 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -983,7 +983,7 @@ StorageLocation &Environment::createObjectInternal(const ValueDecl *D,
   }
 
   Value *Val = nullptr;
-  if (InitExpr)
+  if (InitExpr) {
     // In the (few) cases where an expression is intentionally
     // "uninterpreted", `InitExpr` is not associated with a value.  There are
     // two ways to handle this situation: propagate the status, so that
@@ -998,6 +998,11 @@ StorageLocation &Environment::createObjectInternal(const ValueDecl *D,
     // default value (assuming we don't update the environment API to return
     // references).
     Val = getValue(*InitExpr);
+
+    if (!Val && isa<ImplicitValueInitExpr>(InitExpr) &&
+        InitExpr->getType()->isPointerType())
+      Val = &getOrCreateNullPointerValue(InitExpr->getType()->getPointeeType());
+  }
   if (!Val)
     Val = createValue(Ty);
 
diff --git a/clang/lib/Analysis/FlowSensitive/Transfer.cpp b/clang/lib/Analysis/FlowSensitive/Transfer.cpp
index cd1f04e53cff68..d73965806a533f 100644
--- a/clang/lib/Analysis/FlowSensitive/Transfer.cpp
+++ b/clang/lib/Analysis/FlowSensitive/Transfer.cpp
@@ -685,9 +685,22 @@ class TransferVisitor : public ConstStmtVisitor<TransferVisitor> {
 
     // `S->inits()` contains all the initializer expressions, including the
     // ones for direct base classes.
-    auto Inits = S->inits();
+    ArrayRef<Expr *> Inits = S->inits();
     size_t InitIdx = 0;
 
+    // Unions initialized with an empty initializer list need special treatment.
+    // For structs/classes initialized with an empty initializer list, Clang
+    // puts `ImplicitValueInitExpr`s in `InitListExpr::inits()`, but for unions,
+    // it doesn't do this -- so we create an `ImplicitValueInitExpr` ourselves.
+    std::optional<ImplicitValueInitExpr> ImplicitValueInitForUnion;
+    SmallVector<Expr *> InitsForUnion;
+    if (S->getType()->isUnionType() && Inits.empty()) {
+      assert(FieldsForInit.size() == 1);
+      ImplicitValueInitForUnion.emplace(FieldsForInit.front()->getType());
+      InitsForUnion.push_back(&*ImplicitValueInitForUnion);
+      Inits = InitsForUnion;
+    }
+
     // Initialize base classes.
     if (auto* R = S->getType()->getAsCXXRecordDecl()) {
       assert(FieldsForInit.size() + R->getNumBases() == Inits.size());
diff --git a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
index e7d74581865a32..2eb9a37f289426 100644
--- a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
@@ -2393,8 +2393,72 @@ TEST(TransferTest, InitListExprAsUnion) {
         auto &FLoc = getFieldLoc<RecordStorageLocation>(
             *Env.getThisPointeeStorageLocation(), "F", ASTCtx);
         auto *AVal = cast<PointerValue>(getFieldValue(&FLoc, "a", ASTCtx, Env));
-        ASSERT_EQ(AVal, &getValueForDecl<PointerValue>(ASTCtx, Env, "null"));
-        ASSERT_EQ(getFieldValue(&FLoc, "b", ASTCtx, Env), nullptr);
+        EXPECT_EQ(AVal, &getValueForDecl<PointerValue>(ASTCtx, Env, "null"));
+        EXPECT_EQ(getFieldValue(&FLoc, "b", ASTCtx, Env), nullptr);
+      });
+}
+
+TEST(TransferTest, EmptyInitListExprForUnion) {
+  // This is a crash repro.
+  std::string Code = R"cc(
+    class target {
+      union {
+        int *a;
+        bool *b;
+      } F;
+
+     public:
+      constexpr target() : F{} {
+        int *null = nullptr;
+        F.b;  // Make sure we reference 'b' so it is modeled.
+        // [[p]]
+      }
+    };
+  )cc";
+  runDataflow(
+      Code,
+      [](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
+         ASTContext &ASTCtx) {
+        const Environment &Env = getEnvironmentAtAnnotation(Results, "p");
+
+        auto &FLoc = getFieldLoc<RecordStorageLocation>(
+            *Env.getThisPointeeStorageLocation(), "F", ASTCtx);
+        auto *AVal = cast<PointerValue>(getFieldValue(&FLoc, "a", ASTCtx, Env));
+        EXPECT_EQ(AVal, &getValueForDecl<PointerValue>(ASTCtx, Env, "null"));
+        EXPECT_EQ(getFieldValue(&FLoc, "b", ASTCtx, Env), nullptr);
+      });
+}
+
+TEST(TransferTest, EmptyInitListExprForStruct) {
+  std::string Code = R"cc(
+    class target {
+      struct {
+        int *a;
+        bool *b;
+      } F;
+
+     public:
+      constexpr target() : F{} {
+        int *NullIntPtr = nullptr;
+        bool *NullBoolPtr = nullptr;
+        // [[p]]
+      }
+    };
+  )cc";
+  runDataflow(
+      Code,
+      [](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
+         ASTContext &ASTCtx) {
+        const Environment &Env = getEnvironmentAtAnnotation(Results, "p");
+
+        auto &FLoc = getFieldLoc<RecordStorageLocation>(
+            *Env.getThisPointeeStorageLocation(), "F", ASTCtx);
+        auto *AVal = cast<PointerValue>(getFieldValue(&FLoc, "a", ASTCtx, Env));
+        ASSERT_EQ(AVal,
+                  &getValueForDecl<PointerValue>(ASTCtx, Env, "NullIntPtr"));
+        auto *BVal = cast<PointerValue>(getFieldValue(&FLoc, "b", ASTCtx, Env));
+        ASSERT_EQ(BVal,
+                  &getValueForDecl<PointerValue>(ASTCtx, Env, "NullBoolPtr"));
       });
 }
 

size_t InitIdx = 0;

// Unions initialized with an empty initializer list need special treatment.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering how does CodeGen deal with this? If it happens to have similar extra logic, we could simplify both by changing the AST.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that CodeGen also appears to special-case this like we do:

if (NumInitElements) {

And if we changed the AST so that we had an ImplicitValueInitExpr here, it looks as if the general case in CodeGen would do the right thing.

I'll make a note of this for myself.

This fixes a crash introduced by llvm#82348
but also adds additional handling to make sure that we treat empty initializer
lists for both unions and structs/classes correctly (see tests added in this
patch).
@martinboehme martinboehme force-pushed the piper_export_cl_610355746 branch from a48dc2e to 607c26e Compare February 27, 2024 18:39
@martinboehme
Copy link
Contributor Author

Just modified this PR to essentially reapply #82348 under it.

auto &FLoc = getFieldLoc<RecordStorageLocation>(
*Env.getThisPointeeStorageLocation(), "F", ASTCtx);
auto *AVal = cast<PointerValue>(getFieldValue(&FLoc, "a", ASTCtx, Env));
ASSERT_EQ(AVal,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why ASSERT (vs EXPECT, like above)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point -- these should have been EXPECT_EQ. Changed.


auto &FLoc = getFieldLoc<RecordStorageLocation>(
*Env.getThisPointeeStorageLocation(), "F", ASTCtx);
auto *AVal = cast<PointerValue>(getFieldValue(&FLoc, "a", ASTCtx, Env));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a set when the initializer is empty? I'm guessing it's because line 699 in Transfer.cpp uses front() and so initializes the first field of the union. If so, perhaps say this explicitly in the comment preceding those lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment explaining what the language rules are that govern the behavior we expect here.

@martinboehme martinboehme merged commit 128780b into llvm:main Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants