Skip to content

[clang][dataflow] Don't propagate result objects in unevaluated contexts #90438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 2, 2024

Conversation

martinboehme
Copy link
Contributor

Trying to do so can cause crashes -- see newly added test and the comments in
the fix.

We're starting to see a repeating pattern here: We're getting crashes because
ResultObjectVisitor and getReferencedDecls() don't agree on which parts of
the AST to visit and, hence, which fields should be modeled.

I think we should ensure consistency between these two parts of the code by
using a RecursiveASTVisitor in getReferencedDecls()1; the
Traverse...() functions that control which parts of the AST we visit would go
in a common base class that would be used for both ResultObjectVisitor and
getReferencedDecls().

I'd like to focus this PR, however, on a targeted fix for the current crash and
postpone the refactoring to a later PR (which will be easier to revert if there
are unintended side-effects).

Footnotes

  1. As an added bonus, this would make the code better structured and more
    efficient than the current sequence of if (dyn_cast<T>(...)) statements).

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Apr 29, 2024
@martinboehme martinboehme requested review from ymand and Xazax-hun April 29, 2024 08:05
@llvmbot
Copy link
Member

llvmbot commented Apr 29, 2024

@llvm/pr-subscribers-clang-analysis

Author: None (martinboehme)

Changes

Trying to do so can cause crashes -- see newly added test and the comments in
the fix.

We're starting to see a repeating pattern here: We're getting crashes because
ResultObjectVisitor and getReferencedDecls() don't agree on which parts of
the AST to visit and, hence, which fields should be modeled.

I think we should ensure consistency between these two parts of the code by
using a RecursiveASTVisitor in getReferencedDecls()1; the
Traverse...() functions that control which parts of the AST we visit would go
in a common base class that would be used for both ResultObjectVisitor and
getReferencedDecls().

I'd like to focus this PR, however, on a targeted fix for the current crash and
postpone the refactoring to a later PR (which will be easier to revert if there
are unintended side-effects).


Full diff: https://github.com/llvm/llvm-project/pull/90438.diff

2 Files Affected:

  • (modified) clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp (+7)
  • (modified) clang/unittests/Analysis/FlowSensitive/TransferTest.cpp (+17)
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
index d79e734402892a..f32ee4a2528a8b 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -350,6 +350,13 @@ class ResultObjectVisitor : public RecursiveASTVisitor<ResultObjectVisitor> {
     return RecursiveASTVisitor<ResultObjectVisitor>::TraverseDecl(D);
   }
 
+  bool TraverseDecltypeTypeLoc(DecltypeTypeLoc Node) {
+    // Don't traverse `decltype` because we don't analyze its argument (as it
+    // isn't executed) and hence don't model fields that are only used in the
+    // argument of a `decltype`.
+    return true;
+  }
+
   bool TraverseBindingDecl(BindingDecl *BD) {
     // `RecursiveASTVisitor` doesn't traverse holding variables for
     // `BindingDecl`s by itself, so we need to tell it to.
diff --git a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
index 301bec32c0cf1d..3b8fde0eb244c1 100644
--- a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
@@ -3331,6 +3331,23 @@ TEST(TransferTest, ResultObjectLocationDontVisitNestedRecordDecl) {
          ASTContext &ASTCtx) {});
 }
 
+TEST(TransferTest, ResultObjectLocationDontVisitDeclTypeArg) {
+  // This is a crash repro.
+  // We used to crash because when propagating result objects, we would visit
+  // the argument of `decltype()`, but we don't model fields used only in these.
+  std::string Code = R"cc(
+    struct S1 {};
+    struct S2 { S1 s1; };
+    void target() {
+        decltype(S2{ S1() }) Dummy;
+    }
+  )cc";
+  runDataflow(
+      Code,
+      [](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
+         ASTContext &ASTCtx) {});
+}
+
 TEST(TransferTest, StaticCast) {
   std::string Code = R"(
     void target(int Foo) {

Footnotes

  1. As an added bonus, this would make the code better structured and more
    efficient than the current sequence of if (dyn_cast&lt;T&gt;(...)) statements).

@llvmbot
Copy link
Member

llvmbot commented Apr 29, 2024

@llvm/pr-subscribers-clang

Author: None (martinboehme)

Changes

Trying to do so can cause crashes -- see newly added test and the comments in
the fix.

We're starting to see a repeating pattern here: We're getting crashes because
ResultObjectVisitor and getReferencedDecls() don't agree on which parts of
the AST to visit and, hence, which fields should be modeled.

I think we should ensure consistency between these two parts of the code by
using a RecursiveASTVisitor in getReferencedDecls()1; the
Traverse...() functions that control which parts of the AST we visit would go
in a common base class that would be used for both ResultObjectVisitor and
getReferencedDecls().

I'd like to focus this PR, however, on a targeted fix for the current crash and
postpone the refactoring to a later PR (which will be easier to revert if there
are unintended side-effects).


Full diff: https://github.com/llvm/llvm-project/pull/90438.diff

2 Files Affected:

  • (modified) clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp (+7)
  • (modified) clang/unittests/Analysis/FlowSensitive/TransferTest.cpp (+17)
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
index d79e734402892a..f32ee4a2528a8b 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -350,6 +350,13 @@ class ResultObjectVisitor : public RecursiveASTVisitor<ResultObjectVisitor> {
     return RecursiveASTVisitor<ResultObjectVisitor>::TraverseDecl(D);
   }
 
+  bool TraverseDecltypeTypeLoc(DecltypeTypeLoc Node) {
+    // Don't traverse `decltype` because we don't analyze its argument (as it
+    // isn't executed) and hence don't model fields that are only used in the
+    // argument of a `decltype`.
+    return true;
+  }
+
   bool TraverseBindingDecl(BindingDecl *BD) {
     // `RecursiveASTVisitor` doesn't traverse holding variables for
     // `BindingDecl`s by itself, so we need to tell it to.
diff --git a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
index 301bec32c0cf1d..3b8fde0eb244c1 100644
--- a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
@@ -3331,6 +3331,23 @@ TEST(TransferTest, ResultObjectLocationDontVisitNestedRecordDecl) {
          ASTContext &ASTCtx) {});
 }
 
+TEST(TransferTest, ResultObjectLocationDontVisitDeclTypeArg) {
+  // This is a crash repro.
+  // We used to crash because when propagating result objects, we would visit
+  // the argument of `decltype()`, but we don't model fields used only in these.
+  std::string Code = R"cc(
+    struct S1 {};
+    struct S2 { S1 s1; };
+    void target() {
+        decltype(S2{ S1() }) Dummy;
+    }
+  )cc";
+  runDataflow(
+      Code,
+      [](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
+         ASTContext &ASTCtx) {});
+}
+
 TEST(TransferTest, StaticCast) {
   std::string Code = R"(
     void target(int Foo) {

Footnotes

  1. As an added bonus, this would make the code better structured and more
    efficient than the current sequence of if (dyn_cast&lt;T&gt;(...)) statements).

@@ -350,6 +350,13 @@ class ResultObjectVisitor : public RecursiveASTVisitor<ResultObjectVisitor> {
return RecursiveASTVisitor<ResultObjectVisitor>::TraverseDecl(D);
}

bool TraverseDecltypeTypeLoc(DecltypeTypeLoc Node) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other instances in the language that are unevaluated including: typeid, noexcept, sizeof. Do you anticipate similar problems with those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other instances in the language that are unevaluated including: typeid, noexcept, sizeof. Do you anticipate similar problems with those?

Excellent point, thanks. I'll expand this PR to more generally exclude all unevaluated contexts. Will ping this PR when I've changed this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done -- ready for review again.

It turns out that for some of these (typeid, sizeof), the tests already passed even before I excluded them in ResultObjectVisitor. This is because getReferencedDecls() does actually collect declarations referenced in these unevaluated contexts (even though maybe it shouldn't). Nevertheless, I have decided to exclude these unevaluated contexts from ResultObjectVisitor because the transfer function will never visit them, so they don't need result objects to be propagated into them.

It also turns out that, even though the operand of the noexcept operator in an unevaluated operand, it still shows up in the CFG for some reason, and hence we also need to propagate result objects for it (see also comments added to the code). Maybe there's a reason for this, or maybe it's something that should be fixed in the CFG, but for the time being, I've chosen to leave this as it is here.

Copy link
Collaborator

@Xazax-hun Xazax-hun Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also turns out that, even though the operand of the noexcept operator in an unevaluated operand, it still shows up in the CFG for some reason

This surprises me, I can't think of a reason why we would want to have it in the CFG. This keyword appears a lot in generic code, so maybe excluding it from the CFG might have measurable perf/memory benefits. I'd think it might be worth to give this a try and see if something breaks (in a separate PR).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll take a look at that!

@martinboehme
Copy link
Contributor Author

Windows CI failure is in Driver/amdgpu-toolchain.c. This looks unrelated.

…xts.

Trying to do so can cause crashes -- see newly added test and the comments in
the fix.

We're starting to see a repeating pattern here: We're getting crashes because
`ResultObjectVisitor` and `getReferencedDecls()` don't agree on which parts of
the AST to visit and, hence, which fields should be modeled.

I think we should ensure consistency between these two parts of the code by
using a `RecursiveASTVisitor` in `getReferencedDecls()`[^1]; the
`Traverse...()` functions that control which parts of the AST we visit would go
in a common base class that would be used for both `ResultObjectVisitor` and
`getReferencedDecls()`.

I'd like to focus this PR, however, on a targeted fix for the current crash and
postpone the refactoring to a later PR (which will be easier to revert if there
are unintended side-effects).

[^1]: As an added bonus, this would make the code better structured and more
efficient than the current sequence of `if (dyn_cast<T>(...))` statements).
@martinboehme martinboehme force-pushed the piper_export_cl_628985572 branch from 75a1198 to 2cf2947 Compare April 30, 2024 12:12
@martinboehme martinboehme changed the title [clang][dataflow] Don't propagate result objects inside decltype. [clang][dataflow] Don't propagate result objects in unevaluated contexts Apr 30, 2024
Copy link
Collaborator

@Xazax-hun Xazax-hun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@martinboehme martinboehme merged commit 597a315 into llvm:main May 2, 2024
@cHackz18
Copy link

cHackz18 commented May 2, 2024

Hello and good morning from the UK!

It looks as though this change caused a test failure on the following buildbot:
https://lab.llvm.org/buildbot/#/builders/216/builds/38446

Are you able to take a look see?

many thanks for yoru time,
Tom W

TomWeaver18 pushed a commit that referenced this pull request May 2, 2024
@TomWeaver18
Copy link
Contributor

Reverted in 2252c5c

@martinboehme
Copy link
Contributor Author

Hello and good morning from the UK!

It looks as though this change caused a test failure on the following buildbot: https://lab.llvm.org/buildbot/#/builders/216/builds/38446

Are you able to take a look see?

many thanks for yoru time, Tom W

Sorry, didn't see this in time.

Reverted in 2252c5c

Thanks for the revert!

Will investigate and reland with a fix.

martinboehme added a commit to martinboehme/llvm-project that referenced this pull request May 6, 2024
martinboehme added a commit to martinboehme/llvm-project that referenced this pull request May 6, 2024
martinboehme added a commit that referenced this pull request May 6, 2024
…xts (reland #90438) (#91172)

This relands #90348 with a fix for a [buildbot
failure](https://lab.llvm.org/buildbot/#/builders/216/builds/38446)
caused by the test being run with `-fno-rtti`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants