-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang][dataflow] Check for backedges directly (instead of loop statements). #68923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-analysis @llvm/pr-subscribers-clang Author: Yitzhak Mandelbaum (ymand) ChangesWiden on backedge nodes, instead of nodes with a loop statement as terminator. Full diff: https://github.com/llvm/llvm-project/pull/68923.diff 3 Files Affected:
diff --git a/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp b/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
index 6b167891c1a3ac8..c635edd12219f0e 100644
--- a/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
+++ b/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
@@ -12,7 +12,6 @@
//===----------------------------------------------------------------------===//
#include <algorithm>
-#include <memory>
#include <optional>
#include <system_error>
#include <utility>
@@ -33,8 +32,8 @@
#include "clang/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.h"
#include "clang/Analysis/FlowSensitive/Value.h"
#include "llvm/ADT/ArrayRef.h"
-#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/SmallBitVector.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/Error.h"
@@ -53,19 +52,8 @@ static int blockIndexInPredecessor(const CFGBlock &Pred,
return BlockPos - Pred.succ_begin();
}
-static bool isLoopHead(const CFGBlock &B) {
- if (const auto *T = B.getTerminatorStmt())
- switch (T->getStmtClass()) {
- case Stmt::WhileStmtClass:
- case Stmt::DoStmtClass:
- case Stmt::ForStmtClass:
- case Stmt::CXXForRangeStmtClass:
- return true;
- default:
- return false;
- }
-
- return false;
+static bool isBackedgeNode(const CFGBlock &B) {
+ return B.getLoopTarget() != nullptr;
}
namespace {
@@ -502,14 +490,16 @@ runTypeErasedDataflowAnalysis(
PostVisitCFG) {
PrettyStackTraceAnalysis CrashInfo(CFCtx, "runTypeErasedDataflowAnalysis");
- PostOrderCFGView POV(&CFCtx.getCFG());
- ForwardDataflowWorklist Worklist(CFCtx.getCFG(), &POV);
+ const clang::CFG &CFG = CFCtx.getCFG();
+ PostOrderCFGView POV(&CFG);
+ ForwardDataflowWorklist Worklist(CFG, &POV);
+
std::vector<std::optional<TypeErasedDataflowAnalysisState>> BlockStates(
- CFCtx.getCFG().size());
+ CFG.size());
// The entry basic block doesn't contain statements so it can be skipped.
- const CFGBlock &Entry = CFCtx.getCFG().getEntry();
+ const CFGBlock &Entry = CFG.getEntry();
BlockStates[Entry.getBlockID()] = {Analysis.typeErasedInitialElement(),
InitEnv.fork()};
Worklist.enqueueSuccessors(&Entry);
@@ -553,7 +543,7 @@ runTypeErasedDataflowAnalysis(
llvm::errs() << "Old Env:\n";
OldBlockState->Env.dump();
});
- if (isLoopHead(*Block)) {
+ if (isBackedgeNode(*Block)) {
LatticeJoinEffect Effect1 = Analysis.widenTypeErased(
NewBlockState.Lattice, OldBlockState->Lattice);
LatticeJoinEffect Effect2 =
diff --git a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
index 632632a1b30e78b..7c697fa5f87d941 100644
--- a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
@@ -4099,6 +4099,20 @@ TEST(TransferTest, LoopDereferencingChangingRecordPointerConverges) {
ASSERT_THAT_ERROR(checkDataflowWithNoopAnalysis(Code), llvm::Succeeded());
}
+TEST(TransferTest, LoopWithDisjunctiveConditionConverges) {
+ std::string Code = R"cc(
+ bool foo();
+
+ void target() {
+ bool c = false;
+ while (foo() || foo()) {
+ c = true;
+ }
+ }
+ )cc";
+ ASSERT_THAT_ERROR(checkDataflowWithNoopAnalysis(Code), llvm::Succeeded());
+}
+
TEST(TransferTest, DoesNotCrashOnUnionThisExpr) {
std::string Code = R"(
union Union {
diff --git a/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp b/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
index 2425bb8711bdbaf..6800d736afd9b08 100644
--- a/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
@@ -913,6 +913,34 @@ TEST_F(FlowConditionTest, WhileStmt) {
});
}
+TEST_F(FlowConditionTest, WhileStmtWithAssignmentInCondition) {
+ std::string Code = R"(
+ void target(bool Foo) {
+ // This test checks whether the analysis preserves the connection between
+ // the value of `Foo` and the assignment expression, despite widening.
+ // The equality operator generates a fresh boolean variable on each
+ // interpretation, which forces use of widening.
+ while ((Foo = (3 == 4))) {
+ (void)0;
+ /*[[p]]*/
+ }
+ }
+ )";
+ runDataflow(
+ Code,
+ [](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
+ ASTContext &ASTCtx) {
+ const ValueDecl *FooDecl = findValueDecl(ASTCtx, "Foo");
+ ASSERT_THAT(FooDecl, NotNull());
+
+ ASSERT_THAT(Results.keys(), UnorderedElementsAre("p"));
+ const Environment &Env = getEnvironmentAtAnnotation(Results, "p");
+
+ auto &FooVal = cast<BoolValue>(Env.getValue(*FooDecl))->formula();
+ EXPECT_TRUE(Env.flowConditionImplies(FooVal));
+ });
+}
+
TEST_F(FlowConditionTest, Conjunction) {
std::string Code = R"(
void target(bool Foo, bool Bar) {
|
…ments). Widen on backedge nodes, instead of nodes with a loop statement as terminator. This fixes llvm#67834 and a precision loss from assignment in a loop condition. The commit contains tests for both of these issues.
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix! So, it looks like our definition of "loop head" was wrong.
Indeed. We didn't think of the way control-flow constructs inside the condition expression would be represented. Ironically (?), this is actually simpler and fixes another issue besides, which is nice. ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with comments
PostOrderCFGView POV(&CFCtx.getCFG()); | ||
ForwardDataflowWorklist Worklist(CFCtx.getCFG(), &POV); | ||
const clang::CFG &CFG = CFCtx.getCFG(); | ||
PostOrderCFGView POV(&CFG); | ||
ForwardDataflowWorklist Worklist(CFG, &POV); | ||
|
||
std::vector<std::optional<TypeErasedDataflowAnalysisState>> BlockStates( | ||
CFCtx.getCFG().size()); | ||
CFG.size()); | ||
|
||
// The entry basic block doesn't contain statements so it can be skipped. | ||
const CFGBlock &Entry = CFCtx.getCFG().getEntry(); | ||
const CFGBlock &Entry = CFG.getEntry(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This refactoring is entirely independent of the rest of this PR. Suggest submitting changes like this as a separate "NFC" PR to minimize the size of PRs and ease review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was originally part of this patch, but I ended up undoing some of the new changes. I considered separating this out, but I think for such small NFC changes you run the real risk that you simply won't follow up. That is, it's a great idea, but if we try to make this the norm, I think the result will be many lost small cleanup opportunities. Maybe that's worth it, for review clarity. I don't really know where to draw the line.
} | ||
|
||
return false; | ||
static bool isBackedgeNode(const CFGBlock &B) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a short comment explaining what a "backedge node" is?
(I think the term "backedge node" isn't otherwise defined in Clang or the framework -- or if it is, it would be useful to add a link.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Some blocks are used to represent the "loop edge" to the start of a loop |
Fwiw, I think this comment is better than the one in the CFG, so I don't know that linking is worth much, especially since it can be found pretty easily just by looking at getLoopTarget
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but
-
It still requires work (not everyone has an IDE with great cross-linking, so this could be multiple clicks), and
-
We're introducing a new term here ("backedge node"), and it's not clear without an explanation whether a) we intend for this to be a synonym for "has loop target" (but then why aren't we reusing that term -- is it because we think "backedge node" is clearer?), or b)
getLoopTarget() != nullptr
happens to be the right implementation today, but we think it might change in the future, so we're wrapping this check in a function?
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
Outdated
Show resolved
Hide resolved
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
Outdated
Show resolved
Hide resolved
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
Outdated
Show resolved
Hide resolved
… of loop statements).
Widen on backedge nodes, instead of nodes with a loop statement as terminator.
This fixes #67834 and a precision loss from assignment in a loop condition. The
commit contains tests for both of these issues.