Skip to content

[clang][dataflow] Check for backedges directly (instead of loop statements). #68923

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 15 additions & 20 deletions clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
//===----------------------------------------------------------------------===//

#include <algorithm>
#include <memory>
#include <optional>
#include <system_error>
#include <utility>
Expand All @@ -33,8 +32,8 @@
#include "clang/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.h"
#include "clang/Analysis/FlowSensitive/Value.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallBitVector.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/Error.h"

Expand All @@ -53,19 +52,14 @@ static int blockIndexInPredecessor(const CFGBlock &Pred,
return BlockPos - Pred.succ_begin();
}

static bool isLoopHead(const CFGBlock &B) {
if (const auto *T = B.getTerminatorStmt())
switch (T->getStmtClass()) {
case Stmt::WhileStmtClass:
case Stmt::DoStmtClass:
case Stmt::ForStmtClass:
case Stmt::CXXForRangeStmtClass:
return true;
default:
return false;
}

return false;
// A "backedge" node is a block introduced in the CFG exclusively to indicate a
// loop backedge. They are exactly identified by the presence of a non-null
// pointer to the entry block of the loop condition. Note that this is not
// necessarily the block with the loop statement as terminator, because
// short-circuit operators will result in multiple blocks encoding the loop
// condition, only one of which will contain the loop statement as terminator.
static bool isBackedgeNode(const CFGBlock &B) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a short comment explaining what a "backedge node" is?

(I think the term "backedge node" isn't otherwise defined in Clang or the framework -- or if it is, it would be useful to add a link.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// Some blocks are used to represent the "loop edge" to the start of a loop

Fwiw, I think this comment is better than the one in the CFG, so I don't know that linking is worth much, especially since it can be found pretty easily just by looking at getLoopTarget.

Copy link
Contributor

@martinboehme martinboehme Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but

  • It still requires work (not everyone has an IDE with great cross-linking, so this could be multiple clicks), and

  • We're introducing a new term here ("backedge node"), and it's not clear without an explanation whether a) we intend for this to be a synonym for "has loop target" (but then why aren't we reusing that term -- is it because we think "backedge node" is clearer?), or b) getLoopTarget() != nullptr happens to be the right implementation today, but we think it might change in the future, so we're wrapping this check in a function?

return B.getLoopTarget() != nullptr;
}

namespace {
Expand Down Expand Up @@ -502,14 +496,15 @@ runTypeErasedDataflowAnalysis(
PostVisitCFG) {
PrettyStackTraceAnalysis CrashInfo(CFCtx, "runTypeErasedDataflowAnalysis");

PostOrderCFGView POV(&CFCtx.getCFG());
ForwardDataflowWorklist Worklist(CFCtx.getCFG(), &POV);
const clang::CFG &CFG = CFCtx.getCFG();
PostOrderCFGView POV(&CFG);
ForwardDataflowWorklist Worklist(CFG, &POV);

std::vector<std::optional<TypeErasedDataflowAnalysisState>> BlockStates(
CFCtx.getCFG().size());
CFG.size());

// The entry basic block doesn't contain statements so it can be skipped.
const CFGBlock &Entry = CFCtx.getCFG().getEntry();
const CFGBlock &Entry = CFG.getEntry();
Comment on lines -505 to +507
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This refactoring is entirely independent of the rest of this PR. Suggest submitting changes like this as a separate "NFC" PR to minimize the size of PRs and ease review.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was originally part of this patch, but I ended up undoing some of the new changes. I considered separating this out, but I think for such small NFC changes you run the real risk that you simply won't follow up. That is, it's a great idea, but if we try to make this the norm, I think the result will be many lost small cleanup opportunities. Maybe that's worth it, for review clarity. I don't really know where to draw the line.

BlockStates[Entry.getBlockID()] = {Analysis.typeErasedInitialElement(),
InitEnv.fork()};
Worklist.enqueueSuccessors(&Entry);
Expand Down Expand Up @@ -553,7 +548,7 @@ runTypeErasedDataflowAnalysis(
llvm::errs() << "Old Env:\n";
OldBlockState->Env.dump();
});
if (isLoopHead(*Block)) {
if (isBackedgeNode(*Block)) {
LatticeJoinEffect Effect1 = Analysis.widenTypeErased(
NewBlockState.Lattice, OldBlockState->Lattice);
LatticeJoinEffect Effect2 =
Expand Down
14 changes: 14 additions & 0 deletions clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4099,6 +4099,20 @@ TEST(TransferTest, LoopDereferencingChangingRecordPointerConverges) {
ASSERT_THAT_ERROR(checkDataflowWithNoopAnalysis(Code), llvm::Succeeded());
}

TEST(TransferTest, LoopWithShortCircuitedConditionConverges) {
std::string Code = R"cc(
bool foo();

void target() {
bool c = false;
while (foo() || foo()) {
c = true;
}
}
)cc";
ASSERT_THAT_ERROR(checkDataflowWithNoopAnalysis(Code), llvm::Succeeded());
}

TEST(TransferTest, DoesNotCrashOnUnionThisExpr) {
std::string Code = R"(
union Union {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -913,6 +913,29 @@ TEST_F(FlowConditionTest, WhileStmt) {
});
}

TEST_F(FlowConditionTest, WhileStmtWithAssignmentInCondition) {
std::string Code = R"(
void target(bool Foo) {
// This test checks whether the analysis preserves the connection between
// the value of `Foo` and the assignment expression, despite widening.
// The equality operator generates a fresh boolean variable on each
// interpretation, which forces use of widening.
while ((Foo = (3 == 4))) {
(void)0;
/*[[p]]*/
}
}
)";
runDataflow(
Code,
[](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
ASTContext &ASTCtx) {
const Environment &Env = getEnvironmentAtAnnotation(Results, "p");
auto &FooVal = getValueForDecl<BoolValue>(ASTCtx, Env, "Foo").formula();
EXPECT_TRUE(Env.flowConditionImplies(FooVal));
});
}

TEST_F(FlowConditionTest, Conjunction) {
std::string Code = R"(
void target(bool Foo, bool Bar) {
Expand Down