Skip to content

[clang][dataflow] Process terminator condition within transferCFGBlock(). #78127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions clang/include/clang/Analysis/FlowSensitive/Transfer.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,17 @@ namespace dataflow {
/// Maps statements to the environments of basic blocks that contain them.
class StmtToEnvMap {
public:
// `CurBlockID` is the ID of the block currently being processed, and
// `CurState` is the pending state currently associated with this block. These
// are supplied separately as the pending state for the current block may not
// yet be represented in `BlockToState`.
StmtToEnvMap(const ControlFlowContext &CFCtx,
llvm::ArrayRef<std::optional<TypeErasedDataflowAnalysisState>>
BlockToState)
: CFCtx(CFCtx), BlockToState(BlockToState) {}
BlockToState,
unsigned CurBlockID,
const TypeErasedDataflowAnalysisState &CurState)
: CFCtx(CFCtx), BlockToState(BlockToState), CurBlockID(CurBlockID),
CurState(CurState) {}

/// Returns the environment of the basic block that contains `S`.
/// The result is guaranteed never to be null.
Expand All @@ -37,6 +44,8 @@ class StmtToEnvMap {
private:
const ControlFlowContext &CFCtx;
llvm::ArrayRef<std::optional<TypeErasedDataflowAnalysisState>> BlockToState;
unsigned CurBlockID;
const TypeErasedDataflowAnalysisState &CurState;
};

/// Evaluates `S` and updates `Env` accordingly.
Expand Down
2 changes: 2 additions & 0 deletions clang/lib/Analysis/FlowSensitive/Transfer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ const Environment *StmtToEnvMap::getEnvironment(const Stmt &S) const {
assert(BlockIt != CFCtx.getStmtToBlock().end());
if (!CFCtx.isBlockReachable(*BlockIt->getSecond()))
return nullptr;
if (BlockIt->getSecond()->getBlockID() == CurBlockID)
return &CurState.Env;
const auto &State = BlockToState[BlockIt->getSecond()->getBlockID()];
if (!(State))
return nullptr;
Expand Down
64 changes: 40 additions & 24 deletions clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -75,9 +75,8 @@ using TerminatorVisitorRetTy = std::pair<const Expr *, bool>;
class TerminatorVisitor
: public ConstStmtVisitor<TerminatorVisitor, TerminatorVisitorRetTy> {
public:
TerminatorVisitor(const StmtToEnvMap &StmtToEnv, Environment &Env,
int BlockSuccIdx)
: StmtToEnv(StmtToEnv), Env(Env), BlockSuccIdx(BlockSuccIdx) {}
TerminatorVisitor(Environment &Env, int BlockSuccIdx)
: Env(Env), BlockSuccIdx(BlockSuccIdx) {}

TerminatorVisitorRetTy VisitIfStmt(const IfStmt *S) {
auto *Cond = S->getCond();
Expand Down Expand Up @@ -126,19 +125,12 @@ class TerminatorVisitor

private:
TerminatorVisitorRetTy extendFlowCondition(const Expr &Cond) {
// The terminator sub-expression might not be evaluated.
if (Env.getValue(Cond) == nullptr)
transfer(StmtToEnv, Cond, Env);

auto *Val = Env.get<BoolValue>(Cond);
// Value merging depends on flow conditions from different environments
// being mutually exclusive -- that is, they cannot both be true in their
// entirety (even if they may share some clauses). So, we need *some* value
// for the condition expression, even if just an atom.
if (Val == nullptr) {
Val = &Env.makeAtomicBoolValue();
Env.setValue(Cond, *Val);
}
// In transferCFGBlock(), we ensure that we always have a `Value` for the
// terminator condition, so assert this.
// We consciously assert ourselves instead of asserting via `cast()` so
// that we get a more meaningful line number if the assertion fails.
assert(Val != nullptr);

bool ConditionValue = true;
// The condition must be inverted for the successor that encompasses the
Expand All @@ -152,7 +144,6 @@ class TerminatorVisitor
return {&Cond, ConditionValue};
}

const StmtToEnvMap &StmtToEnv;
Environment &Env;
int BlockSuccIdx;
};
Expand Down Expand Up @@ -335,10 +326,8 @@ computeBlockInputState(const CFGBlock &Block, AnalysisContext &AC) {
// when the terminator is taken. Copy now.
TypeErasedDataflowAnalysisState Copy = MaybePredState->fork();

const StmtToEnvMap StmtToEnv(AC.CFCtx, AC.BlockStates);
auto [Cond, CondValue] =
TerminatorVisitor(StmtToEnv, Copy.Env,
blockIndexInPredecessor(*Pred, Block))
TerminatorVisitor(Copy.Env, blockIndexInPredecessor(*Pred, Block))
Copy link
Collaborator

@Xazax-hun Xazax-hun Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approved the PR because I believe it is a step in the right direction, but I can't help but feel like something is a bit off. I wonder if we could reorganize things a bit to make errors like this impossible to happen. Specifically, I wonder if it is possible to make sure we never need these extra analysis state copies during transfer, and we can always work on the state we have in place associated with the corresponding block.

Here, we evaluate the side effect of the terminator stmt multiple times, once for each of the successors of the block. This is OK here, since each evaluation is happening on a separate copy of the state. While it is OK, it feels a bit redundant and error prone. I wonder if it would be possible to actually do the transfer for the terminator only once, when the basic block is processed, and here only do the transferBranchTypeErased without ever doing a regular transfer.

.Visit(PredTerminatorStmt);
if (Cond != nullptr)
// FIXME: Call transferBranchTypeErased even if BuiltinTransferOpts
Expand All @@ -356,12 +345,13 @@ computeBlockInputState(const CFGBlock &Block, AnalysisContext &AC) {

/// Built-in transfer function for `CFGStmt`.
static void
builtinTransferStatement(const CFGStmt &Elt,
builtinTransferStatement(unsigned CurBlockID, const CFGStmt &Elt,
TypeErasedDataflowAnalysisState &InputState,
AnalysisContext &AC) {
const Stmt *S = Elt.getStmt();
assert(S != nullptr);
transfer(StmtToEnvMap(AC.CFCtx, AC.BlockStates), *S, InputState.Env);
transfer(StmtToEnvMap(AC.CFCtx, AC.BlockStates, CurBlockID, InputState), *S,
InputState.Env);
}

/// Built-in transfer function for `CFGInitializer`.
Expand Down Expand Up @@ -428,12 +418,12 @@ builtinTransferInitializer(const CFGInitializer &Elt,
}
}

static void builtinTransfer(const CFGElement &Elt,
static void builtinTransfer(unsigned CurBlockID, const CFGElement &Elt,
TypeErasedDataflowAnalysisState &State,
AnalysisContext &AC) {
switch (Elt.getKind()) {
case CFGElement::Statement:
builtinTransferStatement(Elt.castAs<CFGStmt>(), State, AC);
builtinTransferStatement(CurBlockID, Elt.castAs<CFGStmt>(), State, AC);
break;
case CFGElement::Initializer:
builtinTransferInitializer(Elt.castAs<CFGInitializer>(), State);
Expand Down Expand Up @@ -477,7 +467,7 @@ transferCFGBlock(const CFGBlock &Block, AnalysisContext &AC,
AC.Log.enterElement(Element);
// Built-in analysis
if (AC.Analysis.builtinOptions()) {
builtinTransfer(Element, State, AC);
builtinTransfer(Block.getBlockID(), Element, State, AC);
}

// User-provided analysis
Expand All @@ -489,6 +479,32 @@ transferCFGBlock(const CFGBlock &Block, AnalysisContext &AC,
}
AC.Log.recordState(State);
}

// If we have a terminator, evaluate its condition.
// This `Expr` may not appear as a `CFGElement` anywhere else, and it's
// important that we evaluate it here (rather than while processing the
// terminator) so that we put the corresponding value in the right
// environment.
if (const Expr *TerminatorCond =
dyn_cast_or_null<Expr>(Block.getTerminatorCondition())) {
if (State.Env.getValue(*TerminatorCond) == nullptr)
// FIXME: This only runs the builtin transfer, not the analysis-specific
// transfer. Fixing this isn't trivial, as the analysis-specific transfer
// takes a `CFGElement` as input, but some expressions only show up as a
// terminator condition, but not as a `CFGElement`. The condition of an if
// statement is one such example.
transfer(
StmtToEnvMap(AC.CFCtx, AC.BlockStates, Block.getBlockID(), State),
*TerminatorCond, State.Env);

// If the transfer function didn't produce a value, create an atom so that
// we have *some* value for the condition expression. This ensures that
// when we extend the flow condition, it actually changes.
if (State.Env.getValue(*TerminatorCond) == nullptr)
State.Env.setValue(*TerminatorCond, State.Env.makeAtomicBoolValue());
AC.Log.recordState(State);
}

return State;
}

Expand Down
1 change: 1 addition & 0 deletions clang/unittests/Analysis/FlowSensitive/LoggerTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ recordState(Elements=1, Branches=0, Joins=0)
enterElement(b (ImplicitCastExpr, LValueToRValue, _Bool))
transfer()
recordState(Elements=2, Branches=0, Joins=0)
recordState(Elements=2, Branches=0, Joins=0)
enterBlock(3, false)
transferBranch(0)
Expand Down
27 changes: 27 additions & 0 deletions clang/unittests/Analysis/FlowSensitive/SignAnalysisTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -895,6 +895,33 @@ TEST(SignAnalysisTest, BinaryEQ) {
LangStandard::lang_cxx17);
}

TEST(SignAnalysisTest, ComplexLoopCondition) {
std::string Code = R"(
int foo();
void fun() {
int a, b;
while ((a = foo()) > 0 && (b = foo()) > 0) {
a;
b;
// [[p]]
}
}
)";
runDataflow(
Code,
[](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
ASTContext &ASTCtx) {
const Environment &Env = getEnvironmentAtAnnotation(Results, "p");

const ValueDecl *A = findValueDecl(ASTCtx, "a");
const ValueDecl *B = findValueDecl(ASTCtx, "b");

EXPECT_TRUE(isPositive(A, ASTCtx, Env));
EXPECT_TRUE(isPositive(B, ASTCtx, Env));
},
LangStandard::lang_cxx17);
}

TEST(SignAnalysisTest, JoinToTop) {
std::string Code = R"(
int foo();
Expand Down
31 changes: 31 additions & 0 deletions clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6408,4 +6408,35 @@ TEST(TransferTest, DifferentReferenceLocInJoin) {
});
}

// This test verifies correct modeling of a relational dependency that goes
// through unmodeled functions (the simple `cond()` in this case).
TEST(TransferTest, ConditionalRelation) {
std::string Code = R"(
bool cond();
void target() {
bool a = true;
bool b = true;
if (cond()) {
a = false;
if (cond()) {
b = false;
}
}
(void)0;
// [[p]]
}
)";
runDataflow(
Code,
[](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
ASTContext &ASTCtx) {
const Environment &Env = getEnvironmentAtAnnotation(Results, "p");
auto &A = Env.arena();
auto &VarA = getValueForDecl<BoolValue>(ASTCtx, Env, "a").formula();
auto &VarB = getValueForDecl<BoolValue>(ASTCtx, Env, "b").formula();

EXPECT_FALSE(Env.allows(A.makeAnd(VarA, A.makeNot(VarB))));
});
}

} // namespace