Skip to content

Commit fc9821a

Browse files
committed
Reland "[dataflow] Add dedicated representation of boolean formulas"
This reverts commit 7a72ce9. Test problems were due to unspecified order of function arg evaluation. Reland "[dataflow] Replace most BoolValue subclasses with references to Formula (and AtomicBoolValue => Atom and BoolValue => Formula where appropriate)" This properly frees the Value hierarchy from managing boolean formulas. We still distinguish AtomicBoolValue; this type is used in client code. However we expect to convert such uses to BoolValue (where the distinction is not needed) or Atom (where atomic identity is intended), and then fold AtomicBoolValue into FormulaBoolValue. We also distinguish TopBoolValue; this has distinct rules for widen/join/equivalence, and top-ness is not represented in Formula. It'd be nice to find a cleaner representation (e.g. the absence of a formula), but no immediate plans. For now, BoolValues with the same Formula are deduplicated. This doesn't seem desirable, as Values are mutable by their creators (properties). We can probably drop this for FormulaBoolValue immediately (not in this patch, to isolate changes). For AtomicBoolValue we first need to update clients to stop using value pointers for atom identity. The data structures around flow conditions are updated: - flow condition tokens are Atom, rather than AtomicBoolValue* - conditions are Formula, rather than BoolValue Most APIs were changed directly, some with many clients had a new version added and the existing one deprecated. The factories for BoolValues in Environment keep their existing signatures for now (e.g. makeOr(BoolValue, BoolValue) => BoolValue) and are not deprecated. These have very many clients and finding the most ergonomic API & migration path still needs some thought. Differential Revision: https://reviews.llvm.org/D153469
1 parent 74eac85 commit fc9821a

24 files changed

+896
-1427
lines changed

clang/include/clang/Analysis/FlowSensitive/Arena.h

Lines changed: 66 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -8,19 +8,18 @@
88
#ifndef LLVM_CLANG_ANALYSIS_FLOWSENSITIVE__ARENA_H
99
#define LLVM_CLANG_ANALYSIS_FLOWSENSITIVE__ARENA_H
1010

11+
#include "clang/Analysis/FlowSensitive/Formula.h"
1112
#include "clang/Analysis/FlowSensitive/StorageLocation.h"
1213
#include "clang/Analysis/FlowSensitive/Value.h"
1314
#include <vector>
1415

1516
namespace clang::dataflow {
1617

1718
/// The Arena owns the objects that model data within an analysis.
18-
/// For example, `Value` and `StorageLocation`.
19+
/// For example, `Value`, `StorageLocation`, `Atom`, and `Formula`.
1920
class Arena {
2021
public:
21-
Arena()
22-
: TrueVal(create<AtomicBoolValue>()),
23-
FalseVal(create<AtomicBoolValue>()) {}
22+
Arena() : True(makeAtom()), False(makeAtom()) {}
2423
Arena(const Arena &) = delete;
2524
Arena &operator=(const Arena &) = delete;
2625

@@ -56,74 +55,91 @@ class Arena {
5655
.get());
5756
}
5857

59-
/// Returns a boolean value that represents the conjunction of `LHS` and
60-
/// `RHS`. Subsequent calls with the same arguments, regardless of their
61-
/// order, will return the same result. If the given boolean values represent
62-
/// the same value, the result will be the value itself.
63-
BoolValue &makeAnd(BoolValue &LHS, BoolValue &RHS);
64-
65-
/// Returns a boolean value that represents the disjunction of `LHS` and
66-
/// `RHS`. Subsequent calls with the same arguments, regardless of their
67-
/// order, will return the same result. If the given boolean values represent
68-
/// the same value, the result will be the value itself.
69-
BoolValue &makeOr(BoolValue &LHS, BoolValue &RHS);
70-
71-
/// Returns a boolean value that represents the negation of `Val`. Subsequent
72-
/// calls with the same argument will return the same result.
73-
BoolValue &makeNot(BoolValue &Val);
74-
75-
/// Returns a boolean value that represents `LHS => RHS`. Subsequent calls
76-
/// with the same arguments, will return the same result. If the given boolean
77-
/// values represent the same value, the result will be a value that
78-
/// represents the true boolean literal.
79-
BoolValue &makeImplies(BoolValue &LHS, BoolValue &RHS);
80-
81-
/// Returns a boolean value that represents `LHS <=> RHS`. Subsequent calls
82-
/// with the same arguments, regardless of their order, will return the same
83-
/// result. If the given boolean values represent the same value, the result
84-
/// will be a value that represents the true boolean literal.
85-
BoolValue &makeEquals(BoolValue &LHS, BoolValue &RHS);
58+
/// Creates a BoolValue wrapping a particular formula.
59+
///
60+
/// Passing in the same formula will result in the same BoolValue.
61+
/// FIXME: Interning BoolValues but not other Values is inconsistent.
62+
/// Decide whether we want Value interning or not.
63+
BoolValue &makeBoolValue(const Formula &);
64+
65+
/// Creates a fresh atom and wraps in in an AtomicBoolValue.
66+
/// FIXME: For now, identical-address AtomicBoolValue <=> identical atom.
67+
/// Stop relying on pointer identity and remove this guarantee.
68+
AtomicBoolValue &makeAtomValue() {
69+
return cast<AtomicBoolValue>(makeBoolValue(makeAtomRef(makeAtom())));
70+
}
71+
72+
/// Creates a fresh Top boolean value.
73+
TopBoolValue &makeTopValue() {
74+
// No need for deduplicating: there's no way to create aliasing Tops.
75+
return create<TopBoolValue>(makeAtomRef(makeAtom()));
76+
}
8677

8778
/// Returns a symbolic integer value that models an integer literal equal to
8879
/// `Value`. These literals are the same every time.
8980
/// Integer literals are not typed; the type is determined by the `Expr` that
9081
/// an integer literal is associated with.
9182
IntegerValue &makeIntLiteral(llvm::APInt Value);
9283

93-
/// Returns a symbolic boolean value that models a boolean literal equal to
94-
/// `Value`. These literals are the same every time.
95-
AtomicBoolValue &makeLiteral(bool Value) const {
96-
return Value ? TrueVal : FalseVal;
84+
// Factories for boolean formulas.
85+
// Formulas are interned: passing the same arguments return the same result.
86+
// For commutative operations like And/Or, interning ignores order.
87+
// Simplifications are applied: makeOr(X, X) => X, etc.
88+
89+
/// Returns a formula for the conjunction of `LHS` and `RHS`.
90+
const Formula &makeAnd(const Formula &LHS, const Formula &RHS);
91+
92+
/// Returns a formula for the disjunction of `LHS` and `RHS`.
93+
const Formula &makeOr(const Formula &LHS, const Formula &RHS);
94+
95+
/// Returns a formula for the negation of `Val`.
96+
const Formula &makeNot(const Formula &Val);
97+
98+
/// Returns a formula for `LHS => RHS`.
99+
const Formula &makeImplies(const Formula &LHS, const Formula &RHS);
100+
101+
/// Returns a formula for `LHS <=> RHS`.
102+
const Formula &makeEquals(const Formula &LHS, const Formula &RHS);
103+
104+
/// Returns a formula for the variable A.
105+
const Formula &makeAtomRef(Atom A);
106+
107+
/// Returns a formula for a literal true/false.
108+
const Formula &makeLiteral(bool Value) {
109+
return makeAtomRef(Value ? True : False);
97110
}
98111

112+
/// Returns a new atomic boolean variable, distinct from any other.
113+
Atom makeAtom() { return static_cast<Atom>(NextAtom++); };
114+
99115
/// Creates a fresh flow condition and returns a token that identifies it. The
100116
/// token can be used to perform various operations on the flow condition such
101117
/// as adding constraints to it, forking it, joining it with another flow
102118
/// condition, or checking implications.
103-
AtomicBoolValue &makeFlowConditionToken() {
104-
return create<AtomicBoolValue>();
105-
}
119+
Atom makeFlowConditionToken() { return makeAtom(); }
106120

107121
private:
122+
llvm::BumpPtrAllocator Alloc;
123+
108124
// Storage for the state of a program.
109125
std::vector<std::unique_ptr<StorageLocation>> Locs;
110126
std::vector<std::unique_ptr<Value>> Vals;
111127

112128
// Indices that are used to avoid recreating the same integer literals and
113129
// composite boolean values.
114130
llvm::DenseMap<llvm::APInt, IntegerValue *> IntegerLiterals;
115-
llvm::DenseMap<std::pair<BoolValue *, BoolValue *>, ConjunctionValue *>
116-
ConjunctionVals;
117-
llvm::DenseMap<std::pair<BoolValue *, BoolValue *>, DisjunctionValue *>
118-
DisjunctionVals;
119-
llvm::DenseMap<BoolValue *, NegationValue *> NegationVals;
120-
llvm::DenseMap<std::pair<BoolValue *, BoolValue *>, ImplicationValue *>
121-
ImplicationVals;
122-
llvm::DenseMap<std::pair<BoolValue *, BoolValue *>, BiconditionalValue *>
123-
BiconditionalVals;
124-
125-
AtomicBoolValue &TrueVal;
126-
AtomicBoolValue &FalseVal;
131+
using FormulaPair = std::pair<const Formula *, const Formula *>;
132+
llvm::DenseMap<FormulaPair, const Formula *> Ands;
133+
llvm::DenseMap<FormulaPair, const Formula *> Ors;
134+
llvm::DenseMap<const Formula *, const Formula *> Nots;
135+
llvm::DenseMap<FormulaPair, const Formula *> Implies;
136+
llvm::DenseMap<FormulaPair, const Formula *> Equals;
137+
llvm::DenseMap<Atom, const Formula *> AtomRefs;
138+
139+
llvm::DenseMap<const Formula *, BoolValue *> FormulaValues;
140+
unsigned NextAtom = 0;
141+
142+
Atom True, False;
127143
};
128144

129145
} // namespace clang::dataflow

clang/include/clang/Analysis/FlowSensitive/DataflowAnalysisContext.h

Lines changed: 13 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -138,33 +138,31 @@ class DataflowAnalysisContext {
138138
PointerValue &getOrCreateNullPointerValue(QualType PointeeType);
139139

140140
/// Adds `Constraint` to the flow condition identified by `Token`.
141-
void addFlowConditionConstraint(AtomicBoolValue &Token,
142-
BoolValue &Constraint);
141+
void addFlowConditionConstraint(Atom Token, const Formula &Constraint);
143142

144143
/// Creates a new flow condition with the same constraints as the flow
145144
/// condition identified by `Token` and returns its token.
146-
AtomicBoolValue &forkFlowCondition(AtomicBoolValue &Token);
145+
Atom forkFlowCondition(Atom Token);
147146

148147
/// Creates a new flow condition that represents the disjunction of the flow
149148
/// conditions identified by `FirstToken` and `SecondToken`, and returns its
150149
/// token.
151-
AtomicBoolValue &joinFlowConditions(AtomicBoolValue &FirstToken,
152-
AtomicBoolValue &SecondToken);
150+
Atom joinFlowConditions(Atom FirstToken, Atom SecondToken);
153151

154152
/// Returns true if and only if the constraints of the flow condition
155153
/// identified by `Token` imply that `Val` is true.
156-
bool flowConditionImplies(AtomicBoolValue &Token, BoolValue &Val);
154+
bool flowConditionImplies(Atom Token, const Formula &);
157155

158156
/// Returns true if and only if the constraints of the flow condition
159157
/// identified by `Token` are always true.
160-
bool flowConditionIsTautology(AtomicBoolValue &Token);
158+
bool flowConditionIsTautology(Atom Token);
161159

162160
/// Returns true if `Val1` is equivalent to `Val2`.
163161
/// Note: This function doesn't take into account constraints on `Val1` and
164162
/// `Val2` imposed by the flow condition.
165-
bool equivalentBoolValues(BoolValue &Val1, BoolValue &Val2);
163+
bool equivalentFormulas(const Formula &Val1, const Formula &Val2);
166164

167-
LLVM_DUMP_METHOD void dumpFlowCondition(AtomicBoolValue &Token,
165+
LLVM_DUMP_METHOD void dumpFlowCondition(Atom Token,
168166
llvm::raw_ostream &OS = llvm::dbgs());
169167

170168
/// Returns the `ControlFlowContext` registered for `F`, if any. Otherwise,
@@ -181,7 +179,7 @@ class DataflowAnalysisContext {
181179
/// included in `Constraints` to provide contextually-accurate results, e.g.
182180
/// if any definitions or relationships of the values in `Constraints` have
183181
/// been stored in flow conditions.
184-
Solver::Result querySolver(llvm::SetVector<BoolValue *> Constraints);
182+
Solver::Result querySolver(llvm::SetVector<const Formula *> Constraints);
185183

186184
private:
187185
friend class Environment;
@@ -209,12 +207,12 @@ class DataflowAnalysisContext {
209207
/// to track tokens of flow conditions that were already visited by recursive
210208
/// calls.
211209
void addTransitiveFlowConditionConstraints(
212-
AtomicBoolValue &Token, llvm::SetVector<BoolValue *> &Constraints,
213-
llvm::DenseSet<AtomicBoolValue *> &VisitedTokens);
210+
Atom Token, llvm::SetVector<const Formula *> &Constraints,
211+
llvm::DenseSet<Atom> &VisitedTokens);
214212

215213
/// Returns true if the solver is able to prove that there is no satisfying
216214
/// assignment for `Constraints`
217-
bool isUnsatisfiable(llvm::SetVector<BoolValue *> Constraints) {
215+
bool isUnsatisfiable(llvm::SetVector<const Formula *> Constraints) {
218216
return querySolver(std::move(Constraints)).getStatus() ==
219217
Solver::Result::Status::Unsatisfiable;
220218
}
@@ -253,9 +251,8 @@ class DataflowAnalysisContext {
253251
// Flow conditions depend on other flow conditions if they are created using
254252
// `forkFlowCondition` or `joinFlowConditions`. The graph of flow condition
255253
// dependencies is stored in the `FlowConditionDeps` map.
256-
llvm::DenseMap<AtomicBoolValue *, llvm::DenseSet<AtomicBoolValue *>>
257-
FlowConditionDeps;
258-
llvm::DenseMap<AtomicBoolValue *, BoolValue *> FlowConditionConstraints;
254+
llvm::DenseMap<Atom, llvm::DenseSet<Atom>> FlowConditionDeps;
255+
llvm::DenseMap<Atom, const Formula *> FlowConditionConstraints;
259256

260257
llvm::DenseMap<const FunctionDecl *, ControlFlowContext> FunctionContexts;
261258

clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h

Lines changed: 23 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -456,77 +456,86 @@ class Environment {
456456
template <typename T, typename... Args>
457457
std::enable_if_t<std::is_base_of<Value, T>::value, T &>
458458
create(Args &&...args) {
459-
return DACtx->arena().create<T>(std::forward<Args>(args)...);
459+
return arena().create<T>(std::forward<Args>(args)...);
460460
}
461461

462462
/// Returns a symbolic integer value that models an integer literal equal to
463463
/// `Value`
464464
IntegerValue &getIntLiteralValue(llvm::APInt Value) const {
465-
return DACtx->arena().makeIntLiteral(Value);
465+
return arena().makeIntLiteral(Value);
466466
}
467467

468468
/// Returns a symbolic boolean value that models a boolean literal equal to
469469
/// `Value`
470470
AtomicBoolValue &getBoolLiteralValue(bool Value) const {
471-
return DACtx->arena().makeLiteral(Value);
471+
return cast<AtomicBoolValue>(
472+
arena().makeBoolValue(arena().makeLiteral(Value)));
472473
}
473474

474475
/// Returns an atomic boolean value.
475476
BoolValue &makeAtomicBoolValue() const {
476-
return DACtx->arena().create<AtomicBoolValue>();
477+
return arena().makeAtomValue();
477478
}
478479

479480
/// Returns a unique instance of boolean Top.
480481
BoolValue &makeTopBoolValue() const {
481-
return DACtx->arena().create<TopBoolValue>();
482+
return arena().makeTopValue();
482483
}
483484

484485
/// Returns a boolean value that represents the conjunction of `LHS` and
485486
/// `RHS`. Subsequent calls with the same arguments, regardless of their
486487
/// order, will return the same result. If the given boolean values represent
487488
/// the same value, the result will be the value itself.
488489
BoolValue &makeAnd(BoolValue &LHS, BoolValue &RHS) const {
489-
return DACtx->arena().makeAnd(LHS, RHS);
490+
return arena().makeBoolValue(
491+
arena().makeAnd(LHS.formula(), RHS.formula()));
490492
}
491493

492494
/// Returns a boolean value that represents the disjunction of `LHS` and
493495
/// `RHS`. Subsequent calls with the same arguments, regardless of their
494496
/// order, will return the same result. If the given boolean values represent
495497
/// the same value, the result will be the value itself.
496498
BoolValue &makeOr(BoolValue &LHS, BoolValue &RHS) const {
497-
return DACtx->arena().makeOr(LHS, RHS);
499+
return arena().makeBoolValue(
500+
arena().makeOr(LHS.formula(), RHS.formula()));
498501
}
499502

500503
/// Returns a boolean value that represents the negation of `Val`. Subsequent
501504
/// calls with the same argument will return the same result.
502505
BoolValue &makeNot(BoolValue &Val) const {
503-
return DACtx->arena().makeNot(Val);
506+
return arena().makeBoolValue(arena().makeNot(Val.formula()));
504507
}
505508

506509
/// Returns a boolean value represents `LHS` => `RHS`. Subsequent calls with
507510
/// the same arguments, will return the same result. If the given boolean
508511
/// values represent the same value, the result will be a value that
509512
/// represents the true boolean literal.
510513
BoolValue &makeImplication(BoolValue &LHS, BoolValue &RHS) const {
511-
return DACtx->arena().makeImplies(LHS, RHS);
514+
return arena().makeBoolValue(
515+
arena().makeImplies(LHS.formula(), RHS.formula()));
512516
}
513517

514518
/// Returns a boolean value represents `LHS` <=> `RHS`. Subsequent calls with
515519
/// the same arguments, regardless of their order, will return the same
516520
/// result. If the given boolean values represent the same value, the result
517521
/// will be a value that represents the true boolean literal.
518522
BoolValue &makeIff(BoolValue &LHS, BoolValue &RHS) const {
519-
return DACtx->arena().makeEquals(LHS, RHS);
523+
return arena().makeBoolValue(
524+
arena().makeEquals(LHS.formula(), RHS.formula()));
520525
}
521526

522527
/// Returns the token that identifies the flow condition of the environment.
523-
AtomicBoolValue &getFlowConditionToken() const { return *FlowConditionToken; }
528+
Atom getFlowConditionToken() const { return FlowConditionToken; }
524529

525530
/// Adds `Val` to the set of clauses that constitute the flow condition.
531+
void addToFlowCondition(const Formula &);
532+
LLVM_DEPRECATED("Use Formula version instead", "")
526533
void addToFlowCondition(BoolValue &Val);
527534

528535
/// Returns true if and only if the clauses that constitute the flow condition
529536
/// imply that `Val` is true.
537+
bool flowConditionImplies(const Formula &) const;
538+
LLVM_DEPRECATED("Use Formula version instead", "")
530539
bool flowConditionImplies(BoolValue &Val) const;
531540

532541
/// Returns the `DeclContext` of the block being analysed, if any. Otherwise,
@@ -547,6 +556,8 @@ class Environment {
547556
/// Returns the `DataflowAnalysisContext` used by the environment.
548557
DataflowAnalysisContext &getDataflowAnalysisContext() const { return *DACtx; }
549558

559+
Arena &arena() const { return DACtx->arena(); }
560+
550561
LLVM_DUMP_METHOD void dump() const;
551562
LLVM_DUMP_METHOD void dump(raw_ostream &OS) const;
552563

@@ -617,7 +628,7 @@ class Environment {
617628
std::pair<StructValue *, const ValueDecl *>>
618629
MemberLocToStruct;
619630

620-
AtomicBoolValue *FlowConditionToken;
631+
Atom FlowConditionToken;
621632
};
622633

623634
/// Returns the storage location for the implicit object of a

0 commit comments

Comments
 (0)