[clang][dataflow] Add an early-out to `flowConditionImplies()` / `flowConditionAllows()`. #78172

martinboehme · 2024-01-15T15:41:00Z

This saves having to assemble the set of constraints and run the SAT solver in
the trivial case of flowConditionImplies(true) or
flowConditionAllows(false).

This is an update / reland of my previous reverted
#77453. That PR contained a
logic bug -- the early-out for flowConditionAllows() was wrong because my
intuition about the logic was wrong. (In particular, note that
flowConditionImplies(F) does not imply flowConditionAllows(F), even though
this may run counter to intuition.)

I've now done what I should have done on the first iteration and added more
tests. These pass both with and without my early-outs.

This patch is a performance win on the benchmarks for the Crubit nullability
checker, except for one slight regression on a relatively short benchmark:

name                              old cpu/op   new cpu/op   delta
BM_PointerAnalysisCopyPointer     68.5µs ± 7%  67.6µs ± 4%    ~     (p=0.159 n=18+19)
BM_PointerAnalysisIntLoop          173µs ± 3%   162µs ± 4%  -6.40%  (p=0.000 n=19+20)
BM_PointerAnalysisPointerLoop      307µs ± 2%   312µs ± 4%  +1.56%  (p=0.013 n=18+20)
BM_PointerAnalysisBranch           199µs ± 4%   181µs ± 4%  -8.81%  (p=0.000 n=20+20)
BM_PointerAnalysisLoopAndBranch    503µs ± 3%   508µs ± 2%    ~     (p=0.081 n=18+19)
BM_PointerAnalysisTwoLoops         304µs ± 4%   286µs ± 2%  -6.04%  (p=0.000 n=19+20)
BM_PointerAnalysisJoinFilePath    4.78ms ± 3%  4.54ms ± 4%  -4.97%  (p=0.000 n=20+20)
BM_PointerAnalysisCallInLoop      3.05ms ± 3%  2.90ms ± 4%  -5.05%  (p=0.000 n=19+20)

When running clang-tidy on real-world code, the results are less clear. In
three runs, averaged, on an arbitrarily chosen input file, I get 11.60 s of user
time without this patch and 11.40 s with it, though with considerable
measurement noise (I'm seeing up to 0.2 s of variation between runs).

Still, this is a very simple change, and it is a clear win in benchmarks, so I
think it is worth making.

…wConditionAllows()`. This saves having to assemble the set of constraints and run the SAT solver in the trivial case of `flowConditionImplies(true)` or `flowConditionAllows(false)`. This is an update / reland of my previous reverted [llvm#77453](llvm#77453). That PR contained a logic bug -- the early-out for `flowConditionAllows()` was wrong because my intuition about the logic was wrong. (In particular, note that `flowConditionImplies(F)` does not imply `flowConditionAllows(F)`, even though this may run counter to intuition.) I've now done what I should have done on the first iteration and added more tests. These pass both with and without my early-outs. This patch is a performance win on the benchmarks for the Crubit nullability checker, except for one slight regression on a relatively short benchmark: ``` name old cpu/op new cpu/op delta BM_PointerAnalysisCopyPointer 68.5µs ± 7% 67.6µs ± 4% ~ (p=0.159 n=18+19) BM_PointerAnalysisIntLoop 173µs ± 3% 162µs ± 4% -6.40% (p=0.000 n=19+20) BM_PointerAnalysisPointerLoop 307µs ± 2% 312µs ± 4% +1.56% (p=0.013 n=18+20) BM_PointerAnalysisBranch 199µs ± 4% 181µs ± 4% -8.81% (p=0.000 n=20+20) BM_PointerAnalysisLoopAndBranch 503µs ± 3% 508µs ± 2% ~ (p=0.081 n=18+19) BM_PointerAnalysisTwoLoops 304µs ± 4% 286µs ± 2% -6.04% (p=0.000 n=19+20) BM_PointerAnalysisJoinFilePath 4.78ms ± 3% 4.54ms ± 4% -4.97% (p=0.000 n=20+20) BM_PointerAnalysisCallInLoop 3.05ms ± 3% 2.90ms ± 4% -5.05% (p=0.000 n=19+20) ``` When running clang-tidy on real-world code, the results are less clear. In three runs, averaged, on an arbitrarily chosen input file, I get 11.60 s of user time without this patch and 11.40 s with it, though with considerable measurement noise (I'm seeing up to 0.2 s of variation between runs). Still, this is a very simple change, and it is a clear win in benchmarks, so I think it is worth making.

llvmbot · 2024-01-15T15:41:28Z

@llvm/pr-subscribers-clang-analysis

@llvm/pr-subscribers-clang

Author: None (martinboehme)

Changes

This saves having to assemble the set of constraints and run the SAT solver in
the trivial case of flowConditionImplies(true) or
flowConditionAllows(false).

This is an update / reland of my previous reverted
#77453. That PR contained a
logic bug -- the early-out for flowConditionAllows() was wrong because my
intuition about the logic was wrong. (In particular, note that
flowConditionImplies(F) does not imply flowConditionAllows(F), even though
this may run counter to intuition.)

I've now done what I should have done on the first iteration and added more
tests. These pass both with and without my early-outs.

This patch is a performance win on the benchmarks for the Crubit nullability
checker, except for one slight regression on a relatively short benchmark:

name                              old cpu/op   new cpu/op   delta
BM_PointerAnalysisCopyPointer     68.5µs ± 7%  67.6µs ± 4%    ~     (p=0.159 n=18+19)
BM_PointerAnalysisIntLoop          173µs ± 3%   162µs ± 4%  -6.40%  (p=0.000 n=19+20)
BM_PointerAnalysisPointerLoop      307µs ± 2%   312µs ± 4%  +1.56%  (p=0.013 n=18+20)
BM_PointerAnalysisBranch           199µs ± 4%   181µs ± 4%  -8.81%  (p=0.000 n=20+20)
BM_PointerAnalysisLoopAndBranch    503µs ± 3%   508µs ± 2%    ~     (p=0.081 n=18+19)
BM_PointerAnalysisTwoLoops         304µs ± 4%   286µs ± 2%  -6.04%  (p=0.000 n=19+20)
BM_PointerAnalysisJoinFilePath    4.78ms ± 3%  4.54ms ± 4%  -4.97%  (p=0.000 n=20+20)
BM_PointerAnalysisCallInLoop      3.05ms ± 3%  2.90ms ± 4%  -5.05%  (p=0.000 n=19+20)

When running clang-tidy on real-world code, the results are less clear. In
three runs, averaged, on an arbitrarily chosen input file, I get 11.60 s of user
time without this patch and 11.40 s with it, though with considerable
measurement noise (I'm seeing up to 0.2 s of variation between runs).

Still, this is a very simple change, and it is a clear win in benchmarks, so I
think it is worth making.

Full diff: https://github.com/llvm/llvm-project/pull/78172.diff

3 Files Affected:

(modified) clang/include/clang/Analysis/FlowSensitive/Formula.h (+4)
(modified) clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp (+6)
(modified) clang/unittests/Analysis/FlowSensitive/DataflowAnalysisContextTest.cpp (+27-3)

diff --git a/clang/include/clang/Analysis/FlowSensitive/Formula.h b/clang/include/clang/Analysis/FlowSensitive/Formula.h
index 982e400c1deff1f..0e6352403a832fe 100644
--- a/clang/include/clang/Analysis/FlowSensitive/Formula.h
+++ b/clang/include/clang/Analysis/FlowSensitive/Formula.h
@@ -75,6 +75,10 @@ class alignas(const Formula *) Formula {
     return static_cast<bool>(Value);
   }
 
+  bool isLiteral(bool b) const {
+    return kind() == Literal && static_cast<bool>(Value) == b;
+  }
+
   ArrayRef<const Formula *> operands() const {
     return ArrayRef(reinterpret_cast<Formula *const *>(this + 1),
                     numOperands(kind()));
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp b/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
index fa114979c8e3263..d95b3324ef6b2ae 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowAnalysisContext.cpp
@@ -174,6 +174,9 @@ Solver::Result DataflowAnalysisContext::querySolver(
 
 bool DataflowAnalysisContext::flowConditionImplies(Atom Token,
                                                    const Formula &F) {
+  if (F.isLiteral(true))
+    return true;
+
   // Returns true if and only if truth assignment of the flow condition implies
   // that `F` is also true. We prove whether or not this property holds by
   // reducing the problem to satisfiability checking. In other words, we attempt
@@ -188,6 +191,9 @@ bool DataflowAnalysisContext::flowConditionImplies(Atom Token,
 
 bool DataflowAnalysisContext::flowConditionAllows(Atom Token,
                                                   const Formula &F) {
+  if (F.isLiteral(false))
+    return false;
+
   llvm::SetVector<const Formula *> Constraints;
   Constraints.insert(&arena().makeAtomRef(Token));
   Constraints.insert(&F);
diff --git a/clang/unittests/Analysis/FlowSensitive/DataflowAnalysisContextTest.cpp b/clang/unittests/Analysis/FlowSensitive/DataflowAnalysisContextTest.cpp
index 88eb11045b9e9f9..4f7a72c502ccfea 100644
--- a/clang/unittests/Analysis/FlowSensitive/DataflowAnalysisContextTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/DataflowAnalysisContextTest.cpp
@@ -33,10 +33,34 @@ TEST_F(DataflowAnalysisContextTest, DistinctTopsNotEquivalent) {
   EXPECT_FALSE(Context.equivalentFormulas(X.formula(), Y.formula()));
 }
 
-TEST_F(DataflowAnalysisContextTest, EmptyFlowCondition) {
+TEST_F(DataflowAnalysisContextTest, TautologicalFlowConditionImplies) {
   Atom FC = A.makeFlowConditionToken();
-  auto &C = A.makeAtomRef(A.makeAtom());
-  EXPECT_FALSE(Context.flowConditionImplies(FC, C));
+  EXPECT_TRUE(Context.flowConditionImplies(FC, A.makeLiteral(true)));
+  EXPECT_FALSE(Context.flowConditionImplies(FC, A.makeLiteral(false)));
+  EXPECT_FALSE(Context.flowConditionImplies(FC, A.makeAtomRef(A.makeAtom())));
+}
+
+TEST_F(DataflowAnalysisContextTest, TautologicalFlowConditionAllows) {
+  Atom FC = A.makeFlowConditionToken();
+  EXPECT_TRUE(Context.flowConditionAllows(FC, A.makeLiteral(true)));
+  EXPECT_FALSE(Context.flowConditionAllows(FC, A.makeLiteral(false)));
+  EXPECT_TRUE(Context.flowConditionAllows(FC, A.makeAtomRef(A.makeAtom())));
+}
+
+TEST_F(DataflowAnalysisContextTest, ContradictoryFlowConditionImpliesAnything) {
+  Atom FC = A.makeFlowConditionToken();
+  Context.addFlowConditionConstraint(FC, A.makeLiteral(false));
+  EXPECT_TRUE(Context.flowConditionImplies(FC, A.makeLiteral(true)));
+  EXPECT_TRUE(Context.flowConditionImplies(FC, A.makeLiteral(false)));
+  EXPECT_TRUE(Context.flowConditionImplies(FC, A.makeAtomRef(A.makeAtom())));
+}
+
+TEST_F(DataflowAnalysisContextTest, ContradictoryFlowConditionAllowsNothing) {
+  Atom FC = A.makeFlowConditionToken();
+  Context.addFlowConditionConstraint(FC, A.makeLiteral(false));
+  EXPECT_FALSE(Context.flowConditionAllows(FC, A.makeLiteral(true)));
+  EXPECT_FALSE(Context.flowConditionAllows(FC, A.makeLiteral(false)));
+  EXPECT_FALSE(Context.flowConditionAllows(FC, A.makeAtomRef(A.makeAtom())));
 }
 
 TEST_F(DataflowAnalysisContextTest, AddFlowConditionConstraint) {

…wConditionAllows()`. (llvm#78172) This saves having to assemble the set of constraints and run the SAT solver in the trivial case of `flowConditionImplies(true)` or `flowConditionAllows(false)`. This is an update / reland of my previous reverted [llvm#77453](llvm#77453). That PR contained a logic bug -- the early-out for `flowConditionAllows()` was wrong because my intuition about the logic was wrong. (In particular, note that `flowConditionImplies(F)` does not imply `flowConditionAllows(F)`, even though this may run counter to intuition.) I've now done what I should have done on the first iteration and added more tests. These pass both with and without my early-outs. This patch is a performance win on the benchmarks for the Crubit nullability checker, except for one slight regression on a relatively short benchmark: ``` name old cpu/op new cpu/op delta BM_PointerAnalysisCopyPointer 68.5µs ± 7% 67.6µs ± 4% ~ (p=0.159 n=18+19) BM_PointerAnalysisIntLoop 173µs ± 3% 162µs ± 4% -6.40% (p=0.000 n=19+20) BM_PointerAnalysisPointerLoop 307µs ± 2% 312µs ± 4% +1.56% (p=0.013 n=18+20) BM_PointerAnalysisBranch 199µs ± 4% 181µs ± 4% -8.81% (p=0.000 n=20+20) BM_PointerAnalysisLoopAndBranch 503µs ± 3% 508µs ± 2% ~ (p=0.081 n=18+19) BM_PointerAnalysisTwoLoops 304µs ± 4% 286µs ± 2% -6.04% (p=0.000 n=19+20) BM_PointerAnalysisJoinFilePath 4.78ms ± 3% 4.54ms ± 4% -4.97% (p=0.000 n=20+20) BM_PointerAnalysisCallInLoop 3.05ms ± 3% 2.90ms ± 4% -5.05% (p=0.000 n=19+20) ``` When running clang-tidy on real-world code, the results are less clear. In three runs, averaged, on an arbitrarily chosen input file, I get 11.60 s of user time without this patch and 11.40 s with it, though with considerable measurement noise (I'm seeing up to 0.2 s of variation between runs). Still, this is a very simple change, and it is a clear win in benchmarks, so I think it is worth making.

llvmbot added clang Clang issues not falling into any other category clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Jan 15, 2024

martinboehme requested review from ymand and Xazax-hun January 15, 2024 15:41

Xazax-hun approved these changes Jan 15, 2024

View reviewed changes

martinboehme merged commit af1463d into llvm:main Jan 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[clang][dataflow] Add an early-out to `flowConditionImplies()` / `flowConditionAllows()`. #78172

[clang][dataflow] Add an early-out to `flowConditionImplies()` / `flowConditionAllows()`. #78172

Uh oh!

martinboehme commented Jan 15, 2024

Uh oh!

llvmbot commented Jan 15, 2024 •

edited

Loading

Uh oh!

Uh oh!

[clang][dataflow] Add an early-out to flowConditionImplies() / flowConditionAllows(). #78172

[clang][dataflow] Add an early-out to flowConditionImplies() / flowConditionAllows(). #78172

Uh oh!

Conversation

martinboehme commented Jan 15, 2024

Uh oh!

llvmbot commented Jan 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

[clang][dataflow] Add an early-out to `flowConditionImplies()` / `flowConditionAllows()`. #78172

[clang][dataflow] Add an early-out to `flowConditionImplies()` / `flowConditionAllows()`. #78172

llvmbot commented Jan 15, 2024 •

edited

Loading