[clang][CFG] Change child order in Reverse Post Order (RPO) iteration. #80030

ymand · 2024-01-30T16:44:54Z

The CFG orders the blocks of loop bodies before those of loop successors (both numerically, and in the successor order of the loop condition block). So, RPO necessarily reverses that order, placing the loop successor before the loop body. For many analyses, particularly those that converge to a fixpoint, this results in potentially significant extra work, because loop successors will necessarily need to be reconsidered once the algorithm has reached a fixpoint on the loop body.

This definition of CFG graph traits reverses the order of children, so that loop bodies will come first in an RPO. Then, the algorithm can fully evaluate the loop and only then consider successor blocks.

llvmbot · 2024-01-30T16:45:26Z

@llvm/pr-subscribers-clang-analysis

Author: Yitzhak Mandelbaum (ymand)

Changes

The CFG orders the blocks of loop bodies before those of loop successors
(both numerically, and in the successor order of the loop condition
block). So, RPO necessarily reverses that order, placing the loop successor
before the loop body. For many analyses, particularly those that converge
to a fixpoint, this results in potentially significant extra work, because loop
successors will necessarily need to be reconsidered once the algorithm has
reached a fixpoint on the loop body.

This definition of CFG graph traits reverses the order of children, so that loop
bodies will come first in an RPO. Then, the algorithm can fully evaluate the
loop and only then consider successor blocks.

Full diff: https://github.com/llvm/llvm-project/pull/80030.diff

3 Files Affected:

(modified) clang/include/clang/Analysis/Analyses/PostOrderCFGView.h (+35-1)
(modified) clang/unittests/Analysis/CFGTest.cpp (+1-9)
(modified) clang/unittests/Analysis/FlowSensitive/LoggerTest.cpp (+7-8)

diff --git a/clang/include/clang/Analysis/Analyses/PostOrderCFGView.h b/clang/include/clang/Analysis/Analyses/PostOrderCFGView.h
index 4356834adf76f..031c4bc15338e 100644
--- a/clang/include/clang/Analysis/Analyses/PostOrderCFGView.h
+++ b/clang/include/clang/Analysis/Analyses/PostOrderCFGView.h
@@ -70,7 +70,41 @@ class PostOrderCFGView : public ManagedAnalysis {
   };
 
 private:
-  using po_iterator = llvm::po_iterator<const CFG *, CFGBlockSet, true>;
+  // The CFG orders the blocks of loop bodies before those of loop successors
+  // (both numerically, and in the successor order of the loop condition
+  // block). So, RPO necessarily reverses that order, placing the loop successor
+  // *before* the loop body. For many analyses, particularly those that converge
+  // to a fixpoint, this results in potentially significant extra work as loop
+  // successors will necessarily need to be reconsidered once the algorithm has
+  // reached a fixpoint on the loop body.
+  //
+  // This definition of CFG graph traits reverses the order of children, so that
+  // loop bodies will come first in an RPO.
+  struct CFGLoopBodyFirstTraits {
+    using NodeRef = const ::clang::CFGBlock *;
+    using ChildIteratorType = ::clang::CFGBlock::const_succ_reverse_iterator;
+
+    static ChildIteratorType child_begin(NodeRef N) { return N->succ_rbegin(); }
+    static ChildIteratorType child_end(NodeRef N) { return N->succ_rend(); }
+
+    using nodes_iterator = ::clang::CFG::const_iterator;
+
+    static NodeRef getEntryNode(const ::clang::CFG *F) {
+      return &F->getEntry();
+    }
+
+    static nodes_iterator nodes_begin(const ::clang::CFG *F) {
+      return F->nodes_begin();
+    }
+
+    static nodes_iterator nodes_end(const ::clang::CFG *F) {
+      return F->nodes_end();
+    }
+
+    static unsigned size(const ::clang::CFG *F) { return F->size(); }
+  };
+  using po_iterator =
+      llvm::po_iterator<const CFG *, CFGBlockSet, true, CFGLoopBodyFirstTraits>;
   std::vector<const CFGBlock *> Blocks;
 
   using BlockOrderTy = llvm::DenseMap<const CFGBlock *, unsigned>;
diff --git a/clang/unittests/Analysis/CFGTest.cpp b/clang/unittests/Analysis/CFGTest.cpp
index 9fa034a0e472e..2b27da0081425 100644
--- a/clang/unittests/Analysis/CFGTest.cpp
+++ b/clang/unittests/Analysis/CFGTest.cpp
@@ -276,14 +276,6 @@ TEST(CFG, Worklists) {
                            ForwardNodes.begin()));
   }
 
-  // RPO: 876321054
-  // WTO: 876534210
-  // So, we construct the WTO order accordingly from the reference order.
-  std::vector<const CFGBlock *> WTOOrder = {
-      ReferenceOrder[0], ReferenceOrder[1], ReferenceOrder[2],
-      ReferenceOrder[7], ReferenceOrder[3], ReferenceOrder[8],
-      ReferenceOrder[4], ReferenceOrder[5], ReferenceOrder[6]};
-
   {
     using ::testing::ElementsAreArray;
     std::optional<WeakTopologicalOrdering> WTO = getIntervalWTO(*CFG);
@@ -297,7 +289,7 @@ TEST(CFG, Worklists) {
     while (const CFGBlock *B = WTOWorklist.dequeue())
       WTONodes.push_back(B);
 
-    EXPECT_THAT(WTONodes, ElementsAreArray(WTOOrder));
+    EXPECT_THAT(WTONodes, ElementsAreArray(*WTO));
   }
 
   std::reverse(ReferenceOrder.begin(), ReferenceOrder.end());
diff --git a/clang/unittests/Analysis/FlowSensitive/LoggerTest.cpp b/clang/unittests/Analysis/FlowSensitive/LoggerTest.cpp
index c5594aa3fe9d1..57920c49a7d3d 100644
--- a/clang/unittests/Analysis/FlowSensitive/LoggerTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/LoggerTest.cpp
@@ -5,7 +5,6 @@
 #include "clang/Analysis/FlowSensitive/DataflowLattice.h"
 #include "llvm/Testing/Support/Error.h"
 #include "gtest/gtest.h"
-#include <optional>
 
 namespace clang::dataflow::test {
 namespace {
@@ -90,7 +89,7 @@ class TestLogger : public Logger {
 AnalysisInputs<TestAnalysis> makeInputs() {
   const char *Code = R"cpp(
 int target(bool b, int p, int q) {
-  return b ? p : q;    
+  return b ? p : q;
 }
 )cpp";
   static const std::vector<std::string> Args = {
@@ -125,17 +124,17 @@ transfer()
 recordState(Elements=2, Branches=0, Joins=0)
 recordState(Elements=2, Branches=0, Joins=0)
 
-enterBlock(3, false)
-transferBranch(0)
+enterBlock(2, false)
+transferBranch(1)
 recordState(Elements=2, Branches=1, Joins=0)
-enterElement(q)
+enterElement(p)
 transfer()
 recordState(Elements=3, Branches=1, Joins=0)
 
-enterBlock(2, false)
-transferBranch(1)
+enterBlock(3, false)
+transferBranch(0)
 recordState(Elements=2, Branches=1, Joins=0)
-enterElement(p)
+enterElement(q)
 transfer()
 recordState(Elements=3, Branches=1, Joins=0)

llvmbot · 2024-01-30T16:45:26Z

@llvm/pr-subscribers-clang

Author: Yitzhak Mandelbaum (ymand)

Changes

The CFG orders the blocks of loop bodies before those of loop successors
(both numerically, and in the successor order of the loop condition
block). So, RPO necessarily reverses that order, placing the loop successor
before the loop body. For many analyses, particularly those that converge
to a fixpoint, this results in potentially significant extra work, because loop
successors will necessarily need to be reconsidered once the algorithm has
reached a fixpoint on the loop body.

This definition of CFG graph traits reverses the order of children, so that loop
bodies will come first in an RPO. Then, the algorithm can fully evaluate the
loop and only then consider successor blocks.

Full diff: https://github.com/llvm/llvm-project/pull/80030.diff

3 Files Affected:

(modified) clang/include/clang/Analysis/Analyses/PostOrderCFGView.h (+35-1)
(modified) clang/unittests/Analysis/CFGTest.cpp (+1-9)
(modified) clang/unittests/Analysis/FlowSensitive/LoggerTest.cpp (+7-8)

diff --git a/clang/include/clang/Analysis/Analyses/PostOrderCFGView.h b/clang/include/clang/Analysis/Analyses/PostOrderCFGView.h
index 4356834adf76f..031c4bc15338e 100644
--- a/clang/include/clang/Analysis/Analyses/PostOrderCFGView.h
+++ b/clang/include/clang/Analysis/Analyses/PostOrderCFGView.h
@@ -70,7 +70,41 @@ class PostOrderCFGView : public ManagedAnalysis {
   };
 
 private:
-  using po_iterator = llvm::po_iterator<const CFG *, CFGBlockSet, true>;
+  // The CFG orders the blocks of loop bodies before those of loop successors
+  // (both numerically, and in the successor order of the loop condition
+  // block). So, RPO necessarily reverses that order, placing the loop successor
+  // *before* the loop body. For many analyses, particularly those that converge
+  // to a fixpoint, this results in potentially significant extra work as loop
+  // successors will necessarily need to be reconsidered once the algorithm has
+  // reached a fixpoint on the loop body.
+  //
+  // This definition of CFG graph traits reverses the order of children, so that
+  // loop bodies will come first in an RPO.
+  struct CFGLoopBodyFirstTraits {
+    using NodeRef = const ::clang::CFGBlock *;
+    using ChildIteratorType = ::clang::CFGBlock::const_succ_reverse_iterator;
+
+    static ChildIteratorType child_begin(NodeRef N) { return N->succ_rbegin(); }
+    static ChildIteratorType child_end(NodeRef N) { return N->succ_rend(); }
+
+    using nodes_iterator = ::clang::CFG::const_iterator;
+
+    static NodeRef getEntryNode(const ::clang::CFG *F) {
+      return &F->getEntry();
+    }
+
+    static nodes_iterator nodes_begin(const ::clang::CFG *F) {
+      return F->nodes_begin();
+    }
+
+    static nodes_iterator nodes_end(const ::clang::CFG *F) {
+      return F->nodes_end();
+    }
+
+    static unsigned size(const ::clang::CFG *F) { return F->size(); }
+  };
+  using po_iterator =
+      llvm::po_iterator<const CFG *, CFGBlockSet, true, CFGLoopBodyFirstTraits>;
   std::vector<const CFGBlock *> Blocks;
 
   using BlockOrderTy = llvm::DenseMap<const CFGBlock *, unsigned>;
diff --git a/clang/unittests/Analysis/CFGTest.cpp b/clang/unittests/Analysis/CFGTest.cpp
index 9fa034a0e472e..2b27da0081425 100644
--- a/clang/unittests/Analysis/CFGTest.cpp
+++ b/clang/unittests/Analysis/CFGTest.cpp
@@ -276,14 +276,6 @@ TEST(CFG, Worklists) {
                            ForwardNodes.begin()));
   }
 
-  // RPO: 876321054
-  // WTO: 876534210
-  // So, we construct the WTO order accordingly from the reference order.
-  std::vector<const CFGBlock *> WTOOrder = {
-      ReferenceOrder[0], ReferenceOrder[1], ReferenceOrder[2],
-      ReferenceOrder[7], ReferenceOrder[3], ReferenceOrder[8],
-      ReferenceOrder[4], ReferenceOrder[5], ReferenceOrder[6]};
-
   {
     using ::testing::ElementsAreArray;
     std::optional<WeakTopologicalOrdering> WTO = getIntervalWTO(*CFG);
@@ -297,7 +289,7 @@ TEST(CFG, Worklists) {
     while (const CFGBlock *B = WTOWorklist.dequeue())
       WTONodes.push_back(B);
 
-    EXPECT_THAT(WTONodes, ElementsAreArray(WTOOrder));
+    EXPECT_THAT(WTONodes, ElementsAreArray(*WTO));
   }
 
   std::reverse(ReferenceOrder.begin(), ReferenceOrder.end());
diff --git a/clang/unittests/Analysis/FlowSensitive/LoggerTest.cpp b/clang/unittests/Analysis/FlowSensitive/LoggerTest.cpp
index c5594aa3fe9d1..57920c49a7d3d 100644
--- a/clang/unittests/Analysis/FlowSensitive/LoggerTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/LoggerTest.cpp
@@ -5,7 +5,6 @@
 #include "clang/Analysis/FlowSensitive/DataflowLattice.h"
 #include "llvm/Testing/Support/Error.h"
 #include "gtest/gtest.h"
-#include <optional>
 
 namespace clang::dataflow::test {
 namespace {
@@ -90,7 +89,7 @@ class TestLogger : public Logger {
 AnalysisInputs<TestAnalysis> makeInputs() {
   const char *Code = R"cpp(
 int target(bool b, int p, int q) {
-  return b ? p : q;    
+  return b ? p : q;
 }
 )cpp";
   static const std::vector<std::string> Args = {
@@ -125,17 +124,17 @@ transfer()
 recordState(Elements=2, Branches=0, Joins=0)
 recordState(Elements=2, Branches=0, Joins=0)
 
-enterBlock(3, false)
-transferBranch(0)
+enterBlock(2, false)
+transferBranch(1)
 recordState(Elements=2, Branches=1, Joins=0)
-enterElement(q)
+enterElement(p)
 transfer()
 recordState(Elements=3, Branches=1, Joins=0)
 
-enterBlock(2, false)
-transferBranch(1)
+enterBlock(3, false)
+transferBranch(0)
 recordState(Elements=2, Branches=1, Joins=0)
-enterElement(p)
+enterElement(q)
 transfer()
 recordState(Elements=3, Branches=1, Joins=0)

ymand · 2024-01-30T16:47:05Z

Gabor, please let me know if you think anyone else should review this change, given that there are assorted clients of the PostCFG view. We think this is a benefit for anyone using a post-order view in a worklist algorithm, but have a hard time anticipating all the potential uses.

Xazax-hun · 2024-01-30T17:10:00Z

I agree that this is a change other clients would also benefit from. I am not that worried about breaking other users, RPO is a partial ordering, and now we are picking a different (but valid) order. Clients depending on this detail are depending on an implementation detail and it would probably be Hyrum's law in action. I do not have an example where the old behavior would be desirable, so I think it might be better to change/fix the clients in that case. If we are proven wrong, we can always introduce a non-type template argument as a customization point to revert back to the old behavior.

The CFG orders the blocks of loop bodies before those of loop successors (both numerically, and in the successor order of the loop condition block). So, RPO necessarily reverses that order, placing the loop successor *before* the loop body. For many analyses, particularly those that converge to a fixpoint, this results in potentially significant extra work, because loop successors will necessarily need to be reconsidered once the algorithm has reached a fixpoint on the loop body. This definition of CFG graph traits reverses the order of children, so that loop bodies will come first in an RPO. Then, the algorithm can fully evaluate the loop and only then consider successor blocks.

This expands to three nested for loops and led to "max blocks visited" errors when ymand@ initially tried to change the order in which block successors are visited in the dataflow framework ([#80030](llvm/llvm-project#80030)). We now no longer see these "max blocks visited" errors with this test because of [other](3dfd35e) [fixes](llvm/llvm-project#80033) that have landed in the meantime. Nevertheless, it seems worthwhile to put this test in place to prevent regressions. PiperOrigin-RevId: 603020024 Change-Id: I7a8428a77a2f4252d012eec30e42ccb0c1b96bc5

Our intent is to signal "perform the default behavior", i.e. keep the `MergedVal` that has been produced by the framework. This would be done by returning true (which is also what the [default implementation](https://github.com/llvm/llvm-project/blob/d5c9d402f07e7448cd46870a59d981f582682548/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h#L105) does). However, we were returning false, which means that the framework should drop the `MergedVal` and remove the corresponding entry from `LocToVal`. This was initially causing "max blocks visited" errors on the `TriplyNestedForLoopSingleIteration` test when ymand@ tried to change the order in which the dataflow framework visits block successors ([#80030](llvm/llvm-project#80030)). We now no longer see these errors because of [other](3dfd35e) [fixes](llvm/llvm-project#80033) that have landed in the meantime. Nevertheless, it seems worthwhile to fix `merge()`, as this reduces the number of block visits required to achieve convergence. On the `TriplyNestedForLoopSingleIteration` test, we require 66 block visits before this patch, and 53 block visits after. The issue was that when performing the join at the loop head, our custom merge logic was causing the merged value for the variable `int c` to be discarded from `LocToVal`. This in turn was blocking convergence because widening [considers](https://github.com/llvm/llvm-project/blob/d5c9d402f07e7448cd46870a59d981f582682548/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp#L626) two environments to be different if the sizes of their `LocToVal` maps differ. As an aside, widening is inconsistent in this respect with the convergence check that we perform on blocks that don't get widened. [`Environment::equivalentTo()`](https://github.com/llvm/llvm-project/blob/d5c9d402f07e7448cd46870a59d981f582682548/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp#L570), which performs this check, uses `compareKeyToValueMaps()` to compare the `LocToVal` maps of the two environments, and `compareKeyToValueMaps()` only compares values for locations that are in both `LocToVal` maps and ignores locations that only occur in one of the maps. We should probably fix this inconsistency. PiperOrigin-RevId: 603025629 Change-Id: Id02993e576023c44a335c8991020f7e233a75320

ymand requested a review from martinboehme January 30, 2024 16:45

llvmbot added clang Clang issues not falling into any other category clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Jan 30, 2024

ymand requested a review from Xazax-hun January 30, 2024 16:45

ymand force-pushed the fix-rpo branch from 05caf24 to 974b337 Compare January 30, 2024 16:49

Xazax-hun approved these changes Jan 30, 2024

View reviewed changes

ymand force-pushed the fix-rpo branch from 974b337 to 4ea1f14 Compare January 30, 2024 21:07

ymand merged commit 2c5a0d3 into llvm:main Jan 30, 2024

ymand deleted the fix-rpo branch January 30, 2024 21:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[clang][CFG] Change child order in Reverse Post Order (RPO) iteration. #80030

[clang][CFG] Change child order in Reverse Post Order (RPO) iteration. #80030

Uh oh!

ymand commented Jan 30, 2024 •

edited

Loading

Uh oh!

llvmbot commented Jan 30, 2024

Uh oh!

llvmbot commented Jan 30, 2024

Uh oh!

ymand commented Jan 30, 2024

Uh oh!

Xazax-hun commented Jan 30, 2024 •

edited

Loading

Uh oh!

Uh oh!

[clang][CFG] Change child order in Reverse Post Order (RPO) iteration. #80030

[clang][CFG] Change child order in Reverse Post Order (RPO) iteration. #80030

Uh oh!

Conversation

ymand commented Jan 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jan 30, 2024

Uh oh!

llvmbot commented Jan 30, 2024

Uh oh!

ymand commented Jan 30, 2024

Uh oh!

Xazax-hun commented Jan 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ymand commented Jan 30, 2024 •

edited

Loading

Xazax-hun commented Jan 30, 2024 •

edited

Loading