-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SPIR-V] Fix block sorting with irreducible CFG #116996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-spir-v Author: Nathan Gauër (Keenuts) ChangesBlock sorting was assuming reducible CFG. Meaning we always had a best node to continue with. Irreducible CFG makes breaks this assumption, so the algorithm looped indefinitely because no node was a valid candidate. Fixes #116692 Full diff: https://github.com/llvm/llvm-project/pull/116996.diff 4 Files Affected:
diff --git a/llvm/lib/Target/SPIRV/SPIRVUtils.cpp b/llvm/lib/Target/SPIRV/SPIRVUtils.cpp
index aeb2c29f7b8618..902dcf2ca28649 100644
--- a/llvm/lib/Target/SPIRV/SPIRVUtils.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVUtils.cpp
@@ -525,6 +525,12 @@ size_t PartialOrderingVisitor::GetNodeRank(BasicBlock *BB) const {
continue;
auto Iterator = BlockToOrder.end();
+ // This block hasn't been ranked yet. Ignoring.
+ // This doesn't happen often, but when dealing with irreducible CFG, we have
+ // to rank nodes without knowing the rank of all their predecessors.
+ if (Iterator == BlockToOrder.end())
+ continue;
+
Loop *L = LI.getLoopFor(P);
BasicBlock *Latch = L ? L->getLoopLatch() : nullptr;
@@ -550,15 +556,27 @@ size_t PartialOrderingVisitor::visit(BasicBlock *BB, size_t Unused) {
ToVisit.push(BB);
Queued.insert(BB);
+ // When the graph is irreducible, we can end up in a case where each
+ // node has a predecessor we haven't ranked yet.
+ // When such case arise, we have to pick a node to continue.
+ // This index is used to determine when we looped through all candidates.
+ // Each time a candidate is processed, this counter is reset.
+ // If the index is larger than the queue size, it means we looped.
+ size_t QueueIndex = 0;
+
while (ToVisit.size() != 0) {
BasicBlock *BB = ToVisit.front();
ToVisit.pop();
- if (!CanBeVisited(BB)) {
+ // Either the node is a candidate, or we looped already, and this is
+ // the first node we tried.
+ if (!CanBeVisited(BB) && QueueIndex <= ToVisit.size()) {
ToVisit.push(BB);
+ QueueIndex++;
continue;
}
+ QueueIndex = 0;
size_t Rank = GetNodeRank(BB);
OrderInfo Info = {Rank, BlockToOrder.size()};
BlockToOrder.emplace(BB, Info);
diff --git a/llvm/test/CodeGen/SPIRV/structurizer/cf.if.nested.ll b/llvm/test/CodeGen/SPIRV/structurizer/cf.if.nested.ll
index a69475a59db6f4..a44eec94db687e 100644
--- a/llvm/test/CodeGen/SPIRV/structurizer/cf.if.nested.ll
+++ b/llvm/test/CodeGen/SPIRV/structurizer/cf.if.nested.ll
@@ -34,28 +34,28 @@
; CHECK: %[[#bb30:]] = OpLabel
; CHECK: OpSelectionMerge %[[#bb31:]] None
; CHECK: OpBranchConditional %[[#]] %[[#bb32:]] %[[#bb33:]]
-; CHECK: %[[#bb32:]] = OpLabel
+; CHECK: %[[#bb32]] = OpLabel
; CHECK: OpSelectionMerge %[[#bb34:]] None
-; CHECK: OpBranchConditional %[[#]] %[[#bb35:]] %[[#bb34:]]
-; CHECK: %[[#bb33:]] = OpLabel
+; CHECK: OpBranchConditional %[[#]] %[[#bb35:]] %[[#bb34]]
+; CHECK: %[[#bb33]] = OpLabel
; CHECK: OpSelectionMerge %[[#bb36:]] None
; CHECK: OpBranchConditional %[[#]] %[[#bb37:]] %[[#bb38:]]
-; CHECK: %[[#bb35:]] = OpLabel
-; CHECK: OpBranch %[[#bb34:]]
-; CHECK: %[[#bb37:]] = OpLabel
-; CHECK: OpBranch %[[#bb36:]]
-; CHECK: %[[#bb38:]] = OpLabel
+; CHECK: %[[#bb35]] = OpLabel
+; CHECK: OpBranch %[[#bb34]]
+; CHECK: %[[#bb34]] = OpLabel
+; CHECK: OpBranch %[[#bb31]]
+; CHECK: %[[#bb37]] = OpLabel
+; CHECK: OpBranch %[[#bb36]]
+; CHECK: %[[#bb38]] = OpLabel
; CHECK: OpSelectionMerge %[[#bb39:]] None
-; CHECK: OpBranchConditional %[[#]] %[[#bb40:]] %[[#bb39:]]
-; CHECK: %[[#bb34:]] = OpLabel
-; CHECK: OpBranch %[[#bb31:]]
-; CHECK: %[[#bb40:]] = OpLabel
-; CHECK: OpBranch %[[#bb39:]]
-; CHECK: %[[#bb39:]] = OpLabel
-; CHECK: OpBranch %[[#bb36:]]
-; CHECK: %[[#bb36:]] = OpLabel
-; CHECK: OpBranch %[[#bb31:]]
-; CHECK: %[[#bb31:]] = OpLabel
+; CHECK: OpBranchConditional %[[#]] %[[#bb40:]] %[[#bb39]]
+; CHECK: %[[#bb40]] = OpLabel
+; CHECK: OpBranch %[[#bb39]]
+; CHECK: %[[#bb39]] = OpLabel
+; CHECK: OpBranch %[[#bb36]]
+; CHECK: %[[#bb36]] = OpLabel
+; CHECK: OpBranch %[[#bb31]]
+; CHECK: %[[#bb31]] = OpLabel
; CHECK: OpReturnValue %[[#]]
; CHECK: OpFunctionEnd
; CHECK: %[[#func_26:]] = OpFunction %[[#void:]] DontInline %[[#]]
diff --git a/llvm/unittests/Target/SPIRV/CMakeLists.txt b/llvm/unittests/Target/SPIRV/CMakeLists.txt
index e9fe4883e5b024..2af36225c5f200 100644
--- a/llvm/unittests/Target/SPIRV/CMakeLists.txt
+++ b/llvm/unittests/Target/SPIRV/CMakeLists.txt
@@ -15,6 +15,6 @@ set(LLVM_LINK_COMPONENTS
add_llvm_target_unittest(SPIRVTests
SPIRVConvergenceRegionAnalysisTests.cpp
+ SPIRVSortBlocksTests.cpp
SPIRVAPITest.cpp
- )
-
+)
diff --git a/llvm/unittests/Target/SPIRV/SPIRVSortBlocksTests.cpp b/llvm/unittests/Target/SPIRV/SPIRVSortBlocksTests.cpp
new file mode 100644
index 00000000000000..487f116aa3ace6
--- /dev/null
+++ b/llvm/unittests/Target/SPIRV/SPIRVSortBlocksTests.cpp
@@ -0,0 +1,262 @@
+//===- SPIRVSortBlocksTests.cpp ----------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "SPIRVUtils.h"
+#include "llvm/Analysis/DominanceFrontier.h"
+#include "llvm/Analysis/PostDominators.h"
+#include "llvm/AsmParser/Parser.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/LLVMContext.h"
+#include "llvm/IR/LegacyPassManager.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IR/PassInstrumentation.h"
+#include "llvm/IR/Type.h"
+#include "llvm/IR/TypedPointerType.h"
+#include "llvm/Support/SourceMgr.h"
+
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+#include <queue>
+
+using namespace llvm;
+using namespace llvm::SPIRV;
+
+class SPIRVSortBlocksTest : public testing::Test {
+protected:
+ void TearDown() override { M.reset(); }
+
+ bool run(StringRef Assembly) {
+ assert(M == nullptr &&
+ "Calling runAnalysis multiple times is unsafe. See getAnalysis().");
+
+ SMDiagnostic Error;
+ M = parseAssemblyString(Assembly, Error, Context);
+ assert(M && "Bad assembly. Bad test?");
+ llvm::Function *F = M->getFunction("main");
+ return sortBlocks(*F);
+ }
+
+ void checkBasicBlockOrder(std::vector<const char *> &&Expected) {
+ llvm::Function *F = M->getFunction("main");
+ auto It = F->begin();
+ for (const auto *Name : Expected) {
+ ASSERT_TRUE(It != F->end())
+ << "Expected block \"" << Name
+ << "\" but reached the end of the function instead.";
+ ASSERT_TRUE(It->getName() == Name)
+ << "Error: expected block \"" << Name << "\" got \"" << It->getName()
+ << "\"";
+ It++;
+ }
+ EXPECT_TRUE(It == F->end());
+ ASSERT_TRUE(It == F->end())
+ << "No more blocks were expected, but function has more.";
+ }
+
+protected:
+ LLVMContext Context;
+ std::unique_ptr<Module> M;
+};
+
+TEST_F(SPIRVSortBlocksTest, DefaultRegion) {
+ StringRef Assembly = R"(
+ define void @main() convergent "hlsl.numthreads"="4,8,16" "hlsl.shader"="compute" {
+ ret void
+ }
+ )";
+
+ EXPECT_FALSE(run(Assembly));
+}
+
+TEST_F(SPIRVSortBlocksTest, BasicBlockSwap) {
+ StringRef Assembly = R"(
+ define void @main() convergent "hlsl.numthreads"="4,8,16" "hlsl.shader"="compute" {
+ entry:
+ br label %middle
+ exit:
+ ret void
+ middle:
+ br label %exit
+ }
+ )";
+
+ EXPECT_TRUE(run(Assembly));
+ checkBasicBlockOrder({"entry", "middle", "exit"});
+}
+
+// Simple loop:
+// entry -> header <-----------------+
+// | `-> body -> continue -+
+// `-> end
+TEST_F(SPIRVSortBlocksTest, LoopOrdering) {
+ StringRef Assembly = R"(
+ define void @main() convergent "hlsl.numthreads"="4,8,16" "hlsl.shader"="compute" {
+ entry:
+ %1 = icmp ne i32 0, 0
+ br label %header
+ end:
+ ret void
+ body:
+ br label %continue
+ continue:
+ br label %header
+ header:
+ br i1 %1, label %body, label %end
+ }
+ )";
+
+ EXPECT_TRUE(run(Assembly));
+ checkBasicBlockOrder({"entry", "header", "body", "continue", "end"});
+}
+
+// Diamond condition:
+// +-> A -+
+// entry -+ +-> C
+// +-> B -+
+//
+// A and B order can be flipped with no effect, but it must be remain
+// deterministic/stable.
+TEST_F(SPIRVSortBlocksTest, DiamondCondition) {
+ StringRef Assembly = R"(
+ define void @main() convergent "hlsl.numthreads"="4,8,16" "hlsl.shader"="compute" {
+ entry:
+ %1 = icmp ne i32 0, 0
+ br i1 %1, label %a, label %b
+ c:
+ ret void
+ b:
+ br label %c
+ a:
+ br label %c
+ }
+ )";
+
+ EXPECT_TRUE(run(Assembly));
+ checkBasicBlockOrder({"entry", "a", "b", "c"});
+}
+
+// Skip condition:
+// +-> A -+
+// entry -+ +-> C
+// +------+
+TEST_F(SPIRVSortBlocksTest, SkipCondition) {
+ StringRef Assembly = R"(
+ define void @main() convergent "hlsl.numthreads"="4,8,16" "hlsl.shader"="compute" {
+ entry:
+ %1 = icmp ne i32 0, 0
+ br i1 %1, label %a, label %c
+ c:
+ ret void
+ a:
+ br label %c
+ }
+ )";
+
+ EXPECT_TRUE(run(Assembly));
+ checkBasicBlockOrder({"entry", "a", "c"});
+}
+
+// Crossing conditions:
+// +------+ +-> C -+
+// +-> A -+ | | |
+// entry -+ +--_|_-+ +-> E
+// +-> B -+ | |
+// +------+----> D -+
+//
+// A & B have the same rank.
+// C & D have the same rank, but are after A & B.
+// E if the last block.
+TEST_F(SPIRVSortBlocksTest, CrossingCondition) {
+ StringRef Assembly = R"(
+ define void @main() convergent "hlsl.numthreads"="4,8,16" "hlsl.shader"="compute" {
+ entry:
+ %1 = icmp ne i32 0, 0
+ br i1 %1, label %a, label %b
+ e:
+ ret void
+ c:
+ br label %e
+ b:
+ br i1 %1, label %d, label %c
+ d:
+ br label %e
+ a:
+ br i1 %1, label %c, label %d
+ }
+ )";
+
+ EXPECT_TRUE(run(Assembly));
+ checkBasicBlockOrder({"entry", "a", "b", "c", "d", "e"});
+}
+
+// Irreducible CFG
+// digraph {
+// entry -> A;
+//
+// A -> B;
+// A -> C;
+//
+// B -> A;
+// B -> C;
+//
+// C -> B;
+// }
+//
+// Order starts with Entry and A. Order of B and C can change, but must remain
+// stable.
+TEST_F(SPIRVSortBlocksTest, IrreducibleOrdering) {
+ StringRef Assembly = R"(
+ define void @main() convergent "hlsl.numthreads"="4,8,16" "hlsl.shader"="compute" {
+ entry:
+ %1 = icmp ne i32 0, 0
+ br label %a
+
+ b:
+ br i1 %1, label %a, label %c
+
+ c:
+ br label %b
+
+ a:
+ br i1 %1, label %b, label %c
+
+ }
+ )";
+
+ EXPECT_TRUE(run(Assembly));
+ checkBasicBlockOrder({"entry", "a", "b", "c"});
+}
+
+TEST_F(SPIRVSortBlocksTest, IrreducibleOrderingBeforeReduction) {
+ StringRef Assembly = R"(
+ define void @main() convergent "hlsl.numthreads"="4,8,16" "hlsl.shader"="compute" {
+ entry:
+ %1 = icmp ne i32 0, 0
+ br label %a
+
+ c:
+ br i1 %1, label %d, label %e
+
+ e:
+ ret void
+
+ b:
+ br i1 %1, label %c, label %d
+
+ a:
+ br label %b
+
+ d:
+ br i1 %1, label %b, label %c
+
+ }
+ )";
+
+ EXPECT_TRUE(run(Assembly));
+ checkBasicBlockOrder({"entry", "a", "b", "c", "d", "e"});
+}
|
Does it make sense to add the reproducer from #116692, or those changes in the PR covers the reproducer's role in full? |
I added 2 versions of the reproducer in the tests:
|
00ce1ff
to
cb13282
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this works for irreducible regions because I am not sure what we are trying to accomplish.
This seems to be the strategy: if no node in toVisit
has all of it predecessor ranked yet, then essentially randomly pick a node in toVisit
as the next node.
Theoretically we could have:
A -> B
A -> C
A -> D
B -> C
B -> D
C -> B
In this case, you will get stuck with toVisit = { B, C, D }
. Do we care if we choose D
next, even if it is not part of the irreducible region? This would mean that D
will have the same rank as B
and C
, and will have the earlier traversal number for the tie-breaker.
I tryed to find the restrictions on block order in spir-v. All I found was:
The order of blocks in a function must satisfy the rule that blocks appear before all blocks they dominate.
If this is the only restriction, then we can do a much simpler algorithm: any pre-order traversal of the dominator tree.
I thought there was an restriction where a constructs has to be properly nested in each other for structured control flow. So we could not have H1, H2, M1, M2 where H is a header with M as its merge block. However, I cannot find that.
For irreducible sub-graphs, what we want is just to comply with the requirement that a block must be placed the ones it dominates in the binary.
In this case, no, we don't care. Maybe if we need a better fix, we could decide to isolate the irreducible region (here determine graph could be reducible if B-C were merged, hence pick should be between B and C).
This would be fine for irreducible graph sorting. But the topological visitor used here is also used for structurizing. Given this graph:
The Dom-tree is as follows:
The header dominates both the exit and the continue block. But we MUST give C a lower rank than D so we can properly structurize the graph. |
Why is this important? I thought that was part of the spec, but I cannot find it. Is it part of how the structurizer is implemented? For changes to this PR, you should update the comments in SPIRVUtils.h to mention the case where it might fail to give a topological order, and why it does not matter. There should be an assert that when it fails we do not need to run the structurizer (we are not compiling for Vulkan). EDIT: Now that I think about it. I don't like that a class could silently fail to do what it is suppose to do. It could be confusing for people who try to us it in a different context. I'm wondering if we should assert the that CFG is reducible before we even start, and use a different sorting algorithm when the Those are my thoughts. However, I'm okay with whatever the community decides. |
For the problem of BB sorting for binary output, I don't think there is a spec bit that says so. We have 3 parts:
Because I had the partial ordering, which provided a super-set of the requirements for sorting, I used it to sort blocks.
Sure, I can add a comment. But for irreducible, this order is not defined, so it does not "fail". It just gives 1 version of it.
Once again, IMO it doesn't fail. Topological sorting is not defined the graphs with cycles, even less for irreducible graphs. Hence this is a kinda-topological with a bias on the loop body ordering. So I see 2 ways forward:
|
I'll send another PR which sorts the IR using a preorder traversal instead. This one will instead add tests to the visitor, and make irreducible case fail and not hang. |
Block sorting was assuming reducible CFG. Meaning we always had a best node to continue with. Irreducible CFG makes breaks this assumption, so the algorithm looped indefinitely because no node was a valid candidate. Fixes llvm#116692 Signed-off-by: Nathan Gauër <[email protected]>
This commit also changes many tests as the new sorting has a slightly different order: - second brcond operand goes first if possible. - switch operands are sorted last to first.
cb13282
to
9302b74
Compare
Signed-off-by: Nathan Gauër <[email protected]>
Signed-off-by: Nathan Gauër <[email protected]>
Signed-off-by: Nathan Gauër <[email protected]>
Hello all! The PR to handle irreducible in the visitor is #117887 |
Irrelevant to this PR, but nevertheless -- this |
Same here, can see it's not related to this PR, but cannot reproduce it on my machine. Not sure why. |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/1715 Here is the relevant piece of the build log for the reference
|
Block sorting was assuming reducible CFG. Meaning we always had a best node to continue with. Irreducible CFG makes breaks this assumption, so the algorithm looped indefinitely because no node was a valid candidate.
Fixes #116692