Skip to content

Commit 53f967f

Browse files
committed
[AMDGPU] Run unreachable-mbb-elimination after isel to clean up PHIs.
Summary: - As LCSSA is turned on just before isel, it may create PHI of the flow, which is consumed by pseudo structurized CFG instructions. When that PHIs are eliminated in O0, COPY may be placed wrongly as the these pseudo structurized CFG instructions are considering prologue of MBB. - Run extra `unreachable-mbb-elimination` at the end of isel to clean up PHIs. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64353 llvm-svn: 367023
1 parent 0ef3f27 commit 53f967f

File tree

2 files changed

+29
-0
lines changed

2 files changed

+29
-0
lines changed

llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -882,6 +882,9 @@ bool GCNPassConfig::addInstSelector() {
882882
addPass(createSILowerI1CopiesPass());
883883
addPass(createSIFixupVectorISelPass());
884884
addPass(createSIAddIMGInitPass());
885+
// FIXME: Remove this once the phi on CF_END is cleaned up by either removing
886+
// LCSSA or other ways.
887+
addPass(&UnreachableMachineBlockElimID);
885888
return false;
886889
}
887890

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
; RUN: llc -march=amdgcn -O0 -o - %s | FileCheck %s
2+
3+
; CHECK-LABEL: non_uniform_loop
4+
; CHECK: s_endpgm
5+
define amdgpu_kernel void @non_uniform_loop(float addrspace(1)* %array) {
6+
entry:
7+
%w = tail call i32 @llvm.amdgcn.workitem.id.x()
8+
br label %for.cond
9+
10+
for.cond:
11+
%i = phi i32 [0, %entry], [%i.next, %for.inc]
12+
%cmp = icmp ult i32 %i, %w
13+
br i1 %cmp, label %for.body, label %for.end
14+
15+
for.body:
16+
br label %for.inc
17+
18+
for.inc:
19+
%i.next = add i32 %i, 1
20+
br label %for.cond
21+
22+
for.end:
23+
ret void
24+
}
25+
26+
declare i32 @llvm.amdgcn.workitem.id.x()

0 commit comments

Comments
 (0)