Skip to content

Commit 33967ad

Browse files
committed
[AMDGPU] Skip non-first termintors when forcing emit zero flag
1 parent ed77df5 commit 33967ad

File tree

2 files changed

+24
-1
lines changed

2 files changed

+24
-1
lines changed

llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1824,7 +1824,9 @@ bool SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI,
18241824
// Verify that the wait is actually needed.
18251825
ScoreBrackets.simplifyWaitcnt(Wait);
18261826

1827-
if (ForceEmitZeroFlag)
1827+
// When forcing emit, we need to skip terminators because that would break the
1828+
// terminators of the MBB if we emit a waitcnt between terminators.
1829+
if (ForceEmitZeroFlag && !MI.isTerminator())
18281830
Wait = WCG->getAllZeroWaitcnt(/*IncludeVSCnt=*/false);
18291831

18301832
if (ForceEmitWaitcnt[LOAD_CNT])
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# RUN: llc -mtriple=amdgcn-amd-amdhsa -run-pass si-insert-waitcnts -amdgpu-waitcnt-forcezero=1 %s -o - | FileCheck %s
2+
3+
...
4+
5+
# CHECK-LABEL: waitcnt-debug-non-first-terminators
6+
# CHECK: S_CBRANCH_SCC1 %bb.1, implicit $scc
7+
# CHECK-NEXT: S_BRANCH %bb.2, implicit $scc
8+
9+
name: waitcnt-debug-non-first-terminators
10+
liveins:
11+
machineFunctionInfo:
12+
isEntryFunction: true
13+
body: |
14+
bb.0:
15+
S_CBRANCH_SCC1 %bb.1, implicit $scc
16+
S_BRANCH %bb.2, implicit $scc
17+
bb.1:
18+
S_NOP 0
19+
bb.2:
20+
S_NOP 0
21+
...

0 commit comments

Comments
 (0)