Skip to content

Commit 8aa7881

Browse files
shiltianDanielCChen
authored andcommitted
[AMDGPU] Skip terminators when forcing emit zero flag (llvm#112116)
When forcing emit zero, we need to skip terminators of a MBB; otherwise the terminator list of the MBB would be broken.
1 parent 22560a7 commit 8aa7881

File tree

2 files changed

+36
-1
lines changed

2 files changed

+36
-1
lines changed

llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1824,7 +1824,9 @@ bool SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI,
18241824
// Verify that the wait is actually needed.
18251825
ScoreBrackets.simplifyWaitcnt(Wait);
18261826

1827-
if (ForceEmitZeroFlag)
1827+
// When forcing emit, we need to skip terminators because that would break the
1828+
// terminators of the MBB if we emit a waitcnt between terminators.
1829+
if (ForceEmitZeroFlag && !MI.isTerminator())
18281830
Wait = WCG->getAllZeroWaitcnt(/*IncludeVSCnt=*/false);
18291831

18301832
if (ForceEmitWaitcnt[LOAD_CNT])
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
2+
# RUN: llc -mtriple=amdgcn-amd-amdhsa -run-pass si-insert-waitcnts -amdgpu-waitcnt-forcezero=1 %s -o - | FileCheck %s
3+
4+
---
5+
name: waitcnt-debug-non-first-terminators
6+
liveins:
7+
machineFunctionInfo:
8+
isEntryFunction: true
9+
body: |
10+
; CHECK-LABEL: name: waitcnt-debug-non-first-terminators
11+
; CHECK: bb.0:
12+
; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
13+
; CHECK-NEXT: {{ $}}
14+
; CHECK-NEXT: S_CBRANCH_SCC1 %bb.1, implicit $scc
15+
; CHECK-NEXT: S_BRANCH %bb.2, implicit $scc
16+
; CHECK-NEXT: {{ $}}
17+
; CHECK-NEXT: bb.1:
18+
; CHECK-NEXT: successors: %bb.2(0x80000000)
19+
; CHECK-NEXT: {{ $}}
20+
; CHECK-NEXT: S_WAITCNT 0
21+
; CHECK-NEXT: S_NOP 0
22+
; CHECK-NEXT: {{ $}}
23+
; CHECK-NEXT: bb.2:
24+
; CHECK-NEXT: S_WAITCNT 0
25+
; CHECK-NEXT: S_NOP 0
26+
bb.0:
27+
S_CBRANCH_SCC1 %bb.1, implicit $scc
28+
S_BRANCH %bb.2, implicit $scc
29+
bb.1:
30+
S_NOP 0
31+
bb.2:
32+
S_NOP 0
33+
...

0 commit comments

Comments
 (0)