Skip to content

Commit b9ba053

Browse files
committed
[AMDGPU] Don't S_MOV_B32 into $scc
The peephole optimizer tries to replace ``` %n:sgpr_32 = S_MOV_B32 x $scc = COPY %n ``` with a `S_MOV_B32` directly into `$scc`. This crashes because `S_MOV_B32` cannot take `$scc` as input. We currently generate code like this from GlobalISel when lowering a G_BRCOND with a constant condition. We should probably look into removing this kind of branch altogether, but until then we should at least not crash. This patch fixes the issue by making sure we don't apply the peephole optimization when trying to move into a physical register that doesn't belong to the correct register class. Differential Revision: https://reviews.llvm.org/D148117
1 parent 243e62b commit b9ba053

File tree

2 files changed

+51
-1
lines changed

2 files changed

+51
-1
lines changed

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3090,7 +3090,12 @@ bool SIInstrInfo::FoldImmediate(MachineInstr &UseMI, MachineInstr &DefMI,
30903090
assert(UseMI.getOperand(1).getReg().isVirtual());
30913091
}
30923092

3093-
UseMI.setDesc(get(NewOpc));
3093+
const MCInstrDesc &NewMCID = get(NewOpc);
3094+
if (DstReg.isPhysical() &&
3095+
!RI.getRegClass(NewMCID.operands()[0].RegClass)->contains(DstReg))
3096+
return false;
3097+
3098+
UseMI.setDesc(NewMCID);
30943099
UseMI.getOperand(1).ChangeToImmediate(Imm.getSExtValue());
30953100
UseMI.addImplicitDefUseOperands(*UseMI.getParent()->getParent());
30963101
return true;

llvm/test/CodeGen/AMDGPU/fold_16bit_imm.mir renamed to llvm/test/CodeGen/AMDGPU/peephole-fold-imm.mir

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,51 @@
11
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
22
# RUN: llc -mtriple=amdgcn--amdhsa -mcpu=gfx908 -verify-machineinstrs -run-pass peephole-opt -o - %s | FileCheck -check-prefix=GCN %s
33

4+
---
5+
name: fold_simm_virtual
6+
body: |
7+
bb.0:
8+
9+
; GCN-LABEL: name: fold_simm_virtual
10+
; GCN: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
11+
; GCN-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 0
12+
; GCN-NEXT: SI_RETURN_TO_EPILOG
13+
%0:sreg_32 = S_MOV_B32 0
14+
%1:sreg_32 = COPY killed %0
15+
SI_RETURN_TO_EPILOG
16+
17+
...
18+
19+
---
20+
name: fold_simm_physical
21+
body: |
22+
bb.0:
23+
24+
; GCN-LABEL: name: fold_simm_physical
25+
; GCN: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
26+
; GCN-NEXT: $sgpr1 = S_MOV_B32 0
27+
; GCN-NEXT: SI_RETURN_TO_EPILOG
28+
%0:sreg_32 = S_MOV_B32 0
29+
$sgpr1 = COPY killed %0
30+
SI_RETURN_TO_EPILOG
31+
32+
...
33+
34+
---
35+
name: dont_fold_simm_scc
36+
body: |
37+
bb.0:
38+
39+
; GCN-LABEL: name: dont_fold_simm_scc
40+
; GCN: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
41+
; GCN-NEXT: $scc = COPY killed [[S_MOV_B32_]]
42+
; GCN-NEXT: SI_RETURN_TO_EPILOG
43+
%0:sreg_32 = S_MOV_B32 0
44+
$scc = COPY killed %0
45+
SI_RETURN_TO_EPILOG
46+
47+
...
48+
449
---
550
name: fold_simm_16_sub_to_lo
651
body: |

0 commit comments

Comments
 (0)