Skip to content

Commit fe1800a

Browse files
committed
[AMDGPU] Fix register class constraints for si-fold-operands pass
This fixes an issue where the si-fold-operands pass would incorrectly fold immediate values into COPY instructions targeting av_32 registers, which is illegal. The pass now properly checks register class constraints before attempting to fold the immediates.
1 parent 955c02d commit fe1800a

File tree

2 files changed

+41
-0
lines changed

2 files changed

+41
-0
lines changed

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1047,6 +1047,11 @@ void SIFoldOperandsImpl::foldOperand(
10471047
if (MovOp == AMDGPU::COPY)
10481048
return;
10491049

1050+
// Check if the destination register of the MOV operation belongs
1051+
// to a vector superclass. Folding would be illegal.
1052+
if (TRI->isVectorSuperClass(DestRC))
1053+
return;
1054+
10501055
MachineInstr::mop_iterator ImpOpI = UseMI->implicit_operands().begin();
10511056
MachineInstr::mop_iterator ImpOpE = UseMI->implicit_operands().end();
10521057
while (ImpOpI != ImpOpE) {
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
2+
# RUN: llc -mtriple=amdgcn -mcpu=gfx90a -run-pass=si-fold-operands -verify-machineinstrs -o - %s 2>&1 | FileCheck %s
3+
4+
---
5+
6+
name: s_mov_b32_imm_literal_copy_s_to_av_32
7+
tracksRegLiveness: true
8+
body: |
9+
bb.0:
10+
; CHECK-LABEL: name: s_mov_b32_imm_literal_copy_s_to_av_32
11+
; CHECK: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 999
12+
; CHECK-NEXT: [[COPY:%[0-9]+]]:av_32 = COPY [[S_MOV_B32_]]
13+
; CHECK-NEXT: $agpr0 = COPY [[COPY]]
14+
; CHECK-NEXT: S_ENDPGM 0
15+
%0:sreg_32 = S_MOV_B32 999
16+
%1:av_32 = COPY %0
17+
$agpr0 = COPY %1
18+
S_ENDPGM 0
19+
...
20+
21+
---
22+
23+
name: v_mov_b32_imm_literal_copy_v_to_av_32
24+
tracksRegLiveness: true
25+
body: |
26+
bb.0:
27+
; CHECK-LABEL: name: v_mov_b32_imm_literal_copy_v_to_av_32
28+
; CHECK: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 999, implicit $exec
29+
; CHECK-NEXT: [[COPY:%[0-9]+]]:av_32 = COPY [[V_MOV_B32_e32_]]
30+
; CHECK-NEXT: $agpr0 = COPY [[COPY]]
31+
; CHECK-NEXT: S_ENDPGM 0
32+
%0:vgpr_32 = V_MOV_B32_e32 999, implicit $exec
33+
%1:av_32 = COPY %0
34+
$agpr0 = COPY %1
35+
S_ENDPGM 0
36+
...

0 commit comments

Comments
 (0)