Skip to content

Commit 6b7afaa

Browse files
authored
[AMDGPU][True16] fix a bug in codeGen causing e64 with wrong vgpr type to shrink (#102942)
This bug is introduced in #102198 The previous path change to use realTrue16 flag, however, we have some t16 instructions that are implemented with fake16, and has Lo128 registers types. Thus we should still using hasTrue16Bit flag for shrinking check --------- Co-authored-by: guochen2 <[email protected]>
1 parent 825dbbb commit 6b7afaa

File tree

2 files changed

+29
-1
lines changed

2 files changed

+29
-1
lines changed

llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1048,7 +1048,7 @@ bool SIShrinkInstructions::runOnMachineFunction(MachineFunction &MF) {
10481048
MachineFunctionProperties::Property::NoVRegs))
10491049
continue;
10501050

1051-
if (ST->useRealTrue16Insts() && AMDGPU::isTrue16Inst(MI.getOpcode()) &&
1051+
if (ST->hasTrue16BitInsts() && AMDGPU::isTrue16Inst(MI.getOpcode()) &&
10521052
!shouldShrinkTrue16(MI))
10531053
continue;
10541054

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
2+
# RUN: llc -mtriple=amdgcn -mcpu=gfx1100 -run-pass=si-shrink-instructions -verify-machineinstrs -o - %s | FileCheck -check-prefix=GFX1100 %s
3+
4+
---
5+
name: 16bit_lo128_shrink
6+
tracksRegLiveness: true
7+
body: |
8+
bb.0:
9+
liveins: $vgpr127
10+
; GFX1100-LABEL: name: 16bit_lo128_shrink
11+
; GFX1100: liveins: $vgpr127
12+
; GFX1100-NEXT: {{ $}}
13+
; GFX1100-NEXT: V_CMP_EQ_U16_t16_e32 0, $vgpr127, implicit-def $vcc, implicit $exec, implicit $exec
14+
$vcc_lo = V_CMP_EQ_U16_t16_e64 0, $vgpr127, implicit-def $vcc, implicit $exec
15+
...
16+
17+
---
18+
name: 16bit_lo128_no_shrink
19+
tracksRegLiveness: true
20+
body: |
21+
bb.0:
22+
liveins: $vgpr128
23+
; GFX1100-LABEL: name: 16bit_lo128_no_shrink
24+
; GFX1100: liveins: $vgpr128
25+
; GFX1100-NEXT: {{ $}}
26+
; GFX1100-NEXT: $vcc_lo = V_CMP_EQ_U16_t16_e64 0, $vgpr128, implicit-def $vcc_lo, implicit $exec
27+
$vcc_lo = V_CMP_EQ_U16_t16_e64 0, $vgpr128, implicit-def $vcc, implicit $exec
28+
...

0 commit comments

Comments
 (0)