Skip to content

Commit 88149fb

Browse files
Jeffrey Byrnesjrbyrnes
authored andcommitted
[AMDGPU][GFX908] IndirectCopyToAGPR: Confirm modified register is dst reg of accvgpr_write
IndirectCopyToAGPR should be reworked as to avoid optimizing during copy lowering. However, as it stands, the code is buggy. This patch replaces the call to definesRegister with modifiesRegister, and confirms that the dest reg of the found accvgpr_write is in fact the src reg of our copy. Differential Revision: https://reviews.llvm.org/D149873 Change-Id: Id8a61659ac15565dcb970069d0624f0925a46e6d
1 parent 39fe48b commit 88149fb

File tree

2 files changed

+10
-5
lines changed

2 files changed

+10
-5
lines changed

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -578,9 +578,12 @@ static void indirectCopyToAGPR(const SIInstrInfo &TII,
578578
if (!RegsOverlap) {
579579
for (auto Def = MI, E = MBB.begin(); Def != E; ) {
580580
--Def;
581-
if (!Def->definesRegister(SrcReg, &RI))
581+
582+
if (!Def->modifiesRegister(SrcReg, &RI))
582583
continue;
583-
if (Def->getOpcode() != AMDGPU::V_ACCVGPR_WRITE_B32_e64)
584+
585+
if (Def->getOpcode() != AMDGPU::V_ACCVGPR_WRITE_B32_e64 ||
586+
Def->getOperand(0).getReg() != SrcReg)
584587
break;
585588

586589
MachineOperand &DefOp = Def->getOperand(1);

llvm/test/CodeGen/AMDGPU/accvgpr-copy.mir

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -900,9 +900,11 @@ body: |
900900
; GFX908-NEXT: $vgpr1 = V_ACCVGPR_READ_B32_e64 $agpr1, implicit $exec, implicit $agpr0_agpr1
901901
; GFX908-NEXT: $agpr2 = V_ACCVGPR_WRITE_B32_e64 $vgpr1, implicit $exec, implicit-def $agpr1_agpr2
902902
; GFX908-NEXT: $vgpr0 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec, implicit $agpr0_agpr1
903-
; GFX908-NEXT: $agpr1 = V_ACCVGPR_WRITE_B32_e64 $vgpr0, implicit $exec, implicit $exec, implicit-def $agpr1_agpr2
904-
; GFX908-NEXT: $agpr4 = V_ACCVGPR_WRITE_B32_e64 $vgpr0, implicit $exec, implicit-def $agpr3_agpr4, implicit $agpr1_agpr2
905-
; GFX908-NEXT: $agpr3 = V_ACCVGPR_WRITE_B32_e64 $vgpr0, implicit $exec, implicit killed $agpr1_agpr2, implicit $exec
903+
; GFX908-NEXT: $agpr1 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr0, implicit $exec, implicit $exec, implicit-def $agpr1_agpr2
904+
; GFX908-NEXT: $vgpr0 = V_ACCVGPR_READ_B32_e64 $agpr2, implicit $exec, implicit $agpr1_agpr2
905+
; GFX908-NEXT: $agpr4 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr0, implicit $exec, implicit-def $agpr3_agpr4
906+
; GFX908-NEXT: $vgpr255 = V_ACCVGPR_READ_B32_e64 killed $agpr1, implicit $exec, implicit killed $agpr1_agpr2
907+
; GFX908-NEXT: $agpr3 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr255, implicit $exec, implicit $exec
906908
; GFX90A-LABEL: name: a2_to_a2_implicit_defs
907909
; GFX90A: liveins: $agpr0_agpr1
908910
; GFX90A-NEXT: {{ $}}

0 commit comments

Comments
 (0)