-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AMDGPU] S_SET_GPR_IDX_ON can be passed an immediate index #125086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] S_SET_GPR_IDX_ON can be passed an immediate index #125086
Conversation
@llvm/pr-subscribers-backend-amdgpu Author: Jon Chesterfield (JonChesterfield) ChangesOversight found by ISel fuzz effort. Assuming the argument is a register, in some cases it can be an immediate. Tablegen's type for the instruction is SSrc_b32, i.e. register or immediate fine. Added the repro from the bug reporter as a test case - prior to this patch llvm will assert in getReg. Fixes SWDEV-508589 Full diff: https://github.com/llvm/llvm-project/pull/125086.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 5727d14ec49e8a..2c7665f5b8acfa 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -2366,11 +2366,10 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) const {
assert(ST.useVGPRIndexMode());
Register VecReg = MI.getOperand(0).getReg();
bool IsUndef = MI.getOperand(1).isUndef();
- Register Idx = MI.getOperand(3).getReg();
Register SubReg = MI.getOperand(4).getImm();
MachineInstr *SetOn = BuildMI(MBB, MI, DL, get(AMDGPU::S_SET_GPR_IDX_ON))
- .addReg(Idx)
+ .add(MI.getOperand(3)) // Index
.addImm(AMDGPU::VGPRIndexMode::DST_ENABLE);
SetOn->getOperand(3).setIsUndef();
diff --git a/llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll b/llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll
new file mode 100644
index 00000000000000..d86f497aa5e13d
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll
@@ -0,0 +1,38 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=amdgcn -mcpu=gfx90a -verify-machineinstrs | FileCheck %s
+
+define amdgpu_kernel void @copy_to_reg_frameindex(ptr addrspace(1) %out, i32 %a, i32 %b, i32 %c) {
+; CHECK-LABEL: copy_to_reg_frameindex:
+; CHECK: ; %bb.0: ; %entry
+; CHECK-NEXT: ; implicit-def: $vgpr0
+; CHECK-NEXT: .LBB0_1: ; %loop
+; CHECK-NEXT: ; =>This Inner Loop Header: Depth=1
+; CHECK-NEXT: s_cmp_lt_u32 0, 16
+; CHECK-NEXT: s_set_gpr_idx_on 0, gpr_idx(DST)
+; CHECK-NEXT: v_mov_b32_e32 v0, 0
+; CHECK-NEXT: s_set_gpr_idx_off
+; CHECK-NEXT: s_cbranch_scc1 .LBB0_1
+; CHECK-NEXT: ; %bb.2: ; %done
+; CHECK-NEXT: s_load_dwordx2 s[0:1], s[4:5], 0x24
+; CHECK-NEXT: v_mov_b32_e32 v1, 0
+; CHECK-NEXT: s_waitcnt lgkmcnt(0)
+; CHECK-NEXT: global_store_dword v1, v0, s[0:1]
+; CHECK-NEXT: s_endpgm
+entry:
+ %B = srem i32 %c, -1
+ %alloca = alloca [16 x i32], align 4, addrspace(5)
+ br label %loop
+
+loop:
+ %inc = phi i32 [ 0, %entry ], [ %inc.i, %loop ]
+ %ptr = getelementptr [16 x i32], ptr addrspace(5) %alloca, i32 0, i32 %inc
+ store i32 %inc, ptr addrspace(5) %ptr, align 4
+ %inc.i = add i32 %inc, %B
+ %cnd = icmp uge i32 %inc.i, 16
+ br i1 %cnd, label %done, label %loop
+
+done:
+ %tmp1 = load i32, ptr addrspace(5) %alloca, align 4
+ store i32 %tmp1, ptr addrspace(1) %out, align 4
+ ret void
+}
|
0965b0b
to
525f117
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, thanks
525f117
to
ac4ea31
Compare
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/95/builds/9050 Here is the relevant piece of the build log for the reference
|
@@ -2366,11 +2366,11 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) const { | |||
assert(ST.useVGPRIndexMode()); | |||
Register VecReg = MI.getOperand(0).getReg(); | |||
bool IsUndef = MI.getOperand(1).isUndef(); | |||
Register Idx = MI.getOperand(3).getReg(); | |||
MachineOperand Idx = MI.getOperand(3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use operands by value
@@ -0,0 +1,38 @@ | |||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py | |||
; RUN: llc < %s -mtriple=amdgcn -mcpu=gfx90a -verify-machineinstrs | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need -verify-machineinstrs
; CHECK-NEXT: s_endpgm | ||
entry: | ||
%B = srem i32 %c, -1 | ||
%alloca = alloca [16 x i32], align 4, addrspace(5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is relying on promote alloca turning this into vector indexing? Should pre-fold that in the IR. Also, I thought we disabled the path to use the indexing mode by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that seems to be the case. Promote alloca gives a <16 x i32> which then turns into an unrolled loop which makes a bit of a mess of the test case, agreed on revising it to the vector input
Applied the above feedback in 4f358d7, thanks! |
Oversight found by ISel fuzz effort. Assuming the argument is a register, in some cases it can be an immediate. Tablegen's type for the instruction is SSrc_b32, i.e. register or immediate fine. Added the repro from the bug reporter as a test case - prior to this patch llvm will assert in getReg. Fixes SWDEV-508589
Oversight found by ISel fuzz effort. Assuming the argument is a register, in some cases it can be an immediate. Tablegen's type for the instruction is SSrc_b32, i.e. register or immediate fine. Added the repro from the bug reporter as a test case - prior to this patch llvm will assert in getReg.
Fixes SWDEV-508589