Skip to content

[AMDGPU] S_SET_GPR_IDX_ON can be passed an immediate index #125086

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 30, 2025

Conversation

JonChesterfield
Copy link
Collaborator

Oversight found by ISel fuzz effort. Assuming the argument is a register, in some cases it can be an immediate. Tablegen's type for the instruction is SSrc_b32, i.e. register or immediate fine. Added the repro from the bug reporter as a test case - prior to this patch llvm will assert in getReg.

Fixes SWDEV-508589

@llvmbot
Copy link
Member

llvmbot commented Jan 30, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Jon Chesterfield (JonChesterfield)

Changes

Oversight found by ISel fuzz effort. Assuming the argument is a register, in some cases it can be an immediate. Tablegen's type for the instruction is SSrc_b32, i.e. register or immediate fine. Added the repro from the bug reporter as a test case - prior to this patch llvm will assert in getReg.

Fixes SWDEV-508589


Full diff: https://github.com/llvm/llvm-project/pull/125086.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+1-2)
  • (added) llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll (+38)
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 5727d14ec49e8a..2c7665f5b8acfa 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -2366,11 +2366,10 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) const {
     assert(ST.useVGPRIndexMode());
     Register VecReg = MI.getOperand(0).getReg();
     bool IsUndef = MI.getOperand(1).isUndef();
-    Register Idx = MI.getOperand(3).getReg();
     Register SubReg = MI.getOperand(4).getImm();
 
     MachineInstr *SetOn = BuildMI(MBB, MI, DL, get(AMDGPU::S_SET_GPR_IDX_ON))
-                              .addReg(Idx)
+                              .add(MI.getOperand(3)) // Index
                               .addImm(AMDGPU::VGPRIndexMode::DST_ENABLE);
     SetOn->getOperand(3).setIsUndef();
 
diff --git a/llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll b/llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll
new file mode 100644
index 00000000000000..d86f497aa5e13d
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll
@@ -0,0 +1,38 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=amdgcn -mcpu=gfx90a -verify-machineinstrs | FileCheck %s
+
+define amdgpu_kernel void @copy_to_reg_frameindex(ptr addrspace(1) %out, i32 %a, i32 %b, i32 %c) {
+; CHECK-LABEL: copy_to_reg_frameindex:
+; CHECK:       ; %bb.0: ; %entry
+; CHECK-NEXT:    ; implicit-def: $vgpr0
+; CHECK-NEXT:  .LBB0_1: ; %loop
+; CHECK-NEXT:    ; =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:    s_cmp_lt_u32 0, 16
+; CHECK-NEXT:    s_set_gpr_idx_on 0, gpr_idx(DST)
+; CHECK-NEXT:    v_mov_b32_e32 v0, 0
+; CHECK-NEXT:    s_set_gpr_idx_off
+; CHECK-NEXT:    s_cbranch_scc1 .LBB0_1
+; CHECK-NEXT:  ; %bb.2: ; %done
+; CHECK-NEXT:    s_load_dwordx2 s[0:1], s[4:5], 0x24
+; CHECK-NEXT:    v_mov_b32_e32 v1, 0
+; CHECK-NEXT:    s_waitcnt lgkmcnt(0)
+; CHECK-NEXT:    global_store_dword v1, v0, s[0:1]
+; CHECK-NEXT:    s_endpgm
+entry:
+  %B = srem i32 %c, -1
+  %alloca = alloca [16 x i32], align 4, addrspace(5)
+  br label %loop
+
+loop:
+  %inc = phi i32 [ 0, %entry ], [ %inc.i, %loop ]
+  %ptr = getelementptr [16 x i32], ptr addrspace(5) %alloca, i32 0, i32 %inc
+  store i32 %inc, ptr addrspace(5) %ptr, align 4
+  %inc.i = add i32 %inc, %B
+  %cnd = icmp uge i32 %inc.i, 16
+  br i1 %cnd, label %done, label %loop
+
+done:
+  %tmp1 = load i32, ptr addrspace(5) %alloca, align 4
+  store i32 %tmp1, ptr addrspace(1) %out, align 4
+  ret void
+}

Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG, thanks

@JonChesterfield JonChesterfield merged commit c39fba2 into llvm:main Jan 30, 2025
5 of 6 checks passed
@JonChesterfield JonChesterfield deleted the jc_frameindex_assert branch January 30, 2025 16:40
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 30, 2025

LLVM Buildbot has detected a new failure on builder clang-ppc64le-linux-test-suite running on ppc64le-clang-test-suite while building llvm at step 4 "cmake-configure".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/95/builds/9050

Here is the relevant piece of the build log for the reference
Step 4 (cmake-configure) failure: cmake (failure)
CMake Deprecation Warning at /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/cmake/Modules/CMakePolicy.cmake:6 (cmake_policy):
  The OLD behavior for policy CMP0116 will be removed from a future version
  of CMake.

  The cmake-policies(7) manual explains that the OLD behaviors of all
  policies are deprecated and that a policy should be set to OLD only under
  specific short-term circumstances.  Projects should be ported to the NEW
  behavior and not rely on setting a policy to OLD.
Call Stack (most recent call first):
  CMakeLists.txt:8 (include)


CMake Error at CMakeLists.txt:47 (project):
  Running

   '/usr/bin/ninja-build' '--version'

  failed with:

   no such file or directory


-- Configuring incomplete, errors occurred!

@@ -2366,11 +2366,11 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) const {
assert(ST.useVGPRIndexMode());
Register VecReg = MI.getOperand(0).getReg();
bool IsUndef = MI.getOperand(1).isUndef();
Register Idx = MI.getOperand(3).getReg();
MachineOperand Idx = MI.getOperand(3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use operands by value

@@ -0,0 +1,38 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=amdgcn -mcpu=gfx90a -verify-machineinstrs | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need -verify-machineinstrs

; CHECK-NEXT: s_endpgm
entry:
%B = srem i32 %c, -1
%alloca = alloca [16 x i32], align 4, addrspace(5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is relying on promote alloca turning this into vector indexing? Should pre-fold that in the IR. Also, I thought we disabled the path to use the indexing mode by default

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that seems to be the case. Promote alloca gives a <16 x i32> which then turns into an unrolled loop which makes a bit of a mess of the test case, agreed on revising it to the vector input

@JonChesterfield
Copy link
Collaborator Author

Applied the above feedback in 4f358d7, thanks!

jrbyrnes pushed a commit to jrbyrnes/llvm-project that referenced this pull request May 27, 2025
Oversight found by ISel fuzz effort. Assuming the argument is a
register, in some cases it can be an immediate. Tablegen's type for the
instruction is SSrc_b32, i.e. register or immediate fine. Added the
repro from the bug reporter as a test case - prior to this patch llvm
will assert in getReg.

Fixes SWDEV-508589
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants