[AMDGPU] S_SET_GPR_IDX_ON can be passed an immediate index #125086

JonChesterfield · 2025-01-30T16:22:55Z

Oversight found by ISel fuzz effort. Assuming the argument is a register, in some cases it can be an immediate. Tablegen's type for the instruction is SSrc_b32, i.e. register or immediate fine. Added the repro from the bug reporter as a test case - prior to this patch llvm will assert in getReg.

Fixes SWDEV-508589

llvmbot · 2025-01-30T16:23:34Z

@llvm/pr-subscribers-backend-amdgpu

Author: Jon Chesterfield (JonChesterfield)

Changes

Oversight found by ISel fuzz effort. Assuming the argument is a register, in some cases it can be an immediate. Tablegen's type for the instruction is SSrc_b32, i.e. register or immediate fine. Added the repro from the bug reporter as a test case - prior to this patch llvm will assert in getReg.

Fixes SWDEV-508589

Full diff: https://github.com/llvm/llvm-project/pull/125086.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+1-2)
(added) llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll (+38)

diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 5727d14ec49e8a..2c7665f5b8acfa 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -2366,11 +2366,10 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) const {
     assert(ST.useVGPRIndexMode());
     Register VecReg = MI.getOperand(0).getReg();
     bool IsUndef = MI.getOperand(1).isUndef();
-    Register Idx = MI.getOperand(3).getReg();
     Register SubReg = MI.getOperand(4).getImm();
 
     MachineInstr *SetOn = BuildMI(MBB, MI, DL, get(AMDGPU::S_SET_GPR_IDX_ON))
-                              .addReg(Idx)
+                              .add(MI.getOperand(3)) // Index
                               .addImm(AMDGPU::VGPRIndexMode::DST_ENABLE);
     SetOn->getOperand(3).setIsUndef();
 
diff --git a/llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll b/llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll
new file mode 100644
index 00000000000000..d86f497aa5e13d
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll
@@ -0,0 +1,38 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=amdgcn -mcpu=gfx90a -verify-machineinstrs | FileCheck %s
+
+define amdgpu_kernel void @copy_to_reg_frameindex(ptr addrspace(1) %out, i32 %a, i32 %b, i32 %c) {
+; CHECK-LABEL: copy_to_reg_frameindex:
+; CHECK:       ; %bb.0: ; %entry
+; CHECK-NEXT:    ; implicit-def: $vgpr0
+; CHECK-NEXT:  .LBB0_1: ; %loop
+; CHECK-NEXT:    ; =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:    s_cmp_lt_u32 0, 16
+; CHECK-NEXT:    s_set_gpr_idx_on 0, gpr_idx(DST)
+; CHECK-NEXT:    v_mov_b32_e32 v0, 0
+; CHECK-NEXT:    s_set_gpr_idx_off
+; CHECK-NEXT:    s_cbranch_scc1 .LBB0_1
+; CHECK-NEXT:  ; %bb.2: ; %done
+; CHECK-NEXT:    s_load_dwordx2 s[0:1], s[4:5], 0x24
+; CHECK-NEXT:    v_mov_b32_e32 v1, 0
+; CHECK-NEXT:    s_waitcnt lgkmcnt(0)
+; CHECK-NEXT:    global_store_dword v1, v0, s[0:1]
+; CHECK-NEXT:    s_endpgm
+entry:
+  %B = srem i32 %c, -1
+  %alloca = alloca [16 x i32], align 4, addrspace(5)
+  br label %loop
+
+loop:
+  %inc = phi i32 [ 0, %entry ], [ %inc.i, %loop ]
+  %ptr = getelementptr [16 x i32], ptr addrspace(5) %alloca, i32 0, i32 %inc
+  store i32 %inc, ptr addrspace(5) %ptr, align 4
+  %inc.i = add i32 %inc, %B
+  %cnd = icmp uge i32 %inc.i, 16
+  br i1 %cnd, label %done, label %loop
+
+done:
+  %tmp1 = load i32, ptr addrspace(5) %alloca, align 4
+  store i32 %tmp1, ptr addrspace(1) %out, align 4
+  ret void
+}

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

jhuber6

LG, thanks

llvm-ci · 2025-01-30T16:47:54Z

LLVM Buildbot has detected a new failure on builder clang-ppc64le-linux-test-suite running on ppc64le-clang-test-suite while building llvm at step 4 "cmake-configure".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/95/builds/9050

Here is the relevant piece of the build log for the reference

Step 4 (cmake-configure) failure: cmake (failure)
CMake Deprecation Warning at /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/cmake/Modules/CMakePolicy.cmake:6 (cmake_policy):
  The OLD behavior for policy CMP0116 will be removed from a future version
  of CMake.

  The cmake-policies(7) manual explains that the OLD behaviors of all
  policies are deprecated and that a policy should be set to OLD only under
  specific short-term circumstances.  Projects should be ported to the NEW
  behavior and not rely on setting a policy to OLD.
Call Stack (most recent call first):
  CMakeLists.txt:8 (include)


CMake Error at CMakeLists.txt:47 (project):
  Running

   '/usr/bin/ninja-build' '--version'

  failed with:

   no such file or directory


-- Configuring incomplete, errors occurred!

arsenm · 2025-01-30T17:35:06Z

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

@@ -2366,11 +2366,11 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) const {
    assert(ST.useVGPRIndexMode());
    Register VecReg = MI.getOperand(0).getReg();
    bool IsUndef = MI.getOperand(1).isUndef();
-    Register Idx = MI.getOperand(3).getReg();
+    MachineOperand Idx = MI.getOperand(3);


Don't use operands by value

arsenm · 2025-01-30T17:35:18Z

llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll

@@ -0,0 +1,38 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=amdgcn -mcpu=gfx90a -verify-machineinstrs | FileCheck %s


Don't need -verify-machineinstrs

arsenm · 2025-01-30T17:36:32Z

llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll

+; CHECK-NEXT:    s_endpgm
+entry:
+  %B = srem i32 %c, -1
+  %alloca = alloca [16 x i32], align 4, addrspace(5)


I assume this is relying on promote alloca turning this into vector indexing? Should pre-fold that in the IR. Also, I thought we disabled the path to use the indexing mode by default

Yeah, that seems to be the case. Promote alloca gives a <16 x i32> which then turns into an unrolled loop which makes a bit of a mess of the test case, agreed on revising it to the vector input

JonChesterfield · 2025-01-30T20:46:19Z

Applied the above feedback in 4f358d7, thanks!

Oversight found by ISel fuzz effort. Assuming the argument is a register, in some cases it can be an immediate. Tablegen's type for the instruction is SSrc_b32, i.e. register or immediate fine. Added the repro from the bug reporter as a test case - prior to this patch llvm will assert in getReg. Fixes SWDEV-508589

JonChesterfield requested review from jayfoad, arsenm, ronlieb and jhuber6 January 30, 2025 16:22

llvmbot added the backend:AMDGPU label Jan 30, 2025

jhuber6 reviewed Jan 30, 2025

View reviewed changes

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp Outdated Show resolved Hide resolved

JonChesterfield force-pushed the jc_frameindex_assert branch from 0965b0b to 525f117 Compare January 30, 2025 16:31

jhuber6 approved these changes Jan 30, 2025

View reviewed changes

[AMDGPU] S_SET_GPR_IDX_ON can be passed an immediate index

ac4ea31

JonChesterfield force-pushed the jc_frameindex_assert branch from 525f117 to ac4ea31 Compare January 30, 2025 16:35

JonChesterfield merged commit c39fba2 into llvm:main Jan 30, 2025
5 of 6 checks passed

JonChesterfield deleted the jc_frameindex_assert branch January 30, 2025 16:40

arsenm reviewed Jan 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] S_SET_GPR_IDX_ON can be passed an immediate index #125086

[AMDGPU] S_SET_GPR_IDX_ON can be passed an immediate index #125086

Uh oh!

JonChesterfield commented Jan 30, 2025

Uh oh!

llvmbot commented Jan 30, 2025

Uh oh!

Uh oh!

jhuber6 left a comment

Uh oh!

Uh oh!

llvm-ci commented Jan 30, 2025

Uh oh!

arsenm Jan 30, 2025

Uh oh!

arsenm Jan 30, 2025

Uh oh!

arsenm Jan 30, 2025

Uh oh!

JonChesterfield Jan 30, 2025

Uh oh!

JonChesterfield commented Jan 30, 2025

Uh oh!

Uh oh!

		@@ -0,0 +1,38 @@
		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
		; RUN: llc < %s -mtriple=amdgcn -mcpu=gfx90a -verify-machineinstrs \| FileCheck %s

[AMDGPU] S_SET_GPR_IDX_ON can be passed an immediate index #125086

[AMDGPU] S_SET_GPR_IDX_ON can be passed an immediate index #125086

Uh oh!

Conversation

JonChesterfield commented Jan 30, 2025

Uh oh!

llvmbot commented Jan 30, 2025

Uh oh!

Uh oh!

jhuber6 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Jan 30, 2025

Uh oh!

arsenm Jan 30, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm Jan 30, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm Jan 30, 2025

Choose a reason for hiding this comment

Uh oh!

JonChesterfield Jan 30, 2025

Choose a reason for hiding this comment

Uh oh!

JonChesterfield commented Jan 30, 2025

Uh oh!

Uh oh!