[AMDGPU] iglp.opt does not clobber memory operands #126976

jrbyrnes · 2025-02-12T21:45:05Z

I think it was an accident that this wasn't included.

Change-Id: I2a35580b3935a1578d537e869ac17b2a0052eb9a

llvmbot · 2025-02-12T21:45:37Z

@llvm/pr-subscribers-backend-amdgpu

Author: Jeffrey Byrnes (jrbyrnes)

Changes

I think it was an accident that this wasn't included.

Full diff: https://github.com/llvm/llvm-project/pull/126976.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUMemoryUtils.cpp (+1)
(added) llvm/test/CodeGen/AMDGPU/iglp-no-clobber.ll (+39)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMemoryUtils.cpp b/llvm/lib/Target/AMDGPU/AMDGPUMemoryUtils.cpp
index a5bfdb7bf6eac..57289c3e8bbf4 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMemoryUtils.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMemoryUtils.cpp
@@ -367,6 +367,7 @@ bool isReallyAClobber(const Value *Ptr, MemoryDef *Def, AAResults *AA) {
     case Intrinsic::amdgcn_wave_barrier:
     case Intrinsic::amdgcn_sched_barrier:
     case Intrinsic::amdgcn_sched_group_barrier:
+    case Intrinsic::amdgcn_iglp_opt:
       return false;
     default:
       break;
diff --git a/llvm/test/CodeGen/AMDGPU/iglp-no-clobber.ll b/llvm/test/CodeGen/AMDGPU/iglp-no-clobber.ll
new file mode 100644
index 0000000000000..6346f632e537f
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/iglp-no-clobber.ll
@@ -0,0 +1,39 @@
+; RUN: llc -mtriple=amdgcn -mcpu=gfx942 --stop-after=si-fix-sgpr-copies < %s | FileCheck %s
+
+; iglp.opt should not be flagged as clobbering the memory operand for the global_load, and we should be able to
+; lower into the scalar version (i.e. should not need to lower into vector version with waterfall loop)
+; CHECK-NOT: WATERFALL
+
+define amdgpu_kernel void @_attn_forward_fp8e5_128x32x64_BW128(ptr addrspace(1) %in, ptr addrspace(3) %out) {
+.lr.ph:
+  br label %1
+
+1:                                                ; preds = %1, %.lr.ph
+  %addr = phi ptr addrspace(1) [ null, %.lr.ph ], [ %gep, %1 ]
+  %offset = phi i64 [ 0, %.lr.ph ], [ %nextOff, %1 ]
+  %inc = phi i32 [0, %.lr.ph], [ %incCond, %1 ] 
+  %rsrc = tail call ptr addrspace(8) @llvm.amdgcn.make.buffer.rsrc.p1(ptr addrspace(1) %addr, i16 0, i32 0, i32 0)
+  %load = tail call <2 x i32> @llvm.amdgcn.raw.ptr.buffer.load.v2i32(ptr addrspace(8) %rsrc, i32 0, i32 0, i32 0)
+  %load.bc = bitcast <2 x i32> %load to <8 x i8>
+  %load.elem = extractelement <8 x i8> %load.bc, i64 0
+  tail call void @llvm.amdgcn.iglp.opt(i32 0)
+  %vec = insertelement <4 x i8> zeroinitializer, i8 %load.elem, i64 0
+  %vec.bc = bitcast <4 x i8> %vec to <2 x half>
+  %shuff = shufflevector <2 x half> %vec.bc, <2 x half> zeroinitializer, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %gep = getelementptr i8, ptr addrspace(1) %in, i64 %offset
+  %unmaskedload49 = load <1 x i64>, ptr addrspace(1) null, align 8
+  %nextOff = extractelement <1 x i64> %unmaskedload49, i64 0
+  %incCond = add i32 %inc, 1
+  %cond = icmp eq i32 %incCond, 32
+  br i1 %cond, label %2, label %1 
+
+2:
+  store <4 x half> %shuff, ptr addrspace(3) %out, align 8
+  ret void
+}
+
+; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
+declare ptr addrspace(8) @llvm.amdgcn.make.buffer.rsrc.p1(ptr addrspace(1) readnone, i16, i32, i32) #0
+
+; Function Attrs: nocallback nofree nosync nounwind willreturn memory(argmem: read)
+declare <2 x i32> @llvm.amdgcn.raw.ptr.buffer.load.v2i32(ptr addrspace(8) nocapture readonly, i32, i32, i32 immarg) #1
\ No newline at end of file

rampitec

LGTM

Change-Id: Id2ffadfb30dd2aa0edf5d01507c98994c3e41876

llvm/test/CodeGen/AMDGPU/iglp-no-clobber.ll

arsenm · 2025-02-13T04:39:00Z

llvm/test/CodeGen/AMDGPU/iglp-no-clobber.ll

+
+; iglp.opt should not be flagged as clobbering the memory operand for the global_load, and we should be able to
+; lower into the scalar version (i.e. should not need to lower into vector version with waterfall loop)
+; CHECK-NOT: WATERFALL


This is much too fragile of a check. Positively check the output, preferably the final output

Ok - since theres a couple comments I'll open a separate PR

llvm/test/CodeGen/AMDGPU/iglp-no-clobber.ll

I think it was an accident that this wasn't included.

Addressed comments in #126976

Addressed comments in llvm/llvm-project#126976

I think it was an accident that this wasn't included.

Addressed comments in llvm#126976

[AMDGPU] iglp.opt does not clobber memory operands

c688773

Change-Id: I2a35580b3935a1578d537e869ac17b2a0052eb9a

jrbyrnes requested review from kerbowa and rampitec February 12, 2025 21:45

llvmbot added the backend:AMDGPU label Feb 12, 2025

rampitec approved these changes Feb 12, 2025

View reviewed changes

Add newline to test

6ea351c

Change-Id: Id2ffadfb30dd2aa0edf5d01507c98994c3e41876

jrbyrnes merged commit c5a4512 into llvm:main Feb 12, 2025
4 of 5 checks passed

arsenm reviewed Feb 13, 2025

View reviewed changes

flovent pushed a commit to flovent/llvm-project that referenced this pull request Feb 13, 2025

[AMDGPU] iglp.opt does not clobber memory operands (llvm#126976)

51cad11

I think it was an accident that this wasn't included.

jrbyrnes mentioned this pull request Feb 13, 2025

[AMDGPU] NFC: Fix some details for lit test #127141

Merged

joaosaffran pushed a commit to joaosaffran/llvm-project that referenced this pull request Feb 14, 2025

[AMDGPU] iglp.opt does not clobber memory operands (llvm#126976)

4158095

I think it was an accident that this wasn't included.

arsenm pushed a commit that referenced this pull request Feb 16, 2025

[AMDGPU] NFC: Fix some details for lit test (#127141)

a1120c9

Addressed comments in #126976

github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 16, 2025

Automerge: [AMDGPU] NFC: Fix some details for lit test (#127141)

11d1eb3

Addressed comments in llvm/llvm-project#126976

sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025

[AMDGPU] iglp.opt does not clobber memory operands (llvm#126976)

c83f7d3

I think it was an accident that this wasn't included.

sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025

[AMDGPU] NFC: Fix some details for lit test (llvm#127141)

82ede64

Addressed comments in llvm#126976

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] iglp.opt does not clobber memory operands #126976

[AMDGPU] iglp.opt does not clobber memory operands #126976

Uh oh!

jrbyrnes commented Feb 12, 2025

Uh oh!

llvmbot commented Feb 12, 2025

Uh oh!

rampitec left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arsenm Feb 13, 2025

Uh oh!

jrbyrnes Feb 13, 2025

Uh oh!

jrbyrnes Feb 13, 2025

Uh oh!

Uh oh!

Uh oh!

[AMDGPU] iglp.opt does not clobber memory operands #126976

[AMDGPU] iglp.opt does not clobber memory operands #126976

Uh oh!

Conversation

jrbyrnes commented Feb 12, 2025

Uh oh!

llvmbot commented Feb 12, 2025

Uh oh!

rampitec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arsenm Feb 13, 2025

Choose a reason for hiding this comment

Uh oh!

jrbyrnes Feb 13, 2025

Choose a reason for hiding this comment

Uh oh!

jrbyrnes Feb 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!