[AMDGPU] Skip handling of non-byte types in promote alloca. #128769

sgundapa · 2025-02-25T20:22:02Z

Non-byte types like i1 can be packed and be supported. For the time being these types are not promoted.

Issue found by fuzzer.

Non-byte types like i1 can be packed and be supported. For the time being these types are not promoted. Issue found by fuzzer.

llvmbot · 2025-02-25T20:22:35Z

@llvm/pr-subscribers-backend-amdgpu

Author: Sumanth Gundapaneni (sgundapa)

Changes

Non-byte types like i1 can be packed and be supported. For the time being these types are not promoted.

Issue found by fuzzer.

Full diff: https://github.com/llvm/llvm-project/pull/128769.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp (+9-2)
(added) llvm/test/CodeGen/AMDGPU/promote-alloca-skip-non-byte-type.ll (+21)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index 28016b5936ccf..007f930cea4f3 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -759,6 +759,14 @@ bool AMDGPUPromoteAllocaImpl::tryPromoteAllocaToVector(AllocaInst &Alloca) {
     return false;
   }
 
+  Type *VecEltTy = VectorTy->getElementType();
+  constexpr unsigned SIZE_OF_BYTE = 8;
+  unsigned ElementSizeInBits = DL->getTypeSizeInBits(VecEltTy);
+  // FIXME: The non-byte type like i1 can be packed and be supported, but
+  // currently we do not handle them.
+  if (ElementSizeInBits % SIZE_OF_BYTE != 0)
+    return false;
+
   std::map<GetElementPtrInst *, WeakTrackingVH> GEPVectorIdx;
   SmallVector<Instruction *> WorkList;
   SmallVector<Instruction *> UsersToRemove;
@@ -776,8 +784,7 @@ bool AMDGPUPromoteAllocaImpl::tryPromoteAllocaToVector(AllocaInst &Alloca) {
 
   LLVM_DEBUG(dbgs() << "  Attempting promotion to: " << *VectorTy << "\n");
 
-  Type *VecEltTy = VectorTy->getElementType();
-  unsigned ElementSize = DL->getTypeSizeInBits(VecEltTy) / 8;
+  unsigned ElementSize = ElementSizeInBits / SIZE_OF_BYTE;
   for (auto *U : Uses) {
     Instruction *Inst = cast<Instruction>(U->getUser());
 
diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-skip-non-byte-type.ll b/llvm/test/CodeGen/AMDGPU/promote-alloca-skip-non-byte-type.ll
new file mode 100644
index 0000000000000..3d2234f0a7ac3
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-skip-non-byte-type.ll
@@ -0,0 +1,21 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-unknown-amdhsa -passes=amdgpu-promote-alloca < %s | FileCheck %s
+
+; Verify that we do not crash and not promote non-byte alloca types.
+define <8 x i1> @non_byte_alloca_type() {
+; CHECK-LABEL: define <8 x i1> @non_byte_alloca_type() {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[C:%.*]] = icmp ugt <16 x i1> zeroinitializer, zeroinitializer
+; CHECK-NEXT:    [[RP:%.*]] = alloca <8 x i1>, align 1
+; CHECK-NEXT:    [[TMP0:%.*]] = load <8 x i1>, ptr [[RP]], align 1
+; CHECK-NEXT:    store <16 x i1> [[C]], ptr [[RP]], align 2
+; CHECK-NEXT:    ret <8 x i1> [[TMP0]]
+;
+entry:
+  %C = icmp ugt <16 x i1> zeroinitializer, zeroinitializer
+  %RP = alloca <8 x i1>, align 1
+  %0 = load <8 x i1>, ptr %RP, align 1
+  store <16 x i1> %C, ptr %RP, align 2
+  ret <8 x i1> %0
+}
+

shiltian · 2025-02-25T21:07:08Z

llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp

@@ -776,8 +784,7 @@ bool AMDGPUPromoteAllocaImpl::tryPromoteAllocaToVector(AllocaInst &Alloca) {

  LLVM_DEBUG(dbgs() << "  Attempting promotion to: " << *VectorTy << "\n");

-  Type *VecEltTy = VectorTy->getElementType();
-  unsigned ElementSize = DL->getTypeSizeInBits(VecEltTy) / 8;
+  unsigned ElementSize = ElementSizeInBits / SIZE_OF_BYTE;


IIUC SIZE_OF_BYTE is defined by the whatever compiler compiles LLVM instead of for AMDGPU.

You mean , to use some thing like this to derive the value from data layout "DL.getTypeSizeInBits(Type::getInt8Ty(M->getContext()))".

I have defined it to be "constexpr unsigned SIZE_OF_BYTE = 8" in line 763. Probably pick a different name ?

Oh, I missed that part. Hardcoding 8 is probably fine for now and in the any near future, but the proper approach is definitely to query DL.

arsenm

Looking at the actual code, I don't see why this doesn't just work for this case. Is the assert wrong?

arsenm · 2025-02-26T03:47:47Z

llvm/test/CodeGen/AMDGPU/promote-alloca-skip-non-byte-type.ll

+; CHECK-NEXT:    ret <8 x i1> [[TMP0]]
+;
+entry:
+  %C = icmp ugt <16 x i1> zeroinitializer, zeroinitializer


Use something that can't fold away

arsenm · 2025-02-26T03:48:03Z

llvm/test/CodeGen/AMDGPU/promote-alloca-skip-non-byte-type.ll

+;
+entry:
+  %C = icmp ugt <16 x i1> zeroinitializer, zeroinitializer
+  %RP = alloca <8 x i1>, align 1


Use the correct alloca address space. Also this issue isn't about the UB under-alignment, so correct that

Thats correct. Here is an example that might trigger an UB

@g = global <8 x float> <float 4.200000e+01, float 4.200000e+01, float 4.200000e+01, float 4.200000e+01, float 4.200000e+01, float 4.200000e+01, float 4.200000e+01, float 4.200000e+01>

define <8 x i1> @f(float %0, i32 %1, i16 %2) {
BB:
%LGV = load <8 x float>, ptr @g, align 32
%RP = alloca <8 x i1>, align 1
%L = load <8 x float>, ptr %RP, align 32
%C = fcmp olt <8 x float> %L, %LGV
ret <8 x i1> %C
}

Also, if you do not specify the addrspace , wouldn't it default to generic addrsapce which is "0"

arsenm · 2025-02-26T03:48:40Z

llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp

+  unsigned ElementSizeInBits = DL->getTypeSizeInBits(VecEltTy);
+  // FIXME: The non-byte type like i1 can be packed and be supported, but
+  // currently we do not handle them.
+  if (ElementSizeInBits % SIZE_OF_BYTE != 0)


Best to replicate typeSizeEqualsStoreSize

Thanks. Will do

arsenm · 2025-02-26T03:50:01Z

llvm/test/CodeGen/AMDGPU/promote-alloca-skip-non-byte-type.ll

+  store <16 x i1> %C, ptr %RP, align 2
+  ret <8 x i1> %0
+}
+


Can you add some tests for the scalar case? Only the subvector extract was a problem?

The assertion trigered here is due to subvector being <2 x i1> and the access type being <16 x i1>
The access size for < 16 x i1> is 2 and the computation to derive the subvector relies on this access size and ended with a <2xi1> that triggered the assert due to mismatch in storage size.

assert(DL.getTypeStoreSize(SubVecTy) == DL.getTypeStoreSize(AccessTy));

Yes, the assertions I am seeing are all being trigerred while handling subvectors for loads and stores.

sgundapa · 2025-03-10T14:32:30Z

ping

ritter-x2a · 2025-04-14T09:23:39Z

#134042, which subsumes this PR, has landed in trunk.

sgundapa · 2025-04-21T14:28:01Z

This issue is addressed in here : #134042

[AMDGPU] Skip handling non-byte types in promote alloca.

4082aa0

Non-byte types like i1 can be packed and be supported. For the time being these types are not promoted. Issue found by fuzzer.

sgundapa requested review from arsenm and bcahoon February 25, 2025 20:22

llvmbot added the backend:AMDGPU label Feb 25, 2025

sgundapa changed the title ~~[AMDGPU] Skip handling non-byte types in promote alloca.~~ [AMDGPU] Skip handling of non-byte types in promote alloca. Feb 25, 2025

shiltian reviewed Feb 25, 2025

View reviewed changes

arsenm reviewed Feb 26, 2025

View reviewed changes

arsenm requested a review from Pierre-vh February 26, 2025 09:54

arsenm mentioned this pull request Apr 2, 2025

[AMDGPU] Avoid crashes for non-byte-sized types in PromoteAlloca #134042

Merged

sgundapa closed this Apr 21, 2025

sgundapa deleted the alloca branch April 21, 2025 14:28

[AMDGPU] Skip handling of non-byte types in promote alloca. #128769

[AMDGPU] Skip handling of non-byte types in promote alloca. #128769

Uh oh!

Conversation

sgundapa commented Feb 25, 2025

Uh oh!

llvmbot commented Feb 25, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shiltian Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgundapa commented Mar 10, 2025

Uh oh!

ritter-x2a commented Apr 14, 2025

Uh oh!

sgundapa commented Apr 21, 2025

Uh oh!

Uh oh!

shiltian Feb 25, 2025 •

edited

Loading