Skip to content

[llvm][opt][Transforms] Replacement calloc should match replaced malloc #110524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 1, 2024

Conversation

AlexVlx
Copy link
Contributor

@AlexVlx AlexVlx commented Sep 30, 2024

Currently DSE unconditionally emits calloc as returning a pointer to AS0. However, this is incorrect for targets that have a non-zero default AS, as it'd not match the malloc signature. This patch addresses that by piping through the AS for the pointer returned by malloc into the calloc insertion call.

@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Alex Voicu (AlexVlx)

Changes

Currently DSE unconditionally emits calloc as returning a pointer to AS0. However, this is incorrect for targets that have a non-zero default AS, as it'd not match the malloc signature. This patch addresses that by piping through the AS for the pointer returned by malloc into the calloc insertion call.


Full diff: https://github.com/llvm/llvm-project/pull/110524.diff

4 Files Affected:

  • (modified) llvm/include/llvm/Transforms/Utils/BuildLibCalls.h (+1-1)
  • (modified) llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp (+2-1)
  • (modified) llvm/lib/Transforms/Utils/BuildLibCalls.cpp (+3-2)
  • (added) llvm/test/Transforms/DeadStoreElimination/malloc-to-calloc-with-nonzero-default-as.ll (+63)
diff --git a/llvm/include/llvm/Transforms/Utils/BuildLibCalls.h b/llvm/include/llvm/Transforms/Utils/BuildLibCalls.h
index 1979c4af770b02..8d6d8e21cc9abf 100644
--- a/llvm/include/llvm/Transforms/Utils/BuildLibCalls.h
+++ b/llvm/include/llvm/Transforms/Utils/BuildLibCalls.h
@@ -251,7 +251,7 @@ namespace llvm {
 
   /// Emit a call to the calloc function.
   Value *emitCalloc(Value *Num, Value *Size, IRBuilderBase &B,
-                    const TargetLibraryInfo &TLI);
+                    const TargetLibraryInfo &TLI, unsigned AddrSpace = 0);
 
   /// Emit a call to the hot/cold operator new function.
   Value *emitHotColdNew(Value *Num, IRBuilderBase &B,
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index a304f7b056f5f7..39c3d76974a65c 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -1945,7 +1945,8 @@ struct DSEState {
     IRBuilder<> IRB(Malloc);
     Type *SizeTTy = Malloc->getArgOperand(0)->getType();
     auto *Calloc = emitCalloc(ConstantInt::get(SizeTTy, 1),
-                              Malloc->getArgOperand(0), IRB, TLI);
+                              Malloc->getArgOperand(0), IRB, TLI,
+                              Malloc->getType()->getPointerAddressSpace());
     if (!Calloc)
       return false;
 
diff --git a/llvm/lib/Transforms/Utils/BuildLibCalls.cpp b/llvm/lib/Transforms/Utils/BuildLibCalls.cpp
index d4727dece19f62..06d0d6bab224bc 100644
--- a/llvm/lib/Transforms/Utils/BuildLibCalls.cpp
+++ b/llvm/lib/Transforms/Utils/BuildLibCalls.cpp
@@ -1978,7 +1978,7 @@ Value *llvm::emitMalloc(Value *Num, IRBuilderBase &B, const DataLayout &DL,
 }
 
 Value *llvm::emitCalloc(Value *Num, Value *Size, IRBuilderBase &B,
-                        const TargetLibraryInfo &TLI) {
+                        const TargetLibraryInfo &TLI, unsigned AddrSpace) {
   Module *M = B.GetInsertBlock()->getModule();
   if (!isLibFuncEmittable(M, &TLI, LibFunc_calloc))
     return nullptr;
@@ -1986,7 +1986,8 @@ Value *llvm::emitCalloc(Value *Num, Value *Size, IRBuilderBase &B,
   StringRef CallocName = TLI.getName(LibFunc_calloc);
   Type *SizeTTy = getSizeTTy(B, &TLI);
   FunctionCallee Calloc = getOrInsertLibFunc(M, TLI, LibFunc_calloc,
-                                             B.getPtrTy(), SizeTTy, SizeTTy);
+                                             B.getPtrTy(AddrSpace), SizeTTy,
+                                             SizeTTy);
   inferNonMandatoryLibFuncAttrs(M, CallocName, TLI);
   CallInst *CI = B.CreateCall(Calloc, {Num, Size}, CallocName);
 
diff --git a/llvm/test/Transforms/DeadStoreElimination/malloc-to-calloc-with-nonzero-default-as.ll b/llvm/test/Transforms/DeadStoreElimination/malloc-to-calloc-with-nonzero-default-as.ll
new file mode 100644
index 00000000000000..78bde6e87ef6c2
--- /dev/null
+++ b/llvm/test/Transforms/DeadStoreElimination/malloc-to-calloc-with-nonzero-default-as.ll
@@ -0,0 +1,63 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=dse < %s | FileCheck %s
+
+%struct.type = type { ptr addrspace(4), ptr addrspace(4) }
+
+define ptr @malloc_to_calloc() {
+; CHECK-LABEL: define ptr @malloc_to_calloc() {
+; CHECK-NEXT:    [[STRUCT_ALLOCA:%.*]] = alloca [[STRUCT_TYPE:%.*]], align 8
+; CHECK-NEXT:    call void @llvm.lifetime.start.p0(i64 56, ptr nonnull [[STRUCT_ALLOCA]]) #[[ATTR6:[0-9]+]]
+; CHECK-NEXT:    [[STRUCT_BYTE_8:%.*]] = getelementptr inbounds i8, ptr [[STRUCT_ALLOCA]], i64 8
+; CHECK-NEXT:    [[STRUCT_BYTE_4:%.*]] = getelementptr inbounds i8, ptr [[STRUCT_ALLOCA]], i64 4
+; CHECK-NEXT:    [[CALLOC1:%.*]] = call ptr addrspace(4) @calloc(i64 1, i64 4)
+; CHECK-NEXT:    [[CALLOC:%.*]] = call ptr addrspace(4) @calloc(i64 1, i64 4)
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i8, ptr [[STRUCT_BYTE_8]], i64 4
+; CHECK-NEXT:    call void @llvm.memset.p0.i64(ptr noundef nonnull align 4 [[TMP1]], i8 42, i64 4, i1 false)
+; CHECK-NEXT:    store i32 43, ptr [[STRUCT_BYTE_8]], align 4
+; CHECK-NEXT:    [[RET:%.*]] = load ptr, ptr [[STRUCT_BYTE_8]], align 8
+; CHECK-NEXT:    call void @readnone(ptr [[STRUCT_BYTE_4]])
+; CHECK-NEXT:    call void @readnone(ptr [[STRUCT_BYTE_8]])
+; CHECK-NEXT:    call void @llvm.lifetime.end.p0(i64 56, ptr nonnull [[STRUCT_ALLOCA]]) #[[ATTR6]]
+; CHECK-NEXT:    call void @use(ptr addrspace(4) [[CALLOC1]])
+; CHECK-NEXT:    call void @use(ptr addrspace(4) [[CALLOC]])
+; CHECK-NEXT:    ret ptr [[RET]]
+;
+  %struct.alloca = alloca %struct.type, align 8
+  call void @llvm.lifetime.start.p4(i64 56, ptr nonnull %struct.alloca) nounwind
+  %struct.byte.8 = getelementptr inbounds i8, ptr %struct.alloca, i64 8
+  %struct.byte.4 = getelementptr inbounds i8, ptr %struct.alloca, i64 4
+
+  ; Set of removable memory deffs
+  %m1 = tail call ptr addrspace(4) @malloc(i64 4)
+  %m2 = tail call ptr addrspace(4) @malloc(i64 4)
+  store i32 0, ptr %struct.byte.4
+  store i32 0, ptr %struct.byte.8
+  call void @llvm.memset.p4.i64(ptr addrspace(4) noundef nonnull align 4 %m2, i8 0, i64 4, i1 false)
+  call void @llvm.memset.p4.i64(ptr addrspace(4) noundef nonnull align 4 %m1, i8 0, i64 4, i1 false)
+
+  ; Set %struct.alloca[8, 16) to 42.
+  call void @llvm.memset.p0.i64(ptr noundef nonnull align 4 %struct.byte.8, i8 42, i64 8, i1 false)
+  ; Set %struct.alloca[8, 12) to 43.
+  store i32 43, ptr %struct.byte.8, align 4
+  ; Set %struct.alloca[4, 8) to 44.
+  store i32 44, ptr %struct.byte.4, align 4
+  ; Return %struct.alloca[8, 16).
+  %ret = load ptr, ptr %struct.byte.8
+  call void @readnone(ptr %struct.byte.4);
+  call void @readnone(ptr %struct.byte.8);
+  call void @llvm.lifetime.end.p0(i64 56, ptr nonnull %struct.alloca) nounwind
+  call void @use(ptr addrspace(4) %m1)
+  call void @use(ptr addrspace(4) %m2)
+  ret ptr %ret
+}
+
+declare void @llvm.memset.p0.i64(ptr nocapture writeonly, i8, i64, i1 immarg)
+declare void @llvm.memset.p4.i64(ptr addrspace(4) nocapture writeonly, i8, i64, i1 immarg)
+declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture)
+declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture)
+
+declare noalias ptr addrspace(4) @malloc(i64) willreturn allockind("alloc,uninitialized") "alloc-family"="malloc"
+declare void @readnone(ptr) readnone nounwind
+declare void @free(ptr addrspace(4) nocapture) allockind("free") "alloc-family"="malloc"
+
+declare void @use(ptr addrspace(4))

Copy link

github-actions bot commented Sep 30, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -251,7 +251,7 @@ namespace llvm {

/// Emit a call to the calloc function.
Value *emitCalloc(Value *Num, Value *Size, IRBuilderBase &B,
const TargetLibraryInfo &TLI);
const TargetLibraryInfo &TLI, unsigned AddrSpace = 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent mistakes, can we make this a required parameter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, I did not because it seems that most interfaces pass it as a defaulted parameter. I do agree with you that it'd be better not to do that though, and since this is only caller in one place, the impact is limited.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AlexVlx AlexVlx merged commit 4852374 into llvm:main Oct 1, 2024
8 checks passed
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
…alloc` (llvm#110524)

Currently DSE unconditionally emits `calloc` as returning a pointer to
AS0. However, this is incorrect for targets that have a non-zero default
AS, as it'd not match the `malloc` signature. This patch addresses that
by piping through the AS for the pointer returned by `malloc` into the
`calloc` insertion call.
searlmc1 added a commit to ROCm/llvm-project that referenced this pull request Dec 5, 2024
Adds the following patches
AMDGPU: Remove wavefrontsize64 feature from dummy target llvm#117410
[LLVM][NFC] Use used's element type if available llvm#116804
[llvm][AMDGPU] Fold llvm.amdgcn.wavefrontsize early llvm#114481
[clang][Driver][HIP] Add support for mixing AMDGCNSPIRV & concrete offload-archs. llvm#113509
[clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V llvm#110695
[llvm][opt][Transforms] Replacement calloc should match replaced malloc llvm#110524
[clang][HIP] Don't use the OpenCLKernel CC when targeting AMDGCNSPIRV llvm#110447
[cuda][HIP] constant should imply constant llvm#110182
[llvm][SPIRV] Expose fast popcnt support for SPIR-V targets llvm#109845
[clang][CodeGen][SPIR-V] Fix incorrect SYCL usage, implement missing interface llvm#109415
[SPIRV][RFC] Rework / extend support for memory scopes llvm#106429
[clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. llvm#102776

Change-Id: I2b9ab54aba1c9345b9b0eb84409e6ed6c3cdb6cd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants