Skip to content

[WIP] Consider datalayout sentinel pointer value for isKnownNonZero check #91769

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ranapratap55
Copy link
Contributor

Updated isKnownNonZero method to consider data layout sentinel pointer value. Related PR #83109.

@ranapratap55 ranapratap55 requested review from arsenm and Pierre-vh May 10, 2024 17:07
@ranapratap55 ranapratap55 requested a review from nikic as a code owner May 10, 2024 17:07
@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels May 10, 2024
@llvmbot
Copy link
Member

llvmbot commented May 10, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Rana Pratap Reddy (ranapratap55)

Changes

Updated isKnownNonZero method to consider data layout sentinel pointer value. Related PR #83109.


Patch is 368.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91769.diff

16 Files Affected:

  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+9)
  • (added) llvm/test/Transforms/Attributor/ArgumentPromotion/array-dl-sentinel.ll (+56)
  • (added) llvm/test/Transforms/Attributor/IPConstantProp/pthreads-dl-sentinel.ll (+122)
  • (added) llvm/test/Transforms/Attributor/callbacks-dl-sentinel.ll (+321)
  • (added) llvm/test/Transforms/Attributor/nocapture-2-dl-sentinel.ll (+892)
  • (added) llvm/test/Transforms/Attributor/nofree-dl-sentinel.ll (+516)
  • (added) llvm/test/Transforms/Attributor/read_write_returned_arguments_scc-dl-sentinel.ll (+336)
  • (added) llvm/test/Transforms/Attributor/returned-dl-sentinel.ll (+1421)
  • (added) llvm/test/Transforms/Attributor/value-simplify-dl-sentinel.ll (+1732)
  • (added) llvm/test/Transforms/InstCombine/alloca-cast-debuginfo-dl-sentinel.ll (+88)
  • (added) llvm/test/Transforms/InstCombine/alloca-in-non-alloca-as-sentinel.ll (+65)
  • (added) llvm/test/Transforms/InstCombine/assume-dl-sentinel.ll (+944)
  • (added) llvm/test/Transforms/InstCombine/assume-icmp-null-select-dl-sentinel.ll (+37)
  • (added) llvm/test/Transforms/InstSimplify/compare-dl-sentinel.ll (+3270)
  • (added) llvm/test/Transforms/InstSimplify/icmp-dl-sentinel.ll (+414)
  • (added) llvm/test/Transforms/InstSimplify/null-ptr-is-valid-dl-sentinel.ll (+23)
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 375385aca7a39..51ad01219c6e2 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -3098,6 +3098,15 @@ bool isKnownNonZero(const Value *V, const APInt &DemandedElts,
   // Check for pointer simplifications.
 
   if (PointerType *PtrTy = dyn_cast<PointerType>(Ty)) {
+    const DataLayout &DL = Q.DL;
+    if (DL.isSentinelValueDefined() && PtrTy) {
+      unsigned AddrSpace = PtrTy->getPointerAddressSpace();
+
+      if (DL.getSentinelPointerValue(AddrSpace) == 0)
+        return false;
+      return true;
+    }
+
     // A byval, inalloca may not be null in a non-default addres space. A
     // nonnull argument is assumed never 0.
     if (const Argument *A = dyn_cast<Argument>(V)) {
diff --git a/llvm/test/Transforms/Attributor/ArgumentPromotion/array-dl-sentinel.ll b/llvm/test/Transforms/Attributor/ArgumentPromotion/array-dl-sentinel.ll
new file mode 100755
index 0000000000000..1f6ee8bbc348a
--- /dev/null
+++ b/llvm/test/Transforms/Attributor/ArgumentPromotion/array-dl-sentinel.ll
@@ -0,0 +1,56 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-attributes --check-globals
+target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-z0:1-z2:neg1-z3:neg1-z5:neg1-S32-A5-G1-ni:7:8:9"
+; RUN: opt -aa-pipeline=basic-aa -passes=attributor -attributor-manifest-internal  -attributor-annotate-decl-cs  -S < %s | FileCheck %s --check-prefixes=CHECK,TUNIT
+; RUN: opt -aa-pipeline=basic-aa -passes=attributor-cgscc -attributor-manifest-internal  -attributor-annotate-decl-cs -S < %s | FileCheck %s --check-prefixes=CHECK,CGSCC
+;
+; FIXME: The GEP + BC + GEP solution we create is not great but correct.
+
+declare void @use(ptr nocapture readonly %arg)
+
+define void @caller() {
+; TUNIT-LABEL: define {{[^@]+}}@caller() {
+; TUNIT-NEXT:  entry:
+; TUNIT-NEXT:    [[LEFT:%.*]] = alloca [3 x i32], align 4
+; TUNIT-NEXT:    [[TMP0:%.*]] = load i32, ptr [[LEFT]], align 4
+; TUNIT-NEXT:    [[LEFT_B4:%.*]] = getelementptr i8, ptr [[LEFT]], i64 4
+; TUNIT-NEXT:    [[TMP1:%.*]] = load i32, ptr [[LEFT_B4]], align 4
+; TUNIT-NEXT:    [[LEFT_B8:%.*]] = getelementptr i8, ptr [[LEFT]], i64 8
+; TUNIT-NEXT:    [[TMP2:%.*]] = load i32, ptr [[LEFT_B8]], align 4
+; TUNIT-NEXT:    call void @callee(i32 [[TMP0]], i32 [[TMP1]], i32 [[TMP2]])
+; TUNIT-NEXT:    ret void
+;
+; CGSCC-LABEL: define {{[^@]+}}@caller() {
+; CGSCC-NEXT:  entry:
+; CGSCC-NEXT:    call void @callee(i32 undef, i32 undef, i32 undef)
+; CGSCC-NEXT:    ret void
+;
+entry:
+  %left = alloca [3 x i32], align 4
+  call void @callee(ptr %left)
+  ret void
+}
+
+define internal void @callee(ptr noalias %arg) {
+; CHECK: Function Attrs: memory(readwrite, argmem: none)
+; CHECK-LABEL: define {{[^@]+}}@callee
+; CHECK-SAME: (i32 [[TMP0:%.*]], i32 [[TMP1:%.*]], i32 [[TMP2:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[ARG_PRIV:%.*]] = alloca [3 x i32], align 4, addrspace(5)
+; CHECK-NEXT:    store i32 [[TMP0]], ptr addrspace(5) [[ARG_PRIV]], align 4
+; CHECK-NEXT:    [[ARG_PRIV_B4:%.*]] = getelementptr i8, ptr addrspace(5) [[ARG_PRIV]], i64 4
+; CHECK-NEXT:    store i32 [[TMP1]], ptr addrspace(5) [[ARG_PRIV_B4]], align 4
+; CHECK-NEXT:    [[ARG_PRIV_B8:%.*]] = getelementptr i8, ptr addrspace(5) [[ARG_PRIV]], i64 8
+; CHECK-NEXT:    store i32 [[TMP2]], ptr addrspace(5) [[ARG_PRIV_B8]], align 4
+; CHECK-NEXT:    [[TMP3:%.*]] = addrspacecast ptr addrspace(5) [[ARG_PRIV]] to ptr
+; CHECK-NEXT:    call void @use(ptr noalias nocapture nofree noundef nonnull readonly align 4 dereferenceable(12) [[TMP3]])
+; CHECK-NEXT:    ret void
+;
+entry:
+  call void @use(ptr %arg)
+  ret void
+}
+;.
+; TUNIT: attributes #[[ATTR0]] = { memory(readwrite, argmem: none) }
+;.
+; CGSCC: attributes #[[ATTR0]] = { memory(readwrite, argmem: none) }
+;.
diff --git a/llvm/test/Transforms/Attributor/IPConstantProp/pthreads-dl-sentinel.ll b/llvm/test/Transforms/Attributor/IPConstantProp/pthreads-dl-sentinel.ll
new file mode 100755
index 0000000000000..b93017bd58986
--- /dev/null
+++ b/llvm/test/Transforms/Attributor/IPConstantProp/pthreads-dl-sentinel.ll
@@ -0,0 +1,122 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-attributes --check-globals
+; RUN: opt -aa-pipeline=basic-aa -passes=attributor -attributor-manifest-internal  -attributor-annotate-decl-cs  -S < %s | FileCheck %s --check-prefixes=CHECK,TUNIT
+; RUN: opt -aa-pipeline=basic-aa -passes=attributor-cgscc -attributor-manifest-internal  -attributor-annotate-decl-cs -S < %s | FileCheck %s --check-prefixes=CHECK,CGSCC
+;
+;    #include <pthread.h>
+;
+;    ptr GlobalVPtr;
+;
+;    static ptr foo(ptr arg) { return arg; }
+;    static ptr bar(ptr arg) { return arg; }
+;
+;    int main() {
+;      pthread_t thread;
+;      pthread_create(&thread, NULL, foo, NULL);
+;      pthread_create(&thread, NULL, bar, &GlobalVPtr);
+;      return 0;
+;    }
+;
+; Verify the constant values NULL and &GlobalVPtr are propagated into foo and
+; bar, respectively.
+;
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-z0:1-z2:neg1-z3:neg1-z5:neg1-S128"
+
+%union.pthread_attr_t = type { i64, [48 x i8] }
+
+@GlobalVPtr = common dso_local global ptr null, align 8
+
+; FIXME: nocapture & noalias for @GlobalVPtr in %call1
+; FIXME: nocapture & noalias for %alloc2 in %call3
+
+;.
+; CHECK: @GlobalVPtr = common dso_local global ptr null, align 8
+;.
+define dso_local i32 @main() {
+; TUNIT-LABEL: define {{[^@]+}}@main() {
+; TUNIT-NEXT:  entry:
+; TUNIT-NEXT:    [[ALLOC11:%.*]] = alloca i8, i32 0, align 8
+; TUNIT-NEXT:    [[ALLOC22:%.*]] = alloca i8, i32 0, align 8
+; TUNIT-NEXT:    [[THREAD:%.*]] = alloca i64, align 8
+; TUNIT-NEXT:    [[CALL:%.*]] = call i32 @pthread_create(ptr noundef nonnull align 8 dereferenceable(8) [[THREAD]], ptr noundef align 4294967296 null, ptr noundef nonnull @foo, ptr nofree readnone align 4294967296 undef)
+; TUNIT-NEXT:    [[CALL1:%.*]] = call i32 @pthread_create(ptr noundef nonnull align 8 dereferenceable(8) [[THREAD]], ptr noundef align 4294967296 null, ptr noundef nonnull @bar, ptr noalias nocapture nofree nonnull readnone align 8 dereferenceable(8) undef)
+; TUNIT-NEXT:    [[CALL2:%.*]] = call i32 @pthread_create(ptr noundef nonnull align 8 dereferenceable(8) [[THREAD]], ptr noundef align 4294967296 null, ptr noundef nonnull @baz, ptr noalias nocapture nofree noundef nonnull readnone align 8 dereferenceable(1) [[ALLOC11]])
+; TUNIT-NEXT:    [[CALL3:%.*]] = call i32 @pthread_create(ptr noundef nonnull align 8 dereferenceable(8) [[THREAD]], ptr noundef align 4294967296 null, ptr noundef nonnull @buz, ptr noalias nofree noundef nonnull readnone align 8 dereferenceable(1) "no-capture-maybe-returned" [[ALLOC22]])
+; TUNIT-NEXT:    ret i32 0
+;
+; CGSCC-LABEL: define {{[^@]+}}@main() {
+; CGSCC-NEXT:  entry:
+; CGSCC-NEXT:    [[ALLOC1:%.*]] = alloca i8, align 8
+; CGSCC-NEXT:    [[ALLOC2:%.*]] = alloca i8, align 8
+; CGSCC-NEXT:    [[THREAD:%.*]] = alloca i64, align 8
+; CGSCC-NEXT:    unreachable
+;
+entry:
+  %alloc1 = alloca i8, align 8
+  %alloc2 = alloca i8, align 8
+  %thread = alloca i64, align 8
+  %call = call i32 @pthread_create(ptr nonnull %thread, ptr null, ptr nonnull @foo, ptr null)
+  %call1 = call i32 @pthread_create(ptr nonnull %thread, ptr null, ptr nonnull @bar, ptr @GlobalVPtr)
+  %call2 = call i32 @pthread_create(ptr nonnull %thread, ptr null, ptr nonnull @baz, ptr nocapture %alloc1)
+  %call3 = call i32 @pthread_create(ptr nonnull %thread, ptr null, ptr nonnull @buz, ptr %alloc2)
+  ret i32 0
+}
+
+declare !callback !0 dso_local i32 @pthread_create(ptr, ptr, ptr, ptr)
+
+define internal ptr @foo(ptr %arg) {
+; CHECK: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
+; CHECK-LABEL: define {{[^@]+}}@foo
+; CHECK-SAME: (ptr noalias nocapture nofree nonnull readnone align 4294967296 [[ARG:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    unreachable
+;
+entry:
+  ret ptr %arg
+}
+
+define internal ptr @bar(ptr %arg) {
+; CHECK: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
+; CHECK-LABEL: define {{[^@]+}}@bar
+; CHECK-SAME: (ptr noalias nocapture nofree nonnull readnone align 8 dereferenceable(8) [[ARG:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    ret ptr @GlobalVPtr
+;
+entry:
+  ret ptr %arg
+}
+
+define internal ptr @baz(ptr %arg) {
+; CHECK: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
+; CHECK-LABEL: define {{[^@]+}}@baz
+; CHECK-SAME: (ptr noalias nofree noundef nonnull readnone returned align 8 dereferenceable(1) "no-capture-maybe-returned" [[ARG:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    ret ptr [[ARG]]
+;
+entry:
+  ret ptr %arg
+}
+
+define internal ptr @buz(ptr %arg) {
+; CHECK: Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
+; CHECK-LABEL: define {{[^@]+}}@buz
+; CHECK-SAME: (ptr noalias nofree noundef nonnull readnone returned align 8 dereferenceable(1) "no-capture-maybe-returned" [[ARG:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    ret ptr [[ARG]]
+;
+entry:
+  ret ptr %arg
+}
+
+!1 = !{i64 2, i64 3, i1 false}
+!0 = !{!1}
+;.
+; TUNIT: attributes #[[ATTR0]] = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) }
+;.
+; CGSCC: attributes #[[ATTR0]] = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) }
+;.
+; TUNIT: [[META0:![0-9]+]] = !{[[META1:![0-9]+]]}
+; TUNIT: [[META1]] = !{i64 2, i64 3, i1 false}
+;.
+; CGSCC: [[META0:![0-9]+]] = !{[[META1:![0-9]+]]}
+; CGSCC: [[META1]] = !{i64 2, i64 3, i1 false}
+;.
diff --git a/llvm/test/Transforms/Attributor/callbacks-dl-sentinel.ll b/llvm/test/Transforms/Attributor/callbacks-dl-sentinel.ll
new file mode 100755
index 0000000000000..1e66ba25f166a
--- /dev/null
+++ b/llvm/test/Transforms/Attributor/callbacks-dl-sentinel.ll
@@ -0,0 +1,321 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-attributes --check-globals
+; RUN: opt -aa-pipeline=basic-aa -passes=attributor -attributor-manifest-internal  -attributor-annotate-decl-cs  -S < %s | FileCheck %s --check-prefixes=CHECK,TUNIT
+; RUN: opt -aa-pipeline=basic-aa -passes=attributor-cgscc -attributor-manifest-internal  -attributor-annotate-decl-cs -S < %s | FileCheck %s --check-prefixes=CHECK,CGSCC
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-z0:1-z2:neg1-z3:neg1-z5:neg1-S128"
+
+; Test 0
+;
+; Make sure we propagate information from the caller to the callback callee but
+; only for arguments that are mapped through the callback metadata. Here, the
+; first two arguments of the call and the callback callee do not correspond to
+; each other but argument 3-5 of the transitive call site in the caller match
+; arguments 2-4 of the callback callee. Here we should see information and value
+; transfer in both directions.
+
+define void @t0_caller(ptr %a) {
+; TUNIT-LABEL: define {{[^@]+}}@t0_caller
+; TUNIT-SAME: (ptr nonnull align 256 [[A:%.*]]) {
+; TUNIT-NEXT:  entry:
+; TUNIT-NEXT:    [[B:%.*]] = alloca i32, align 32
+; TUNIT-NEXT:    [[C:%.*]] = alloca ptr, align 64
+; TUNIT-NEXT:    [[PTR:%.*]] = alloca i32, align 128
+; TUNIT-NEXT:    store i32 42, ptr [[B]], align 32
+; TUNIT-NEXT:    store ptr [[B]], ptr [[C]], align 64
+; TUNIT-NEXT:    call void (ptr, ptr, ptr, ...) @t0_callback_broker(ptr noundef align 4294967296 null, ptr noundef nonnull align 128 dereferenceable(4) [[PTR]], ptr noundef nonnull @t0_callback_callee, ptr nonnull align 256 [[A]], i64 undef, ptr noalias nocapture nofree noundef nonnull readonly align 64 dereferenceable(8) [[C]])
+; TUNIT-NEXT:    ret void
+;
+; CGSCC-LABEL: define {{[^@]+}}@t0_caller
+; CGSCC-SAME: (ptr nonnull align 256 [[A:%.*]]) {
+; CGSCC-NEXT:  entry:
+; CGSCC-NEXT:    [[B:%.*]] = alloca i32, align 32
+; CGSCC-NEXT:    [[C:%.*]] = alloca ptr, align 64
+; CGSCC-NEXT:    [[PTR:%.*]] = alloca i32, align 128
+; CGSCC-NEXT:    store i32 42, ptr [[B]], align 32
+; CGSCC-NEXT:    store ptr [[B]], ptr [[C]], align 64
+; CGSCC-NEXT:    call void (ptr, ptr, ptr, ...) @t0_callback_broker(ptr noundef align 4294967296 null, ptr noundef nonnull align 128 dereferenceable(4) [[PTR]], ptr noundef nonnull @t0_callback_callee, ptr nonnull align 256 [[A]], i64 noundef 99, ptr noalias nocapture nofree noundef nonnull readonly align 64 dereferenceable(8) [[C]])
+; CGSCC-NEXT:    ret void
+;
+entry:
+  %b = alloca i32, align 32
+  %c = alloca ptr, align 64
+  %ptr = alloca i32, align 128
+  store i32 42, ptr %b, align 4
+  store ptr %b, ptr %c, align 8
+  call void (ptr, ptr, ptr, ...) @t0_callback_broker(ptr null, ptr %ptr, ptr @t0_callback_callee, ptr %a, i64 99, ptr %c)
+  ret void
+}
+
+; Note that the first two arguments are provided by the callback_broker according to the callback in !1 below!
+; The others are annotated with alignment information, amongst others, or even replaced by the constants passed to the call.
+define internal void @t0_callback_callee(ptr %is_not_null, ptr %ptr, ptr %a, i64 %b, ptr %c) {
+;
+; TUNIT-LABEL: define {{[^@]+}}@t0_callback_callee
+; TUNIT-SAME: (ptr nocapture nofree noundef nonnull writeonly align 4 dereferenceable(4) [[IS_NOT_NULL:%.*]], ptr nocapture nofree noundef nonnull readonly align 8 dereferenceable(4) [[PTR:%.*]], ptr nonnull align 256 [[A:%.*]], i64 [[B:%.*]], ptr noalias nocapture nofree noundef nonnull readonly align 64 dereferenceable(8) [[C:%.*]]) {
+; TUNIT-NEXT:  entry:
+; TUNIT-NEXT:    [[PTR_VAL:%.*]] = load i32, ptr [[PTR]], align 8
+; TUNIT-NEXT:    store i32 [[PTR_VAL]], ptr [[IS_NOT_NULL]], align 4
+; TUNIT-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[C]], align 64
+; TUNIT-NEXT:    tail call void @t0_check(ptr nonnull align 256 [[A]], i64 noundef 99, ptr nonnull align 32 [[TMP0]])
+; TUNIT-NEXT:    ret void
+;
+; CGSCC-LABEL: define {{[^@]+}}@t0_callback_callee
+; CGSCC-SAME: (ptr nocapture nofree noundef nonnull writeonly align 4 dereferenceable(4) [[IS_NOT_NULL:%.*]], ptr nocapture nofree noundef nonnull readonly align 8 dereferenceable(4) [[PTR:%.*]], ptr nonnull align 256 [[A:%.*]], i64 [[B:%.*]], ptr noalias nocapture nofree noundef nonnull readonly align 64 dereferenceable(8) [[C:%.*]]) {
+; CGSCC-NEXT:  entry:
+; CGSCC-NEXT:    [[PTR_VAL:%.*]] = load i32, ptr [[PTR]], align 8
+; CGSCC-NEXT:    store i32 [[PTR_VAL]], ptr [[IS_NOT_NULL]], align 4
+; CGSCC-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[C]], align 64
+; CGSCC-NEXT:    tail call void @t0_check(ptr nonnull align 256 [[A]], i64 noundef 99, ptr nonnull [[TMP0]])
+; CGSCC-NEXT:    ret void
+;
+entry:
+  %ptr_val = load i32, ptr %ptr, align 8
+  store i32 %ptr_val, ptr %is_not_null
+  %0 = load ptr, ptr %c, align 8
+  tail call void @t0_check(ptr %a, i64 %b, ptr %0)
+  ret void
+}
+
+declare void @t0_check(ptr align 256, i64, ptr)
+
+declare !callback !0 void @t0_callback_broker(ptr, ptr, ptr, ...)
+
+; Test 1
+;
+; Similar to test 0 but with some additional annotations (noalias/nocapute) to make sure
+; we deduce and propagate noalias and others properly.
+
+define void @t1_caller(ptr noalias %a) {
+;
+; TUNIT-LABEL: define {{[^@]+}}@t1_caller
+; TUNIT-SAME: (ptr noalias nocapture nonnull align 256 [[A:%.*]]) {
+; TUNIT-NEXT:  entry:
+; TUNIT-NEXT:    [[B:%.*]] = alloca i32, align 32
+; TUNIT-NEXT:    [[C:%.*]] = alloca ptr, align 64
+; TUNIT-NEXT:    [[PTR:%.*]] = alloca i32, align 128
+; TUNIT-NEXT:    store i32 42, ptr [[B]], align 32
+; TUNIT-NEXT:    store ptr [[B]], ptr [[C]], align 64
+; TUNIT-NEXT:    call void (ptr, ptr, ptr, ...) @t1_callback_broker(ptr noundef align 4294967296 null, ptr noalias nocapture noundef nonnull align 128 dereferenceable(4) [[PTR]], ptr nocapture noundef nonnull @t1_callback_callee, ptr nocapture nonnull align 256 [[A]], i64 undef, ptr noalias nocapture nofree noundef nonnull readonly align 64 dereferenceable(8) [[C]])
+; TUNIT-NEXT:    ret void
+;
+; CGSCC-LABEL: define {{[^@]+}}@t1_caller
+; CGSCC-SAME: (ptr noalias nocapture nonnull align 256 [[A:%.*]]) {
+; CGSCC-NEXT:  entry:
+; CGSCC-NEXT:    [[B:%.*]] = alloca i32, align 32
+; CGSCC-NEXT:    [[C:%.*]] = alloca ptr, align 64
+; CGSCC-NEXT:    [[PTR:%.*]] = alloca i32, align 128
+; CGSCC-NEXT:    store i32 42, ptr [[B]], align 32
+; CGSCC-NEXT:    store ptr [[B]], ptr [[C]], align 64
+; CGSCC-NEXT:    call void (ptr, ptr, ptr, ...) @t1_callback_broker(ptr noundef align 4294967296 null, ptr noalias nocapture noundef nonnull align 128 dereferenceable(4) [[PTR]], ptr nocapture noundef nonnull @t1_callback_callee, ptr nocapture nonnull align 256 [[A]], i64 noundef 99, ptr noalias nocapture nofree noundef nonnull readonly align 64 dereferenceable(8) [[C]])
+; CGSCC-NEXT:    ret void
+;
+entry:
+  %b = alloca i32, align 32
+  %c = alloca ptr, align 64
+  %ptr = alloca i32, align 128
+  store i32 42, ptr %b, align 4
+  store ptr %b, ptr %c, align 8
+  call void (ptr, ptr, ptr, ...) @t1_callback_broker(ptr null, ptr %ptr, ptr @t1_callback_callee, ptr %a, i64 99, ptr %c)
+  ret void
+}
+
+; Note that the first two arguments are provided by the callback_broker according to the callback in !1 below!
+; The others are annotated with alignment information, amongst others, or even replaced by the constants passed to the call.
+define internal void @t1_callback_callee(ptr %is_not_null, ptr %ptr, ptr %a, i64 %b, ptr %c) {
+;
+; TUNIT: Function Attrs: nosync
+; TUNIT-LABEL: define {{[^@]+}}@t1_callback_callee
+; TUNIT-SAME: (ptr nocapture nofree noundef nonnull writeonly align 4 dereferenceable(4) [[IS_NOT_NULL:%.*]], ptr nocapture nofree noundef nonnull readonly align 8 dereferenceable(4) [[PTR:%.*]], ptr nocapture nonnull align 256 [[A:%.*]], i64 [[B:%.*]], ptr noalias nocapture nofree noundef nonnull readonly align 64 dereferenceable(8) [[C:%.*]]) #[[ATTR0:[0-9]+]] {
+; TUNIT-NEXT:  entry:
+; TUNIT-NEXT:    [[PTR_VAL:%.*]] = load i32, ptr [[PTR]], align 8
+; TUNIT-NEXT:    store i32 [[PTR_VAL]], ptr [[IS_NOT_NULL]], align 4
+; TUNIT-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[C]], align 64
+; TUNIT-NEXT:    tail call void @t1_check(ptr nocapture nonnull align 256 [[A]], i64 noundef 99, ptr nocapture nonnull align 32 [[TMP0]])
+; TUNIT-NEXT:    ret void
+;
+; CGSCC: Function Attrs: nosync
+; CGSCC-LABEL: define {{[^@]+}}@t1_callback_callee
+; CGSCC-SAME: (ptr nocapture nofree noundef nonnull writeonly align 4 dereferenceable(4) [[IS_NOT_NULL:%.*]], ptr nocapture nofree noundef nonnull readonly align 8 dereferenceable(4) [[PTR:%.*]], ptr nocapture nonnull align 256 [[A:%.*]], i64 [[B:%.*]], ptr noalias nocapture nofree noundef nonnull readonly align 64 dereferenceable(8) [[C:%.*]]) #[[ATTR0:[0-9]+]] {
+; CGSCC-NEXT:  entry:
+; CGSCC-NEXT:    [[PTR_VAL:%.*]] = load i32, ptr [[PTR]], align 8
+; CGSCC-NEXT:    store i32 [[PTR_VAL]], ptr [[IS_NOT_NULL]], align 4
+; CGSCC-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[C]], align 64
+; CGSCC-NEXT:    tail call void @t1_check(ptr nocapture nonnull align 256 [[A]], i64 noundef 99, ptr nocapture nonnull [[TMP0]])
+; CGSCC-NEXT:    ret void
+;
+entry:
+  %ptr_val = load i32, ptr %ptr, align 8
+  store i32 %ptr_val, ptr %is_not_null
+  %0 = load ptr, ptr %c, align 8
+  tail call void @t1_check(ptr %a, i64 %b, ptr %0)
+  ret void
+}
+
+declare void @t1_check(ptr nocapture align 256, i64, ptr nocapture) nosync
+
+declare !callback !0 void @t1_callback_broker(ptr nocapture , ptr nocapture , ptr nocapture, ...)
+
+; Test 2
+;
+; Similar to test 1 but checking that the noalias is only placed if potential synchronization through @t2_check is preserved.
+
+define void @t2_caller(ptr noalias %a) {
+; TUNIT-LABEL: define {{[^@]+}}@t2_caller
+; TUNIT-SAME: (ptr noalias nocapture nonnull align 256 [[A:%.*]]) {
+; TUNIT-NEXT:  entry:
+; TUNIT-NEXT:    [[B:%.*]] = alloca i32, align 32
+; TUNIT-NEXT:    [[C:%.*]] = alloca ptr, align 64
+; TUNIT-NEXT:    [[PTR:%.*]] = alloca i32, align 128
+; TUNIT-NEXT:    store i32 42, ptr [[B]], align 32
+; TUNIT-NEXT:    store ptr [[B]], ptr [[C]], align 64
+; TUNIT-NEXT:    call void (ptr, ptr, ptr, ...) @t2_callback_broker(ptr ...
[truncated]

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what's going on in the tests. These seem to just be a bunch of existing tests, copied with a datalayout applied

Comment on lines 3105 to 3107
if (DL.getSentinelPointerValue(AddrSpace) == 0)
return false;
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fold to return of boolean expression

@@ -3098,6 +3098,15 @@ bool isKnownNonZero(const Value *V, const APInt &DemandedElts,
// Check for pointer simplifications.

if (PointerType *PtrTy = dyn_cast<PointerType>(Ty)) {
const DataLayout &DL = Q.DL;
if (DL.isSentinelValueDefined() && PtrTy) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PtrTy check is redundant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fold to return of boolean expression
PtrTy check is redundant

removed PtrTy check and updated fold to return.

@@ -0,0 +1,56 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-attributes --check-globals
target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-z0:1-z2:neg1-z3:neg1-z5:neg1-S32-A5-G1-ni:7:8:9"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can reduce this to just the relevant piece of the datalayout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can reduce this to just the relevant piece of the datalayout

updated with and without datalayout with relevant string

; Gracefully handle the alloca that is not in the alloca AS (=5)

target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-z0:1-z2:neg1-z3:neg1-z5:neg1-S32-A5-G1-ni:7:8:9"
target triple = "amdgcn-amd-amdhsa"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the triples

;
.master:
%0 = alloca i8, i64 8, align 1
store ptr undef, ptr %0, align 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using undef and use named values

; CHECK-NEXT: ret void
;
bb:
%alloca = alloca [30 x %struct.widget], i32 0, align 16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix alloca of 0 size. Also all of these should use the alloca address space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the triples
Avoid using undef and use named values
Drop CC and weak
Fix alloca of 0 size. Also all of these should use the alloca address space

removed CC, weak and triples. updated alloca address space and used named values.

; RUN: opt -aa-pipeline=basic-aa -passes=attributor-cgscc -attributor-manifest-internal -attributor-annotate-decl-cs -S < %s | FileCheck %s --check-prefixes=CHECK,CGSCC
;
; FIXME: The GEP + BC + GEP solution we create is not great but correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this test is supposed to be showing. None of the address spaces are non-0

@@ -0,0 +1,1421 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-attributes --check-globals
; RUN: opt -aa-pipeline=basic-aa -passes=attributor -attributor-manifest-internal -attributor-annotate-decl-cs -S < %s | FileCheck %s --check-prefixes=CHECK,TUNIT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, all these attributor tests seem to just be copies of existing tests with the datalayout applied. This needs to be more targeted (and ideally would have 2 identical copies of functions in the same test file, with different property address spaces)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, all these attributor tests seem to just be copies of existing tests with the datalayout applied. This needs to be more targeted (and ideally would have 2 identical copies of functions in the same test file, with different property address spaces)

Updated tests with and without datalayout string changes. added datalayout string to opt run cmd.


; FIXME: The interpretation of nonnull assumes a 0 pointer value, so
; this really is an incorrect fold.
define i1 @nonnull_may_be_zero(ptr addrspace(5) nonnull %ptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have copies of identical functions except for the address space

target datalayout = "A5-z0:1-z2:neg1-z3:neg1-z5:neg1"

; A 0 valued byval pointer may be valid
define i1 @byval_may_be_zero(ptr addrspace(5) byval(i32) %ptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also test a plain, non-byval pointer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have copies of identical functions except for the address space
Also test a plain, non-byval pointer

added byval, plain, and non-byval pointer identical functions with different address space.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right. Pointers can have contain an all-zero value no matter what the sentinel value is.

I think you want to modify NullPointerIsDefined?

@arsenm
Copy link
Contributor

arsenm commented May 13, 2024

This doesn't look right. Pointers can have contain an all-zero value no matter what the sentinel value is.

I think you want to modify NullPointerIsDefined?

There are a few pieces I think need to change. I think we need a variant of isKnownNonZero that specifically checks for known-non-null pointers. Currently some of the uses of this are using as a null check.

Additionally we need to eventually do something about ConstantPointerNull, where you should be able to assume the bitvalue is what the datalayout says

@efriedma-quic
Copy link
Collaborator

I think I'd rather just leave ConstantPointerNull as meaning the all-zero bit pattern. Some places actually do care that it's all-zero, and there's not much benefit to changing it. But better to discuss that on an RFC, probably.

Copy link

github-actions bot commented May 17, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@ranapratap55 ranapratap55 force-pushed the rnimmaka/sentinelValue branch from 24dbbe4 to 6425e33 Compare May 17, 2024 09:34
Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointers can have contain an all-zero value no matter what the sentinel value is.

@ranapratap55
Copy link
Contributor Author

I think I'd rather just leave ConstantPointerNull as meaning the all-zero bit pattern. Some places actually do care that it's all-zero, and there's not much benefit to changing it. But better to discuss that on an RFC, probably.

sure, will look into it and will start an RFC with more details.

@shiltian
Copy link
Contributor

Hi @ranapratap55 , I wonder if there is any progress on this. I'm interested in the idea of DL nullptr, which is very helpful For our use.

@ranapratap55
Copy link
Contributor Author

This PR adds a new field as sentinel value in the DL string for all unlisted addrspaces. I am looking into ways to use this in the IR optimizations. There are few APIs(knownbits, knownnull, ConstantPointerNull)which needs to be updated to use this change.

@arsenm
Copy link
Contributor

arsenm commented Feb 13, 2025

ConstantPointerNull which needs to be updated to use this change.

I think we need to introduce a new constant type, not start reinterpreting ConstantPointerNull. There will still be use for 0 initialized pointer values, and practically speaking we'll never get any migration done if we try to change the interpretation up front

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants