-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Clang] Add __builtin_assume_dereferenceable to encode deref assumption. #121789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-clang-codegen Author: Florian Hahn (fhahn) ChangesThis patch adds a new __builtin_assume_dereferenceable to encode dereferenceability of a pointer using llvm.assume with an operand bundle. For now the builtin only accepts constant sizes, I am planning to drop this restriction in a follow-up change. This can be used to better optimize cases where a pointer is known to be dereferenceable, e.g. unconditionally loading from p2 when vectorizing the loop.
Full diff: https://github.com/llvm/llvm-project/pull/121789.diff 7 Files Affected:
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index e020710c7aa4f5..1a96647ae765e2 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -2761,6 +2761,41 @@ etc.).
Query for this feature with ``__has_builtin(__builtin_assume_separate_storage)``.
+``__builtin_assume_dereferenceable``
+-------------------------------------
+
+``__builtin_assume_derefernceable`` is used to provide the optimizer with the
+knowledge that the pointer argument P is dereferenceable up to the specified
+number of bytes.
+
+**Syntax**:
+
+.. code-block:: c++
+
+ __builtin_assume_dereferenceable(const void *, size_t)
+
+**Example of Use**:
+
+.. code-block:: c++
+
+ int foo(int *x, int y) {
+ __builtin_assume_dereferenceable(x, 4);
+ int z = 0;
+ if (y == 1) {
+ // The optimizer may execute the load of x unconditionally.
+ z = *x;
+ }
+ return z;
+ }
+
+**Description**:
+
+The arguments to this function provide a start pointer ``P`` and a size ``S``.
+``P`` must be non-null and ``S`` at least 1. The optimizer may assume that
+``S`` bytes are dereferenceable starting at ``P``.
+
+Query for this feature with ``__has_builtin(__builtin_assume_dereferenceable)``.
+
``__builtin_offsetof``
----------------------
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 468c16050e2bf0..e9d8c6f621afa9 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -839,6 +839,12 @@ def BuiltinAssumeAligned : Builtin {
let Prototype = "void*(void const*, size_t, ...)";
}
+def BuiltinAssumeDereferenceable : Builtin {
+ let Spellings = ["__builtin_assume_dereferenceable"];
+ let Attributes = [NoThrow, Const, Constexpr];
+ let Prototype = "void(void const*, _Constant size_t)";
+}
+
def BuiltinFree : Builtin {
let Spellings = ["__builtin_free"];
let Attributes = [FunctionWithBuiltinPrefix, NoThrow];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index c419fb0cc055e0..0743d4d5eef49e 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3726,6 +3726,16 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
AlignmentCI, OffsetValue);
return RValue::get(PtrValue);
}
+ case Builtin::BI__builtin_assume_dereferenceable: {
+ const Expr *Ptr = E->getArg(0);
+ Value *PtrValue = EmitScalarExpr(Ptr);
+ Value *SizeValue = EmitScalarExpr(E->getArg(1));
+ if (SizeValue->getType() != IntPtrTy)
+ SizeValue =
+ Builder.CreateIntCast(SizeValue, IntPtrTy, false, "casted.size");
+ Builder.CreateDereferenceableAssumption(PtrValue, SizeValue);
+ return RValue::get(nullptr);
+ }
case Builtin::BI__assume:
case Builtin::BI__builtin_assume: {
if (E->getArg(0)->HasSideEffects(getContext()))
diff --git a/clang/test/CodeGen/builtin-assume-dereferenceable.c b/clang/test/CodeGen/builtin-assume-dereferenceable.c
new file mode 100644
index 00000000000000..cadffd4a84c264
--- /dev/null
+++ b/clang/test/CodeGen/builtin-assume-dereferenceable.c
@@ -0,0 +1,34 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm -o - %s | FileCheck %s
+
+// CHECK-LABEL: @test1(
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[A_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT: store ptr [[A:%.*]], ptr [[A_ADDR]], align 8
+// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[A_ADDR]], align 8
+// CHECK-NEXT: call void @llvm.assume(i1 true) [ "dereferenceable"(ptr [[TMP0]], i64 10) ]
+// CHECK-NEXT: [[TMP1:%.*]] = load ptr, ptr [[A_ADDR]], align 8
+// CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i64 0
+// CHECK-NEXT: [[TMP2:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
+// CHECK-NEXT: ret i32 [[TMP2]]
+//
+int test1(int *a) {
+ __builtin_assume_dereferenceable(a, 10);
+ return a[0];
+}
+
+// CHECK-LABEL: @test2(
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[A_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT: store ptr [[A:%.*]], ptr [[A_ADDR]], align 8
+// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[A_ADDR]], align 8
+// CHECK-NEXT: call void @llvm.assume(i1 true) [ "dereferenceable"(ptr [[TMP0]], i64 32) ]
+// CHECK-NEXT: [[TMP1:%.*]] = load ptr, ptr [[A_ADDR]], align 8
+// CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i64 0
+// CHECK-NEXT: [[TMP2:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
+// CHECK-NEXT: ret i32 [[TMP2]]
+//
+int test2(int *a) {
+ __builtin_assume_dereferenceable(a, 32ull);
+ return a[0];
+}
diff --git a/clang/test/Sema/builtin-assume-dereferenceable.c b/clang/test/Sema/builtin-assume-dereferenceable.c
new file mode 100644
index 00000000000000..07b23022043cd7
--- /dev/null
+++ b/clang/test/Sema/builtin-assume-dereferenceable.c
@@ -0,0 +1,41 @@
+// RUN: %clang_cc1 -DSIZE_T_64 -fsyntax-only -Wno-strict-prototypes -triple x86_64-linux -verify %s
+
+int test1(int *a) {
+ __builtin_assume_dereferenceable(a, 32);
+ return a[0];
+}
+
+int test2(int *a) {
+ __builtin_assume_dereferenceable(a, 32ull);
+ return a[0];
+}
+
+int test3(int *a) {
+ __builtin_assume_dereferenceable(a, 32u);
+ return a[0];
+}
+
+int test4(int *a, unsigned size) {
+ a = __builtin_assume_dereferenceable(a, size); // expected-error {{argument to '__builtin_assume_dereferenceable' must be a constant integer}}
+ return a[0];
+}
+
+int test5(int *a, unsigned long long size) {
+ a = __builtin_assume_dereferenceable(a, size); // expected-error {{argument to '__builtin_assume_dereferenceable' must be a constant integer}}
+ return a[0];
+}
+
+int test6(float a) {
+ __builtin_assume_dereferenceable(a, 2); // expected-error {{passing 'float' to parameter of incompatible type 'const void *'}}
+ return 0;;
+}
+
+int test7(int *a) {
+ __builtin_assume_dereferenceable(a, 32, 1); // expected-error {{too many arguments to function call, expected 2, have 3}}
+ return a[0];
+}
+
+int test8(int *a) {
+ __builtin_assume_dereferenceable(a); // expected-error {{too few arguments to function call, expected 2, have 1}}
+ return a[0];
+}
diff --git a/llvm/include/llvm/IR/IRBuilder.h b/llvm/include/llvm/IR/IRBuilder.h
index b73309175f20d1..4da303d28e3d7b 100644
--- a/llvm/include/llvm/IR/IRBuilder.h
+++ b/llvm/include/llvm/IR/IRBuilder.h
@@ -2678,6 +2678,11 @@ class IRBuilderBase {
CallInst *CreateAlignmentAssumption(const DataLayout &DL, Value *PtrValue,
Value *Alignment,
Value *OffsetValue = nullptr);
+
+ /// Create an assume intrinsic call that represents an dereferencable
+ /// assumption on the provided pointer.
+ ///
+ CallInst *CreateDereferenceableAssumption(Value *PtrValue, Value *SizeValue);
};
/// This provides a uniform API for creating instructions and inserting
diff --git a/llvm/lib/IR/IRBuilder.cpp b/llvm/lib/IR/IRBuilder.cpp
index 27b499e42a4e4c..87636a33ee044b 100644
--- a/llvm/lib/IR/IRBuilder.cpp
+++ b/llvm/lib/IR/IRBuilder.cpp
@@ -1274,6 +1274,16 @@ CallInst *IRBuilderBase::CreateAlignmentAssumption(const DataLayout &DL,
return CreateAlignmentAssumptionHelper(DL, PtrValue, Alignment, OffsetValue);
}
+CallInst *IRBuilderBase::CreateDereferenceableAssumption(Value *PtrValue,
+ Value *SizeValue) {
+ assert(isa<PointerType>(PtrValue->getType()) &&
+ "trying to create an deferenceable assumption on a non-pointer?");
+ SmallVector<Value *, 4> Vals({PtrValue, SizeValue});
+ OperandBundleDefT<Value *> DereferenceableOpB("dereferenceable", Vals);
+ return CreateAssumption(ConstantInt::getTrue(getContext()),
+ {DereferenceableOpB});
+}
+
IRBuilderDefaultInserter::~IRBuilderDefaultInserter() = default;
IRBuilderCallbackInserter::~IRBuilderCallbackInserter() = default;
IRBuilderFolder::~IRBuilderFolder() = default;
|
clang/docs/LanguageExtensions.rst
Outdated
**Description**: | ||
|
||
The arguments to this function provide a start pointer ``P`` and a size ``S``. | ||
``P`` must be non-null and ``S`` at least 1. The optimizer may assume that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the 'must be non-null' create an optimization opportunity here too? I guess that is kind of what 'dereferenceable' means, but should we better point out that:
int foo(int * x) {
__builtin_assume_dereferenceable(x, 1);
if (x) {
return 1;
}
return 0;
}
Can be optimized to just: return 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In IR, dereferenceable and nonnull are separate, if the builtin assumes both, make 2 operand bundles and the IR passes should pick it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In IR, dereferenceable and nonnull are separate, if the builtin assumes both, make 2 operand bundles and the IR passes should pick it up.
Is it? What is the difference with dereferenceable_or_null
then? It seems to me that dereferenceable
implies nonnull
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dereferenceable implies nonnull if the null pointer is not dereferenceable (duh).
Unless -fno-delete-null-pointer-checks
is used, the null pointer is known to not be dereferenceable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @erichkeane's example should be simplify-able due to the metadata.
I think it should be safe to add nonnull as well, done in the latest update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dereferenceable implies nonnull if the null pointer is not dereferenceable (duh).
Unless
-fno-delete-null-pointer-checks
is used, the null pointer is known to not be dereferenceable.
It depends on the address space/target, e.g., AS(3) on AMD GPUs has a dereferenceable NULL.
You need to use the helper that takes the function and the AS tells you if NULL is dereferenceable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone back and adjusted the code to not add NonNull and updated the docs here to not specify that the pointer must be non-null. IIUC LLVM's dereferenceable
should do the right thing already automatically.
@@ -3726,6 +3726,16 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, | |||
AlignmentCI, OffsetValue); | |||
return RValue::get(PtrValue); | |||
} | |||
case Builtin::BI__builtin_assume_dereferenceable: { | |||
const Expr *Ptr = E->getArg(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason this is a separate variable but the Size one doesn't do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulled Size out to variable as well, thanks!
@@ -839,6 +839,12 @@ def BuiltinAssumeAligned : Builtin { | |||
let Prototype = "void*(void const*, size_t, ...)"; | |||
} | |||
|
|||
def BuiltinAssumeDereferenceable : Builtin { | |||
let Spellings = ["__builtin_assume_dereferenceable"]; | |||
let Attributes = [NoThrow, Const, Constexpr]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This says Constexpr
, but there's no implementation for that that, and of course no tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped this for now and added a Sema test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be really nice to make this constexpr
, since we'd have to jump through hoops otherwise to use this in e.g. string
(yes, I already have ideas).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constexpr implementation should be pretty easy, since this is effectively a no-op, right? Though perhaps you still have to evaluate the arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make sure that the claimed range is actually dereferenceable, just like we check in __builtin_assume_aligned
that that pointer is actually guaranteed to be aligned properly. I don't know how complicated that would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about adding constexpr
support separately? One use case I am interested in is using this with some libc++ data types like std::vector
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how you want to use it with vector
, but I'd be fine with making constexpr
support a follow-up patch. Given that vector
is constexpr
I'd really like to see the support implemented before adding it to libc++ though, since it'd require quite a bit of working around non-constexpr support otherwise.
9facf13
to
0711638
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 more question on the semantics + a typo, but the code LGTM.
clang/docs/LanguageExtensions.rst
Outdated
__builtin_assume_dereferenceable(x, 4); | ||
int z = 0; | ||
if (y == 1) { | ||
// The optimizer may execute the load of x unconditionally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better description as to WHY it can do so unconditionally would be nice. I originally assumed it was making some sort of assumption about the value of 'y' and was unclear why that would be the case.
Here's a side note: Does this builtin ALSO imply the opposite? A __builtin_assume_dereferenceable
states that the 1st 4 bytes are dereferenceable (so only index 0?), but does that mean it is NOT derferenceable
beyond that? For example, do we NOW know that:
x[3]
is UB? I should hope not...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better description as to WHY it can do so unconditionally would be nice. I originally assumed it was making some sort of assumption about the value of 'y' and was unclear why that would be the case.
Added, hope it makes things clearer.
x[3] is UB? I should hope not...
The attribute doesn't imply that anything beyond the specified size is not dereferenceable, I updated the docs to hopefully make this clearer, thanks
clang/docs/LanguageExtensions.rst
Outdated
``__builtin_assume_dereferenceable`` | ||
------------------------------------- | ||
|
||
``__builtin_assume_derefernceable`` is used to provide the optimizer with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``__builtin_assume_derefernceable`` is used to provide the optimizer with the | |
``__builtin_assume_dereferenceable`` is used to provide the optimizer with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed thanks!
Thanks, this is awesome! A non-constant version would be even more so :) |
Actually, how does this interact with ASan? e.g. would "I know there is enough storage to deref, but it may be poisoned by container annotations" be a valid use-case for this, or would we have to not annotate pointers in that case? |
// CHECK-NEXT: store ptr [[A:%.*]], ptr [[A_ADDR]], align 8 | ||
// CHECK-NEXT: [[TMP0:%.*]] = load ptr, ptr [[A_ADDR]], align 8 | ||
// CHECK-NEXT: call void @llvm.assume(i1 true) [ "dereferenceable"(ptr [[TMP0]], i64 32) ] | ||
// CHECK-NEXT: call void @llvm.assume(i1 true) [ "nonnull"(ptr [[TMP0]]) ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why two assumes and not one? I think later we'd merge them, maybe avoid the hassle.
Also, as mentioned in the other comment, we can't just do nonnull, we probably want a test for the non-nonnull case:
https://godbolt.org/z/1f3r8K9Pn
Note also the nonnull deduction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to not emit the NonNull assumption again, dereferenceable
should already d the right thing for addrspace 0 pointers and if nullptr is dereferenceable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, we usually derive nonnull from deref if we can.
148add9
to
ae9dd3b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, one suggestion for the docs.
clang/docs/LanguageExtensions.rst
Outdated
------------------------------------- | ||
|
||
``__builtin_assume_dereferenceable`` is used to provide the optimizer with the | ||
knowledge that the pointer argument P is dereferenceable up to the specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
knowledge that the pointer argument P is dereferenceable up to the specified | |
knowledge that the pointer argument P is dereferenceable at least up to the specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That has been adjusted a while ago, thanks!
I'm concerned about the soundness of our current implementation for these assumptions, see my comment at #120962 (comment). Not sure we should be exposing the current implementation from clang. |
ae9dd3b
to
141496b
Compare
Let's wait here and move the discussion into the other issue I guess. I replied there too. |
#126117) Update LangRef and code using `Dereferenceable` in assume bundles to only use the information if it is safe at the point of use. `Dereferenceable` in an assume bundle is only guaranteed at the point of the assumption, but may not be guaranteed at later points, because the pointer may have been freed. Update code using `Dereferenceable` to only use it if the pointer cannot be freed. This can further be refined to check if the pointer could be freed between assume and use. This follows up on #123196. With that change, it should be safe to expose dereferenceable assumptions more widely as in #121789 PR: #126117
This patch adds a new __builtin_assume_dereferenceable to encode dereferenceability of a pointer using llvm.assume with an operand bundle. For now the builtin only accepts constant sizes, I am planning to drop this restriction in a follow-up change. This can be used to better optimize cases where a pointer is known to be dereferenceable, e.g. unconditionally loading from p2 when vectorizing the loop. int *get_ptr(); void foo(int* src, int x) { int *p2 = get_ptr(); __builtin_assume_aligned(p2, 4); __builtin_assume_dereferenceable(p2, 4000); for (unsigned I = 0; I != 1000; ++I) { int x = src[I]; if (x == 0) x = p2[I]; src[I] = x; } }
141496b
to
2835b3c
Compare
#126117 has landed now which adjusted the semantics of dereferenceable assume bundles and I think it should be safe to go ahead with exposing this in Clang. Unless there are any further concerns I am planning on submitting this soon :) |
Yeah, this is good to go now. |
…s at assume. (#126117) Update LangRef and code using `Dereferenceable` in assume bundles to only use the information if it is safe at the point of use. `Dereferenceable` in an assume bundle is only guaranteed at the point of the assumption, but may not be guaranteed at later points, because the pointer may have been freed. Update code using `Dereferenceable` to only use it if the pointer cannot be freed. This can further be refined to check if the pointer could be freed between assume and use. This follows up on llvm/llvm-project#123196. With that change, it should be safe to expose dereferenceable assumptions more widely as in llvm/llvm-project#121789 PR: llvm/llvm-project#126117
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/11474 Here is the relevant piece of the build log for the reference
|
…ef assumption. (#121789) This patch adds a new __builtin_assume_dereferenceable to encode dereferenceability of a pointer using llvm.assume with an operand bundle. For now the builtin only accepts constant sizes, I am planning to drop this restriction in a follow-up change. This can be used to better optimize cases where a pointer is known to be dereferenceable, e.g. unconditionally loading from p2 when vectorizing the loop. int *get_ptr(); void foo(int* src, int x) { int *p2 = get_ptr(); __builtin_assume_aligned(p2, 4); __builtin_assume_dereferenceable(p2, 4000); for (unsigned I = 0; I != 1000; ++I) { int x = src[I]; if (x == 0) x = p2[I]; src[I] = x; } } PR: llvm/llvm-project#121789
llvm#126117) Update LangRef and code using `Dereferenceable` in assume bundles to only use the information if it is safe at the point of use. `Dereferenceable` in an assume bundle is only guaranteed at the point of the assumption, but may not be guaranteed at later points, because the pointer may have been freed. Update code using `Dereferenceable` to only use it if the pointer cannot be freed. This can further be refined to check if the pointer could be freed between assume and use. This follows up on llvm#123196. With that change, it should be safe to expose dereferenceable assumptions more widely as in llvm#121789 PR: llvm#126117
…on. (llvm#121789) This patch adds a new __builtin_assume_dereferenceable to encode dereferenceability of a pointer using llvm.assume with an operand bundle. For now the builtin only accepts constant sizes, I am planning to drop this restriction in a follow-up change. This can be used to better optimize cases where a pointer is known to be dereferenceable, e.g. unconditionally loading from p2 when vectorizing the loop. int *get_ptr(); void foo(int* src, int x) { int *p2 = get_ptr(); __builtin_assume_aligned(p2, 4); __builtin_assume_dereferenceable(p2, 4000); for (unsigned I = 0; I != 1000; ++I) { int x = src[I]; if (x == 0) x = p2[I]; src[I] = x; } } PR: llvm#121789
llvm#126117) Update LangRef and code using `Dereferenceable` in assume bundles to only use the information if it is safe at the point of use. `Dereferenceable` in an assume bundle is only guaranteed at the point of the assumption, but may not be guaranteed at later points, because the pointer may have been freed. Update code using `Dereferenceable` to only use it if the pointer cannot be freed. This can further be refined to check if the pointer could be freed between assume and use. This follows up on llvm#123196. With that change, it should be safe to expose dereferenceable assumptions more widely as in llvm#121789 PR: llvm#126117
…on. (llvm#121789) This patch adds a new __builtin_assume_dereferenceable to encode dereferenceability of a pointer using llvm.assume with an operand bundle. For now the builtin only accepts constant sizes, I am planning to drop this restriction in a follow-up change. This can be used to better optimize cases where a pointer is known to be dereferenceable, e.g. unconditionally loading from p2 when vectorizing the loop. int *get_ptr(); void foo(int* src, int x) { int *p2 = get_ptr(); __builtin_assume_aligned(p2, 4); __builtin_assume_dereferenceable(p2, 4000); for (unsigned I = 0; I != 1000; ++I) { int x = src[I]; if (x == 0) x = p2[I]; src[I] = x; } } PR: llvm#121789
This patch adds a new __builtin_assume_dereferenceable to encode dereferenceability of a pointer using llvm.assume with an operand bundle.
For now the builtin only accepts constant sizes, I am planning to drop this restriction in a follow-up change.
This can be used to better optimize cases where a pointer is known to be dereferenceable, e.g. unconditionally loading from p2 when vectorizing the loop.