-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Intrinsics] Add @llvm.dereferenceable intrinsic. #120755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch adds a @llvm.dereferenceable intrinsic that can be used to mark pointers as dereferenceable at the point the intrinsic is called. The semantics match the meaning of the dereferenceable function argument attribute. The goal of the intrinsic is the preserve dereferenceability information after moving memory instructions. This allows us to vectorize cases such as https://clang.godbolt.org/z/Y1bedbhs3, where we currently fail due to instcombine sinking the load into then/else blocks. Alternatively we could use an assume bundle with a dereferenceable attribute. But as a follow-up I would like to allow non-immediate size arguments, and this may not mesh well with the attribute.
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-backend-aarch64 Author: Florian Hahn (fhahn) ChangesThis patch adds a @llvm.dereferenceable intrinsic that can be used to mark pointers as dereferenceable at the point the intrinsic is called. The semantics match the meaning of the dereferenceable function argument attribute. The goal of the intrinsic is the preserve dereferenceability information after moving memory instructions. This allows us to vectorize cases such as https://clang.godbolt.org/z/Y1bedbhs3, where we currently fail due to instcombine sinking the load into then/else blocks. Alternatively we could use an assume bundle with a dereferenceable attribute. But as a follow-up I would like to allow non-immediate size arguments, and this may not mesh well with the attribute. Full diff: https://github.com/llvm/llvm-project/pull/120755.diff 5 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 7e01331b20c570..ef45995339a0dc 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -29153,6 +29153,45 @@ attach various forms of information to operands that dominate specific
uses. It is not meant for general use, only for building temporary
renaming forms that require value splits at certain points.
+.. _int_dereferenceable:
+
+'``llvm.dereferenceable``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+ declare void @llvm.dereferenceable(ptr %p, i64 <size>)
+
+Overview:
+"""""""""
+
+The ``llvm.dereferenceable`` allows the optimizer to assume that the returned
+pointer is dereferenceable at the point the intrinsic is called.
+
+Arguments:
+""""""""""
+
+The arguments of the call are the pointer which will be marked as
+dereferenceable and the number of bytes known to be dereferenceable. ``<size>``
+must be a constant.
+
+Semantics:
+""""""""""
+
+The intrinsic returns the input pointer. The returned pointer is dereferenceable
+at the point the intrinsic is called. A pointer that is dereferenceable can be
+loaded from speculatively without a risk of trapping. This implies that the
+input pointer is not null and neither undef or poison. The number of bytes known
+to be dereferenceable is provided as second argument. It is legal for the number
+of bytes to be less than the size of the pointee type.
+
+The semantics above match the semantics of the ``dereferenceable(<n>)``
+parameter attribute.
+
+
.. _type.test:
'``llvm.type.test``' Intrinsic
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index ee877349a33149..cdf2fb0f74529e 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -954,6 +954,10 @@ def int_call_preallocated_teardown : DefaultAttrsIntrinsic<[], [llvm_token_ty]>;
def int_callbr_landingpad : Intrinsic<[llvm_any_ty], [LLVMMatchType<0>],
[IntrNoMerge]>;
+// Attach dereferenceability information to a pointer.
+def int_dereferenceable: DefaultAttrsIntrinsic<[llvm_ptr_ty], [llvm_ptr_ty, llvm_i64_ty],
+ [IntrInaccessibleMemOnly, ImmArg<ArgIndex<1>>]>;
+
//===------------------- Standard C Library Intrinsics --------------------===//
//
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index f8d7c3ef7bbe71..da975c9a2a61a6 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -8293,6 +8293,9 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
visitVectorExtractLastActive(I, Intrinsic);
return;
}
+ case Intrinsic::dereferenceable:
+ setValue(&I, getValue(I.getArgOperand(0)));
+ return;
}
}
diff --git a/llvm/test/CodeGen/AArch64/dereferenceable-intrinsics.ll b/llvm/test/CodeGen/AArch64/dereferenceable-intrinsics.ll
new file mode 100644
index 00000000000000..293cd31e3969e9
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/dereferenceable-intrinsics.ll
@@ -0,0 +1,14 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=arm64-apple-macosx -o - %s | FileCheck %s
+
+declare ptr @llvm.dereferenceable(ptr, i64 immarg)
+
+define i64 @foo(ptr %a) {
+; CHECK-LABEL: foo:
+; CHECK: ; %bb.0:
+; CHECK-NEXT: ldr x0, [x0]
+; CHECK-NEXT: ret
+ %d = call ptr @llvm.dereferenceable(ptr %a, i64 4)
+ %l = load i64, ptr %d
+ ret i64 %l
+}
diff --git a/llvm/test/Verifier/dereferenceable-intrinsics.ll b/llvm/test/Verifier/dereferenceable-intrinsics.ll
new file mode 100644
index 00000000000000..92948daf2c5a47
--- /dev/null
+++ b/llvm/test/Verifier/dereferenceable-intrinsics.ll
@@ -0,0 +1,11 @@
+; RUN: not llvm-as < %s -o /dev/null 2>&1 | FileCheck %s
+
+declare ptr @llvm.dereferenceable(ptr, i64 immarg)
+
+define void @transpose(ptr %p, i64 %x) {
+; CHECK: immarg operand has non-immediate parameter
+ %d.0 = call ptr @llvm.dereferenceable(ptr %p, i64 4)
+ %d.1 = call ptr @llvm.dereferenceable(ptr %p, i64 %x)
+ ret void
+}
+
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively we could use an assume bundle with a dereferenceable attribute. But as a follow-up I would like to allow non-immediate size arguments, and this may not mesh well with the attribute.
I don't understand this bit. Using a non-immediate arguments for operand bundle assumes is generally okay.
I don't think we should introduce a new intrinsic if existing assume infrastructure covers it.
I assumed there might be an issue with customizing the logic to handle the attribute with immediate only in the function arg context and variable arguments in the assume context. But if that won't be an issue I'll do it that way, assuming the general idea seems good. |
Ah sorry, another advantage of using an intrinsic is that it is easier to find a dominating |
Variable arguments should already work (as in not break anything -- just be unused). We explicitly allow this in LangRef:
Not sure what mechanism you have in mind here for llvm.dereferenceable. The assumes will be part of the AssumptionCache. isDereferenceableAndAlignedPointer() already supports operand bundle assumes for dereferenceable, but I think it currently has a weird implementation quirk where you also need to have an aligned bundle on the same assume. |
Which semantics? In tree assumes two different ones, see the old old globally_dereferancably discussion/patch. I assume this is dereferanceable "here", not "globally", right? |
Yep, at the assume call; but I think that should match the assume bundle? I put up #121789 to expose the functionality via a builtin in clang, for now just with constant sizes. |
This patch adds a @llvm.dereferenceable intrinsic that can be used to mark pointers as dereferenceable at the point the intrinsic is called. The semantics match the meaning of the dereferenceable function argument attribute.
The goal of the intrinsic is the preserve dereferenceability information after moving memory instructions. This allows us to vectorize cases such as https://clang.godbolt.org/z/Y1bedbhs3, where we currently fail due to instcombine sinking the load into then/else blocks.
Alternatively we could use an assume bundle with a dereferenceable attribute. But as a follow-up I would like to allow non-immediate size arguments, and this may not mesh well with the attribute.