-
Notifications
You must be signed in to change notification settings - Fork 14.3k
clang/OpenCL: Fix assertion on call to function with addrspace argument #115093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clang/OpenCL: Fix assertion on call to function with addrspace argument #115093
Conversation
Don't know how anything was working before. There must have been a recent regression, but I haven't looked yet.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-codegen Author: Matt Arsenault (arsenm) ChangesDon't know how anything was working before. There must have been a recent regression, Full diff: https://github.com/llvm/llvm-project/pull/115093.diff 2 Files Affected:
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8f4f5d3ed81601..b6d0715cb3fde5 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5394,7 +5394,8 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
// can happen due to trivial type mismatches.
if (FirstIRArg < IRFuncTy->getNumParams() &&
V->getType() != IRFuncTy->getParamType(FirstIRArg))
- V = Builder.CreateBitCast(V, IRFuncTy->getParamType(FirstIRArg));
+ V = Builder.CreateAddrSpaceCast(V,
+ IRFuncTy->getParamType(FirstIRArg));
if (ArgHasMaybeUndefAttr)
V = Builder.CreateFreeze(V);
diff --git a/clang/test/CodeGenOpenCL/implicit-addrspacecast-function-parameter.cl b/clang/test/CodeGenOpenCL/implicit-addrspacecast-function-parameter.cl
new file mode 100644
index 00000000000000..4a7bb8227c3393
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/implicit-addrspacecast-function-parameter.cl
@@ -0,0 +1,68 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -cl-std=CL2.0 -triple amdgcn-amd-amdhsa -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
+
+// Check there's no assertion when passing a pointer to an address space
+// qualified argument.
+
+extern void private_ptr(__private int *);
+extern void local_ptr(__local int *);
+extern void generic_ptr(__generic int *);
+
+// CHECK-LABEL: define dso_local void @use_of_private_var(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[X:%.*]] = alloca i32, align 4, addrspace(5)
+// CHECK-NEXT: [[X_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[X]] to ptr
+// CHECK-NEXT: call void @llvm.lifetime.start.p5(i64 4, ptr addrspace(5) [[X]]) #[[ATTR4:[0-9]+]]
+// CHECK-NEXT: store i32 0, ptr [[X_ASCAST]], align 4, !tbaa [[TBAA4:![0-9]+]]
+// CHECK-NEXT: [[TMP0:%.*]] = addrspacecast ptr [[X_ASCAST]] to ptr addrspace(5)
+// CHECK-NEXT: call void @private_ptr(ptr addrspace(5) noundef [[TMP0]]) #[[ATTR5:[0-9]+]]
+// CHECK-NEXT: call void @generic_ptr(ptr noundef [[X_ASCAST]]) #[[ATTR5]]
+// CHECK-NEXT: call void @llvm.lifetime.end.p5(i64 4, ptr addrspace(5) [[X]]) #[[ATTR4]]
+// CHECK-NEXT: ret void
+//
+void use_of_private_var()
+{
+ int x = 0 ;
+ private_ptr(&x);
+ generic_ptr(&x);
+}
+
+// CHECK-LABEL: define dso_local void @addr_of_arg(
+// CHECK-SAME: i32 noundef [[X:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: [[X_ADDR:%.*]] = alloca i32, align 4, addrspace(5)
+// CHECK-NEXT: [[X_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[X_ADDR]] to ptr
+// CHECK-NEXT: store i32 [[X]], ptr [[X_ADDR_ASCAST]], align 4, !tbaa [[TBAA4]]
+// CHECK-NEXT: [[TMP0:%.*]] = addrspacecast ptr [[X_ADDR_ASCAST]] to ptr addrspace(5)
+// CHECK-NEXT: call void @private_ptr(ptr addrspace(5) noundef [[TMP0]]) #[[ATTR5]]
+// CHECK-NEXT: call void @generic_ptr(ptr noundef [[X_ADDR_ASCAST]]) #[[ATTR5]]
+// CHECK-NEXT: ret void
+//
+void addr_of_arg(int x)
+{
+ private_ptr(&x);
+ generic_ptr(&x);
+}
+
+// CHECK-LABEL: define dso_local amdgpu_kernel void @use_of_local_var(
+// CHECK-SAME: ) #[[ATTR3:[0-9]+]] !kernel_arg_addr_space [[META8:![0-9]+]] !kernel_arg_access_qual [[META8]] !kernel_arg_type [[META8]] !kernel_arg_base_type [[META8]] !kernel_arg_type_qual [[META8]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: call void @local_ptr(ptr addrspace(3) noundef @use_of_local_var.x) #[[ATTR5]]
+// CHECK-NEXT: call void @generic_ptr(ptr noundef addrspacecast (ptr addrspace(3) @use_of_local_var.x to ptr)) #[[ATTR5]]
+// CHECK-NEXT: ret void
+//
+__kernel void use_of_local_var()
+{
+ __local int x;
+ local_ptr(&x);
+ generic_ptr(&x);
+}
+
+//.
+// CHECK: [[TBAA4]] = !{[[META5:![0-9]+]], [[META5]], i64 0}
+// CHECK: [[META5]] = !{!"int", [[META6:![0-9]+]], i64 0}
+// CHECK: [[META6]] = !{!"omnipotent char", [[META7:![0-9]+]], i64 0}
+// CHECK: [[META7]] = !{!"Simple C/C++ TBAA"}
+// CHECK: [[META8]] = !{}
+//.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following also currently crashes with the flags in question:
void use_of_private_var()
{
int x = 0 ;
__private void* xx = &x;
}
And the AST for the call in your testcase looks fine:
`-CallExpr 0xd4aeec8 <line:7:5, col:19> 'void'
|-ImplicitCastExpr 0xd4aeeb0 <col:5> 'void (*)(__private int *__private)' <FunctionToPointerDecay>
| `-DeclRefExpr 0xd4aedf0 <col:5> 'void (__private int *__private)' Function 0xd4ae950 'private_ptr' 'void (__private int *__private)'
`-UnaryOperator 0xd4aee30 <col:17, col:18> '__private int *' prefix '&' cannot overflow
`-DeclRefExpr 0xd4aee10 <col:18> '__private int' lvalue Var 0xd4aed50 'x' '__private int'
I think this is actually a bug in the implementation of the "&" operator.
I don’t know if we want to do this blindly, it’s generally a good catch for actual bugs. I assume that this “worked” before because the AS map hack was in place. |
I’m not sure that’s a bug, I think that’s just the wonky OpenCL rules at play. I have a separate PR fixing this at the root which is stuck because @arsenm had objections and there’s a more robust way of handling ‘sret’, see #113930 |
Don't know how anything was working before. There must have been a recent regression,
but I haven't looked yet.