Skip to content

Commit 1b7a095

Browse files
authored
[Clang][AMDGPU] Permit language address spaces for AMDGPU globals (#66205)
Summary: Currently, there is an assertion that prevents us from emitting an AMDGPU global with a non-target specific address space (i.e. numerical attribute). I'm unsure what the original intentions of this assertion were, but we should be able to use OpenCL address spaces when compiling directly to AMDGPU from C++. This is permitted on NVPTX so I'm unsure what this assertion is guarding. The patch simply removes the assertion and adds a test to ensure that these emit the expected address spaces. Fixes #65069
1 parent ba81cd1 commit 1b7a095

File tree

2 files changed

+62
-1
lines changed

2 files changed

+62
-1
lines changed

clang/lib/CodeGen/Targets/AMDGPU.cpp

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -446,7 +446,6 @@ AMDGPUTargetCodeGenInfo::getGlobalVarAddressSpace(CodeGenModule &CGM,
446446
return DefaultGlobalAS;
447447

448448
LangAS AddrSpace = D->getType().getAddressSpace();
449-
assert(AddrSpace == LangAS::Default || isTargetAddressSpace(AddrSpace));
450449
if (AddrSpace != LangAS::Default)
451450
return AddrSpace;
452451

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --check-globals --version 3
2+
// RUN: %clang_cc1 -cc1 -triple amdgcn-amd-amdhsa -emit-llvm %s -o - | FileCheck %s
3+
4+
int [[clang::opencl_global]] a = 100;
5+
int [[clang::opencl_generic]] b = 42;
6+
int [[clang::opencl_constant]] c = 999;
7+
[[clang::loader_uninitialized]] int [[clang::opencl_local]] d;
8+
[[clang::loader_uninitialized]] int [[clang::opencl_private]] e;
9+
10+
int [[clang::address_space(1)]] x = 100;
11+
int [[clang::address_space(0)]] y = 42;
12+
int [[clang::address_space(4)]] z = 999;
13+
[[clang::loader_uninitialized]] int [[clang::address_space(3)]] w;
14+
[[clang::loader_uninitialized]] int [[clang::address_space(5)]] u;
15+
16+
int [[clang::address_space(6)]] aaa = 1000;
17+
int [[clang::address_space(999)]] bbb = 1234;
18+
19+
//.
20+
// CHECK: @a = addrspace(1) global i32 100, align 4
21+
// CHECK: @b = global i32 42, align 4
22+
// CHECK: @c = addrspace(4) constant i32 999, align 4
23+
// CHECK: @d = addrspace(3) global i32 undef, align 4
24+
// CHECK: @e = addrspace(5) global i32 undef, align 4
25+
// CHECK: @x = addrspace(1) global i32 100, align 4
26+
// CHECK: @y = global i32 42, align 4
27+
// CHECK: @z = addrspace(4) global i32 999, align 4
28+
// CHECK: @w = addrspace(3) global i32 undef, align 4
29+
// CHECK: @u = addrspace(5) global i32 undef, align 4
30+
// CHECK: @aaa = addrspace(6) global i32 1000, align 4
31+
// CHECK: @bbb = addrspace(999) global i32 1234, align 4
32+
// CHECK: @llvm.amdgcn.abi.version = weak_odr hidden local_unnamed_addr addrspace(4) constant i32 400
33+
//.
34+
// CHECK-LABEL: define dso_local amdgpu_kernel void @foo(
35+
// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
36+
// CHECK-NEXT: entry:
37+
// CHECK-NEXT: store i32 0, ptr addrspace(1) @a, align 4
38+
// CHECK-NEXT: store i32 0, ptr @b, align 4
39+
// CHECK-NEXT: store i32 0, ptr addrspace(3) @d, align 4
40+
// CHECK-NEXT: store i32 0, ptr addrspace(5) @e, align 4
41+
// CHECK-NEXT: store i32 0, ptr addrspace(1) @x, align 4
42+
// CHECK-NEXT: store i32 0, ptr @y, align 4
43+
// CHECK-NEXT: store i32 0, ptr addrspace(3) @d, align 4
44+
// CHECK-NEXT: store i32 0, ptr addrspace(5) @u, align 4
45+
// CHECK-NEXT: store i32 0, ptr addrspace(6) @aaa, align 4
46+
// CHECK-NEXT: store i32 0, ptr addrspace(999) @bbb, align 4
47+
// CHECK-NEXT: ret void
48+
//
49+
extern "C" [[clang::amdgpu_kernel]] void foo() {
50+
a = 0;
51+
b = 0;
52+
d = 0;
53+
e = 0;
54+
55+
x = 0;
56+
y = 0;
57+
d = 0;
58+
u = 0;
59+
60+
aaa = 0;
61+
bbb = 0;
62+
}

0 commit comments

Comments
 (0)