-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[OpenMP][USM] Adds test for -fopenmp-force-usm flag #75467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang Author: Jan Patrick Lehr (jplehr) ChangesThis adds a basic test to check the correct generation of double indirect access to declare target globals in USM mode vs non-USM mode. Marked as XFAIL to first land test and then enable in subsequent patch. Full diff: https://github.com/llvm/llvm-project/pull/75467.diff 1 Files Affected:
diff --git a/clang/test/OpenMP/force-usm.c b/clang/test/OpenMP/force-usm.c
new file mode 100644
index 00000000000000..222705322b8976
--- /dev/null
+++ b/clang/test/OpenMP/force-usm.c
@@ -0,0 +1,73 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --include-generated-funcs --replace-value-regex "__omp_offloading_[0-9a-z]+_[0-9a-z]+" "pl_cond[.].+[.|,]" --prefix-filecheck-ir-name _ --version 3
+// XFAIL: amdgpu-registered-target
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown -fopenmp-targets=amdgcn-amd-amdhsa -include %S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -emit-llvm-bc %s -o %t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa -fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -include %S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck -check-prefix=CHECK-USM %s
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown -fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm-bc %s -o %t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa -fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck -check-prefix=CHECK-DEFAULT %s
+// expected-no-diagnostics
+
+extern "C" void *malloc(unsigned int b);
+
+int GI;
+#pragma omp declare target
+int *pGI;
+#pragma omp end declare target
+
+int main(void) {
+
+ GI = 0;
+
+ pGI = (int *) malloc(sizeof(int));
+ *pGI = 42;
+
+#pragma omp target map(pGI[:1], GI)
+ {
+ GI = 1;
+ *pGI = 2;
+ }
+
+ return 0;
+}
+
+// CHECK-USM-LABEL: define weak_odr protected amdgpu_kernel void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-USM-SAME: (ptr noundef nonnull align 4 dereferenceable(4) [[GI:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-USM-NEXT: entry:
+// CHECK-USM-NEXT: [[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-USM-NEXT: [[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[GI_ADDR]] to ptr
+// CHECK-USM-NEXT: store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT: [[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT: [[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-USM-NEXT: [[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-USM-NEXT: br i1 [[EXEC_USER_CODE]], label [[USER_CODE_ENTRY:%.*]], label [[WORKER_EXIT:%.*]]
+// CHECK-USM: user_code.entry:
+// CHECK-USM-NEXT: store i32 1, ptr [[TMP0]], align 4
+// CHECK-USM-NEXT: [[TMP2:%.*]] = load ptr, ptr @pGI_decl_tgt_ref_ptr, align 8
+// CHECK-USM-NEXT: [[TMP3:%.*]] = load ptr, ptr [[TMP2]], align 8
+// CHECK-USM-NEXT: store i32 2, ptr [[TMP3]], align 4
+// CHECK-USM-NEXT: call void @__kmpc_target_deinit(ptr addrspacecast (ptr addrspace(1) @[[GLOB1]] to ptr), i8 1)
+// CHECK-USM-NEXT: ret void
+// CHECK-USM: worker.exit:
+// CHECK-USM-NEXT: ret void
+//
+//
+// CHECK-DEFAULT-LABEL: define weak_odr protected amdgpu_kernel void @{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-DEFAULT-SAME: (ptr noundef nonnull align 4 dereferenceable(4) [[GI:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-DEFAULT-NEXT: entry:
+// CHECK-DEFAULT-NEXT: [[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-DEFAULT-NEXT: [[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[GI_ADDR]] to ptr
+// CHECK-DEFAULT-NEXT: store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-DEFAULT-NEXT: [[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-DEFAULT-NEXT: [[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-DEFAULT-NEXT: [[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-DEFAULT-NEXT: br i1 [[EXEC_USER_CODE]], label [[USER_CODE_ENTRY:%.*]], label [[WORKER_EXIT:%.*]]
+// CHECK-DEFAULT: user_code.entry:
+// CHECK-DEFAULT-NEXT: store i32 1, ptr [[TMP0]], align 4
+// CHECK-DEFAULT-NEXT: [[TMP2:%.*]] = load ptr, ptr addrspacecast (ptr addrspace(1) @pGI to ptr), align 8
+// CHECK-DEFAULT-NEXT: store i32 2, ptr [[TMP2]], align 4
+// CHECK-DEFAULT-NEXT: call void @__kmpc_target_deinit(ptr addrspacecast (ptr addrspace(1) @[[GLOB1]] to ptr), i8 1)
+// CHECK-DEFAULT-NEXT: ret void
+// CHECK-DEFAULT: worker.exit:
+// CHECK-DEFAULT-NEXT: ret void
+//
|
This adds a basic test to check the correct generation of double indirect access to declare target globals in USM mode vs non-USM mode. Marked as XFAIL to first land test and then enable in subsequent patch.
ea2a919
to
d3d073d
Compare
I updated the feature PR (#75468) with a different solution. Will update the test after feedback if the route I took in the other PR is seen as OK. |
How do the IR checks verify the flag works? |
Test should probably show that IR is equivalent to |
I mean, the test shows device IR for two functions, doesn't it? I am not aware that part is impacted by USM. Host globals/register functions are. |
The IR is impacted for the global that is in that test case. Lines ~46-50 (first IR section) vs line ~68 (second IR section). The remaining code is indeed the same. |
I see. FWIW "pGI_decl_tgt_ref_ptr" is really not a good name (not your fault). |
Closing this. Test is now part of feature-PR. |
This adds a basic test to check the correct generation of double indirect access to declare target globals in USM mode vs non-USM mode.
I am a bit unhappy with the way this test is set up, but could not find a better way to do it. Happy to improve that and add more tests then.
Marked as XFAIL to first land test and then enable in subsequent patch.