[NFC][OpenMP][MLIR] Add test for lowering parallel workshare GPU loop #76144

DominikAdamski · 2023-12-21T12:26:40Z

This test checks if MLIR code is lowered according to schema presented below:

func1() {
call __kmpc_parallel_51(..., func2, ...)
}

func2() {
call __kmpc_for_static_loop_4u(..., func3, ...)
}

func3() {
//loop body
}

This test checks if MLIR code is lowered according to schema presented below: func1(){ call __kmpc_parallel_51(..., func2, ...) } func2() { call __kmpc_for_static_loop_4u(..., func3, ...) } func3() { //loop body }

llvmbot · 2023-12-21T12:27:08Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-llvm

Author: Dominik Adamski (DominikAdamski)

Changes

This test checks if MLIR code is lowered according to schema presented below:

func1() {
call __kmpc_parallel_51(..., func2, ...)
}

func2() {
call __kmpc_for_static_loop_4u(..., func3, ...)
}

func3() {
//loop body
}

Full diff: https://github.com/llvm/llvm-project/pull/76144.diff

1 Files Affected:

(added) mlir/test/Target/LLVMIR/omptarget-parallel-wsloop.mlir (+36)

diff --git a/mlir/test/Target/LLVMIR/omptarget-parallel-wsloop.mlir b/mlir/test/Target/LLVMIR/omptarget-parallel-wsloop.mlir
new file mode 100644
index 00000000000000..43d0934d3a931e
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/omptarget-parallel-wsloop.mlir
@@ -0,0 +1,36 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// The aim of the test is to check the GPU LLVM IR codegen
+// for nested omp do loop inside omp target region
+
+module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<"dlti.alloca_memory_space", 5 : ui32>>, llvm.data_layout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8", llvm.target_triple = "amdgcn-amd-amdhsa", omp.is_gpu = true, omp.is_target_device = true } {
+  llvm.func @target_parallel_wsloop(%arg0: !llvm.ptr ){
+    omp.parallel {
+      %loop_ub = llvm.mlir.constant(9 : i32) : i32
+      %loop_lb = llvm.mlir.constant(0 : i32) : i32
+      %loop_step = llvm.mlir.constant(1 : i32) : i32
+      omp.wsloop for  (%loop_cnt) : i32 = (%loop_lb) to (%loop_ub) inclusive step (%loop_step) {
+        %gep = llvm.getelementptr %arg0[0, %loop_cnt] : (!llvm.ptr, i32) -> !llvm.ptr, !llvm.array<10 x i32>
+        llvm.store %loop_cnt, %gep : i32, !llvm.ptr
+        omp.yield
+      }
+     omp.terminator
+    }
+
+    llvm.return
+  }
+
+}
+// CHECK:      call void @__kmpc_parallel_51(ptr addrspacecast
+// CHECK-SAME:  (ptr addrspace(1) @[[GLOB:[0-9]+]] to ptr),
+// CHECK-SAME:  i32 %[[THREAD_NUM:.*]], i32 1, i32 -1, i32 -1,
+// CHECK-SAME:  ptr @[[PARALLEL_FUNC:.*]], ptr null, ptr %[[PARALLEL_ARGS:.*]], i64 1)
+
+// CHECK:      define internal void @[[PARALLEL_FUNC]]
+// CHECK-SAME:  (ptr noalias noundef %[[TID_ADDR:.*]], ptr noalias noundef %[[ZERO_ADDR:.*]],
+// CHECK-SAME:  ptr %[[ARG_PTR:.*]])
+// CHECK: call void @__kmpc_for_static_loop_4u(ptr addrspacecast (ptr addrspace(1) @[[GLOB]] to ptr),
+// CHECK-SAME:   ptr @[[LOOP_BODY_FUNC:.*]], ptr %[[LOO_BODY_FUNC_ARG:.*]], i32 10,
+// CHECK-SAME:   i32 %[[THREAD_NUM:.*]], i32 0)
+
+// CHECK:      define internal void @[[LOOP_BODY_FUNC]](i32 %[[CNT:.*]], ptr %[[LOOP_BODY_ARG_PTR:.*]]) {

kiranchandramohan · 2023-12-21T15:19:51Z

Could you share the command-line flags that have to be used with flang-new to get the llvm ir in this test?

kiranchandramohan · 2023-12-21T15:37:51Z

Could you share the command-line flags that have to be used with flang-new to get the llvm ir in this test?

I guess it is something like the following.

flang-new -fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp -fopenmp-is-target-device do.f90

DominikAdamski · 2023-12-21T16:04:58Z

The MLIR test case was reduced by hand.

The initial code was similar to:

subroutine test(i, beta) bind(C)
!$omp declare target
        use ISO_C_BINDING
        integer (C_INT), dimension(*), intent(out) :: beta
        integer (C_INT), value :: i
!$omp parallel
!$omp do
      do i = 1, 10
        beta(i) = i
      end do
!$omp end do
!$omp end parallel
end subroutine

I am able to generate test-openmp-amdgcn-amd-amdhsa-gfx90a-llvmir.mlir file by command:
flang-new -save-temps -c -fopenmp --offload-arch=gfx90a test.f95

Or:
flang-new -fc1 -triple amdgcn-amd-amdhsa -emit-llvm-bc -fopenmp -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -target-cpu gfx90a -fopenmp-host-ir-file-path test-host-x86_64-unknown-linux-gnu.bc -fopenmp-is-target-device -mframe-pointer=all -o test-openmp-amdgcn-amd-amdhsa-gfx90a.tmp.bc -save-temps=cwd -x f95 test-openmp-amdgcn-amd-amdhsa-gfx90a.i

[NFC][OpenMP][MLIR] Add test for lowering parallel workshare GPU loop

545f76d

This test checks if MLIR code is lowered according to schema presented below: func1(){ call __kmpc_parallel_51(..., func2, ...) } func2() { call __kmpc_for_static_loop_4u(..., func3, ...) } func3() { //loop body }

DominikAdamski requested review from kiranchandramohan and kiranktp December 21, 2023 12:26

llvmbot added mlir:llvm mlir labels Dec 21, 2023

DominikAdamski requested review from jsjodin and skatrak December 21, 2023 12:27

DominikAdamski requested review from ergawy and agozillon December 21, 2023 12:27

jdoerfert approved these changes Dec 21, 2023

View reviewed changes

DominikAdamski merged commit ffabf73 into llvm:main Dec 22, 2023

DominikAdamski deleted the omptarget_parallel_wsloop branch December 22, 2023 10:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NFC][OpenMP][MLIR] Add test for lowering parallel workshare GPU loop #76144

[NFC][OpenMP][MLIR] Add test for lowering parallel workshare GPU loop #76144

Uh oh!

DominikAdamski commented Dec 21, 2023

Uh oh!

llvmbot commented Dec 21, 2023 •

edited

Loading

Uh oh!

kiranchandramohan commented Dec 21, 2023

Uh oh!

kiranchandramohan commented Dec 21, 2023

Uh oh!

DominikAdamski commented Dec 21, 2023

Uh oh!

Uh oh!

[NFC][OpenMP][MLIR] Add test for lowering parallel workshare GPU loop #76144

[NFC][OpenMP][MLIR] Add test for lowering parallel workshare GPU loop #76144

Uh oh!

Conversation

DominikAdamski commented Dec 21, 2023

Uh oh!

llvmbot commented Dec 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kiranchandramohan commented Dec 21, 2023

Uh oh!

kiranchandramohan commented Dec 21, 2023

Uh oh!

DominikAdamski commented Dec 21, 2023

Uh oh!

Uh oh!

llvmbot commented Dec 21, 2023 •

edited

Loading