[flang][cuda] Add entry point to launch global function with cluster_dims #113958

clementval · 2024-10-28T19:59:50Z

No description provided.

…dims

llvmbot · 2024-10-28T20:00:28Z

@llvm/pr-subscribers-flang-runtime

Author: Valentin Clement (バレンタインクレメン) (clementval)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/113958.diff

2 Files Affected:

(modified) flang/include/flang/Runtime/CUDA/kernel.h (+7-1)
(modified) flang/runtime/CUDA/kernel.cpp (+24-1)

diff --git a/flang/include/flang/Runtime/CUDA/kernel.h b/flang/include/flang/Runtime/CUDA/kernel.h
index cf07d874a082c0..85afda09e347ae 100644
--- a/flang/include/flang/Runtime/CUDA/kernel.h
+++ b/flang/include/flang/Runtime/CUDA/kernel.h
@@ -15,13 +15,19 @@
 
 extern "C" {
 
-// This function uses intptr_t instead of CUDA's unsigned int to match
+// These functions use intptr_t instead of CUDA's unsigned int to match
 // the type of MLIR's index type. This avoids the need for casts in the
 // generated MLIR code.
+
 void RTDEF(CUFLaunchKernel)(const void *kernelName, intptr_t gridX,
     intptr_t gridY, intptr_t gridZ, intptr_t blockX, intptr_t blockY,
     intptr_t blockZ, int32_t smem, void **params, void **extra);
 
+void RTDEF(CUFLaunchClusterKernel)(const void *kernelName, intptr_t clusterX,
+    intptr_t clusterY, intptr_t clusterZ, intptr_t gridX, intptr_t gridY,
+    intptr_t gridZ, intptr_t blockX, intptr_t blockY, intptr_t blockZ,
+    int32_t smem, void **params, void **extra);
+
 } // extern "C"
 
 #endif // FORTRAN_RUNTIME_CUDA_KERNEL_H_
diff --git a/flang/runtime/CUDA/kernel.cpp b/flang/runtime/CUDA/kernel.cpp
index f81153a1af4bc7..abb7ebb72e5923 100644
--- a/flang/runtime/CUDA/kernel.cpp
+++ b/flang/runtime/CUDA/kernel.cpp
@@ -25,9 +25,32 @@ void RTDEF(CUFLaunchKernel)(const void *kernel, intptr_t gridX, intptr_t gridY,
   blockDim.x = blockX;
   blockDim.y = blockY;
   blockDim.z = blockZ;
-  cudaStream_t stream = 0;
+  cudaStream_t stream = 0; // TODO stream managment
   CUDA_REPORT_IF_ERROR(
       cudaLaunchKernel(kernel, gridDim, blockDim, params, smem, stream));
 }
 
+void RTDEF(CUFLaunchClusterKernel)(const void *kernel, intptr_t clusterX,
+    intptr_t clusterY, intptr_t clusterZ, intptr_t gridX, intptr_t gridY,
+    intptr_t gridZ, intptr_t blockX, intptr_t blockY, intptr_t blockZ,
+    int32_t smem, void **params, void **extra) {
+  cudaLaunchConfig_t config;
+  config.gridDim.x = gridX;
+  config.gridDim.y = gridY;
+  config.gridDim.z = gridZ;
+  config.blockDim.x = blockX;
+  config.blockDim.y = blockY;
+  config.blockDim.z = blockZ;
+  config.dynamicSmemBytes = smem;
+  config.stream = 0; // TODO stream managment
+  cudaLaunchAttribute launchAttr[1];
+  launchAttr[0].id = cudaLaunchAttributeClusterDimension;
+  launchAttr[0].val.clusterDim.x = clusterX;
+  launchAttr[0].val.clusterDim.y = clusterY;
+  launchAttr[0].val.clusterDim.z = clusterZ;
+  config.numAttrs = 1;
+  config.attrs = launchAttr;
+  CUDA_REPORT_IF_ERROR(cudaLaunchKernelExC(&config, kernel, params));
+}
+
 } // extern "C"

…dims (llvm#113958)

[flang][cuda] Add entry point to launch global function with cluster_…

8e02c90

…dims

clementval requested review from wangzpgi and Renaud-K October 28, 2024 19:59

llvmbot added flang:runtime flang Flang issues not falling into any other category labels Oct 28, 2024

wangzpgi approved these changes Oct 28, 2024

View reviewed changes

clementval merged commit 0b700f2 into main Oct 29, 2024
11 checks passed

clementval deleted the users/clementval/cuf_cluster_rt branch October 29, 2024 17:01

NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024

[flang][cuda] Add entry point to launch global function with cluster_…

8cce887

…dims (llvm#113958)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[flang][cuda] Add entry point to launch global function with cluster_dims #113958

[flang][cuda] Add entry point to launch global function with cluster_dims #113958

Uh oh!

clementval commented Oct 28, 2024

Uh oh!

llvmbot commented Oct 28, 2024

Uh oh!

Uh oh!

Uh oh!

[flang][cuda] Add entry point to launch global function with cluster_dims #113958

[flang][cuda] Add entry point to launch global function with cluster_dims #113958

Uh oh!

Conversation

clementval commented Oct 28, 2024

Uh oh!

llvmbot commented Oct 28, 2024

Uh oh!

Uh oh!

Uh oh!