Skip to content

[flang][cuda] Add cuf.shared_memory operation #131392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 14, 2025

Conversation

clementval
Copy link
Contributor

Introduce cuf.shared_memory operation. The operation is used to get the pointer in shared memory for a specific variable. The shared memory is materialized as a global in address space 3 and the different variables are pointing to it at different offset.

Follow up patches will add lowering and conversion of this operation.

@clementval clementval requested a review from wangzpgi March 14, 2025 21:10
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Mar 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 14, 2025

@llvm/pr-subscribers-flang-fir-hlfir

Author: Valentin Clement (バレンタイン クレメン) (clementval)

Changes

Introduce cuf.shared_memory operation. The operation is used to get the pointer in shared memory for a specific variable. The shared memory is materialized as a global in address space 3 and the different variables are pointing to it at different offset.

Follow up patches will add lowering and conversion of this operation.


Full diff: https://github.com/llvm/llvm-project/pull/131392.diff

2 Files Affected:

  • (modified) flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td (+29)
  • (modified) flang/test/Fir/cuf.mlir (+27)
diff --git a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td
index c1021da0cfb21..eda129fb59ded 100644
--- a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td
+++ b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td
@@ -349,4 +349,33 @@ def cuf_DeviceAddressOp : cuf_Op<"device_address", []> {
   let results = (outs fir_ReferenceType:$addr);
 }
 
+def cuf_SharedMemoryOp
+    : cuf_Op<"shared_memory", [AttrSizedOperandSegments, Pure]> {
+  let summary = "Get the pointer to the kernel shared memory";
+
+  let description = [{
+    Return the pointer in the shared memory relative to the specified offset.
+  }];
+
+  let arguments = (ins TypeAttr:$in_type, OptionalAttr<StrAttr>:$uniq_name,
+      OptionalAttr<StrAttr>:$bindc_name, Variadic<AnyIntegerType>:$typeparams,
+      Variadic<AnyIntegerType>:$shape,
+      OptionalAttr<I32Attr>:$offset // offset in bytes from the shared memory
+                                    // base address.
+  );
+
+  let results = (outs fir_ReferenceType:$ptr);
+
+  let assemblyFormat = [{
+      $in_type (`(` $typeparams^ `:` type($typeparams) `)`)?
+        (`,` $shape^ `:` type($shape) )?  attr-dict `->` qualified(type($ptr))
+  }];
+
+  let builders = [OpBuilder<(ins "mlir::Type":$inType,
+      "llvm::StringRef":$uniqName, "llvm::StringRef":$bindcName,
+      CArg<"mlir::ValueRange", "{}">:$typeparams,
+      CArg<"mlir::ValueRange", "{}">:$shape,
+      CArg<"llvm::ArrayRef<mlir::NamedAttribute>", "{}">:$attributes)>];
+}
+
 #endif // FORTRAN_DIALECT_CUF_CUF_OPS
diff --git a/flang/test/Fir/cuf.mlir b/flang/test/Fir/cuf.mlir
index 188044d04b848..d38b26a4548ed 100644
--- a/flang/test/Fir/cuf.mlir
+++ b/flang/test/Fir/cuf.mlir
@@ -86,3 +86,30 @@ func.func @_QPsub1() {
 // CHECK: cuf.alloc
 // CHECK: cuf.free
 
+// -----
+
+ gpu.module @cuda_device_mod {
+  gpu.func @_QPdynshared() kernel {
+    %c-1 = arith.constant -1 : index
+    %6 = cuf.shared_memory !fir.array<?xf32>, %c-1 : index {bindc_name = "r", uniq_name = "_QFdynsharedEr"} -> !fir.ref<!fir.array<?xf32>>
+    %7 = fir.shape %c-1 : (index) -> !fir.shape<1>
+    %8 = fir.declare %6(%7) {data_attr = #cuf.cuda<shared>, uniq_name = "_QFdynsharedEr"} : (!fir.ref<!fir.array<?xf32>>, !fir.shape<1>) -> !fir.ref<!fir.array<?xf32>>
+    gpu.return
+  }
+}
+
+// CHECK: cuf.shared_memory
+
+// -----
+
+gpu.module @cuda_device_mod {
+  gpu.func @_QPshared_static() attributes {cuf.proc_attr = #cuf.cuda_proc<global>} {
+    %0 = cuf.shared_memory i32 {bindc_name = "a", uniq_name = "_QFshared_staticEa"} -> !fir.ref<i32>
+    %1 = fir.declare %0 {data_attr = #cuf.cuda<shared>, uniq_name = "_QFshared_staticEa"} : (!fir.ref<i32>) -> !fir.ref<i32>
+    %2 = cuf.shared_memory i32 {bindc_name = "b", uniq_name = "_QFshared_staticEb"} -> !fir.ref<i32>
+    %3 = fir.declare %2 {data_attr = #cuf.cuda<shared>, uniq_name = "_QFshared_staticEb"} : (!fir.ref<i32>) -> !fir.ref<i32>
+    gpu.return
+  }
+}
+
+// CHECK-COUNT-2: cuf.shared_memory 

@clementval clementval merged commit 4818623 into main Mar 14, 2025
8 of 10 checks passed
@clementval clementval deleted the users/clementval/cuf_shared_memory_op branch March 14, 2025 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants