[flang][cuda] Defined allocator for unified data #102189

clementval · 2024-08-06T17:48:59Z

CUDA unified variable where set to use the same allocator than managed variable. This patch adds a specific allocator for the unified variables. Currently it will call the managed allocator underneath but we want to have the flexibility to change that in the future.

llvmbot · 2024-08-06T17:49:30Z

@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-runtime

Author: Valentin Clement (バレンタインクレメン) (clementval)

Changes

CUDA unified variable where set to use the same allocator than managed variable. This patch adds a specific allocator for the unified variables. Currently it will call the managed allocator underneath but we want to have the flexibility to change that in the future.

Full diff: https://github.com/llvm/llvm-project/pull/102189.diff

4 Files Affected:

(modified) flang/include/flang/Runtime/CUDA/allocator.h (+3)
(modified) flang/include/flang/Runtime/allocator-registry.h (+2-1)
(modified) flang/lib/Lower/ConvertVariable.cpp (+3-2)
(modified) flang/runtime/CUDA/allocator.cpp (+12)

diff --git a/flang/include/flang/Runtime/CUDA/allocator.h b/flang/include/flang/Runtime/CUDA/allocator.h
index 46ff5dbe2f385..70729c3d9f188 100644
--- a/flang/include/flang/Runtime/CUDA/allocator.h
+++ b/flang/include/flang/Runtime/CUDA/allocator.h
@@ -36,5 +36,8 @@ void CUFFreeDevice(void *);
 void *CUFAllocManaged(std::size_t);
 void CUFFreeManaged(void *);
 
+void *CUFAllocUnified(std::size_t);
+void CUFFreeUnified(void *);
+
 } // namespace Fortran::runtime::cuf
 #endif // FORTRAN_RUNTIME_CUDA_ALLOCATOR_H_
diff --git a/flang/include/flang/Runtime/allocator-registry.h b/flang/include/flang/Runtime/allocator-registry.h
index 209b4d2e44e9b..acfada506fafc 100644
--- a/flang/include/flang/Runtime/allocator-registry.h
+++ b/flang/include/flang/Runtime/allocator-registry.h
@@ -19,8 +19,9 @@ static constexpr unsigned kDefaultAllocator = 0;
 static constexpr unsigned kPinnedAllocatorPos = 1;
 static constexpr unsigned kDeviceAllocatorPos = 2;
 static constexpr unsigned kManagedAllocatorPos = 3;
+static constexpr unsigned kUnifiedAllocatorPos = 4;
 
-#define MAX_ALLOCATOR 5
+#define MAX_ALLOCATOR 7 // 3 bits are reserved in the descriptor.
 
 namespace Fortran::runtime {
 
diff --git a/flang/lib/Lower/ConvertVariable.cpp b/flang/lib/Lower/ConvertVariable.cpp
index 45389091b8164..ffbbea238647c 100644
--- a/flang/lib/Lower/ConvertVariable.cpp
+++ b/flang/lib/Lower/ConvertVariable.cpp
@@ -1860,9 +1860,10 @@ static unsigned getAllocatorIdx(const Fortran::semantics::Symbol &sym) {
       return kPinnedAllocatorPos;
     if (*cudaAttr == Fortran::common::CUDADataAttr::Device)
       return kDeviceAllocatorPos;
-    if (*cudaAttr == Fortran::common::CUDADataAttr::Managed ||
-        *cudaAttr == Fortran::common::CUDADataAttr::Unified)
+    if (*cudaAttr == Fortran::common::CUDADataAttr::Managed)
       return kManagedAllocatorPos;
+    if (*cudaAttr == Fortran::common::CUDADataAttr::Unified)
+      return kUnifiedAllocatorPos;
   }
   return kDefaultAllocator;
 }
diff --git a/flang/runtime/CUDA/allocator.cpp b/flang/runtime/CUDA/allocator.cpp
index 26a3c29696269..5292dd54322bd 100644
--- a/flang/runtime/CUDA/allocator.cpp
+++ b/flang/runtime/CUDA/allocator.cpp
@@ -26,6 +26,8 @@ void CUFRegisterAllocator() {
       kDeviceAllocatorPos, {&CUFAllocDevice, CUFFreeDevice});
   allocatorRegistry.Register(
       kManagedAllocatorPos, {&CUFAllocManaged, CUFFreeManaged});
+  allocatorRegistry.Register(
+      kUnifiedAllocatorPos, {&CUFAllocUnified, CUFFreeUnified});
 }
 
 void *CUFAllocPinned(std::size_t sizeInBytes) {
@@ -57,4 +59,14 @@ void CUFFreeManaged(void *p) {
   CUDA_REPORT_IF_ERROR(cuMemFree(reinterpret_cast<CUdeviceptr>(p)));
 }
 
+void *CUFAllocUnified(std::size_t sizeInBytes) {
+  // Call alloc managed for the time being.
+  return CUFAllocManaged(sizeInBytes);
+}
+
+void CUFFreeUnified(void *p) {
+  // Call free managed for the time being.
+  CUFFreeManaged(p);
+}
+
 } // namespace Fortran::runtime::cuf

clementval requested review from wangzpgi and vzakhari August 6, 2024 17:48

llvmbot added flang:runtime flang Flang issues not falling into any other category flang:fir-hlfir labels Aug 6, 2024

clementval added the flang:cuf label Aug 6, 2024

wangzpgi approved these changes Aug 6, 2024

View reviewed changes

[flang][cuda] Defined allocator for unified data

da6b36a

clementval force-pushed the cuf_unified_allocator branch from a8497fa to da6b36a Compare August 6, 2024 21:07

clementval merged commit 388b632 into llvm:main Aug 6, 2024
5 of 6 checks passed

clementval deleted the cuf_unified_allocator branch August 6, 2024 21:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[flang][cuda] Defined allocator for unified data #102189

[flang][cuda] Defined allocator for unified data #102189

Uh oh!

clementval commented Aug 6, 2024

Uh oh!

llvmbot commented Aug 6, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[flang][cuda] Defined allocator for unified data #102189

[flang][cuda] Defined allocator for unified data #102189

Uh oh!

Conversation

clementval commented Aug 6, 2024

Uh oh!

llvmbot commented Aug 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Aug 6, 2024 •

edited

Loading