[libc] Use the NVIDIA device allocator for GPU malloc #124277

jhuber6 · 2025-01-24T14:37:53Z

Summary:
This is a blocker on another patch in the OpenMP runtime. The problem is
that NVIDIA truly doesn't handle RPC-based allocations very well. It
cannot reliably update the MMU while a kernel is running and it will
usually deadlock if called from a separate thread due to internal use of
TLS.

This patch just removes the definition of malloc and free for NVPTX.
The result here is that they will be undefined, which is the cue for the
nvlink linker to define them for us. So, as far as libc is concerned
it still implements malloc.

Summary: This is a blocker on another patch in the OpenMP runtime. The problem is that NVIDIA truly doesn't handle RPC-based allocations very well. It cannot reliably update the MMU while a kernel is running and it will usually deadlock if called from a separate thread due to internal use of TLS. This patch just removes the definition of `malloc` and `free` for NVPTX. The result here is that they will be undefined, which is the cue for the `nvlink` linker to define them for us. So, as far as `libc` is concerned it still implements malloc.

llvmbot · 2025-01-24T14:38:35Z

@llvm/pr-subscribers-libc

Author: Joseph Huber (jhuber6)

Changes

Summary:
This is a blocker on another patch in the OpenMP runtime. The problem is
that NVIDIA truly doesn't handle RPC-based allocations very well. It
cannot reliably update the MMU while a kernel is running and it will
usually deadlock if called from a separate thread due to internal use of
TLS.

This patch just removes the definition of malloc and free for NVPTX.
The result here is that they will be undefined, which is the cue for the
nvlink linker to define them for us. So, as far as libc is concerned
it still implements malloc.

Full diff: https://github.com/llvm/llvm-project/pull/124277.diff

3 Files Affected:

(modified) libc/src/stdlib/gpu/free.cpp (+4)
(modified) libc/src/stdlib/gpu/malloc.cpp (+4)
(modified) libc/test/src/stdlib/CMakeLists.txt (+2-1)

diff --git a/libc/src/stdlib/gpu/free.cpp b/libc/src/stdlib/gpu/free.cpp
index 1f0e9ec7359740..6ef9d718315a5c 100644
--- a/libc/src/stdlib/gpu/free.cpp
+++ b/libc/src/stdlib/gpu/free.cpp
@@ -14,6 +14,10 @@
 
 namespace LIBC_NAMESPACE_DECL {
 
+// FIXME: For now we just default to the NVIDIA device allocator which is
+// always available on NVPTX targets. This will be implemented fully later.
+#ifndef LIBC_TARGET_ARCH_IS_NVPTX
 LLVM_LIBC_FUNCTION(void, free, (void *ptr)) { gpu::deallocate(ptr); }
+#endif
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/stdlib/gpu/malloc.cpp b/libc/src/stdlib/gpu/malloc.cpp
index 54f2d8843996ee..b5909cb9cb4d02 100644
--- a/libc/src/stdlib/gpu/malloc.cpp
+++ b/libc/src/stdlib/gpu/malloc.cpp
@@ -14,8 +14,12 @@
 
 namespace LIBC_NAMESPACE_DECL {
 
+// FIXME: For now we just default to the NVIDIA device allocator which is
+// always available on NVPTX targets. This will be implemented fully later.
+#ifndef LIBC_TARGET_ARCH_IS_NVPTX
 LLVM_LIBC_FUNCTION(void *, malloc, (size_t size)) {
   return gpu::allocate(size);
 }
+#endif
 
 } // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/test/src/stdlib/CMakeLists.txt b/libc/test/src/stdlib/CMakeLists.txt
index 8cc0428632ba39..aba76833be9d41 100644
--- a/libc/test/src/stdlib/CMakeLists.txt
+++ b/libc/test/src/stdlib/CMakeLists.txt
@@ -420,7 +420,8 @@ if(LLVM_LIBC_FULL_BUILD)
   )
 
   # Only baremetal and GPU has an in-tree 'malloc' implementation.
-  if(LIBC_TARGET_OS_IS_BAREMETAL OR LIBC_TARGET_OS_IS_GPU)
+  if((LIBC_TARGET_OS_IS_BAREMETAL OR LIBC_TARGET_OS_IS_GPU) AND
+      NOT LIBC_TARGET_ARCHITECTURE_IS_NVPTX)
     add_libc_test(
       malloc_test
       HERMETIC_TEST_ONLY

jhuber6 · 2025-01-24T15:34:37Z

This unblocks #112988

jhuber6 requested review from jplehr, lntue, michaelrj-google and shiltian January 24, 2025 14:37

llvmbot added the libc label Jan 24, 2025

shiltian approved these changes Jan 24, 2025

View reviewed changes

jhuber6 merged commit 256f40d into llvm:main Jan 24, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[libc] Use the NVIDIA device allocator for GPU malloc #124277

[libc] Use the NVIDIA device allocator for GPU malloc #124277

Uh oh!

jhuber6 commented Jan 24, 2025

Uh oh!

llvmbot commented Jan 24, 2025

Uh oh!

jhuber6 commented Jan 24, 2025

Uh oh!

Uh oh!

Uh oh!

[libc] Use the NVIDIA device allocator for GPU malloc #124277

[libc] Use the NVIDIA device allocator for GPU malloc #124277

Uh oh!

Conversation

jhuber6 commented Jan 24, 2025

Uh oh!

llvmbot commented Jan 24, 2025

Uh oh!

jhuber6 commented Jan 24, 2025

Uh oh!

Uh oh!

Uh oh!