Skip to content

Commit 256f40d

Browse files
authored
[libc] Use the NVIDIA device allocator for GPU malloc (#124277)
Summary: This is a blocker on another patch in the OpenMP runtime. The problem is that NVIDIA truly doesn't handle RPC-based allocations very well. It cannot reliably update the MMU while a kernel is running and it will usually deadlock if called from a separate thread due to internal use of TLS. This patch just removes the definition of `malloc` and `free` for NVPTX. The result here is that they will be undefined, which is the cue for the `nvlink` linker to define them for us. So, as far as `libc` is concerned it still implements malloc.
1 parent 37bf0a1 commit 256f40d

File tree

3 files changed

+10
-1
lines changed

3 files changed

+10
-1
lines changed

libc/src/stdlib/gpu/free.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,10 @@
1414

1515
namespace LIBC_NAMESPACE_DECL {
1616

17+
// FIXME: For now we just default to the NVIDIA device allocator which is
18+
// always available on NVPTX targets. This will be implemented fully later.
19+
#ifndef LIBC_TARGET_ARCH_IS_NVPTX
1720
LLVM_LIBC_FUNCTION(void, free, (void *ptr)) { gpu::deallocate(ptr); }
21+
#endif
1822

1923
} // namespace LIBC_NAMESPACE_DECL

libc/src/stdlib/gpu/malloc.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,12 @@
1414

1515
namespace LIBC_NAMESPACE_DECL {
1616

17+
// FIXME: For now we just default to the NVIDIA device allocator which is
18+
// always available on NVPTX targets. This will be implemented fully later.
19+
#ifndef LIBC_TARGET_ARCH_IS_NVPTX
1720
LLVM_LIBC_FUNCTION(void *, malloc, (size_t size)) {
1821
return gpu::allocate(size);
1922
}
23+
#endif
2024

2125
} // namespace LIBC_NAMESPACE_DECL

libc/test/src/stdlib/CMakeLists.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -420,7 +420,8 @@ if(LLVM_LIBC_FULL_BUILD)
420420
)
421421

422422
# Only baremetal and GPU has an in-tree 'malloc' implementation.
423-
if(LIBC_TARGET_OS_IS_BAREMETAL OR LIBC_TARGET_OS_IS_GPU)
423+
if((LIBC_TARGET_OS_IS_BAREMETAL OR LIBC_TARGET_OS_IS_GPU) AND
424+
NOT LIBC_TARGET_ARCHITECTURE_IS_NVPTX)
424425
add_libc_test(
425426
malloc_test
426427
HERMETIC_TEST_ONLY

0 commit comments

Comments
 (0)