Skip to content

[Libomptarget] Remove temporary files in AMDGPU JIT impl #77980

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 16, 2024

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Jan 12, 2024

Summary:
This patch cleans up some of the JIT handling for AMDGPU as well as
removing its temporary files. Previously these would be left in the
temporary directory after the program was run. This costs some extra
time, but the correct solution to avoid that is to create a sufficient
entrypoint into ld.lld that we can simply pass a memory buffer into.

Summary:
This patch cleans up some of the JIT handling for AMDGPU as well as
removing its temporary files. Previously these would be left in the
temporary directory after the program was run. This costs some extra
time, but the correct solution to avoid that is to create a sufficient
entrypoint into `ld.lld` that we can simply pass a memory buffer into.
@llvmbot
Copy link
Member

llvmbot commented Jan 12, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Joseph Huber (jhuber6)

Changes

Summary:
This patch cleans up some of the JIT handling for AMDGPU as well as
removing its temporary files. Previously these would be left in the
temporary directory after the program was run. This costs some extra
time, but the correct solution to avoid that is to create a sufficient
entrypoint into ld.lld that we can simply pass a memory buffer into.


Full diff: https://github.com/llvm/llvm-project/pull/77980.diff

1 Files Affected:

  • (modified) openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp (+32-18)
diff --git a/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp b/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
index e7a38a93c9acbe..ba38ea0e34c56c 100644
--- a/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
+++ b/openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
@@ -38,6 +38,7 @@
 #include "llvm/Frontend/OpenMP/OMPConstants.h"
 #include "llvm/Frontend/OpenMP/OMPGridValues.h"
 #include "llvm/Support/Error.h"
+#include "llvm/Support/FileOutputBuffer.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Program.h"
@@ -1999,21 +2000,27 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, AMDGenericDeviceTy {
 
     // TODO: We should try to avoid materialization but there seems to be no
     // good linker interface w/o file i/o.
-    SmallString<128> LinkerOutputFilePath;
-    std::error_code EC = sys::fs::createTemporaryFile(
-        "amdgpu-pre-link-jit", ".out", LinkerOutputFilePath);
+    SmallString<128> LinkerInputFilePath;
+    std::error_code EC = sys::fs::createTemporaryFile("amdgpu-pre-link-jit",
+                                                      "o", LinkerInputFilePath);
     if (EC)
-      return createStringError(EC,
-                               "Failed to create temporary file for linker");
-
-    SmallString<128> LinkerInputFilePath = LinkerOutputFilePath;
-    LinkerInputFilePath.pop_back_n(2);
+      return Plugin::error("Failed to create temporary file for linker");
+
+    // Write the file's contents to the output file.
+    Expected<std::unique_ptr<FileOutputBuffer>> OutputOrErr =
+        FileOutputBuffer::create(LinkerInputFilePath, MB->getBuffer().size());
+    if (!OutputOrErr)
+      return OutputOrErr.takeError();
+    std::unique_ptr<FileOutputBuffer> Output = std::move(*OutputOrErr);
+    llvm::copy(MB->getBuffer(), Output->getBufferStart());
+    if (Error E = Output->commit())
+      return std::move(E);
 
-    auto FD = raw_fd_ostream(LinkerInputFilePath.data(), EC);
+    SmallString<128> LinkerOutputFilePath;
+    EC = sys::fs::createTemporaryFile("amdgpu-pre-link-jit", "so",
+                                      LinkerOutputFilePath);
     if (EC)
-      return createStringError(EC, "Failed to open temporary file for linker");
-    FD.write(MB->getBufferStart(), MB->getBufferSize());
-    FD.close();
+      return Plugin::error("Failed to create temporary file for linker");
 
     const auto &ErrorOrPath = sys::findProgramByName("lld");
     if (!ErrorOrPath)
@@ -2025,7 +2032,6 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, AMDGenericDeviceTy {
          "Using `%s` to link JITed amdgcn ouput.", LLDPath.c_str());
 
     std::string MCPU = "-plugin-opt=mcpu=" + getComputeUnitKind();
-
     StringRef Args[] = {LLDPath,
                         "-flavor",
                         "gnu",
@@ -2039,12 +2045,20 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, AMDGenericDeviceTy {
     std::string Error;
     int RC = sys::ExecuteAndWait(LLDPath, Args, std::nullopt, {}, 0, 0, &Error);
     if (RC)
-      return createStringError(inconvertibleErrorCode(),
-                               "Linking optimized bitcode failed: %s",
-                               Error.c_str());
+      return Plugin::error("Linking optimized bitcode failed: %s",
+                           Error.c_str());
+
+    auto BufferOrErr = MemoryBuffer::getFileOrSTDIN(LinkerOutputFilePath);
+    if (!BufferOrErr)
+      return Plugin::error("Failed to open temporary file for lld");
+
+    // Clean up the temporary files afterwards.
+    if (sys::fs::remove(LinkerOutputFilePath))
+      return Plugin::error("Failed to remove temporary file for lld");
+    if (sys::fs::remove(LinkerInputFilePath))
+      return Plugin::error("Failed to remove temporary file for lld");
 
-    return std::move(
-        MemoryBuffer::getFileOrSTDIN(LinkerOutputFilePath.data()).get());
+    return std::move(*BufferOrErr);
   }
 
   /// See GenericDeviceTy::getComputeUnitKind().

std::error_code EC = sys::fs::createTemporaryFile(
"amdgpu-pre-link-jit", ".out", LinkerOutputFilePath);
SmallString<128> LinkerInputFilePath;
std::error_code EC = sys::fs::createTemporaryFile("amdgpu-pre-link-jit",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this "guarantee" uniqueness in a multi-threaded / multi-process environment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No but that could only happen when multiple libraries are initialized all at once, or users explicitly dlopen libraries with OpenMP target offloading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's an issue, the underlying code more or less just tries random filenames until one works. There may be an astronomically slim chance that the stars align and both threads create the same random name at the same exact time and none of the other code detects it, but my gut says that the Linux kernel would handle that fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's sound as much of a guarantee as I was hoping for.

Copy link
Contributor

@shiltian shiltian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@jhuber6 jhuber6 merged commit 89cdd48 into llvm:main Jan 16, 2024
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
Summary:
This patch cleans up some of the JIT handling for AMDGPU as well as
removing its temporary files. Previously these would be left in the
temporary directory after the program was run. This costs some extra
time, but the correct solution to avoid that is to create a sufficient
entrypoint into `ld.lld` that we can simply pass a memory buffer into.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants