Skip to content

[Clang] Suppress missing architecture error when doing LTO #100652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 31, 2024

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Jul 25, 2024

Summary:
The nvlink-wrapper can do LTO now, which means we can still create
some LLVM-IR without needing an architecture. In the case that we try to
invoke nvlink internally, that will still fail. This patch simply
defers the error until later so we can use --lto-emit-llvm to get the
IR without specifying an architecture.

@jhuber6 jhuber6 requested review from Artem-B and jlebar July 25, 2024 20:57
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jul 25, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 25, 2024

@llvm/pr-subscribers-libc
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Joseph Huber (jhuber6)

Changes

Summary:
The nvlink-wrapper can do LTO now, which means we can still create
some LLVM-IR without needing an architecture. In the case that we try to
invoke nvlink internally, that will still fail. This patch simply
defers the error until later so we can use --lto-emit-llvm to get the
IR without specifying an architecture.


Full diff: https://github.com/llvm/llvm-project/pull/100652.diff

3 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/Cuda.cpp (+5-3)
  • (modified) clang/test/Driver/cuda-cross-compiling.c (+7)
  • (modified) clang/tools/clang-nvlink-wrapper/ClangNVLinkWrapper.cpp (+7)
diff --git a/clang/lib/Driver/ToolChains/Cuda.cpp b/clang/lib/Driver/ToolChains/Cuda.cpp
index e98e574d6cc2b..6e10e3d006767 100644
--- a/clang/lib/Driver/ToolChains/Cuda.cpp
+++ b/clang/lib/Driver/ToolChains/Cuda.cpp
@@ -596,14 +596,16 @@ void NVPTX::Linker::ConstructJob(Compilation &C, const JobAction &JA,
     CmdArgs.push_back("-v");
 
   StringRef GPUArch = Args.getLastArgValue(options::OPT_march_EQ);
-  if (GPUArch.empty()) {
+  if (GPUArch.empty() && !C.getDriver().isUsingLTO()) {
     C.getDriver().Diag(diag::err_drv_offload_missing_gpu_arch)
         << getToolChain().getArchName() << getShortName();
     return;
   }
 
-  CmdArgs.push_back("-arch");
-  CmdArgs.push_back(Args.MakeArgString(GPUArch));
+  if (!GPUArch.empty()) {
+    CmdArgs.push_back("-arch");
+    CmdArgs.push_back(Args.MakeArgString(GPUArch));
+  }
 
   if (Args.hasArg(options::OPT_ptxas_path_EQ))
     CmdArgs.push_back(Args.MakeArgString(
diff --git a/clang/test/Driver/cuda-cross-compiling.c b/clang/test/Driver/cuda-cross-compiling.c
index c2e538c25329e..5f24e7a5accb0 100644
--- a/clang/test/Driver/cuda-cross-compiling.c
+++ b/clang/test/Driver/cuda-cross-compiling.c
@@ -84,6 +84,13 @@
 // MISSING: error: must pass in an explicit nvptx64 gpu architecture to 'ptxas'
 // MISSING: error: must pass in an explicit nvptx64 gpu architecture to 'nvlink'
 
+// Do not error when performing LTO.
+//
+// RUN: %clang -target nvptx64-nvidia-cuda -flto %s -### 2>&1 \
+// RUN:   | FileCheck -check-prefix=MISSING-LTO %s
+
+// MISSING-LTO-NOT: error: must pass in an explicit nvptx64 gpu architecture to 'nvlink'
+
 // RUN: %clang -target nvptx64-nvidia-cuda -flto -c %s -### 2>&1 \
 // RUN:   | FileCheck -check-prefix=GENERIC %s
 // RUN: %clang -target nvptx64-nvidia-cuda -march=sm_52 -march=generic -flto -c %s -### 2>&1 \
diff --git a/clang/tools/clang-nvlink-wrapper/ClangNVLinkWrapper.cpp b/clang/tools/clang-nvlink-wrapper/ClangNVLinkWrapper.cpp
index 3885166e76ca7..ac60c96722c65 100644
--- a/clang/tools/clang-nvlink-wrapper/ClangNVLinkWrapper.cpp
+++ b/clang/tools/clang-nvlink-wrapper/ClangNVLinkWrapper.cpp
@@ -302,6 +302,9 @@ Expected<StringRef> runPTXAs(StringRef File, const ArgList &Args) {
       findProgram(Args, "ptxas", {CudaPath + "/bin", GivenPath});
   if (!PTXAsPath)
     return PTXAsPath.takeError();
+  if (!Args.hasArg(OPT_arch))
+    return createStringError(
+        "must pass in an explicit nvptx64 gpu architecture to 'ptxas'");
 
   auto TempFileOrErr = createTempFile(
       Args, sys::path::stem(Args.getLastArgValue(OPT_o, "a.out")), "cubin");
@@ -693,6 +696,10 @@ Error runNVLink(ArrayRef<StringRef> Files, const ArgList &Args) {
   if (!NVLinkPath)
     return NVLinkPath.takeError();
 
+  if (!Args.hasArg(OPT_arch))
+    return createStringError(
+        "must pass in an explicit nvptx64 gpu architecture to 'nvlink'");
+
   ArgStringList NewLinkerArgs;
   for (const opt::Arg *Arg : Args) {
     // Do not forward arguments only intended for the linker wrapper.

jhuber6 added 2 commits July 26, 2024 16:40
Summary:
The `nvlink-wrapper` can do LTO now, which means we can still create
some LLVM-IR without needing an architecture. In the case that we try to
invoke `nvlink` internally, that will still fail. This patch simply
defers the error until later so we can use `--lto-emit-llvm` to get the
IR without specifying an architecture.
@jhuber6
Copy link
Contributor Author

jhuber6 commented Jul 31, 2024

Ping, would like to remove the hacky code from libc.

@jhuber6 jhuber6 merged commit 2bf58f5 into llvm:main Jul 31, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category libc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants