Skip to content

[Clang] Forward arguments to the device compiler better #125957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 6, 2025

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Feb 5, 2025

Summary:
Currently we have a subset of arguments that are handled specially to
keep them consistent between host and device compiles, however, this is
extremely hacky as it only works on a few predetermined options. This is
a holdover from the days before the linker wrapper shuttled all of its
arguments through clang. Now that we just use clang, all we need to do
is just use the --device-compiler= option to forward it there and let
the normal toolchain handle it.

For example,

clang -fopenmp --offload-arch=gfx1030,sm_89 -Xarch_nvptx64 -O3 -foffload-lto

will forward the -O3 to the LTO compilation only for the NVPTX
compilation.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Feb 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 5, 2025

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-backend-amdgpu

Author: Joseph Huber (jhuber6)

Changes

Summary:
Currently we have a subset of arguments that are handled specially to
keep them consistent between host and device compiles, however, this is
extremely hacky as it only works on a few predetermined options. This is
a holdover from the days before the linker wrapper shuttled all of its
arguments through clang. Now that we just use clang, all we need to do
is just use the --device-compiler= option to forward it there and let
the normal toolchain handle it.

For example,

clang -fopenmp --offload-arch=gfx1030,sm_89 -Xarch_nvptx64 -O3 -foffload-lto

will forward the -O3 to the LTO compilation only for the NVPTX
compilation.


Patch is 41.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125957.diff

4 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+211-212)
  • (modified) clang/test/Driver/amdgpu-openmp-sanitize-options.c (+2-2)
  • (modified) clang/test/Driver/amdgpu-openmp-toolchain.c (+2-2)
  • (modified) clang/test/Driver/openmp-offload.c (+2-2)
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 27de34634660c3..b957fdc3b4a0d2 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -92,8 +92,8 @@ static void CheckCodeGenerationOptions(const Driver &D, const ArgList &Args) {
   if (Args.hasArg(options::OPT_static))
     if (const Arg *A =
             Args.getLastArg(options::OPT_dynamic, options::OPT_mdynamic_no_pic))
-      D.Diag(diag::err_drv_argument_not_allowed_with) << A->getAsString(Args)
-                                                      << "-static";
+      D.Diag(diag::err_drv_argument_not_allowed_with)
+          << A->getAsString(Args) << "-static";
 }
 
 /// Apply \a Work on the current tool chain \a RegularToolChain and any other
@@ -139,8 +139,8 @@ forAllAssociatedToolChains(Compilation &C, const JobAction &JA,
 /// parameter in reciprocal argument strings. Return false if there is an error
 /// parsing the refinement step. Otherwise, return true and set the Position
 /// of the refinement step in the input string.
-static bool getRefinementStep(StringRef In, const Driver &D,
-                              const Arg &A, size_t &Position) {
+static bool getRefinementStep(StringRef In, const Driver &D, const Arg &A,
+                              size_t &Position) {
   const char RefinementStepToken = ':';
   Position = In.find(RefinementStepToken);
   if (Position != StringRef::npos) {
@@ -259,7 +259,8 @@ static void ParseMRecip(const Driver &D, const ArgList &Args,
 
     // If the precision was not specified, also mark the double and half entry
     // as found.
-    if (ValBase.back() != 'f' && ValBase.back() != 'd' && ValBase.back() != 'h') {
+    if (ValBase.back() != 'f' && ValBase.back() != 'd' &&
+        ValBase.back() != 'h') {
       OptionStrings[ValBase.str() + 'd'] = true;
       OptionStrings[ValBase.str() + 'h'] = true;
     }
@@ -498,7 +499,7 @@ static void addMacroPrefixMapArg(const Driver &D, const ArgList &Args,
 
 /// Add a CC1 and CC1AS option to specify the coverage file path prefix map.
 static void addCoveragePrefixMapArg(const Driver &D, const ArgList &Args,
-                                   ArgStringList &CmdArgs) {
+                                    ArgStringList &CmdArgs) {
   for (const Arg *A : Args.filtered(options::OPT_ffile_prefix_map_EQ,
                                     options::OPT_fcoverage_prefix_map_EQ)) {
     StringRef Map = A->getValue();
@@ -589,13 +590,12 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C,
 
   auto *CSPGOGenerateArg = getLastCSProfileGenerateArg(Args);
 
-  auto *ProfileGenerateArg = Args.getLastArg(
-      options::OPT_fprofile_instr_generate,
-      options::OPT_fprofile_instr_generate_EQ,
-      options::OPT_fno_profile_instr_generate);
-  if (ProfileGenerateArg &&
-      ProfileGenerateArg->getOption().matches(
-          options::OPT_fno_profile_instr_generate))
+  auto *ProfileGenerateArg =
+      Args.getLastArg(options::OPT_fprofile_instr_generate,
+                      options::OPT_fprofile_instr_generate_EQ,
+                      options::OPT_fno_profile_instr_generate);
+  if (ProfileGenerateArg && ProfileGenerateArg->getOption().matches(
+                                options::OPT_fno_profile_instr_generate))
     ProfileGenerateArg = nullptr;
 
   if (PGOGenerateArg && ProfileGenerateArg)
@@ -1159,8 +1159,8 @@ void Clang::AddPreprocessingOptions(Compilation &C, const JobAction &JA,
       }
 
       if (ThroughHeader.empty()) {
-        CmdArgs.push_back(Args.MakeArgString(
-            Twine("-pch-through-hdrstop-") + (YcArg ? "create" : "use")));
+        CmdArgs.push_back(Args.MakeArgString(Twine("-pch-through-hdrstop-") +
+                                             (YcArg ? "create" : "use")));
       } else {
         CmdArgs.push_back(
             Args.MakeArgString(Twine("-pch-through-header=") + ThroughHeader));
@@ -1199,8 +1199,8 @@ void Clang::AddPreprocessingOptions(Compilation &C, const JobAction &JA,
           continue;
         } else {
           // Ignore the PCH if not first on command line and emit warning.
-          D.Diag(diag::warn_drv_pch_not_first_include) << P
-                                                       << A->getAsString(Args);
+          D.Diag(diag::warn_drv_pch_not_first_include)
+              << P << A->getAsString(Args);
         }
       }
     } else if (A->getOption().matches(options::OPT_isystem_after)) {
@@ -1422,8 +1422,9 @@ static void renderRemarksOptions(const ArgList &Args, ArgStringList &CmdArgs,
       if (Arg *FinalOutput = Args.getLastArg(options::OPT_o))
         F = FinalOutput->getValue();
     } else {
-      if (Format != "yaml" && // For YAML, keep the original behavior.
-          Triple.isOSDarwin() && // Enable this only on darwin, since it's the only platform supporting .dSYM bundles.
+      if (Format != "yaml" &&    // For YAML, keep the original behavior.
+          Triple.isOSDarwin() && // Enable this only on darwin, since it's the
+                                 // only platform supporting .dSYM bundles.
           Output.isFilename())
         F = Output.getFilename();
     }
@@ -1517,7 +1518,7 @@ void AddUnalignedAccessWarning(ArgStringList &CmdArgs) {
       StringRef(*StrictAlignIter) == "+strict-align")
     CmdArgs.push_back("-Wunaligned-access");
 }
-}
+} // namespace
 
 // Each combination of options here forms a signing schema, and in most cases
 // each signing schema is its own incompatible ABI. The default values of the
@@ -1797,7 +1798,7 @@ void RenderAArch64ABI(const llvm::Triple &Triple, const ArgList &Args,
   CmdArgs.push_back("-target-abi");
   CmdArgs.push_back(ABIName);
 }
-}
+} // namespace
 
 void Clang::AddAArch64TargetArgs(const ArgList &Args,
                                  ArgStringList &CmdArgs) const {
@@ -1833,17 +1834,19 @@ void Clang::AddAArch64TargetArgs(const ArgList &Args,
         Val == "1024+" || Val == "2048+") {
       unsigned Bits = 0;
       if (!Val.consume_back("+")) {
-        bool Invalid = Val.getAsInteger(10, Bits); (void)Invalid;
+        bool Invalid = Val.getAsInteger(10, Bits);
+        (void)Invalid;
         assert(!Invalid && "Failed to parse value");
         CmdArgs.push_back(
             Args.MakeArgString("-mvscale-max=" + llvm::Twine(Bits / 128)));
       }
 
-      bool Invalid = Val.getAsInteger(10, Bits); (void)Invalid;
+      bool Invalid = Val.getAsInteger(10, Bits);
+      (void)Invalid;
       assert(!Invalid && "Failed to parse value");
       CmdArgs.push_back(
           Args.MakeArgString("-mvscale-min=" + llvm::Twine(Bits / 128)));
-    // Silently drop requests for vector-length agnostic code as it's implied.
+      // Silently drop requests for vector-length agnostic code as it's implied.
     } else if (Val != "scalable")
       // Handle the unsupported values passed to msve-vector-bits.
       D.Diag(diag::err_drv_unsupported_option_argument)
@@ -2126,9 +2129,9 @@ void Clang::AddPPCTargetArgs(const ArgList &Args,
       ABIName = "elfv2";
       A->claim();
     } else if (V != "altivec")
-      // The ppc64 linux abis are all "altivec" abis by default. Accept and ignore
-      // the option if given as we don't have backend support for any targets
-      // that don't use the altivec abi.
+      // The ppc64 linux abis are all "altivec" abis by default. Accept and
+      // ignore the option if given as we don't have backend support for any
+      // targets that don't use the altivec abi.
       ABIName = A->getValue();
   }
   if (IEEELongDouble)
@@ -2279,7 +2282,7 @@ void Clang::AddSystemZTargetArgs(const ArgList &Args,
   if (HasBackchain && HasPackedStack && !HasSoftFloat) {
     const Driver &D = getToolChain().getDriver();
     D.Diag(diag::err_drv_unsupported_opt)
-      << "-mpacked-stack -mbackchain -mhard-float";
+        << "-mpacked-stack -mbackchain -mhard-float";
   }
   if (HasBackchain)
     CmdArgs.push_back("-mbackchain");
@@ -2434,7 +2437,8 @@ void Clang::AddVETargetArgs(const ArgList &Args, ArgStringList &CmdArgs) const {
 
 void Clang::DumpCompilationDatabase(Compilation &C, StringRef Filename,
                                     StringRef Target, const InputInfo &Output,
-                                    const InputInfo &Input, const ArgList &Args) const {
+                                    const InputInfo &Input,
+                                    const ArgList &Args) const {
   // If this is a dry run, do not create the compilation database file.
   if (C.getArgs().hasArg(options::OPT__HASH_HASH_HASH))
     return;
@@ -2448,8 +2452,8 @@ void Clang::DumpCompilationDatabase(Compilation &C, StringRef Filename,
         Filename, EC,
         llvm::sys::fs::OF_TextWithCRLF | llvm::sys::fs::OF_Append);
     if (EC) {
-      D.Diag(clang::diag::err_drv_compilationdatabase) << Filename
-                                                       << EC.message();
+      D.Diag(clang::diag::err_drv_compilationdatabase)
+          << Filename << EC.message();
       return;
     }
     CompilationDatabase = std::move(File);
@@ -2475,7 +2479,7 @@ void Clang::DumpCompilationDatabase(Compilation &C, StringRef Filename,
   CDB << ", \"" << escape(Input.getFilename()) << "\"";
   if (Output.isFilename())
     CDB << ", \"-o\", \"" << escape(Output.getFilename()) << "\"";
-  for (auto &A: Args) {
+  for (auto &A : Args) {
     auto &O = A->getOption();
     // Skip language selection, which is positional.
     if (O.getID() == options::OPT_x)
@@ -2494,7 +2498,7 @@ void Clang::DumpCompilationDatabase(Compilation &C, StringRef Filename,
     // All other arguments are quoted and appended.
     ArgStringList ASL;
     A->render(Args, ASL);
-    for (auto &it: ASL)
+    for (auto &it : ASL)
       CDB << ", \"" << escape(it) << "\"";
   }
   Buf = "--target=";
@@ -2921,7 +2925,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
   bool AssociativeMath = false;
   bool ReciprocalMath = false;
   bool SignedZeros = true;
-  bool TrappingMath = false; // Implemented via -ffp-exception-behavior
+  bool TrappingMath = false;        // Implemented via -ffp-exception-behavior
   bool TrappingMathPresent = false; // Is trapping-math in args, and not
                                     // overriden by ffp-exception-behavior?
   bool RoundingFPMath = false;
@@ -3022,7 +3026,8 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
 
     switch (A->getOption().getID()) {
     // If this isn't an FP option skip the claim below
-    default: continue;
+    default:
+      continue;
 
     case options::OPT_fcx_limited_range:
       if (GccRangeComplexOption.empty()) {
@@ -3159,20 +3164,48 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
     }
 
     // Options controlling individual features
-    case options::OPT_fhonor_infinities:    HonorINFs = true;         break;
-    case options::OPT_fno_honor_infinities: HonorINFs = false;        break;
-    case options::OPT_fhonor_nans:          HonorNaNs = true;         break;
-    case options::OPT_fno_honor_nans:       HonorNaNs = false;        break;
-    case options::OPT_fapprox_func:         ApproxFunc = true;        break;
-    case options::OPT_fno_approx_func:      ApproxFunc = false;       break;
-    case options::OPT_fmath_errno:          MathErrno = true;         break;
-    case options::OPT_fno_math_errno:       MathErrno = false;        break;
-    case options::OPT_fassociative_math:    AssociativeMath = true;   break;
-    case options::OPT_fno_associative_math: AssociativeMath = false;  break;
-    case options::OPT_freciprocal_math:     ReciprocalMath = true;    break;
-    case options::OPT_fno_reciprocal_math:  ReciprocalMath = false;   break;
-    case options::OPT_fsigned_zeros:        SignedZeros = true;       break;
-    case options::OPT_fno_signed_zeros:     SignedZeros = false;      break;
+    case options::OPT_fhonor_infinities:
+      HonorINFs = true;
+      break;
+    case options::OPT_fno_honor_infinities:
+      HonorINFs = false;
+      break;
+    case options::OPT_fhonor_nans:
+      HonorNaNs = true;
+      break;
+    case options::OPT_fno_honor_nans:
+      HonorNaNs = false;
+      break;
+    case options::OPT_fapprox_func:
+      ApproxFunc = true;
+      break;
+    case options::OPT_fno_approx_func:
+      ApproxFunc = false;
+      break;
+    case options::OPT_fmath_errno:
+      MathErrno = true;
+      break;
+    case options::OPT_fno_math_errno:
+      MathErrno = false;
+      break;
+    case options::OPT_fassociative_math:
+      AssociativeMath = true;
+      break;
+    case options::OPT_fno_associative_math:
+      AssociativeMath = false;
+      break;
+    case options::OPT_freciprocal_math:
+      ReciprocalMath = true;
+      break;
+    case options::OPT_fno_reciprocal_math:
+      ReciprocalMath = false;
+      break;
+    case options::OPT_fsigned_zeros:
+      SignedZeros = true;
+      break;
+    case options::OPT_fno_signed_zeros:
+      SignedZeros = false;
+      break;
     case options::OPT_ftrapping_math:
       if (!TrappingMathPresent && !FPExceptionBehavior.empty() &&
           FPExceptionBehavior != "strict")
@@ -3413,8 +3446,8 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
           << VecLibArg->getAsString(Args);
   }
 
- if (AssociativeMath && ReciprocalMath && !SignedZeros && ApproxFunc &&
-     !TrappingMath)
+  if (AssociativeMath && ReciprocalMath && !SignedZeros && ApproxFunc &&
+      !TrappingMath)
     CmdArgs.push_back("-funsafe-math-optimizations");
 
   if (!SignedZeros)
@@ -3456,8 +3489,8 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
     CmdArgs.push_back(Args.MakeArgString("-fno-rounding-math"));
 
   if (!FPExceptionBehavior.empty())
-    CmdArgs.push_back(Args.MakeArgString("-ffp-exception-behavior=" +
-                      FPExceptionBehavior));
+    CmdArgs.push_back(
+        Args.MakeArgString("-ffp-exception-behavior=" + FPExceptionBehavior));
 
   if (!FPEvalMethod.empty())
     CmdArgs.push_back(Args.MakeArgString("-ffp-eval-method=" + FPEvalMethod));
@@ -3559,8 +3592,7 @@ static void RenderAnalyzerOptions(const ArgList &Args, ArgStringList &CmdArgs,
       CmdArgs.push_back("-analyzer-checker=osx");
       CmdArgs.push_back(
           "-analyzer-checker=security.insecureAPI.decodeValueOfObjCType");
-    }
-    else if (Triple.isOSFuchsia())
+    } else if (Triple.isOSFuchsia())
       CmdArgs.push_back("-analyzer-checker=fuchsia");
 
     CmdArgs.push_back("-analyzer-checker=deadcode");
@@ -3569,7 +3601,8 @@ static void RenderAnalyzerOptions(const ArgList &Args, ArgStringList &CmdArgs,
       CmdArgs.push_back("-analyzer-checker=cplusplus");
 
     if (!Triple.isPS()) {
-      CmdArgs.push_back("-analyzer-checker=security.insecureAPI.UncheckedReturn");
+      CmdArgs.push_back(
+          "-analyzer-checker=security.insecureAPI.UncheckedReturn");
       CmdArgs.push_back("-analyzer-checker=security.insecureAPI.getpw");
       CmdArgs.push_back("-analyzer-checker=security.insecureAPI.gets");
       CmdArgs.push_back("-analyzer-checker=security.insecureAPI.mktemp");
@@ -3892,8 +3925,7 @@ static void RenderOpenCLOptions(const ArgList &Args, ArgStringList &CmdArgs,
       options::OPT_cl_mad_enable,
       options::OPT_cl_no_signed_zeros,
       options::OPT_cl_fp32_correctly_rounded_divide_sqrt,
-      options::OPT_cl_uniform_work_group_size
-  };
+      options::OPT_cl_uniform_work_group_size};
 
   if (Arg *A = Args.getLastArg(options::OPT_cl_std_EQ)) {
     std::string CLStdStr = std::string("-cl-std=") + A->getValue();
@@ -4247,10 +4279,9 @@ static bool RenderModulesOptions(Compilation &C, const Driver &D,
 static void RenderCharacterOptions(const ArgList &Args, const llvm::Triple &T,
                                    ArgStringList &CmdArgs) {
   // -fsigned-char is default.
-  if (const Arg *A = Args.getLastArg(options::OPT_fsigned_char,
-                                     options::OPT_fno_signed_char,
-                                     options::OPT_funsigned_char,
-                                     options::OPT_fno_unsigned_char)) {
+  if (const Arg *A = Args.getLastArg(
+          options::OPT_fsigned_char, options::OPT_fno_signed_char,
+          options::OPT_funsigned_char, options::OPT_fno_unsigned_char)) {
     if (A->getOption().matches(options::OPT_funsigned_char) ||
         A->getOption().matches(options::OPT_fno_signed_char)) {
       CmdArgs.push_back("-fno-signed-char");
@@ -4344,9 +4375,8 @@ static void RenderObjCOptions(const ToolChain &TC, const Driver &D,
     auto *Arg = Args.getLastArg(
         options::OPT_fobjc_convert_messages_to_runtime_calls,
         options::OPT_fno_objc_convert_messages_to_runtime_calls);
-    if (Arg &&
-        Arg->getOption().matches(
-            options::OPT_fno_objc_convert_messages_to_runtime_calls))
+    if (Arg && Arg->getOption().matches(
+                   options::OPT_fno_objc_convert_messages_to_runtime_calls))
       CmdArgs.push_back("-fno-objc-convert-messages-to-runtime-calls");
   }
 
@@ -5191,8 +5221,8 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
     bool Failure =
         Triple.getArchName().substr(Offset).consumeInteger(10, Version);
     if (Failure || Version < 7)
-      D.Diag(diag::err_target_unsupported_arch) << Triple.getArchName()
-                                                << TripleStr;
+      D.Diag(diag::err_target_unsupported_arch)
+          << Triple.getArchName() << TripleStr;
   }
 
   // Push all default warning arguments that are specific to
@@ -5377,8 +5407,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
             Twine("-flto=") + (LTOMode == LTOK_Thin ? "thin" : "full")));
         // PS4 uses the legacy LTO API, which does not support some of the
         // features enabled by -flto-unit.
-        if (!RawTriple.isPS4() ||
-            (D.getLTOMode() == LTOK_Full) || !UnifiedLTO)
+        if (!RawTriple.isPS4() || (D.getLTOMode() == LTOK_Full) || !UnifiedLTO)
           CmdArgs.push_back("-flto-unit");
       }
     }
@@ -5491,7 +5520,8 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
 
     // Render ABI arguments
     switch (TC.getArch()) {
-    default: break;
+    default:
+      break;
     case llvm::Triple::arm:
     case llvm::Triple::armeb:
     case llvm::Triple::thumbeb:
@@ -5807,7 +5837,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   if (Arg *A = Args.getLastArg(options::OPT_mabi_EQ_quadword_atomics)) {
     if (!Triple.isOSAIX() || Triple.isPPC32())
       D.Diag(diag::err_drv_unsupported_opt_for_target)
-        << A->getSpelling() << RawTriple.str();
+          << A->getSpelling() << RawTriple.str();
     CmdArgs.push_back("-mabi=quadword-atomics");
   }
 
@@ -5882,7 +5912,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   }
 
   CodeGenOptions::FramePointerKind FPKeepKind =
-                  getFramePointerKind(Args, RawTriple);
+      getFramePointerKind(Args, RawTriple);
   const char *FPKeepKindStr = nullptr;
   switch (FPKeepKind) {
   case CodeGenOptions::FramePointerKind::None:
@@ -6088,10 +6118,10 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   // complicated ways.
   auto SanitizeArgs = TC.getSanitizerArgs(Args);
 
-  bool IsAsyncUnwindTablesDefault =
-      TC.getDefaultUnwindTableLevel(Args) == ToolChain::UnwindTableLevel::Asynchronous;
-  bool IsSyncUnwindTablesDefault =
-      TC.getDefaultUnwindTableLevel(Args) == ToolChain::UnwindTableLevel::Synchronous;
+  bool IsAsyncUnwindTablesDefault = TC.getDefaultUnwindTableLevel(Args) ==
+                                    ToolChain::UnwindTableLevel::Asynchronous;
+  bool IsSyncUnwindTablesDefault = TC.getDefaultUnwindTableLevel(Args) ==
+                                   ToolChain::UnwindTableLevel::Synchronous;
 
   bool AsyncUnwindTables = Args.hasFlag(
       options::OPT_fasynchronous_unwind_tables,
@@ -6104,7 +6134,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   if (AsyncUnwindTables)
     CmdArgs.push_back("-funwind-tables=2");
   else if (UnwindTables)
-     CmdArgs.push_back("-funwind-tables=1");
+    CmdArgs.push_back("-funwind-tables=1");
 
   // Prepare `-aux-target-cpu` and `-aux-target-feature` unless
   // `--gpu-use-a...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Feb 5, 2025

@llvm/pr-subscribers-clang

Author: Joseph Huber (jhuber6)

Changes

Summary:
Currently we have a subset of arguments that are handled specially to
keep them consistent between host and device compiles, however, this is
extremely hacky as it only works on a few predetermined options. This is
a holdover from the days before the linker wrapper shuttled all of its
arguments through clang. Now that we just use clang, all we need to do
is just use the --device-compiler= option to forward it there and let
the normal toolchain handle it.

For example,

clang -fopenmp --offload-arch=gfx1030,sm_89 -Xarch_nvptx64 -O3 -foffload-lto

will forward the -O3 to the LTO compilation only for the NVPTX
compilation.


Patch is 41.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125957.diff

4 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+211-212)
  • (modified) clang/test/Driver/amdgpu-openmp-sanitize-options.c (+2-2)
  • (modified) clang/test/Driver/amdgpu-openmp-toolchain.c (+2-2)
  • (modified) clang/test/Driver/openmp-offload.c (+2-2)
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 27de34634660c3..b957fdc3b4a0d2 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -92,8 +92,8 @@ static void CheckCodeGenerationOptions(const Driver &D, const ArgList &Args) {
   if (Args.hasArg(options::OPT_static))
     if (const Arg *A =
             Args.getLastArg(options::OPT_dynamic, options::OPT_mdynamic_no_pic))
-      D.Diag(diag::err_drv_argument_not_allowed_with) << A->getAsString(Args)
-                                                      << "-static";
+      D.Diag(diag::err_drv_argument_not_allowed_with)
+          << A->getAsString(Args) << "-static";
 }
 
 /// Apply \a Work on the current tool chain \a RegularToolChain and any other
@@ -139,8 +139,8 @@ forAllAssociatedToolChains(Compilation &C, const JobAction &JA,
 /// parameter in reciprocal argument strings. Return false if there is an error
 /// parsing the refinement step. Otherwise, return true and set the Position
 /// of the refinement step in the input string.
-static bool getRefinementStep(StringRef In, const Driver &D,
-                              const Arg &A, size_t &Position) {
+static bool getRefinementStep(StringRef In, const Driver &D, const Arg &A,
+                              size_t &Position) {
   const char RefinementStepToken = ':';
   Position = In.find(RefinementStepToken);
   if (Position != StringRef::npos) {
@@ -259,7 +259,8 @@ static void ParseMRecip(const Driver &D, const ArgList &Args,
 
     // If the precision was not specified, also mark the double and half entry
     // as found.
-    if (ValBase.back() != 'f' && ValBase.back() != 'd' && ValBase.back() != 'h') {
+    if (ValBase.back() != 'f' && ValBase.back() != 'd' &&
+        ValBase.back() != 'h') {
       OptionStrings[ValBase.str() + 'd'] = true;
       OptionStrings[ValBase.str() + 'h'] = true;
     }
@@ -498,7 +499,7 @@ static void addMacroPrefixMapArg(const Driver &D, const ArgList &Args,
 
 /// Add a CC1 and CC1AS option to specify the coverage file path prefix map.
 static void addCoveragePrefixMapArg(const Driver &D, const ArgList &Args,
-                                   ArgStringList &CmdArgs) {
+                                    ArgStringList &CmdArgs) {
   for (const Arg *A : Args.filtered(options::OPT_ffile_prefix_map_EQ,
                                     options::OPT_fcoverage_prefix_map_EQ)) {
     StringRef Map = A->getValue();
@@ -589,13 +590,12 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C,
 
   auto *CSPGOGenerateArg = getLastCSProfileGenerateArg(Args);
 
-  auto *ProfileGenerateArg = Args.getLastArg(
-      options::OPT_fprofile_instr_generate,
-      options::OPT_fprofile_instr_generate_EQ,
-      options::OPT_fno_profile_instr_generate);
-  if (ProfileGenerateArg &&
-      ProfileGenerateArg->getOption().matches(
-          options::OPT_fno_profile_instr_generate))
+  auto *ProfileGenerateArg =
+      Args.getLastArg(options::OPT_fprofile_instr_generate,
+                      options::OPT_fprofile_instr_generate_EQ,
+                      options::OPT_fno_profile_instr_generate);
+  if (ProfileGenerateArg && ProfileGenerateArg->getOption().matches(
+                                options::OPT_fno_profile_instr_generate))
     ProfileGenerateArg = nullptr;
 
   if (PGOGenerateArg && ProfileGenerateArg)
@@ -1159,8 +1159,8 @@ void Clang::AddPreprocessingOptions(Compilation &C, const JobAction &JA,
       }
 
       if (ThroughHeader.empty()) {
-        CmdArgs.push_back(Args.MakeArgString(
-            Twine("-pch-through-hdrstop-") + (YcArg ? "create" : "use")));
+        CmdArgs.push_back(Args.MakeArgString(Twine("-pch-through-hdrstop-") +
+                                             (YcArg ? "create" : "use")));
       } else {
         CmdArgs.push_back(
             Args.MakeArgString(Twine("-pch-through-header=") + ThroughHeader));
@@ -1199,8 +1199,8 @@ void Clang::AddPreprocessingOptions(Compilation &C, const JobAction &JA,
           continue;
         } else {
           // Ignore the PCH if not first on command line and emit warning.
-          D.Diag(diag::warn_drv_pch_not_first_include) << P
-                                                       << A->getAsString(Args);
+          D.Diag(diag::warn_drv_pch_not_first_include)
+              << P << A->getAsString(Args);
         }
       }
     } else if (A->getOption().matches(options::OPT_isystem_after)) {
@@ -1422,8 +1422,9 @@ static void renderRemarksOptions(const ArgList &Args, ArgStringList &CmdArgs,
       if (Arg *FinalOutput = Args.getLastArg(options::OPT_o))
         F = FinalOutput->getValue();
     } else {
-      if (Format != "yaml" && // For YAML, keep the original behavior.
-          Triple.isOSDarwin() && // Enable this only on darwin, since it's the only platform supporting .dSYM bundles.
+      if (Format != "yaml" &&    // For YAML, keep the original behavior.
+          Triple.isOSDarwin() && // Enable this only on darwin, since it's the
+                                 // only platform supporting .dSYM bundles.
           Output.isFilename())
         F = Output.getFilename();
     }
@@ -1517,7 +1518,7 @@ void AddUnalignedAccessWarning(ArgStringList &CmdArgs) {
       StringRef(*StrictAlignIter) == "+strict-align")
     CmdArgs.push_back("-Wunaligned-access");
 }
-}
+} // namespace
 
 // Each combination of options here forms a signing schema, and in most cases
 // each signing schema is its own incompatible ABI. The default values of the
@@ -1797,7 +1798,7 @@ void RenderAArch64ABI(const llvm::Triple &Triple, const ArgList &Args,
   CmdArgs.push_back("-target-abi");
   CmdArgs.push_back(ABIName);
 }
-}
+} // namespace
 
 void Clang::AddAArch64TargetArgs(const ArgList &Args,
                                  ArgStringList &CmdArgs) const {
@@ -1833,17 +1834,19 @@ void Clang::AddAArch64TargetArgs(const ArgList &Args,
         Val == "1024+" || Val == "2048+") {
       unsigned Bits = 0;
       if (!Val.consume_back("+")) {
-        bool Invalid = Val.getAsInteger(10, Bits); (void)Invalid;
+        bool Invalid = Val.getAsInteger(10, Bits);
+        (void)Invalid;
         assert(!Invalid && "Failed to parse value");
         CmdArgs.push_back(
             Args.MakeArgString("-mvscale-max=" + llvm::Twine(Bits / 128)));
       }
 
-      bool Invalid = Val.getAsInteger(10, Bits); (void)Invalid;
+      bool Invalid = Val.getAsInteger(10, Bits);
+      (void)Invalid;
       assert(!Invalid && "Failed to parse value");
       CmdArgs.push_back(
           Args.MakeArgString("-mvscale-min=" + llvm::Twine(Bits / 128)));
-    // Silently drop requests for vector-length agnostic code as it's implied.
+      // Silently drop requests for vector-length agnostic code as it's implied.
     } else if (Val != "scalable")
       // Handle the unsupported values passed to msve-vector-bits.
       D.Diag(diag::err_drv_unsupported_option_argument)
@@ -2126,9 +2129,9 @@ void Clang::AddPPCTargetArgs(const ArgList &Args,
       ABIName = "elfv2";
       A->claim();
     } else if (V != "altivec")
-      // The ppc64 linux abis are all "altivec" abis by default. Accept and ignore
-      // the option if given as we don't have backend support for any targets
-      // that don't use the altivec abi.
+      // The ppc64 linux abis are all "altivec" abis by default. Accept and
+      // ignore the option if given as we don't have backend support for any
+      // targets that don't use the altivec abi.
       ABIName = A->getValue();
   }
   if (IEEELongDouble)
@@ -2279,7 +2282,7 @@ void Clang::AddSystemZTargetArgs(const ArgList &Args,
   if (HasBackchain && HasPackedStack && !HasSoftFloat) {
     const Driver &D = getToolChain().getDriver();
     D.Diag(diag::err_drv_unsupported_opt)
-      << "-mpacked-stack -mbackchain -mhard-float";
+        << "-mpacked-stack -mbackchain -mhard-float";
   }
   if (HasBackchain)
     CmdArgs.push_back("-mbackchain");
@@ -2434,7 +2437,8 @@ void Clang::AddVETargetArgs(const ArgList &Args, ArgStringList &CmdArgs) const {
 
 void Clang::DumpCompilationDatabase(Compilation &C, StringRef Filename,
                                     StringRef Target, const InputInfo &Output,
-                                    const InputInfo &Input, const ArgList &Args) const {
+                                    const InputInfo &Input,
+                                    const ArgList &Args) const {
   // If this is a dry run, do not create the compilation database file.
   if (C.getArgs().hasArg(options::OPT__HASH_HASH_HASH))
     return;
@@ -2448,8 +2452,8 @@ void Clang::DumpCompilationDatabase(Compilation &C, StringRef Filename,
         Filename, EC,
         llvm::sys::fs::OF_TextWithCRLF | llvm::sys::fs::OF_Append);
     if (EC) {
-      D.Diag(clang::diag::err_drv_compilationdatabase) << Filename
-                                                       << EC.message();
+      D.Diag(clang::diag::err_drv_compilationdatabase)
+          << Filename << EC.message();
       return;
     }
     CompilationDatabase = std::move(File);
@@ -2475,7 +2479,7 @@ void Clang::DumpCompilationDatabase(Compilation &C, StringRef Filename,
   CDB << ", \"" << escape(Input.getFilename()) << "\"";
   if (Output.isFilename())
     CDB << ", \"-o\", \"" << escape(Output.getFilename()) << "\"";
-  for (auto &A: Args) {
+  for (auto &A : Args) {
     auto &O = A->getOption();
     // Skip language selection, which is positional.
     if (O.getID() == options::OPT_x)
@@ -2494,7 +2498,7 @@ void Clang::DumpCompilationDatabase(Compilation &C, StringRef Filename,
     // All other arguments are quoted and appended.
     ArgStringList ASL;
     A->render(Args, ASL);
-    for (auto &it: ASL)
+    for (auto &it : ASL)
       CDB << ", \"" << escape(it) << "\"";
   }
   Buf = "--target=";
@@ -2921,7 +2925,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
   bool AssociativeMath = false;
   bool ReciprocalMath = false;
   bool SignedZeros = true;
-  bool TrappingMath = false; // Implemented via -ffp-exception-behavior
+  bool TrappingMath = false;        // Implemented via -ffp-exception-behavior
   bool TrappingMathPresent = false; // Is trapping-math in args, and not
                                     // overriden by ffp-exception-behavior?
   bool RoundingFPMath = false;
@@ -3022,7 +3026,8 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
 
     switch (A->getOption().getID()) {
     // If this isn't an FP option skip the claim below
-    default: continue;
+    default:
+      continue;
 
     case options::OPT_fcx_limited_range:
       if (GccRangeComplexOption.empty()) {
@@ -3159,20 +3164,48 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
     }
 
     // Options controlling individual features
-    case options::OPT_fhonor_infinities:    HonorINFs = true;         break;
-    case options::OPT_fno_honor_infinities: HonorINFs = false;        break;
-    case options::OPT_fhonor_nans:          HonorNaNs = true;         break;
-    case options::OPT_fno_honor_nans:       HonorNaNs = false;        break;
-    case options::OPT_fapprox_func:         ApproxFunc = true;        break;
-    case options::OPT_fno_approx_func:      ApproxFunc = false;       break;
-    case options::OPT_fmath_errno:          MathErrno = true;         break;
-    case options::OPT_fno_math_errno:       MathErrno = false;        break;
-    case options::OPT_fassociative_math:    AssociativeMath = true;   break;
-    case options::OPT_fno_associative_math: AssociativeMath = false;  break;
-    case options::OPT_freciprocal_math:     ReciprocalMath = true;    break;
-    case options::OPT_fno_reciprocal_math:  ReciprocalMath = false;   break;
-    case options::OPT_fsigned_zeros:        SignedZeros = true;       break;
-    case options::OPT_fno_signed_zeros:     SignedZeros = false;      break;
+    case options::OPT_fhonor_infinities:
+      HonorINFs = true;
+      break;
+    case options::OPT_fno_honor_infinities:
+      HonorINFs = false;
+      break;
+    case options::OPT_fhonor_nans:
+      HonorNaNs = true;
+      break;
+    case options::OPT_fno_honor_nans:
+      HonorNaNs = false;
+      break;
+    case options::OPT_fapprox_func:
+      ApproxFunc = true;
+      break;
+    case options::OPT_fno_approx_func:
+      ApproxFunc = false;
+      break;
+    case options::OPT_fmath_errno:
+      MathErrno = true;
+      break;
+    case options::OPT_fno_math_errno:
+      MathErrno = false;
+      break;
+    case options::OPT_fassociative_math:
+      AssociativeMath = true;
+      break;
+    case options::OPT_fno_associative_math:
+      AssociativeMath = false;
+      break;
+    case options::OPT_freciprocal_math:
+      ReciprocalMath = true;
+      break;
+    case options::OPT_fno_reciprocal_math:
+      ReciprocalMath = false;
+      break;
+    case options::OPT_fsigned_zeros:
+      SignedZeros = true;
+      break;
+    case options::OPT_fno_signed_zeros:
+      SignedZeros = false;
+      break;
     case options::OPT_ftrapping_math:
       if (!TrappingMathPresent && !FPExceptionBehavior.empty() &&
           FPExceptionBehavior != "strict")
@@ -3413,8 +3446,8 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
           << VecLibArg->getAsString(Args);
   }
 
- if (AssociativeMath && ReciprocalMath && !SignedZeros && ApproxFunc &&
-     !TrappingMath)
+  if (AssociativeMath && ReciprocalMath && !SignedZeros && ApproxFunc &&
+      !TrappingMath)
     CmdArgs.push_back("-funsafe-math-optimizations");
 
   if (!SignedZeros)
@@ -3456,8 +3489,8 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
     CmdArgs.push_back(Args.MakeArgString("-fno-rounding-math"));
 
   if (!FPExceptionBehavior.empty())
-    CmdArgs.push_back(Args.MakeArgString("-ffp-exception-behavior=" +
-                      FPExceptionBehavior));
+    CmdArgs.push_back(
+        Args.MakeArgString("-ffp-exception-behavior=" + FPExceptionBehavior));
 
   if (!FPEvalMethod.empty())
     CmdArgs.push_back(Args.MakeArgString("-ffp-eval-method=" + FPEvalMethod));
@@ -3559,8 +3592,7 @@ static void RenderAnalyzerOptions(const ArgList &Args, ArgStringList &CmdArgs,
       CmdArgs.push_back("-analyzer-checker=osx");
       CmdArgs.push_back(
           "-analyzer-checker=security.insecureAPI.decodeValueOfObjCType");
-    }
-    else if (Triple.isOSFuchsia())
+    } else if (Triple.isOSFuchsia())
       CmdArgs.push_back("-analyzer-checker=fuchsia");
 
     CmdArgs.push_back("-analyzer-checker=deadcode");
@@ -3569,7 +3601,8 @@ static void RenderAnalyzerOptions(const ArgList &Args, ArgStringList &CmdArgs,
       CmdArgs.push_back("-analyzer-checker=cplusplus");
 
     if (!Triple.isPS()) {
-      CmdArgs.push_back("-analyzer-checker=security.insecureAPI.UncheckedReturn");
+      CmdArgs.push_back(
+          "-analyzer-checker=security.insecureAPI.UncheckedReturn");
       CmdArgs.push_back("-analyzer-checker=security.insecureAPI.getpw");
       CmdArgs.push_back("-analyzer-checker=security.insecureAPI.gets");
       CmdArgs.push_back("-analyzer-checker=security.insecureAPI.mktemp");
@@ -3892,8 +3925,7 @@ static void RenderOpenCLOptions(const ArgList &Args, ArgStringList &CmdArgs,
       options::OPT_cl_mad_enable,
       options::OPT_cl_no_signed_zeros,
       options::OPT_cl_fp32_correctly_rounded_divide_sqrt,
-      options::OPT_cl_uniform_work_group_size
-  };
+      options::OPT_cl_uniform_work_group_size};
 
   if (Arg *A = Args.getLastArg(options::OPT_cl_std_EQ)) {
     std::string CLStdStr = std::string("-cl-std=") + A->getValue();
@@ -4247,10 +4279,9 @@ static bool RenderModulesOptions(Compilation &C, const Driver &D,
 static void RenderCharacterOptions(const ArgList &Args, const llvm::Triple &T,
                                    ArgStringList &CmdArgs) {
   // -fsigned-char is default.
-  if (const Arg *A = Args.getLastArg(options::OPT_fsigned_char,
-                                     options::OPT_fno_signed_char,
-                                     options::OPT_funsigned_char,
-                                     options::OPT_fno_unsigned_char)) {
+  if (const Arg *A = Args.getLastArg(
+          options::OPT_fsigned_char, options::OPT_fno_signed_char,
+          options::OPT_funsigned_char, options::OPT_fno_unsigned_char)) {
     if (A->getOption().matches(options::OPT_funsigned_char) ||
         A->getOption().matches(options::OPT_fno_signed_char)) {
       CmdArgs.push_back("-fno-signed-char");
@@ -4344,9 +4375,8 @@ static void RenderObjCOptions(const ToolChain &TC, const Driver &D,
     auto *Arg = Args.getLastArg(
         options::OPT_fobjc_convert_messages_to_runtime_calls,
         options::OPT_fno_objc_convert_messages_to_runtime_calls);
-    if (Arg &&
-        Arg->getOption().matches(
-            options::OPT_fno_objc_convert_messages_to_runtime_calls))
+    if (Arg && Arg->getOption().matches(
+                   options::OPT_fno_objc_convert_messages_to_runtime_calls))
       CmdArgs.push_back("-fno-objc-convert-messages-to-runtime-calls");
   }
 
@@ -5191,8 +5221,8 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
     bool Failure =
         Triple.getArchName().substr(Offset).consumeInteger(10, Version);
     if (Failure || Version < 7)
-      D.Diag(diag::err_target_unsupported_arch) << Triple.getArchName()
-                                                << TripleStr;
+      D.Diag(diag::err_target_unsupported_arch)
+          << Triple.getArchName() << TripleStr;
   }
 
   // Push all default warning arguments that are specific to
@@ -5377,8 +5407,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
             Twine("-flto=") + (LTOMode == LTOK_Thin ? "thin" : "full")));
         // PS4 uses the legacy LTO API, which does not support some of the
         // features enabled by -flto-unit.
-        if (!RawTriple.isPS4() ||
-            (D.getLTOMode() == LTOK_Full) || !UnifiedLTO)
+        if (!RawTriple.isPS4() || (D.getLTOMode() == LTOK_Full) || !UnifiedLTO)
           CmdArgs.push_back("-flto-unit");
       }
     }
@@ -5491,7 +5520,8 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
 
     // Render ABI arguments
     switch (TC.getArch()) {
-    default: break;
+    default:
+      break;
     case llvm::Triple::arm:
     case llvm::Triple::armeb:
     case llvm::Triple::thumbeb:
@@ -5807,7 +5837,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   if (Arg *A = Args.getLastArg(options::OPT_mabi_EQ_quadword_atomics)) {
     if (!Triple.isOSAIX() || Triple.isPPC32())
       D.Diag(diag::err_drv_unsupported_opt_for_target)
-        << A->getSpelling() << RawTriple.str();
+          << A->getSpelling() << RawTriple.str();
     CmdArgs.push_back("-mabi=quadword-atomics");
   }
 
@@ -5882,7 +5912,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   }
 
   CodeGenOptions::FramePointerKind FPKeepKind =
-                  getFramePointerKind(Args, RawTriple);
+      getFramePointerKind(Args, RawTriple);
   const char *FPKeepKindStr = nullptr;
   switch (FPKeepKind) {
   case CodeGenOptions::FramePointerKind::None:
@@ -6088,10 +6118,10 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   // complicated ways.
   auto SanitizeArgs = TC.getSanitizerArgs(Args);
 
-  bool IsAsyncUnwindTablesDefault =
-      TC.getDefaultUnwindTableLevel(Args) == ToolChain::UnwindTableLevel::Asynchronous;
-  bool IsSyncUnwindTablesDefault =
-      TC.getDefaultUnwindTableLevel(Args) == ToolChain::UnwindTableLevel::Synchronous;
+  bool IsAsyncUnwindTablesDefault = TC.getDefaultUnwindTableLevel(Args) ==
+                                    ToolChain::UnwindTableLevel::Asynchronous;
+  bool IsSyncUnwindTablesDefault = TC.getDefaultUnwindTableLevel(Args) ==
+                                   ToolChain::UnwindTableLevel::Synchronous;
 
   bool AsyncUnwindTables = Args.hasFlag(
       options::OPT_fasynchronous_unwind_tables,
@@ -6104,7 +6134,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   if (AsyncUnwindTables)
     CmdArgs.push_back("-funwind-tables=2");
   else if (UnwindTables)
-     CmdArgs.push_back("-funwind-tables=1");
+    CmdArgs.push_back("-funwind-tables=1");
 
   // Prepare `-aux-target-cpu` and `-aux-target-feature` unless
   // `--gpu-use-a...
[truncated]

@ye-luo
Copy link
Contributor

ye-luo commented Feb 6, 2025

What is the rule of propagation? It is not immediately clear to me

clang -fopenmp --offload-arch=gfx1030,sm_89 -Xarch_nvptx64 -O3 -foffload-lto

-foffload-lto is an argument to -Xarch_nvptx64 instead of clang.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 6, 2025

What is the rule of propagation? It is not immediately clear to me

clang -fopenmp --offload-arch=gfx1030,sm_89 -Xarch_nvptx64 -O3 -foffload-lto

-foffload-lto is an argument to -Xarch_nvptx64 instead of clang.

That's just an example, -Xarch_nvptx64 will send that argument to the NVPTX toolchain which will then be forwarded to the embedded clang job to link the final image, i.e. you will get optimized LTO only for NVPTX and not for GFX1030. This patch basically just forwards a set of approved options to the clang invocation inside of the linker wrapper.

@ye-luo
Copy link
Contributor

ye-luo commented Feb 6, 2025

I understood that -Xarch_nvptx64 propagates -O3 to the nvptx toolchain but it is not obvious to me that it should pick up all the argument after it.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 6, 2025

I understood that -Xarch_nvptx64 propagates -O3 to the nvptx toolchain but it is not obvious to me that it should pick up all the argument after it.

It doesn't, maybe I worded it poorly, but -O3 is only relevant to the linker wrapper during LTO so I turned on LTO.

@ye-luo
Copy link
Contributor

ye-luo commented Feb 6, 2025

I see what you meant.

@@ -9270,11 +9260,6 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA,
});
}

// If we disable the GPU C library support it needs to be forwarded to the
// link job.
if (!Args.hasFlag(options::OPT_gpulibc, options::OPT_nogpulibc, true))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont see these options in either of the new option sets, is this code not needed anymore or is it handled somewhere else already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is leftover from an ill-conceived attempt at making the AMDGPU target link it automatically if it was present. It's useless now so I just deleted it.

Summary:
Currently we have a subset of arguments that are handled specially to
keep them consistent between host and device compiles, however, this is
extremely hacky as it only works on a few predetermined options. This is
a holdover from the days before the linker wrapper shuttled all of its
arguments through `clang`. Now that we just use clang, all we need to do
is just use the `--device-compiler=` option to forward it there and let
the normal toolchain handle it.

For example,
```console
clang -fopenmp --offload-arch=gfx1030,sm_89 -Xarch_nvptx64 -O3 -foffload-lto
```
will forward the `-O3` to the LTO compilation only for the NVPTX
compilation.
@jhuber6 jhuber6 merged commit 2e18c94 into llvm:main Feb 6, 2025
5 of 7 checks passed
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
Summary:
Currently we have a subset of arguments that are handled specially to
keep them consistent between host and device compiles, however, this is
extremely hacky as it only works on a few predetermined options. This is
a holdover from the days before the linker wrapper shuttled all of its
arguments through `clang`. Now that we just use clang, all we need to do
is just use the `--device-compiler=` option to forward it there and let
the normal toolchain handle it.

For example,
```console
clang -fopenmp --offload-arch=gfx1030,sm_89 -Xarch_nvptx64 -O3 -foffload-lto
```
will forward the `-O3` to the LTO compilation only for the NVPTX
compilation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants