Skip to content

[flang] Add basic -mtune support #95043

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jun 25, 2024

Conversation

AlexisPerry
Copy link
Contributor

This PR adds -mtune as a valid flang flag and passes the information through to LLVM IR as an attribute on all functions. No specific architecture optimizations are added at this time.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' mlir:llvm flang:driver mlir flang Flang issues not falling into any other category flang:fir-hlfir flang:codegen labels Jun 10, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 10, 2024

@llvm/pr-subscribers-flang-fir-hlfir
@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-flang-driver

@llvm/pr-subscribers-flang-codegen

Author: Alexis Perry-Holby (AlexisPerry)

Changes

This PR adds -mtune as a valid flang flag and passes the information through to LLVM IR as an attribute on all functions. No specific architecture optimizations are added at this time.


Patch is 24.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/95043.diff

23 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+4-3)
  • (modified) clang/lib/Driver/ToolChains/Flang.cpp (+8)
  • (modified) flang/include/flang/Frontend/TargetOptions.h (+3)
  • (modified) flang/include/flang/Lower/Bridge.h (+3-3)
  • (modified) flang/include/flang/Optimizer/CodeGen/CGPasses.td (+4)
  • (modified) flang/include/flang/Optimizer/CodeGen/Target.h (+18-1)
  • (modified) flang/include/flang/Optimizer/Dialect/Support/FIRContext.h (+7)
  • (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+3)
  • (modified) flang/lib/Frontend/CompilerInvocation.cpp (+4)
  • (modified) flang/lib/Frontend/FrontendActions.cpp (+2-1)
  • (modified) flang/lib/Lower/Bridge.cpp (+2-1)
  • (modified) flang/lib/Optimizer/CodeGen/CodeGen.cpp (+5-1)
  • (modified) flang/lib/Optimizer/CodeGen/Target.cpp (+11)
  • (modified) flang/lib/Optimizer/CodeGen/TargetRewrite.cpp (+11-1)
  • (modified) flang/lib/Optimizer/CodeGen/TypeConverter.cpp (+2-1)
  • (modified) flang/lib/Optimizer/Dialect/Support/FIRContext.cpp (+18)
  • (modified) flang/tools/bbc/bbc.cpp (+1-1)
  • (modified) flang/tools/tco/tco.cpp (+4)
  • (modified) flang/unittests/Optimizer/FIRContextTest.cpp (+3)
  • (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMAttrDefs.td (+9)
  • (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+1)
  • (modified) mlir/lib/Target/LLVMIR/ModuleImport.cpp (+5)
  • (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+3)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index d44faa55c456f..b81f480e1ed2b 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5403,6 +5403,7 @@ def module_file_info : Flag<["-"], "module-file-info">, Flags<[]>,
   HelpText<"Provide information about a particular module file">;
 def mthumb : Flag<["-"], "mthumb">, Group<m_Group>;
 def mtune_EQ : Joined<["-"], "mtune=">, Group<m_Group>,
+  Visibility<[ClangOption, FlangOption]>,
   HelpText<"Only supported on AArch64, PowerPC, RISC-V, SPARC, SystemZ, and X86">;
 def multi__module : Flag<["-"], "multi_module">;
 def multiply__defined__unused : Separate<["-"], "multiply_defined_unused">;
@@ -6722,9 +6723,6 @@ def emit_hlfir : Flag<["-"], "emit-hlfir">, Group<Action_Group>,
 
 let Visibility = [CC1Option, CC1AsOption] in {
 
-def tune_cpu : Separate<["-"], "tune-cpu">,
-  HelpText<"Tune for a specific cpu type">,
-  MarshallingInfoString<TargetOpts<"TuneCPU">>;
 def target_abi : Separate<["-"], "target-abi">,
   HelpText<"Target a particular ABI type">,
   MarshallingInfoString<TargetOpts<"ABI">>;
@@ -6751,6 +6749,9 @@ def darwin_target_variant_triple : Separate<["-"], "darwin-target-variant-triple
 
 let Visibility = [CC1Option, CC1AsOption, FC1Option] in {
 
+def tune_cpu : Separate<["-"], "tune-cpu">,
+  HelpText<"Tune for a specific cpu type">,
+  MarshallingInfoString<TargetOpts<"TuneCPU">>;
 def target_cpu : Separate<["-"], "target-cpu">,
   HelpText<"Target a specific cpu type">,
   MarshallingInfoString<TargetOpts<"CPU">>;
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index 42b45dba2bd31..3dc7ee0ea2bff 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -17,6 +17,7 @@
 #include "llvm/Support/Path.h"
 #include "llvm/TargetParser/RISCVISAInfo.h"
 #include "llvm/TargetParser/RISCVTargetParser.h"
+#include "llvm/TargetParser/Host.h"
 
 #include <cassert>
 
@@ -411,6 +412,13 @@ void Flang::addTargetOptions(const ArgList &Args,
   }
 
   // TODO: Add target specific flags, ABI, mtune option etc.
+  if (const Arg *A = Args.getLastArg(options::OPT_mtune_EQ)) {
+    CmdArgs.push_back("-tune-cpu");
+    if (strcmp(A->getValue(), "native") == 0)
+      CmdArgs.push_back(Args.MakeArgString(llvm::sys::getHostCPUName()));
+    else
+      CmdArgs.push_back(A->getValue());
+  }
 }
 
 void Flang::addOffloadOptions(Compilation &C, const InputInfoList &Inputs,
diff --git a/flang/include/flang/Frontend/TargetOptions.h b/flang/include/flang/Frontend/TargetOptions.h
index ef5d270a2185d..a7a7192c55cb1 100644
--- a/flang/include/flang/Frontend/TargetOptions.h
+++ b/flang/include/flang/Frontend/TargetOptions.h
@@ -32,6 +32,9 @@ class TargetOptions {
   /// If given, the name of the target CPU to generate code for.
   std::string cpu;
 
+  /// If given, the name of the target CPU to tune code for.
+  std::string tuneCPU;
+
   /// The list of target specific features to enable or disable, as written on
   /// the command line.
   std::vector<std::string> featuresAsWritten;
diff --git a/flang/include/flang/Lower/Bridge.h b/flang/include/flang/Lower/Bridge.h
index 52110b861b680..4379ed512cdf0 100644
--- a/flang/include/flang/Lower/Bridge.h
+++ b/flang/include/flang/Lower/Bridge.h
@@ -65,11 +65,11 @@ class LoweringBridge {
          const Fortran::lower::LoweringOptions &loweringOptions,
          const std::vector<Fortran::lower::EnvironmentDefault> &envDefaults,
          const Fortran::common::LanguageFeatureControl &languageFeatures,
-         const llvm::TargetMachine &targetMachine) {
+         const llvm::TargetMachine &targetMachine, llvm::StringRef tuneCPU) {
     return LoweringBridge(ctx, semanticsContext, defaultKinds, intrinsics,
                           targetCharacteristics, allCooked, triple, kindMap,
                           loweringOptions, envDefaults, languageFeatures,
-                          targetMachine);
+                          targetMachine, tuneCPU);
   }
 
   //===--------------------------------------------------------------------===//
@@ -148,7 +148,7 @@ class LoweringBridge {
       const Fortran::lower::LoweringOptions &loweringOptions,
       const std::vector<Fortran::lower::EnvironmentDefault> &envDefaults,
       const Fortran::common::LanguageFeatureControl &languageFeatures,
-      const llvm::TargetMachine &targetMachine);
+      const llvm::TargetMachine &targetMachine, const llvm::StringRef tuneCPU);
   LoweringBridge() = delete;
   LoweringBridge(const LoweringBridge &) = delete;
 
diff --git a/flang/include/flang/Optimizer/CodeGen/CGPasses.td b/flang/include/flang/Optimizer/CodeGen/CGPasses.td
index 9a4d327b33bad..989e3943882a1 100644
--- a/flang/include/flang/Optimizer/CodeGen/CGPasses.td
+++ b/flang/include/flang/Optimizer/CodeGen/CGPasses.td
@@ -31,6 +31,8 @@ def FIRToLLVMLowering : Pass<"fir-to-llvm-ir", "mlir::ModuleOp"> {
            "Override module's data layout.">,
     Option<"forcedTargetCPU", "target-cpu", "std::string", /*default=*/"",
            "Override module's target CPU.">,
+    Option<"forcedTuneCPU", "tune-cpu", "std::string", /*default=*/"",
+           "Override module's tune CPU.">,
     Option<"forcedTargetFeatures", "target-features", "std::string",
            /*default=*/"", "Override module's target features.">,
     Option<"applyTBAA", "apply-tbaa", "bool", /*default=*/"false",
@@ -68,6 +70,8 @@ def TargetRewritePass : Pass<"target-rewrite", "mlir::ModuleOp"> {
            "Override module's target triple.">,
     Option<"forcedTargetCPU", "target-cpu", "std::string", /*default=*/"",
            "Override module's target CPU.">,
+    Option<"forcedTuneCPU", "tune-cpu", "std::string", /*default=*/"",
+           "Override module's tune CPU.">,
     Option<"forcedTargetFeatures", "target-features", "std::string",
            /*default=*/"", "Override module's target features.">,
     Option<"noCharacterConversion", "no-character-conversion",
diff --git a/flang/include/flang/Optimizer/CodeGen/Target.h b/flang/include/flang/Optimizer/CodeGen/Target.h
index 3cf6a74a9adb7..a7161152a5c32 100644
--- a/flang/include/flang/Optimizer/CodeGen/Target.h
+++ b/flang/include/flang/Optimizer/CodeGen/Target.h
@@ -76,6 +76,11 @@ class CodeGenSpecifics {
       llvm::StringRef targetCPU, mlir::LLVM::TargetFeaturesAttr targetFeatures,
       const mlir::DataLayout &dl);
 
+  static std::unique_ptr<CodeGenSpecifics>
+  get(mlir::MLIRContext *ctx, llvm::Triple &&trp, KindMapping &&kindMap,
+      llvm::StringRef targetCPU, mlir::LLVM::TargetFeaturesAttr targetFeatures,
+      const mlir::DataLayout &dl, llvm::StringRef tuneCPU);
+
   static TypeAndAttr getTypeAndAttr(mlir::Type t) { return TypeAndAttr{t, {}}; }
 
   CodeGenSpecifics(mlir::MLIRContext *ctx, llvm::Triple &&trp,
@@ -83,7 +88,17 @@ class CodeGenSpecifics {
                    mlir::LLVM::TargetFeaturesAttr targetFeatures,
                    const mlir::DataLayout &dl)
       : context{*ctx}, triple{std::move(trp)}, kindMap{std::move(kindMap)},
-        targetCPU{targetCPU}, targetFeatures{targetFeatures}, dataLayout{&dl} {}
+        targetCPU{targetCPU}, targetFeatures{targetFeatures}, dataLayout{&dl},
+        tuneCPU{""} {}
+
+  CodeGenSpecifics(mlir::MLIRContext *ctx, llvm::Triple &&trp,
+                   KindMapping &&kindMap, llvm::StringRef targetCPU,
+                   mlir::LLVM::TargetFeaturesAttr targetFeatures,
+                   const mlir::DataLayout &dl, llvm::StringRef tuneCPU)
+      : context{*ctx}, triple{std::move(trp)}, kindMap{std::move(kindMap)},
+        targetCPU{targetCPU}, targetFeatures{targetFeatures}, dataLayout{&dl},
+        tuneCPU{tuneCPU} {}
+
   CodeGenSpecifics() = delete;
   virtual ~CodeGenSpecifics() {}
 
@@ -165,6 +180,7 @@ class CodeGenSpecifics {
   virtual unsigned char getCIntTypeWidth() const = 0;
 
   llvm::StringRef getTargetCPU() const { return targetCPU; }
+  llvm::StringRef getTuneCPU() const { return tuneCPU; }
 
   mlir::LLVM::TargetFeaturesAttr getTargetFeatures() const {
     return targetFeatures;
@@ -182,6 +198,7 @@ class CodeGenSpecifics {
   llvm::StringRef targetCPU;
   mlir::LLVM::TargetFeaturesAttr targetFeatures;
   const mlir::DataLayout *dataLayout = nullptr;
+  llvm::StringRef tuneCPU;
 };
 
 } // namespace fir
diff --git a/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h b/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
index 059a10ce2fe51..bd31aa0782493 100644
--- a/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
+++ b/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
@@ -58,6 +58,13 @@ void setTargetCPU(mlir::ModuleOp mod, llvm::StringRef cpu);
 /// Get the target CPU string from the Module or return a null reference.
 llvm::StringRef getTargetCPU(mlir::ModuleOp mod);
 
+/// Set the tune CPU for the module. `cpu` must not be deallocated while
+/// module `mod` is still live.
+void setTuneCPU(mlir::ModuleOp mod, llvm::StringRef cpu);
+
+/// Get the tune CPU string from the Module or return a null reference.
+llvm::StringRef getTuneCPU(mlir::ModuleOp mod);
+
 /// Set the target features for the module.
 void setTargetFeatures(mlir::ModuleOp mod, llvm::StringRef features);
 
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td
index 7a3baca4c19da..2b1752960f485 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -393,6 +393,9 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> {
     Option<"unsafeFPMath", "unsafe-fp-math",
            "bool", /*default=*/"false",
            "Set the unsafe-fp-math attribute on functions in the module.">,
+    Option<"tuneCPU", "tune-cpu",
+           "llvm::StringRef", /*default=*/"llvm::StringRef{}",
+           "Set the tune-cpu attribute on functions in the module.">,
   ];
 }
 
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index f64a939b785ef..13fda2ec6e035 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -402,6 +402,10 @@ static void parseTargetArgs(TargetOptions &opts, llvm::opt::ArgList &args) {
           args.getLastArg(clang::driver::options::OPT_target_cpu))
     opts.cpu = a->getValue();
 
+  if (const llvm::opt::Arg *a =
+          args.getLastArg(clang::driver::options::OPT_tune_cpu))
+    opts.tuneCPU = a->getValue();
+
   for (const llvm::opt::Arg *currentArg :
        args.filtered(clang::driver::options::OPT_target_feature))
     opts.featuresAsWritten.emplace_back(currentArg->getValue());
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index b1b6391f1439c..a01151dd6346b 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -292,7 +292,8 @@ bool CodeGenAction::beginSourceFileAction() {
       ci.getParsing().allCooked(), ci.getInvocation().getTargetOpts().triple,
       kindMap, ci.getInvocation().getLoweringOpts(),
       ci.getInvocation().getFrontendOpts().envDefaults,
-      ci.getInvocation().getFrontendOpts().features, targetMachine);
+      ci.getInvocation().getFrontendOpts().features, targetMachine,
+      ci.getInvocation().getTargetOpts().tuneCPU);
 
   // Fetch module from lb, so we can set
   mlirModule = std::make_unique<mlir::ModuleOp>(lb.getModule());
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 202efa57d4a36..7df49e3becf17 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -5832,7 +5832,7 @@ Fortran::lower::LoweringBridge::LoweringBridge(
     const Fortran::lower::LoweringOptions &loweringOptions,
     const std::vector<Fortran::lower::EnvironmentDefault> &envDefaults,
     const Fortran::common::LanguageFeatureControl &languageFeatures,
-    const llvm::TargetMachine &targetMachine)
+    const llvm::TargetMachine &targetMachine, const llvm::StringRef tuneCPU)
     : semanticsContext{semanticsContext}, defaultKinds{defaultKinds},
       intrinsics{intrinsics}, targetCharacteristics{targetCharacteristics},
       cooked{&cooked}, context{context}, kindMap{kindMap},
@@ -5889,6 +5889,7 @@ Fortran::lower::LoweringBridge::LoweringBridge(
   fir::setTargetTriple(*module.get(), triple);
   fir::setKindMapping(*module.get(), kindMap);
   fir::setTargetCPU(*module.get(), targetMachine.getTargetCPU());
+  fir::setTuneCPU(*module.get(), tuneCPU);
   fir::setTargetFeatures(*module.get(), targetMachine.getTargetFeatureString());
   fir::support::setMLIRDataLayout(*module.get(),
                                   targetMachine.createDataLayout());
diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
index 9f21c6b0cf097..6e25bcdb0a88e 100644
--- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp
+++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
@@ -3597,6 +3597,9 @@ class FIRToLLVMLowering
     if (!forcedTargetCPU.empty())
       fir::setTargetCPU(mod, forcedTargetCPU);
 
+    if (!forcedTuneCPU.empty())
+      fir::setTuneCPU(mod, forcedTuneCPU);
+
     if (!forcedTargetFeatures.empty())
       fir::setTargetFeatures(mod, forcedTargetFeatures);
 
@@ -3693,7 +3696,8 @@ class FIRToLLVMLowering
       signalPassFailure();
     }
 
-    // Run pass to add comdats to functions that have weak linkage on relevant platforms
+    // Run pass to add comdats to functions that have weak linkage on relevant
+    // platforms
     if (fir::getTargetTriple(mod).supportsCOMDAT()) {
       mlir::OpPassManager comdatPM("builtin.module");
       comdatPM.addPass(mlir::LLVM::createLLVMAddComdats());
diff --git a/flang/lib/Optimizer/CodeGen/Target.cpp b/flang/lib/Optimizer/CodeGen/Target.cpp
index 652e2bddc1b89..25141102a8c43 100644
--- a/flang/lib/Optimizer/CodeGen/Target.cpp
+++ b/flang/lib/Optimizer/CodeGen/Target.cpp
@@ -1113,3 +1113,14 @@ fir::CodeGenSpecifics::get(mlir::MLIRContext *ctx, llvm::Triple &&trp,
   }
   TODO(mlir::UnknownLoc::get(ctx), "target not implemented");
 }
+
+std::unique_ptr<fir::CodeGenSpecifics> fir::CodeGenSpecifics::get(
+    mlir::MLIRContext *ctx, llvm::Triple &&trp, KindMapping &&kindMap,
+    llvm::StringRef targetCPU, mlir::LLVM::TargetFeaturesAttr targetFeatures,
+    const mlir::DataLayout &dl, llvm::StringRef tuneCPU) {
+  std::unique_ptr<fir::CodeGenSpecifics> CGS = fir::CodeGenSpecifics::get(
+      ctx, std::move(trp), std::move(kindMap), targetCPU, targetFeatures, dl);
+
+  CGS->tuneCPU = tuneCPU;
+  return CGS;
+}
diff --git a/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp b/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp
index 8199c5ef7fa26..a101295ba4c13 100644
--- a/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp
+++ b/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp
@@ -89,6 +89,9 @@ class TargetRewrite : public fir::impl::TargetRewritePassBase<TargetRewrite> {
     if (!forcedTargetCPU.empty())
       fir::setTargetCPU(mod, forcedTargetCPU);
 
+    if (!forcedTuneCPU.empty())
+      fir::setTuneCPU(mod, forcedTuneCPU);
+
     if (!forcedTargetFeatures.empty())
       fir::setTargetFeatures(mod, forcedTargetFeatures);
 
@@ -106,7 +109,8 @@ class TargetRewrite : public fir::impl::TargetRewritePassBase<TargetRewrite> {
 
     auto specifics = fir::CodeGenSpecifics::get(
         mod.getContext(), fir::getTargetTriple(mod), fir::getKindMapping(mod),
-        fir::getTargetCPU(mod), fir::getTargetFeatures(mod), *dl);
+        fir::getTargetCPU(mod), fir::getTargetFeatures(mod), *dl,
+        fir::getTuneCPU(mod));
 
     setMembers(specifics.get(), &rewriter, &*dl);
 
@@ -672,12 +676,18 @@ class TargetRewrite : public fir::impl::TargetRewritePassBase<TargetRewrite> {
     auto targetCPU = specifics->getTargetCPU();
     mlir::StringAttr targetCPUAttr =
         targetCPU.empty() ? nullptr : mlir::StringAttr::get(ctx, targetCPU);
+    auto tuneCPU = specifics->getTuneCPU();
+    mlir::StringAttr tuneCPUAttr =
+        tuneCPU.empty() ? nullptr : mlir::StringAttr::get(ctx, tuneCPU);
     auto targetFeaturesAttr = specifics->getTargetFeatures();
 
     for (auto fn : mod.getOps<mlir::func::FuncOp>()) {
       if (targetCPUAttr)
         fn->setAttr("target_cpu", targetCPUAttr);
 
+      if (tuneCPUAttr)
+        fn->setAttr("tune_cpu", tuneCPUAttr);
+
       if (targetFeaturesAttr)
         fn->setAttr("target_features", targetFeaturesAttr);
 
diff --git a/flang/lib/Optimizer/CodeGen/TypeConverter.cpp b/flang/lib/Optimizer/CodeGen/TypeConverter.cpp
index 07d3bd713ce45..2b8f8299cb9e5 100644
--- a/flang/lib/Optimizer/CodeGen/TypeConverter.cpp
+++ b/flang/lib/Optimizer/CodeGen/TypeConverter.cpp
@@ -35,7 +35,8 @@ LLVMTypeConverter::LLVMTypeConverter(mlir::ModuleOp module, bool applyTBAA,
       kindMapping(getKindMapping(module)),
       specifics(CodeGenSpecifics::get(
           module.getContext(), getTargetTriple(module), getKindMapping(module),
-          getTargetCPU(module), getTargetFeatures(module), dl)),
+          getTargetCPU(module), getTargetFeatures(module), dl,
+          getTuneCPU(module))),
       tbaaBuilder(std::make_unique<TBAABuilder>(module->getContext(), applyTBAA,
                                                 forceUnifiedTBAATree)),
       dataLayout{&dl} {
diff --git a/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp b/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp
index c4d00875c45e4..1aa631cb39126 100644
--- a/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp
+++ b/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp
@@ -77,6 +77,24 @@ llvm::StringRef fir::getTargetCPU(mlir::ModuleOp mod) {
   return {};
 }
 
+static constexpr const char *tuneCpuName = "fir.tune_cpu";
+
+void fir::setTuneCPU(mlir::ModuleOp mod, llvm::StringRef cpu) {
+  if (cpu.empty())
+    return;
+
+  auto *ctx = mod.getContext();
+
+  mod->setAttr(tuneCpuName, mlir::StringAttr::get(ctx, cpu));
+}
+
+llvm::StringRef fir::getTuneCPU(mlir::ModuleOp mod) {
+  if (auto attr = mod->getAttrOfType<mlir::StringAttr>(tuneCpuName))
+    return attr.getValue();
+
+  return {};
+}
+
 static constexpr const char *targetFeaturesName = "fir.target_features";
 
 void fir::setTargetFeatures(mlir::ModuleOp mod, llvm::StringRef features) {
diff --git a/flang/tools/bbc/bbc.cpp b/flang/tools/bbc/bbc.cpp
index 3485c1499d3b6..44bddde35c103 100644
--- a/flang/tools/bbc/bbc.cpp
+++ b/flang/tools/bbc/bbc.cpp
@@ -371,7 +371,7 @@ static mlir::LogicalResult convertFortranSourceToMLIR(
       ctx, semanticsContext, defKinds, semanticsContext.intrinsics(),
       semanticsContext.targetCharacteristics(), parsing.allCooked(),
       targetTriple, kindMap, loweringOptions, envDefaults,
-      semanticsContext.languageFeatures(), targetMachine);
+      semanticsContext.languageFeatures(), targetMachine, ""); // FIXME
   mlir::ModuleOp mlirModule = burnside.getModule();
   if (enableOpenMP) {
     if (enableOpenMPGPU && !enableOpenMPDevice) {
diff --git a/flang/tools/tco/tco.cpp b/flang/tools/tco/tco.cpp
index 399ea1362fda4..c8964d46b9cea 100644
--- a/flang/tools/tco/tco.cpp
+++ b/flang/tools/tco/tco.cpp
@@ -58,6 +58,9 @@ static cl::opt<std::string> targetTriple("target",
 static cl::opt<std::string>
     targetCPU("target-cpu", cl::desc("specify a target CPU"), cl::init(""));
 
+static cl::opt<std::string> tuneCPU("tune-cpu", cl::desc("specify a tune CPU"),
+                                    cl::init(""));
+
 static cl::opt<std::string>
     targetFeatures("target-features", cl::desc("specify the target features"),
                    cl::init(""));
@@ -113,6 +116,7 @@ compileFIR(const mlir::PassPipelineCLParser &passPipeline) {
   fir::setTargetTriple(*owningRef, targetTriple);
   fir::setKindMapping(*owningRef, kindMap);
   fir::setTargetCPU(*owningRef, targetCPU);
+  fir::setTuneCPU(*owningRef, tuneCPU);
   fir::setTargetFeatures(*owningRef, targetFeatures);
   // tco is a testing tool, so it will happily use the target independent
   // data layout if none is on the module.
diff --git a/flang/unittests/Optimizer/FIRContextTest.cpp b/flang/unittests/Optimizer/FIRContextTest.cpp
index 49e1ebf23d8aa..3f8b59ac94a95 100644
--- a/flang/unittests/Optimizer/FIRContextTest.cpp
+++ b/flang/unittests/Optimizer/FIRContextTest.cpp
@@ -34,6 +34,7 @@ struct StringAttributesTests : public testing::Test {
       "i10:80,l3:24,a1:8,r54:Double,r62:X86_FP80,r11:PPC_FP128";
   std::string tar...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jun 10, 2024

@llvm/pr-subscribers-mlir-llvm

Author: Alexis Perry-Holby (AlexisPerry)

Changes

This PR adds -mtune as a valid flang flag and passes the information through to LLVM IR as an attribute on all functions. No specific architecture optimizations are added at this time.


Patch is 24.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/95043.diff

23 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+4-3)
  • (modified) clang/lib/Driver/ToolChains/Flang.cpp (+8)
  • (modified) flang/include/flang/Frontend/TargetOptions.h (+3)
  • (modified) flang/include/flang/Lower/Bridge.h (+3-3)
  • (modified) flang/include/flang/Optimizer/CodeGen/CGPasses.td (+4)
  • (modified) flang/include/flang/Optimizer/CodeGen/Target.h (+18-1)
  • (modified) flang/include/flang/Optimizer/Dialect/Support/FIRContext.h (+7)
  • (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+3)
  • (modified) flang/lib/Frontend/CompilerInvocation.cpp (+4)
  • (modified) flang/lib/Frontend/FrontendActions.cpp (+2-1)
  • (modified) flang/lib/Lower/Bridge.cpp (+2-1)
  • (modified) flang/lib/Optimizer/CodeGen/CodeGen.cpp (+5-1)
  • (modified) flang/lib/Optimizer/CodeGen/Target.cpp (+11)
  • (modified) flang/lib/Optimizer/CodeGen/TargetRewrite.cpp (+11-1)
  • (modified) flang/lib/Optimizer/CodeGen/TypeConverter.cpp (+2-1)
  • (modified) flang/lib/Optimizer/Dialect/Support/FIRContext.cpp (+18)
  • (modified) flang/tools/bbc/bbc.cpp (+1-1)
  • (modified) flang/tools/tco/tco.cpp (+4)
  • (modified) flang/unittests/Optimizer/FIRContextTest.cpp (+3)
  • (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMAttrDefs.td (+9)
  • (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+1)
  • (modified) mlir/lib/Target/LLVMIR/ModuleImport.cpp (+5)
  • (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+3)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index d44faa55c456f..b81f480e1ed2b 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5403,6 +5403,7 @@ def module_file_info : Flag<["-"], "module-file-info">, Flags<[]>,
   HelpText<"Provide information about a particular module file">;
 def mthumb : Flag<["-"], "mthumb">, Group<m_Group>;
 def mtune_EQ : Joined<["-"], "mtune=">, Group<m_Group>,
+  Visibility<[ClangOption, FlangOption]>,
   HelpText<"Only supported on AArch64, PowerPC, RISC-V, SPARC, SystemZ, and X86">;
 def multi__module : Flag<["-"], "multi_module">;
 def multiply__defined__unused : Separate<["-"], "multiply_defined_unused">;
@@ -6722,9 +6723,6 @@ def emit_hlfir : Flag<["-"], "emit-hlfir">, Group<Action_Group>,
 
 let Visibility = [CC1Option, CC1AsOption] in {
 
-def tune_cpu : Separate<["-"], "tune-cpu">,
-  HelpText<"Tune for a specific cpu type">,
-  MarshallingInfoString<TargetOpts<"TuneCPU">>;
 def target_abi : Separate<["-"], "target-abi">,
   HelpText<"Target a particular ABI type">,
   MarshallingInfoString<TargetOpts<"ABI">>;
@@ -6751,6 +6749,9 @@ def darwin_target_variant_triple : Separate<["-"], "darwin-target-variant-triple
 
 let Visibility = [CC1Option, CC1AsOption, FC1Option] in {
 
+def tune_cpu : Separate<["-"], "tune-cpu">,
+  HelpText<"Tune for a specific cpu type">,
+  MarshallingInfoString<TargetOpts<"TuneCPU">>;
 def target_cpu : Separate<["-"], "target-cpu">,
   HelpText<"Target a specific cpu type">,
   MarshallingInfoString<TargetOpts<"CPU">>;
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index 42b45dba2bd31..3dc7ee0ea2bff 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -17,6 +17,7 @@
 #include "llvm/Support/Path.h"
 #include "llvm/TargetParser/RISCVISAInfo.h"
 #include "llvm/TargetParser/RISCVTargetParser.h"
+#include "llvm/TargetParser/Host.h"
 
 #include <cassert>
 
@@ -411,6 +412,13 @@ void Flang::addTargetOptions(const ArgList &Args,
   }
 
   // TODO: Add target specific flags, ABI, mtune option etc.
+  if (const Arg *A = Args.getLastArg(options::OPT_mtune_EQ)) {
+    CmdArgs.push_back("-tune-cpu");
+    if (strcmp(A->getValue(), "native") == 0)
+      CmdArgs.push_back(Args.MakeArgString(llvm::sys::getHostCPUName()));
+    else
+      CmdArgs.push_back(A->getValue());
+  }
 }
 
 void Flang::addOffloadOptions(Compilation &C, const InputInfoList &Inputs,
diff --git a/flang/include/flang/Frontend/TargetOptions.h b/flang/include/flang/Frontend/TargetOptions.h
index ef5d270a2185d..a7a7192c55cb1 100644
--- a/flang/include/flang/Frontend/TargetOptions.h
+++ b/flang/include/flang/Frontend/TargetOptions.h
@@ -32,6 +32,9 @@ class TargetOptions {
   /// If given, the name of the target CPU to generate code for.
   std::string cpu;
 
+  /// If given, the name of the target CPU to tune code for.
+  std::string tuneCPU;
+
   /// The list of target specific features to enable or disable, as written on
   /// the command line.
   std::vector<std::string> featuresAsWritten;
diff --git a/flang/include/flang/Lower/Bridge.h b/flang/include/flang/Lower/Bridge.h
index 52110b861b680..4379ed512cdf0 100644
--- a/flang/include/flang/Lower/Bridge.h
+++ b/flang/include/flang/Lower/Bridge.h
@@ -65,11 +65,11 @@ class LoweringBridge {
          const Fortran::lower::LoweringOptions &loweringOptions,
          const std::vector<Fortran::lower::EnvironmentDefault> &envDefaults,
          const Fortran::common::LanguageFeatureControl &languageFeatures,
-         const llvm::TargetMachine &targetMachine) {
+         const llvm::TargetMachine &targetMachine, llvm::StringRef tuneCPU) {
     return LoweringBridge(ctx, semanticsContext, defaultKinds, intrinsics,
                           targetCharacteristics, allCooked, triple, kindMap,
                           loweringOptions, envDefaults, languageFeatures,
-                          targetMachine);
+                          targetMachine, tuneCPU);
   }
 
   //===--------------------------------------------------------------------===//
@@ -148,7 +148,7 @@ class LoweringBridge {
       const Fortran::lower::LoweringOptions &loweringOptions,
       const std::vector<Fortran::lower::EnvironmentDefault> &envDefaults,
       const Fortran::common::LanguageFeatureControl &languageFeatures,
-      const llvm::TargetMachine &targetMachine);
+      const llvm::TargetMachine &targetMachine, const llvm::StringRef tuneCPU);
   LoweringBridge() = delete;
   LoweringBridge(const LoweringBridge &) = delete;
 
diff --git a/flang/include/flang/Optimizer/CodeGen/CGPasses.td b/flang/include/flang/Optimizer/CodeGen/CGPasses.td
index 9a4d327b33bad..989e3943882a1 100644
--- a/flang/include/flang/Optimizer/CodeGen/CGPasses.td
+++ b/flang/include/flang/Optimizer/CodeGen/CGPasses.td
@@ -31,6 +31,8 @@ def FIRToLLVMLowering : Pass<"fir-to-llvm-ir", "mlir::ModuleOp"> {
            "Override module's data layout.">,
     Option<"forcedTargetCPU", "target-cpu", "std::string", /*default=*/"",
            "Override module's target CPU.">,
+    Option<"forcedTuneCPU", "tune-cpu", "std::string", /*default=*/"",
+           "Override module's tune CPU.">,
     Option<"forcedTargetFeatures", "target-features", "std::string",
            /*default=*/"", "Override module's target features.">,
     Option<"applyTBAA", "apply-tbaa", "bool", /*default=*/"false",
@@ -68,6 +70,8 @@ def TargetRewritePass : Pass<"target-rewrite", "mlir::ModuleOp"> {
            "Override module's target triple.">,
     Option<"forcedTargetCPU", "target-cpu", "std::string", /*default=*/"",
            "Override module's target CPU.">,
+    Option<"forcedTuneCPU", "tune-cpu", "std::string", /*default=*/"",
+           "Override module's tune CPU.">,
     Option<"forcedTargetFeatures", "target-features", "std::string",
            /*default=*/"", "Override module's target features.">,
     Option<"noCharacterConversion", "no-character-conversion",
diff --git a/flang/include/flang/Optimizer/CodeGen/Target.h b/flang/include/flang/Optimizer/CodeGen/Target.h
index 3cf6a74a9adb7..a7161152a5c32 100644
--- a/flang/include/flang/Optimizer/CodeGen/Target.h
+++ b/flang/include/flang/Optimizer/CodeGen/Target.h
@@ -76,6 +76,11 @@ class CodeGenSpecifics {
       llvm::StringRef targetCPU, mlir::LLVM::TargetFeaturesAttr targetFeatures,
       const mlir::DataLayout &dl);
 
+  static std::unique_ptr<CodeGenSpecifics>
+  get(mlir::MLIRContext *ctx, llvm::Triple &&trp, KindMapping &&kindMap,
+      llvm::StringRef targetCPU, mlir::LLVM::TargetFeaturesAttr targetFeatures,
+      const mlir::DataLayout &dl, llvm::StringRef tuneCPU);
+
   static TypeAndAttr getTypeAndAttr(mlir::Type t) { return TypeAndAttr{t, {}}; }
 
   CodeGenSpecifics(mlir::MLIRContext *ctx, llvm::Triple &&trp,
@@ -83,7 +88,17 @@ class CodeGenSpecifics {
                    mlir::LLVM::TargetFeaturesAttr targetFeatures,
                    const mlir::DataLayout &dl)
       : context{*ctx}, triple{std::move(trp)}, kindMap{std::move(kindMap)},
-        targetCPU{targetCPU}, targetFeatures{targetFeatures}, dataLayout{&dl} {}
+        targetCPU{targetCPU}, targetFeatures{targetFeatures}, dataLayout{&dl},
+        tuneCPU{""} {}
+
+  CodeGenSpecifics(mlir::MLIRContext *ctx, llvm::Triple &&trp,
+                   KindMapping &&kindMap, llvm::StringRef targetCPU,
+                   mlir::LLVM::TargetFeaturesAttr targetFeatures,
+                   const mlir::DataLayout &dl, llvm::StringRef tuneCPU)
+      : context{*ctx}, triple{std::move(trp)}, kindMap{std::move(kindMap)},
+        targetCPU{targetCPU}, targetFeatures{targetFeatures}, dataLayout{&dl},
+        tuneCPU{tuneCPU} {}
+
   CodeGenSpecifics() = delete;
   virtual ~CodeGenSpecifics() {}
 
@@ -165,6 +180,7 @@ class CodeGenSpecifics {
   virtual unsigned char getCIntTypeWidth() const = 0;
 
   llvm::StringRef getTargetCPU() const { return targetCPU; }
+  llvm::StringRef getTuneCPU() const { return tuneCPU; }
 
   mlir::LLVM::TargetFeaturesAttr getTargetFeatures() const {
     return targetFeatures;
@@ -182,6 +198,7 @@ class CodeGenSpecifics {
   llvm::StringRef targetCPU;
   mlir::LLVM::TargetFeaturesAttr targetFeatures;
   const mlir::DataLayout *dataLayout = nullptr;
+  llvm::StringRef tuneCPU;
 };
 
 } // namespace fir
diff --git a/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h b/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
index 059a10ce2fe51..bd31aa0782493 100644
--- a/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
+++ b/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
@@ -58,6 +58,13 @@ void setTargetCPU(mlir::ModuleOp mod, llvm::StringRef cpu);
 /// Get the target CPU string from the Module or return a null reference.
 llvm::StringRef getTargetCPU(mlir::ModuleOp mod);
 
+/// Set the tune CPU for the module. `cpu` must not be deallocated while
+/// module `mod` is still live.
+void setTuneCPU(mlir::ModuleOp mod, llvm::StringRef cpu);
+
+/// Get the tune CPU string from the Module or return a null reference.
+llvm::StringRef getTuneCPU(mlir::ModuleOp mod);
+
 /// Set the target features for the module.
 void setTargetFeatures(mlir::ModuleOp mod, llvm::StringRef features);
 
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td
index 7a3baca4c19da..2b1752960f485 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -393,6 +393,9 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> {
     Option<"unsafeFPMath", "unsafe-fp-math",
            "bool", /*default=*/"false",
            "Set the unsafe-fp-math attribute on functions in the module.">,
+    Option<"tuneCPU", "tune-cpu",
+           "llvm::StringRef", /*default=*/"llvm::StringRef{}",
+           "Set the tune-cpu attribute on functions in the module.">,
   ];
 }
 
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index f64a939b785ef..13fda2ec6e035 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -402,6 +402,10 @@ static void parseTargetArgs(TargetOptions &opts, llvm::opt::ArgList &args) {
           args.getLastArg(clang::driver::options::OPT_target_cpu))
     opts.cpu = a->getValue();
 
+  if (const llvm::opt::Arg *a =
+          args.getLastArg(clang::driver::options::OPT_tune_cpu))
+    opts.tuneCPU = a->getValue();
+
   for (const llvm::opt::Arg *currentArg :
        args.filtered(clang::driver::options::OPT_target_feature))
     opts.featuresAsWritten.emplace_back(currentArg->getValue());
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index b1b6391f1439c..a01151dd6346b 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -292,7 +292,8 @@ bool CodeGenAction::beginSourceFileAction() {
       ci.getParsing().allCooked(), ci.getInvocation().getTargetOpts().triple,
       kindMap, ci.getInvocation().getLoweringOpts(),
       ci.getInvocation().getFrontendOpts().envDefaults,
-      ci.getInvocation().getFrontendOpts().features, targetMachine);
+      ci.getInvocation().getFrontendOpts().features, targetMachine,
+      ci.getInvocation().getTargetOpts().tuneCPU);
 
   // Fetch module from lb, so we can set
   mlirModule = std::make_unique<mlir::ModuleOp>(lb.getModule());
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 202efa57d4a36..7df49e3becf17 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -5832,7 +5832,7 @@ Fortran::lower::LoweringBridge::LoweringBridge(
     const Fortran::lower::LoweringOptions &loweringOptions,
     const std::vector<Fortran::lower::EnvironmentDefault> &envDefaults,
     const Fortran::common::LanguageFeatureControl &languageFeatures,
-    const llvm::TargetMachine &targetMachine)
+    const llvm::TargetMachine &targetMachine, const llvm::StringRef tuneCPU)
     : semanticsContext{semanticsContext}, defaultKinds{defaultKinds},
       intrinsics{intrinsics}, targetCharacteristics{targetCharacteristics},
       cooked{&cooked}, context{context}, kindMap{kindMap},
@@ -5889,6 +5889,7 @@ Fortran::lower::LoweringBridge::LoweringBridge(
   fir::setTargetTriple(*module.get(), triple);
   fir::setKindMapping(*module.get(), kindMap);
   fir::setTargetCPU(*module.get(), targetMachine.getTargetCPU());
+  fir::setTuneCPU(*module.get(), tuneCPU);
   fir::setTargetFeatures(*module.get(), targetMachine.getTargetFeatureString());
   fir::support::setMLIRDataLayout(*module.get(),
                                   targetMachine.createDataLayout());
diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
index 9f21c6b0cf097..6e25bcdb0a88e 100644
--- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp
+++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
@@ -3597,6 +3597,9 @@ class FIRToLLVMLowering
     if (!forcedTargetCPU.empty())
       fir::setTargetCPU(mod, forcedTargetCPU);
 
+    if (!forcedTuneCPU.empty())
+      fir::setTuneCPU(mod, forcedTuneCPU);
+
     if (!forcedTargetFeatures.empty())
       fir::setTargetFeatures(mod, forcedTargetFeatures);
 
@@ -3693,7 +3696,8 @@ class FIRToLLVMLowering
       signalPassFailure();
     }
 
-    // Run pass to add comdats to functions that have weak linkage on relevant platforms
+    // Run pass to add comdats to functions that have weak linkage on relevant
+    // platforms
     if (fir::getTargetTriple(mod).supportsCOMDAT()) {
       mlir::OpPassManager comdatPM("builtin.module");
       comdatPM.addPass(mlir::LLVM::createLLVMAddComdats());
diff --git a/flang/lib/Optimizer/CodeGen/Target.cpp b/flang/lib/Optimizer/CodeGen/Target.cpp
index 652e2bddc1b89..25141102a8c43 100644
--- a/flang/lib/Optimizer/CodeGen/Target.cpp
+++ b/flang/lib/Optimizer/CodeGen/Target.cpp
@@ -1113,3 +1113,14 @@ fir::CodeGenSpecifics::get(mlir::MLIRContext *ctx, llvm::Triple &&trp,
   }
   TODO(mlir::UnknownLoc::get(ctx), "target not implemented");
 }
+
+std::unique_ptr<fir::CodeGenSpecifics> fir::CodeGenSpecifics::get(
+    mlir::MLIRContext *ctx, llvm::Triple &&trp, KindMapping &&kindMap,
+    llvm::StringRef targetCPU, mlir::LLVM::TargetFeaturesAttr targetFeatures,
+    const mlir::DataLayout &dl, llvm::StringRef tuneCPU) {
+  std::unique_ptr<fir::CodeGenSpecifics> CGS = fir::CodeGenSpecifics::get(
+      ctx, std::move(trp), std::move(kindMap), targetCPU, targetFeatures, dl);
+
+  CGS->tuneCPU = tuneCPU;
+  return CGS;
+}
diff --git a/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp b/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp
index 8199c5ef7fa26..a101295ba4c13 100644
--- a/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp
+++ b/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp
@@ -89,6 +89,9 @@ class TargetRewrite : public fir::impl::TargetRewritePassBase<TargetRewrite> {
     if (!forcedTargetCPU.empty())
       fir::setTargetCPU(mod, forcedTargetCPU);
 
+    if (!forcedTuneCPU.empty())
+      fir::setTuneCPU(mod, forcedTuneCPU);
+
     if (!forcedTargetFeatures.empty())
       fir::setTargetFeatures(mod, forcedTargetFeatures);
 
@@ -106,7 +109,8 @@ class TargetRewrite : public fir::impl::TargetRewritePassBase<TargetRewrite> {
 
     auto specifics = fir::CodeGenSpecifics::get(
         mod.getContext(), fir::getTargetTriple(mod), fir::getKindMapping(mod),
-        fir::getTargetCPU(mod), fir::getTargetFeatures(mod), *dl);
+        fir::getTargetCPU(mod), fir::getTargetFeatures(mod), *dl,
+        fir::getTuneCPU(mod));
 
     setMembers(specifics.get(), &rewriter, &*dl);
 
@@ -672,12 +676,18 @@ class TargetRewrite : public fir::impl::TargetRewritePassBase<TargetRewrite> {
     auto targetCPU = specifics->getTargetCPU();
     mlir::StringAttr targetCPUAttr =
         targetCPU.empty() ? nullptr : mlir::StringAttr::get(ctx, targetCPU);
+    auto tuneCPU = specifics->getTuneCPU();
+    mlir::StringAttr tuneCPUAttr =
+        tuneCPU.empty() ? nullptr : mlir::StringAttr::get(ctx, tuneCPU);
     auto targetFeaturesAttr = specifics->getTargetFeatures();
 
     for (auto fn : mod.getOps<mlir::func::FuncOp>()) {
       if (targetCPUAttr)
         fn->setAttr("target_cpu", targetCPUAttr);
 
+      if (tuneCPUAttr)
+        fn->setAttr("tune_cpu", tuneCPUAttr);
+
       if (targetFeaturesAttr)
         fn->setAttr("target_features", targetFeaturesAttr);
 
diff --git a/flang/lib/Optimizer/CodeGen/TypeConverter.cpp b/flang/lib/Optimizer/CodeGen/TypeConverter.cpp
index 07d3bd713ce45..2b8f8299cb9e5 100644
--- a/flang/lib/Optimizer/CodeGen/TypeConverter.cpp
+++ b/flang/lib/Optimizer/CodeGen/TypeConverter.cpp
@@ -35,7 +35,8 @@ LLVMTypeConverter::LLVMTypeConverter(mlir::ModuleOp module, bool applyTBAA,
       kindMapping(getKindMapping(module)),
       specifics(CodeGenSpecifics::get(
           module.getContext(), getTargetTriple(module), getKindMapping(module),
-          getTargetCPU(module), getTargetFeatures(module), dl)),
+          getTargetCPU(module), getTargetFeatures(module), dl,
+          getTuneCPU(module))),
       tbaaBuilder(std::make_unique<TBAABuilder>(module->getContext(), applyTBAA,
                                                 forceUnifiedTBAATree)),
       dataLayout{&dl} {
diff --git a/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp b/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp
index c4d00875c45e4..1aa631cb39126 100644
--- a/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp
+++ b/flang/lib/Optimizer/Dialect/Support/FIRContext.cpp
@@ -77,6 +77,24 @@ llvm::StringRef fir::getTargetCPU(mlir::ModuleOp mod) {
   return {};
 }
 
+static constexpr const char *tuneCpuName = "fir.tune_cpu";
+
+void fir::setTuneCPU(mlir::ModuleOp mod, llvm::StringRef cpu) {
+  if (cpu.empty())
+    return;
+
+  auto *ctx = mod.getContext();
+
+  mod->setAttr(tuneCpuName, mlir::StringAttr::get(ctx, cpu));
+}
+
+llvm::StringRef fir::getTuneCPU(mlir::ModuleOp mod) {
+  if (auto attr = mod->getAttrOfType<mlir::StringAttr>(tuneCpuName))
+    return attr.getValue();
+
+  return {};
+}
+
 static constexpr const char *targetFeaturesName = "fir.target_features";
 
 void fir::setTargetFeatures(mlir::ModuleOp mod, llvm::StringRef features) {
diff --git a/flang/tools/bbc/bbc.cpp b/flang/tools/bbc/bbc.cpp
index 3485c1499d3b6..44bddde35c103 100644
--- a/flang/tools/bbc/bbc.cpp
+++ b/flang/tools/bbc/bbc.cpp
@@ -371,7 +371,7 @@ static mlir::LogicalResult convertFortranSourceToMLIR(
       ctx, semanticsContext, defKinds, semanticsContext.intrinsics(),
       semanticsContext.targetCharacteristics(), parsing.allCooked(),
       targetTriple, kindMap, loweringOptions, envDefaults,
-      semanticsContext.languageFeatures(), targetMachine);
+      semanticsContext.languageFeatures(), targetMachine, ""); // FIXME
   mlir::ModuleOp mlirModule = burnside.getModule();
   if (enableOpenMP) {
     if (enableOpenMPGPU && !enableOpenMPDevice) {
diff --git a/flang/tools/tco/tco.cpp b/flang/tools/tco/tco.cpp
index 399ea1362fda4..c8964d46b9cea 100644
--- a/flang/tools/tco/tco.cpp
+++ b/flang/tools/tco/tco.cpp
@@ -58,6 +58,9 @@ static cl::opt<std::string> targetTriple("target",
 static cl::opt<std::string>
     targetCPU("target-cpu", cl::desc("specify a target CPU"), cl::init(""));
 
+static cl::opt<std::string> tuneCPU("tune-cpu", cl::desc("specify a tune CPU"),
+                                    cl::init(""));
+
 static cl::opt<std::string>
     targetFeatures("target-features", cl::desc("specify the target features"),
                    cl::init(""));
@@ -113,6 +116,7 @@ compileFIR(const mlir::PassPipelineCLParser &passPipeline) {
   fir::setTargetTriple(*owningRef, targetTriple);
   fir::setKindMapping(*owningRef, kindMap);
   fir::setTargetCPU(*owningRef, targetCPU);
+  fir::setTuneCPU(*owningRef, tuneCPU);
   fir::setTargetFeatures(*owningRef, targetFeatures);
   // tco is a testing tool, so it will happily use the target independent
   // data layout if none is on the module.
diff --git a/flang/unittests/Optimizer/FIRContextTest.cpp b/flang/unittests/Optimizer/FIRContextTest.cpp
index 49e1ebf23d8aa..3f8b59ac94a95 100644
--- a/flang/unittests/Optimizer/FIRContextTest.cpp
+++ b/flang/unittests/Optimizer/FIRContextTest.cpp
@@ -34,6 +34,7 @@ struct StringAttributesTests : public testing::Test {
       "i10:80,l3:24,a1:8,r54:Double,r62:X86_FP80,r11:PPC_FP128";
   std::string tar...
[truncated]

Copy link

github-actions bot commented Jun 10, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@Dinistro Dinistro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing tests for the LLVM import and export. I suspect that this is currently still part of the function's passthrough dictionary, which should be changed.

Copy link
Contributor

@tarunprabhu tarunprabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to test that the attribute is lowered to FIR and perhaps also on to LLVM? flang/test/Lower/ has some tests which check for the presence of certain constructs in the LLVM IR. Could something be written similar to those?

Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @AlexisPerry ! Could you add some tests?

@AlexisPerry AlexisPerry requested a review from banach-space June 12, 2024 21:29
Copy link
Contributor

@tarunprabhu tarunprabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. LGTM, but wait for @banach-space to approve.

Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for implementing this 🙏🏻

@AlexisPerry
Copy link
Contributor Author

I have lost my upstream commit privileges due to inactivity, so once this is fully approved, could someone merge it on my behalf? Thank you.

Copy link
Contributor

@Dinistro Dinistro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MLIR side looks good to me % one remaining nit comment. Thanks for addressing the comments.

@AlexisPerry
Copy link
Contributor Author

I believe I've addressed all the review comments and all the checks have passed. Could someone with commit access please merge this on my behalf? Thank you.

@banach-space banach-space merged commit a790279 into llvm:main Jun 25, 2024
7 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jun 25, 2024

LLVM Buildbot has detected a new failure on builder clang-cuda-l4 running on cuda-l4-0 while building clang,flang,mlir at step 3 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/101/builds/691

Here is the relevant piece of the build log for the reference:

Step 3 (annotate) failure: '/buildbot/cuda-build --jobs=' (failure)
...
+ echo @@@STEP_SUMMARY_TEXT@@@@
+ run ninja check-cuda-simple
+ echo '>>> ' ninja check-cuda-simple
+ ninja check-cuda-simple
@@@BUILD_STEP Testing CUDA test-suite@@@
@@@STEP_SUMMARY_CLEAR@@@
@@@STEP_SUMMARY_TEXT@@@@
>>>  ninja check-cuda-simple
[0/40] cd /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA && /usr/local/bin/lit -vv -j 1 assert-cuda-11.8-c++11-libc++.test axpy-cuda-11.8-c++11-libc++.test algorithm-cuda-11.8-c++11-libc++.test cmath-cuda-11.8-c++11-libc++.test complex-cuda-11.8-c++11-libc++.test math_h-cuda-11.8-c++11-libc++.test new-cuda-11.8-c++11-libc++.test empty-cuda-11.8-c++11-libc++.test printf-cuda-11.8-c++11-libc++.test future-cuda-11.8-c++11-libc++.test builtin_var-cuda-11.8-c++11-libc++.test test_round-cuda-11.8-c++11-libc++.test
-- Testing: 12 tests, 1 workers --
FAIL: test-suite :: External/CUDA/algorithm-cuda-11.8-c++11-libc++.test (1 of 12)
******************** TEST 'test-suite :: External/CUDA/algorithm-cuda-11.8-c++11-libc++.test' FAILED ********************

/buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/algorithm-cuda-11.8-c++11-libc++.test.out --redirect-input /dev/null --summary /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/algorithm-cuda-11.8-c++11-libc++.test.time /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/algorithm-cuda-11.8-c++11-libc++
cd /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA ; /buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/algorithm-cuda-11.8-c++11-libc++.test.out algorithm.reference_output-cuda-11.8-c++11-libc++

+ cd /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA
+ /buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/algorithm-cuda-11.8-c++11-libc++.test.out algorithm.reference_output-cuda-11.8-c++11-libc++
/buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target: Comparison failed, textual difference between 'C' and 'S'

********************
FAIL: test-suite :: External/CUDA/assert-cuda-11.8-c++11-libc++.test (2 of 12)
******************** TEST 'test-suite :: External/CUDA/assert-cuda-11.8-c++11-libc++.test' FAILED ********************

/buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/assert-cuda-11.8-c++11-libc++.test.out --redirect-input /dev/null --summary /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/assert-cuda-11.8-c++11-libc++.test.time /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/assert-cuda-11.8-c++11-libc++
cd /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA ; /buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/assert-cuda-11.8-c++11-libc++.test.out assert.reference_output-cuda-11.8-c++11-libc++

+ cd /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA
+ /buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/assert-cuda-11.8-c++11-libc++.test.out assert.reference_output-cuda-11.8-c++11-libc++
/buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target: Comparison failed, textual difference between 'e' and 'a'

********************
FAIL: test-suite :: External/CUDA/axpy-cuda-11.8-c++11-libc++.test (3 of 12)
******************** TEST 'test-suite :: External/CUDA/axpy-cuda-11.8-c++11-libc++.test' FAILED ********************

/buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/axpy-cuda-11.8-c++11-libc++.test.out --redirect-input /dev/null --summary /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/axpy-cuda-11.8-c++11-libc++.test.time /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/axpy-cuda-11.8-c++11-libc++
cd /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA ; /buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/axpy-cuda-11.8-c++11-libc++.test.out axpy.reference_output-cuda-11.8-c++11-libc++

+ cd /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA
+ /buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/axpy-cuda-11.8-c++11-libc++.test.out axpy.reference_output-cuda-11.8-c++11-libc++
/buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target: Comparison failed, textual difference between '-' and '2'

********************
PASS: test-suite :: External/CUDA/builtin_var-cuda-11.8-c++11-libc++.test (4 of 12)
********** TEST 'test-suite :: External/CUDA/builtin_var-cuda-11.8-c++11-libc++.test' RESULTS **********
exec_time: 0.0000 
hash: "293d0eb9282156edc5422e7a8c9268e3" 
**********
FAIL: test-suite :: External/CUDA/cmath-cuda-11.8-c++11-libc++.test (5 of 12)

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jun 25, 2024

LLVM Buildbot has detected a new failure on builder flang-aarch64-libcxx running on linaro-flang-aarch64-libcxx while building clang,flang,mlir at step 6 "test-build-unified-tree-check-flang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/89/builds/791

Here is the relevant piece of the build log for the reference:

Step 6 (test-build-unified-tree-check-flang) failure: test (failure)
******************** TEST 'Flang :: Lower/tune-cpu-llvm.f90' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
RUN: at line 1: /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/bin/flang-new -mtune=pentium4 -S -emit-llvm /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/test/Lower/tune-cpu-llvm.f90 -o - | /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/bin/FileCheck /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/test/Lower/tune-cpu-llvm.f90
+ /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/bin/flang-new -mtune=pentium4 -S -emit-llvm /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/test/Lower/tune-cpu-llvm.f90 -o -
+ /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/bin/FileCheck /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/test/Lower/tune-cpu-llvm.f90
flang-new: error: unsupported argument 'pentium4' to option '-mtune='
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/bin/FileCheck /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/test/Lower/tune-cpu-llvm.f90

--

********************


@banach-space
Copy link
Contributor

LLVM Buildbot has detected a new failure on builder flang-aarch64-libcxx running on linaro-flang-aarch64-libcxx while building clang,flang,mlir at step 6 "test-build-unified-tree-check-flang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/89/builds/791

Here is the relevant piece of the build log for the reference:

Step 6 (test-build-unified-tree-check-flang) failure: test (failure)
******************** TEST 'Flang :: Lower/tune-cpu-llvm.f90' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
RUN: at line 1: /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/bin/flang-new -mtune=pentium4 -S -emit-llvm /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/test/Lower/tune-cpu-llvm.f90 -o - | /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/bin/FileCheck /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/test/Lower/tune-cpu-llvm.f90
+ /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/bin/flang-new -mtune=pentium4 -S -emit-llvm /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/test/Lower/tune-cpu-llvm.f90 -o -
+ /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/bin/FileCheck /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/test/Lower/tune-cpu-llvm.f90
flang-new: error: unsupported argument 'pentium4' to option '-mtune='
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/tcwg-buildbot/worker/flang-aarch64-libcxx/build/bin/FileCheck /home/tcwg-buildbot/worker/flang-aarch64-libcxx/llvm-project/flang/test/Lower/tune-cpu-llvm.f90

--

********************

Most likely https://github.com/llvm/llvm-project/pull/95043/files#diff-a29a79ef4763ed66987d979a7a8a4ff87d242101fe133d5188577b9ff144b805 requires X86 to be enabled. Could somebody either fix or revert? I don’t have access to Git ATM.

tarunprabhu added a commit that referenced this pull request Jun 25, 2024
tarunprabhu added a commit that referenced this pull request Jun 25, 2024
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
banach-space pushed a commit that referenced this pull request Jul 16, 2024
Initial implementation for the -mtune flag in Flang.

This PR is a clean version of PR #96688, which is a re-land of PR #95043
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Initial implementation for the -mtune flag in Flang.

This PR is a clean version of PR #96688, which is a re-land of PR #95043
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category flang:codegen flang:driver flang:fir-hlfir flang Flang issues not falling into any other category mlir:llvm mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants