Skip to content

[flang] Add -f[no-]unroll-loops flag #122906

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 16, 2025
Merged

[flang] Add -f[no-]unroll-loops flag #122906

merged 5 commits into from
Jan 16, 2025

Conversation

DavidTruby
Copy link
Member

This patch adds support for the -funroll-loops and -fno-unroll-loops flags with similar behaviour to clang. funroll-loops is enabled at -O2 onwards as is the current default.

This patch adds support for the -funroll-loops and -fno-unroll-loops
flags with similar behaviour to clang. funroll-loops is enabled at -O2
onwards as in clang.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' flang:driver flang Flang issues not falling into any other category labels Jan 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 14, 2025

@llvm/pr-subscribers-flang-driver

@llvm/pr-subscribers-clang-driver

Author: David Truby (DavidTruby)

Changes

This patch adds support for the -funroll-loops and -fno-unroll-loops flags with similar behaviour to clang. funroll-loops is enabled at -O2 onwards as is the current default.


Full diff: https://github.com/llvm/llvm-project/pull/122906.diff

6 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+2-2)
  • (modified) clang/lib/Driver/ToolChains/Flang.cpp (+6-1)
  • (modified) flang/include/flang/Frontend/CodeGenOptions.def (+1)
  • (modified) flang/lib/Frontend/CompilerInvocation.cpp (+4)
  • (modified) flang/lib/Frontend/FrontendActions.cpp (+2)
  • (added) flang/test/HLFIR/unroll-loops.fir (+43)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 2721c1b5d8dc55..4bab2ae4d8dd5c 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4157,9 +4157,9 @@ def ftrap_function_EQ : Joined<["-"], "ftrap-function=">, Group<f_Group>,
   HelpText<"Issue call to specified function rather than a trap instruction">,
   MarshallingInfoString<CodeGenOpts<"TrapFuncName">>;
 def funroll_loops : Flag<["-"], "funroll-loops">, Group<f_Group>,
-  HelpText<"Turn on loop unroller">, Visibility<[ClangOption, CC1Option]>;
+  HelpText<"Turn on loop unroller">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>;
 def fno_unroll_loops : Flag<["-"], "fno-unroll-loops">, Group<f_Group>,
-  HelpText<"Turn off loop unroller">, Visibility<[ClangOption, CC1Option]>;
+  HelpText<"Turn off loop unroller">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>;
 def ffinite_loops: Flag<["-"],  "ffinite-loops">, Group<f_Group>,
   HelpText<"Assume all non-trivial loops are finite.">, Visibility<[ClangOption, CC1Option]>;
 def fno_finite_loops: Flag<["-"], "fno-finite-loops">, Group<f_Group>,
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index a7d0cc99f27d2d..282a4e267b3dfc 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -150,12 +150,17 @@ void Flang::addCodegenOptions(const ArgList &Args,
   if (shouldLoopVersion(Args))
     CmdArgs.push_back("-fversion-loops-for-stride");
 
+  Args.addAllArgs(CmdArgs, {options::OPT_flang_experimental_hlfir,
+                            options::OPT_flang_deprecated_no_hlfir,
+                            options::OPT_fno_ppc_native_vec_elem_order,
+                            options::OPT_fppc_native_vec_elem_order});
   Args.addAllArgs(CmdArgs,
                   {options::OPT_flang_experimental_hlfir,
                    options::OPT_flang_deprecated_no_hlfir,
                    options::OPT_fno_ppc_native_vec_elem_order,
                    options::OPT_fppc_native_vec_elem_order,
-                   options::OPT_ftime_report, options::OPT_ftime_report_EQ});
+                   options::OPT_ftime_report, options::OPT_ftime_report_EQ,
+                   options::OPT_funroll_loops, options::OPT_fno_unroll_loops});
 }
 
 void Flang::addPicOptions(const ArgList &Args, ArgStringList &CmdArgs) const {
diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def
index 9d03ec88a56b8a..deb8d1aede518b 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.def
+++ b/flang/include/flang/Frontend/CodeGenOptions.def
@@ -32,6 +32,7 @@ CODEGENOPT(PrepareForThinLTO , 1, 0) ///< Set when -flto=thin is enabled on the
                                      ///< compile step.
 CODEGENOPT(StackArrays, 1, 0) ///< -fstack-arrays (enable the stack-arrays pass)
 CODEGENOPT(LoopVersioning, 1, 0) ///< Enable loop versioning.
+CODEGENOPT(UnrollLoops, 1, 0) ///< Enable loop unrolling
 CODEGENOPT(AliasAnalysis, 1, 0) ///< Enable alias analysis pass
 
 CODEGENOPT(Underscoring, 1, 1)
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 5e7127313c1335..15b1e1e0a24881 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -246,6 +246,10 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
                    clang::driver::options::OPT_fno_loop_versioning, false))
     opts.LoopVersioning = 1;
 
+  opts.UnrollLoops = args.hasFlag(clang::driver::options::OPT_funroll_loops,
+                                  clang::driver::options::OPT_fno_unroll_loops,
+                                  (opts.OptimizationLevel > 1));
+
   opts.AliasAnalysis = opts.OptimizationLevel > 0;
 
   // -mframe-pointer=none/non-leaf/all option.
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index 52a18d59c7cda5..b0545a7ac2f99a 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -1028,6 +1028,8 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) {
   si.registerCallbacks(pic, &mam);
   if (ci.isTimingEnabled())
     si.getTimePasses().setOutStream(ci.getTimingStreamLLVM());
+  pto.LoopUnrolling = opts.UnrollLoops;
+  pto.LoopInterleaving = opts.UnrollLoops;
   llvm::PassBuilder pb(targetMachine, pto, pgoOpt, &pic);
 
   // Attempt to load pass plugins and register their callbacks with PB.
diff --git a/flang/test/HLFIR/unroll-loops.fir b/flang/test/HLFIR/unroll-loops.fir
new file mode 100644
index 00000000000000..f645132262f8d6
--- /dev/null
+++ b/flang/test/HLFIR/unroll-loops.fir
@@ -0,0 +1,43 @@
+// RUN: %flang_fc1 -emit-llvm -O1 -funroll-loops -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,UNROLL
+// RUN: %flang_fc1 -emit-llvm -O2 -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,UNROLL
+// RUN: %flang_fc1 -emit-llvm -O1 -fno-unroll-loops -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,NO-UNROLL
+// RUN: %flang_fc1 -emit-llvm -O1 -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,NO-UNROLL
+
+// CHECK-LABEL: @unroll
+// CHECK-SAME: (ptr nocapture writeonly %[[ARG0:.*]])
+func.func @unroll(%arg0: !fir.ref<!fir.array<1000xf64>> {fir.bindc_name = "a"}) {
+  // CHECK:   %[[GEPIV:.*]] = getelementptr i8, ptr %0, i64 -8
+  %scope = fir.dummy_scope : !fir.dscope
+  %c1000 = arith.constant 1000 : index
+  %shape = fir.shape %c1000 : (index) -> !fir.shape<1>
+  %a:2 = hlfir.declare %arg0(%shape) dummy_scope %scope {uniq_name = "unrollEa"} : (!fir.ref<!fir.array<1000xf64>>, !fir.shape<1>, !fir.dscope) -> (!fir.ref<!fir.array<1000xf64>>, !fir.ref<!fir.array<1000xf64>>)
+  %c1 = arith.constant 1 : index
+  fir.do_loop %arg1 = %c1 to %c1000 step %c1 {
+    // CHECK: [[BLK:.*]]:
+
+    // NO-UNROLL-NEXT: %[[PHI:.*]] = phi i64 [ 1, %{{.*}} ], [ %[[NIV:.*]], %[[BLK]] ]
+    // NO-UNROLL-NEXT: %[[IV_D:.*]] = uitofp nneg i64 %[[PHI]] to double
+    // NO-UNROLL-NEXT: %[[GEP:.*]] = getelementptr double, ptr %[[GEPIV]], i64 %[[PHI]]
+    // NO-UNROLL-NEXT: store double %[[IV_D]], ptr %[[GEP]]
+    // NO-UNROLL-NEXT: %[[NIV:.*]] = add nuw nsw i64 %{{.*}}, 1
+    // NO-UNROLL-NEXT: %[[EXIT:.*]] = icmp eq i64 %[[NIV]], 1001
+    // NO-UNROLL-NEXT: br i1 %[[EXIT]], label %{{.*}}, label %[[BLK]]
+
+    // UNROLL-NEXT: %[[PHI:.*]] = phi i64 [ 0, %{{.*}} ], [ %[[NIV:.*]], %[[BLK]] ]
+    // UNROLL-NEXT: %[[IV0:.*]] = or disjoint i64 %[[PHI]], 1
+    // UNROLL-NEXT: %[[IV1:.*]] = add i64 %[[PHI]], 2
+    // UNROLL-NEXT: %[[IV0_D:.*]] = uitofp nneg i64 %[[IV0]] to double
+    // UNROLL-NEXT: %[[IV1_D:.*]] = uitofp nneg i64 %[[IV1]] to double
+    // UNROLL-NEXT: %[[GEP0:.*]] = getelementptr double, ptr %[[ARG0]], i64 %[[PHI]]
+    // UNROLL-NEXT: %[[GEP1:.*]] = getelementptr double, ptr %[[GEPIV]], i64 %[[IV1]]
+    // UNROLL-NEXT: store double %[[IV0_D]], ptr %[[GEP0]]
+    // UNROLL-NEXT: store double %[[IV1_D]], ptr %[[GEP1]]
+    // UNROLL-NEXT: %[[NIV:.*]] = add nuw i64 %[[PHI]], 2
+    // UNROLL-NEXT: %[[EXIT:.*]] = icmp eq i64 %[[NIV]], 1000
+    // UNROLL-NEXT: br i1 %[[EXIT]], label %{{.*}}, label %[[BLK]]
+    %iv = fir.convert %arg1 : (index) -> f64
+    %ai = hlfir.designate %a#0 (%arg1)  : (!fir.ref<!fir.array<1000xf64>>, index) -> !fir.ref<f64>
+    hlfir.assign %iv to %ai : f64, !fir.ref<f64>
+  }
+  return
+}

@llvmbot
Copy link
Member

llvmbot commented Jan 14, 2025

@llvm/pr-subscribers-clang

Author: David Truby (DavidTruby)

Changes

This patch adds support for the -funroll-loops and -fno-unroll-loops flags with similar behaviour to clang. funroll-loops is enabled at -O2 onwards as is the current default.


Full diff: https://github.com/llvm/llvm-project/pull/122906.diff

6 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+2-2)
  • (modified) clang/lib/Driver/ToolChains/Flang.cpp (+6-1)
  • (modified) flang/include/flang/Frontend/CodeGenOptions.def (+1)
  • (modified) flang/lib/Frontend/CompilerInvocation.cpp (+4)
  • (modified) flang/lib/Frontend/FrontendActions.cpp (+2)
  • (added) flang/test/HLFIR/unroll-loops.fir (+43)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 2721c1b5d8dc55..4bab2ae4d8dd5c 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4157,9 +4157,9 @@ def ftrap_function_EQ : Joined<["-"], "ftrap-function=">, Group<f_Group>,
   HelpText<"Issue call to specified function rather than a trap instruction">,
   MarshallingInfoString<CodeGenOpts<"TrapFuncName">>;
 def funroll_loops : Flag<["-"], "funroll-loops">, Group<f_Group>,
-  HelpText<"Turn on loop unroller">, Visibility<[ClangOption, CC1Option]>;
+  HelpText<"Turn on loop unroller">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>;
 def fno_unroll_loops : Flag<["-"], "fno-unroll-loops">, Group<f_Group>,
-  HelpText<"Turn off loop unroller">, Visibility<[ClangOption, CC1Option]>;
+  HelpText<"Turn off loop unroller">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>;
 def ffinite_loops: Flag<["-"],  "ffinite-loops">, Group<f_Group>,
   HelpText<"Assume all non-trivial loops are finite.">, Visibility<[ClangOption, CC1Option]>;
 def fno_finite_loops: Flag<["-"], "fno-finite-loops">, Group<f_Group>,
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index a7d0cc99f27d2d..282a4e267b3dfc 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -150,12 +150,17 @@ void Flang::addCodegenOptions(const ArgList &Args,
   if (shouldLoopVersion(Args))
     CmdArgs.push_back("-fversion-loops-for-stride");
 
+  Args.addAllArgs(CmdArgs, {options::OPT_flang_experimental_hlfir,
+                            options::OPT_flang_deprecated_no_hlfir,
+                            options::OPT_fno_ppc_native_vec_elem_order,
+                            options::OPT_fppc_native_vec_elem_order});
   Args.addAllArgs(CmdArgs,
                   {options::OPT_flang_experimental_hlfir,
                    options::OPT_flang_deprecated_no_hlfir,
                    options::OPT_fno_ppc_native_vec_elem_order,
                    options::OPT_fppc_native_vec_elem_order,
-                   options::OPT_ftime_report, options::OPT_ftime_report_EQ});
+                   options::OPT_ftime_report, options::OPT_ftime_report_EQ,
+                   options::OPT_funroll_loops, options::OPT_fno_unroll_loops});
 }
 
 void Flang::addPicOptions(const ArgList &Args, ArgStringList &CmdArgs) const {
diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def
index 9d03ec88a56b8a..deb8d1aede518b 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.def
+++ b/flang/include/flang/Frontend/CodeGenOptions.def
@@ -32,6 +32,7 @@ CODEGENOPT(PrepareForThinLTO , 1, 0) ///< Set when -flto=thin is enabled on the
                                      ///< compile step.
 CODEGENOPT(StackArrays, 1, 0) ///< -fstack-arrays (enable the stack-arrays pass)
 CODEGENOPT(LoopVersioning, 1, 0) ///< Enable loop versioning.
+CODEGENOPT(UnrollLoops, 1, 0) ///< Enable loop unrolling
 CODEGENOPT(AliasAnalysis, 1, 0) ///< Enable alias analysis pass
 
 CODEGENOPT(Underscoring, 1, 1)
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 5e7127313c1335..15b1e1e0a24881 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -246,6 +246,10 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
                    clang::driver::options::OPT_fno_loop_versioning, false))
     opts.LoopVersioning = 1;
 
+  opts.UnrollLoops = args.hasFlag(clang::driver::options::OPT_funroll_loops,
+                                  clang::driver::options::OPT_fno_unroll_loops,
+                                  (opts.OptimizationLevel > 1));
+
   opts.AliasAnalysis = opts.OptimizationLevel > 0;
 
   // -mframe-pointer=none/non-leaf/all option.
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index 52a18d59c7cda5..b0545a7ac2f99a 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -1028,6 +1028,8 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) {
   si.registerCallbacks(pic, &mam);
   if (ci.isTimingEnabled())
     si.getTimePasses().setOutStream(ci.getTimingStreamLLVM());
+  pto.LoopUnrolling = opts.UnrollLoops;
+  pto.LoopInterleaving = opts.UnrollLoops;
   llvm::PassBuilder pb(targetMachine, pto, pgoOpt, &pic);
 
   // Attempt to load pass plugins and register their callbacks with PB.
diff --git a/flang/test/HLFIR/unroll-loops.fir b/flang/test/HLFIR/unroll-loops.fir
new file mode 100644
index 00000000000000..f645132262f8d6
--- /dev/null
+++ b/flang/test/HLFIR/unroll-loops.fir
@@ -0,0 +1,43 @@
+// RUN: %flang_fc1 -emit-llvm -O1 -funroll-loops -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,UNROLL
+// RUN: %flang_fc1 -emit-llvm -O2 -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,UNROLL
+// RUN: %flang_fc1 -emit-llvm -O1 -fno-unroll-loops -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,NO-UNROLL
+// RUN: %flang_fc1 -emit-llvm -O1 -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,NO-UNROLL
+
+// CHECK-LABEL: @unroll
+// CHECK-SAME: (ptr nocapture writeonly %[[ARG0:.*]])
+func.func @unroll(%arg0: !fir.ref<!fir.array<1000xf64>> {fir.bindc_name = "a"}) {
+  // CHECK:   %[[GEPIV:.*]] = getelementptr i8, ptr %0, i64 -8
+  %scope = fir.dummy_scope : !fir.dscope
+  %c1000 = arith.constant 1000 : index
+  %shape = fir.shape %c1000 : (index) -> !fir.shape<1>
+  %a:2 = hlfir.declare %arg0(%shape) dummy_scope %scope {uniq_name = "unrollEa"} : (!fir.ref<!fir.array<1000xf64>>, !fir.shape<1>, !fir.dscope) -> (!fir.ref<!fir.array<1000xf64>>, !fir.ref<!fir.array<1000xf64>>)
+  %c1 = arith.constant 1 : index
+  fir.do_loop %arg1 = %c1 to %c1000 step %c1 {
+    // CHECK: [[BLK:.*]]:
+
+    // NO-UNROLL-NEXT: %[[PHI:.*]] = phi i64 [ 1, %{{.*}} ], [ %[[NIV:.*]], %[[BLK]] ]
+    // NO-UNROLL-NEXT: %[[IV_D:.*]] = uitofp nneg i64 %[[PHI]] to double
+    // NO-UNROLL-NEXT: %[[GEP:.*]] = getelementptr double, ptr %[[GEPIV]], i64 %[[PHI]]
+    // NO-UNROLL-NEXT: store double %[[IV_D]], ptr %[[GEP]]
+    // NO-UNROLL-NEXT: %[[NIV:.*]] = add nuw nsw i64 %{{.*}}, 1
+    // NO-UNROLL-NEXT: %[[EXIT:.*]] = icmp eq i64 %[[NIV]], 1001
+    // NO-UNROLL-NEXT: br i1 %[[EXIT]], label %{{.*}}, label %[[BLK]]
+
+    // UNROLL-NEXT: %[[PHI:.*]] = phi i64 [ 0, %{{.*}} ], [ %[[NIV:.*]], %[[BLK]] ]
+    // UNROLL-NEXT: %[[IV0:.*]] = or disjoint i64 %[[PHI]], 1
+    // UNROLL-NEXT: %[[IV1:.*]] = add i64 %[[PHI]], 2
+    // UNROLL-NEXT: %[[IV0_D:.*]] = uitofp nneg i64 %[[IV0]] to double
+    // UNROLL-NEXT: %[[IV1_D:.*]] = uitofp nneg i64 %[[IV1]] to double
+    // UNROLL-NEXT: %[[GEP0:.*]] = getelementptr double, ptr %[[ARG0]], i64 %[[PHI]]
+    // UNROLL-NEXT: %[[GEP1:.*]] = getelementptr double, ptr %[[GEPIV]], i64 %[[IV1]]
+    // UNROLL-NEXT: store double %[[IV0_D]], ptr %[[GEP0]]
+    // UNROLL-NEXT: store double %[[IV1_D]], ptr %[[GEP1]]
+    // UNROLL-NEXT: %[[NIV:.*]] = add nuw i64 %[[PHI]], 2
+    // UNROLL-NEXT: %[[EXIT:.*]] = icmp eq i64 %[[NIV]], 1000
+    // UNROLL-NEXT: br i1 %[[EXIT]], label %{{.*}}, label %[[BLK]]
+    %iv = fir.convert %arg1 : (index) -> f64
+    %ai = hlfir.designate %a#0 (%arg1)  : (!fir.ref<!fir.array<1000xf64>>, index) -> !fir.ref<f64>
+    hlfir.assign %iv to %ai : f64, !fir.ref<f64>
+  }
+  return
+}

@@ -0,0 +1,43 @@
// RUN: %flang_fc1 -emit-llvm -O1 -funroll-loops -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,UNROLL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test should probably be in Integration directory and possibly a source to LLVM IR test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A source to (at least) HLFIR test will also check that the -f(no-)unroll loops option has not disappeared from the frontend, and is being passed correctly to fc1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should possibly have all 3? A source to HLFIR, this test, and source->llvmir?

Copy link
Member Author

@DavidTruby DavidTruby Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, source->HLFIR isn't useful when the driver test already exists. I'll add a source->llvm integration test.

Comment on lines 153 to 156
Args.addAllArgs(CmdArgs, {options::OPT_flang_experimental_hlfir,
options::OPT_flang_deprecated_no_hlfir,
options::OPT_fno_ppc_native_vec_elem_order,
options::OPT_fppc_native_vec_elem_order});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this added separately from below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad rebase I think 😓

Args.addAllArgs(CmdArgs,
{options::OPT_flang_experimental_hlfir,
options::OPT_flang_deprecated_no_hlfir,
options::OPT_fno_ppc_native_vec_elem_order,
options::OPT_fppc_native_vec_elem_order,
options::OPT_ftime_report, options::OPT_ftime_report_EQ});
options::OPT_ftime_report, options::OPT_ftime_report_EQ,
options::OPT_funroll_loops, options::OPT_fno_unroll_loops});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a forwarding test from the driver to the frontend driver.

Comment on lines 16 to 20
! NO-UNROLL-NEXT: %[[IV_D:.*]] = uitofp nneg <2 x i64> %[[VIND]] to <2 x double>
! NO-UNROLL-NEXT: %[[GEP:.*]] = getelementptr double, ptr %[[ARG0]], i64 %[[IND]]
! NO-UNROLL-NEXT: store <2 x double> %[[IV_D]], ptr %[[GEP]]
! NO-UNROLL-NEXT: %[[NIV:.*]] = add nuw i64 %{{.*}}, 2
! NO-UNROLL-NEXT: %[[NVIND]] = add <2 x i64> %[[VIND]], splat (i64 2)
Copy link
Contributor

@kiranchandramohan kiranchandramohan Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unroll should ideally check for the branch back to the body. And nounroll should probably check that there is no such branch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-funroll-loops doesn't mean that the loop will be fully unrolled, just that some unrolling can occur. So there's a branch back to the body of the loop in both cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For small loops, it will fully unroll if the body is small.

The issue with checking for two iterations is that this could be due to interleaving during vectorisation as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually did try to only have (e.g.) 2 iterations of the loop; for some reason the full loop is still emitted, at least at -O1. What I ended up with here was essentially copied from the clang -funroll-loops tests but converted to fortran.

! CHECK-LABEL: @unroll
! CHECK-SAME: (ptr nocapture writeonly %[[ARG0:.*]])
subroutine unroll(a)
real(kind=8), intent(out) :: a(1000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be an integer array to avoid the uitofp ?

Copy link
Contributor

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG.

@DavidTruby DavidTruby merged commit 0195ec4 into llvm:main Jan 16, 2025
8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 16, 2025

LLVM Buildbot has detected a new failure on builder ppc64le-flang-rhel-clang running on ppc64le-flang-rhel-test while building clang,flang at step 6 "test-build-unified-tree-check-flang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/17336

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-flang) failure: test (failure)
******************** TEST 'Flang :: HLFIR/unroll-loops.fir' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -fc1 -emit-llvm -O1 -funroll-loops -mllvm -force-vector-width=2 -o- /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/HLFIR/unroll-loops.fir | /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/HLFIR/unroll-loops.fir --check-prefixes=CHECK,UNROLL
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -fc1 -emit-llvm -O1 -funroll-loops -mllvm -force-vector-width=2 -o- /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/HLFIR/unroll-loops.fir
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/HLFIR/unroll-loops.fir --check-prefixes=CHECK,UNROLL
warning: overriding the module target triple with powerpc64le-unknown-linux-gnu [-Woverride-module]
/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/HLFIR/unroll-loops.fir:26:18: error: UNROLL-NEXT: is not on the line after the previous match
 // UNROLL-NEXT: %[[GEP0:.*]] = getelementptr i64, ptr %[[ARG0]], i64 %[[IND]]
                 ^
<stdin>:21:2: note: 'next' match was here
 %1 = getelementptr i64, ptr %0, i64 %index
 ^
<stdin>:14:51: note: previous match ended here
 %step.add = add <2 x i64> %vec.ind, splat (i64 2)
                                                  ^
<stdin>:15:1: note: non-matching line after previous match is here
 %step.add.2 = add <2 x i64> %vec.ind, splat (i64 4)
^

Input file: <stdin>
Check file: /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/HLFIR/unroll-loops.fir

-dump-input=help explains the following input dump.

Input was:
<<<<<<
         .
         .
         .
        16:  %step.add.3 = add <2 x i64> %vec.ind, splat (i64 6) 
        17:  %step.add.4 = add <2 x i64> %vec.ind, splat (i64 8) 
        18:  %step.add.5 = add <2 x i64> %vec.ind, splat (i64 10) 
        19:  %step.add.6 = add <2 x i64> %vec.ind, splat (i64 12) 
        20:  %step.add.7 = add <2 x i64> %vec.ind, splat (i64 14) 
        21:  %1 = getelementptr i64, ptr %0, i64 %index 
next:26      !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  error: match on wrong line
        22:  %2 = getelementptr i8, ptr %1, i64 16 
        23:  %3 = getelementptr i8, ptr %1, i64 32 
        24:  %4 = getelementptr i8, ptr %1, i64 48 
        25:  %5 = getelementptr i8, ptr %1, i64 64 
        26:  %6 = getelementptr i8, ptr %1, i64 80 
         .
         .
         .
>>>>>>

--
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 16, 2025

LLVM Buildbot has detected a new failure on builder ppc64-flang-aix running on ppc64-flang-aix-test while building clang,flang at step 6 "test-build-unified-tree-check-flang".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/201/builds/1461

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-flang) failure: test (failure)
******************** TEST 'Flang :: HLFIR/unroll-loops.fir' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/build/bin/flang -fc1 -emit-llvm -O1 -funroll-loops -mllvm -force-vector-width=2 -o- /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/llvm-project/flang/test/HLFIR/unroll-loops.fir | /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/build/bin/FileCheck /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/llvm-project/flang/test/HLFIR/unroll-loops.fir --check-prefixes=CHECK,UNROLL
+ /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/build/bin/flang -fc1 -emit-llvm -O1 -funroll-loops -mllvm -force-vector-width=2 -o- /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/llvm-project/flang/test/HLFIR/unroll-loops.fir
+ /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/build/bin/FileCheck /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/llvm-project/flang/test/HLFIR/unroll-loops.fir --check-prefixes=CHECK,UNROLL
warning: overriding the module target triple with powerpc64-ibm-aix7.2.0.0 [-Woverride-module]
/home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/llvm-project/flang/test/HLFIR/unroll-loops.fir:25:18: error: UNROLL-NEXT: is not on the line after the previous match
 // UNROLL-NEXT: %[[VIND1:.*]] = add <2 x i64> %[[VIND]], splat (i64 2)
                 ^
<stdin>:20:2: note: 'next' match was here
 %vec.ind.next = add <2 x i64> %vec.ind, splat (i64 2)
 ^
<stdin>:17:92: note: previous match ended here
 %vec.ind = phi <2 x i64> [ <i64 1, i64 2>, %vector.ph ], [ %vec.ind.next.4, %vector.body ]
                                                                                           ^
<stdin>:18:1: note: non-matching line after previous match is here
 %1 = getelementptr i64, ptr %0, i64 %index
^

Input file: <stdin>
Check file: /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/llvm-project/flang/test/HLFIR/unroll-loops.fir

-dump-input=help explains the following input dump.

Input was:
<<<<<<
         .
         .
         .
        15: vector.body: ; preds = %vector.body, %vector.ph 
        16:  %index = phi i64 [ 0, %vector.ph ], [ %index.next.4, %vector.body ] 
        17:  %vec.ind = phi <2 x i64> [ <i64 1, i64 2>, %vector.ph ], [ %vec.ind.next.4, %vector.body ] 
        18:  %1 = getelementptr i64, ptr %0, i64 %index 
        19:  store <2 x i64> %vec.ind, ptr %1, align 8, !tbaa !1 
        20:  %vec.ind.next = add <2 x i64> %vec.ind, splat (i64 2) 
next:25      !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  error: match on wrong line
        21:  %gep = getelementptr i64, ptr %invariant.gep, i64 %index 
        22:  store <2 x i64> %vec.ind.next, ptr %gep, align 8, !tbaa !1 
        23:  %vec.ind.next.1 = add <2 x i64> %vec.ind, splat (i64 4) 
        24:  %gep2 = getelementptr i64, ptr %invariant.gep1, i64 %index 
        25:  store <2 x i64> %vec.ind.next.1, ptr %gep2, align 8, !tbaa !1 
         .
         .
         .
>>>>>>

--
...

@mustartt
Copy link
Member

Hi @DavidTruby, have are you investigating this failure on ppc64le-flang-rhel-clang? it has been failing for a few days, couldjust please update us on a possible fix for it. I can xfail and open an issue for this test case while you investigate in the meantime. Please let me know if there is anyway I can help.

kamaub pushed a commit that referenced this pull request Jan 20, 2025
xfail the following 2 test cases that are failing on PowerPC buildbots
`ppc64-flang-aix` and `ppc64le-flang-rhel-clang` due toPR #122906.
Defect opened:  #123668.

FAIL: Flang::unroll-loops.fir
FAIL: Flang::unroll-loops.f90
mustartt added a commit to mustartt/llvm-project that referenced this pull request Jan 21, 2025
@DavidTruby
Copy link
Member Author

I guess the tests in this should have had --target= for the targets I checked them on (x86_64 and aarch64). I can add those and remove the xfail?

@DavidTruby DavidTruby deleted the funroll branch January 22, 2025 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category flang:driver flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants