-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[flang] Add -f[no-]unroll-loops flag #122906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch adds support for the -funroll-loops and -fno-unroll-loops flags with similar behaviour to clang. funroll-loops is enabled at -O2 onwards as in clang.
@llvm/pr-subscribers-flang-driver @llvm/pr-subscribers-clang-driver Author: David Truby (DavidTruby) ChangesThis patch adds support for the -funroll-loops and -fno-unroll-loops flags with similar behaviour to clang. funroll-loops is enabled at -O2 onwards as is the current default. Full diff: https://github.com/llvm/llvm-project/pull/122906.diff 6 Files Affected:
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 2721c1b5d8dc55..4bab2ae4d8dd5c 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4157,9 +4157,9 @@ def ftrap_function_EQ : Joined<["-"], "ftrap-function=">, Group<f_Group>,
HelpText<"Issue call to specified function rather than a trap instruction">,
MarshallingInfoString<CodeGenOpts<"TrapFuncName">>;
def funroll_loops : Flag<["-"], "funroll-loops">, Group<f_Group>,
- HelpText<"Turn on loop unroller">, Visibility<[ClangOption, CC1Option]>;
+ HelpText<"Turn on loop unroller">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>;
def fno_unroll_loops : Flag<["-"], "fno-unroll-loops">, Group<f_Group>,
- HelpText<"Turn off loop unroller">, Visibility<[ClangOption, CC1Option]>;
+ HelpText<"Turn off loop unroller">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>;
def ffinite_loops: Flag<["-"], "ffinite-loops">, Group<f_Group>,
HelpText<"Assume all non-trivial loops are finite.">, Visibility<[ClangOption, CC1Option]>;
def fno_finite_loops: Flag<["-"], "fno-finite-loops">, Group<f_Group>,
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index a7d0cc99f27d2d..282a4e267b3dfc 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -150,12 +150,17 @@ void Flang::addCodegenOptions(const ArgList &Args,
if (shouldLoopVersion(Args))
CmdArgs.push_back("-fversion-loops-for-stride");
+ Args.addAllArgs(CmdArgs, {options::OPT_flang_experimental_hlfir,
+ options::OPT_flang_deprecated_no_hlfir,
+ options::OPT_fno_ppc_native_vec_elem_order,
+ options::OPT_fppc_native_vec_elem_order});
Args.addAllArgs(CmdArgs,
{options::OPT_flang_experimental_hlfir,
options::OPT_flang_deprecated_no_hlfir,
options::OPT_fno_ppc_native_vec_elem_order,
options::OPT_fppc_native_vec_elem_order,
- options::OPT_ftime_report, options::OPT_ftime_report_EQ});
+ options::OPT_ftime_report, options::OPT_ftime_report_EQ,
+ options::OPT_funroll_loops, options::OPT_fno_unroll_loops});
}
void Flang::addPicOptions(const ArgList &Args, ArgStringList &CmdArgs) const {
diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def
index 9d03ec88a56b8a..deb8d1aede518b 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.def
+++ b/flang/include/flang/Frontend/CodeGenOptions.def
@@ -32,6 +32,7 @@ CODEGENOPT(PrepareForThinLTO , 1, 0) ///< Set when -flto=thin is enabled on the
///< compile step.
CODEGENOPT(StackArrays, 1, 0) ///< -fstack-arrays (enable the stack-arrays pass)
CODEGENOPT(LoopVersioning, 1, 0) ///< Enable loop versioning.
+CODEGENOPT(UnrollLoops, 1, 0) ///< Enable loop unrolling
CODEGENOPT(AliasAnalysis, 1, 0) ///< Enable alias analysis pass
CODEGENOPT(Underscoring, 1, 1)
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 5e7127313c1335..15b1e1e0a24881 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -246,6 +246,10 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
clang::driver::options::OPT_fno_loop_versioning, false))
opts.LoopVersioning = 1;
+ opts.UnrollLoops = args.hasFlag(clang::driver::options::OPT_funroll_loops,
+ clang::driver::options::OPT_fno_unroll_loops,
+ (opts.OptimizationLevel > 1));
+
opts.AliasAnalysis = opts.OptimizationLevel > 0;
// -mframe-pointer=none/non-leaf/all option.
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index 52a18d59c7cda5..b0545a7ac2f99a 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -1028,6 +1028,8 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) {
si.registerCallbacks(pic, &mam);
if (ci.isTimingEnabled())
si.getTimePasses().setOutStream(ci.getTimingStreamLLVM());
+ pto.LoopUnrolling = opts.UnrollLoops;
+ pto.LoopInterleaving = opts.UnrollLoops;
llvm::PassBuilder pb(targetMachine, pto, pgoOpt, &pic);
// Attempt to load pass plugins and register their callbacks with PB.
diff --git a/flang/test/HLFIR/unroll-loops.fir b/flang/test/HLFIR/unroll-loops.fir
new file mode 100644
index 00000000000000..f645132262f8d6
--- /dev/null
+++ b/flang/test/HLFIR/unroll-loops.fir
@@ -0,0 +1,43 @@
+// RUN: %flang_fc1 -emit-llvm -O1 -funroll-loops -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,UNROLL
+// RUN: %flang_fc1 -emit-llvm -O2 -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,UNROLL
+// RUN: %flang_fc1 -emit-llvm -O1 -fno-unroll-loops -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,NO-UNROLL
+// RUN: %flang_fc1 -emit-llvm -O1 -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,NO-UNROLL
+
+// CHECK-LABEL: @unroll
+// CHECK-SAME: (ptr nocapture writeonly %[[ARG0:.*]])
+func.func @unroll(%arg0: !fir.ref<!fir.array<1000xf64>> {fir.bindc_name = "a"}) {
+ // CHECK: %[[GEPIV:.*]] = getelementptr i8, ptr %0, i64 -8
+ %scope = fir.dummy_scope : !fir.dscope
+ %c1000 = arith.constant 1000 : index
+ %shape = fir.shape %c1000 : (index) -> !fir.shape<1>
+ %a:2 = hlfir.declare %arg0(%shape) dummy_scope %scope {uniq_name = "unrollEa"} : (!fir.ref<!fir.array<1000xf64>>, !fir.shape<1>, !fir.dscope) -> (!fir.ref<!fir.array<1000xf64>>, !fir.ref<!fir.array<1000xf64>>)
+ %c1 = arith.constant 1 : index
+ fir.do_loop %arg1 = %c1 to %c1000 step %c1 {
+ // CHECK: [[BLK:.*]]:
+
+ // NO-UNROLL-NEXT: %[[PHI:.*]] = phi i64 [ 1, %{{.*}} ], [ %[[NIV:.*]], %[[BLK]] ]
+ // NO-UNROLL-NEXT: %[[IV_D:.*]] = uitofp nneg i64 %[[PHI]] to double
+ // NO-UNROLL-NEXT: %[[GEP:.*]] = getelementptr double, ptr %[[GEPIV]], i64 %[[PHI]]
+ // NO-UNROLL-NEXT: store double %[[IV_D]], ptr %[[GEP]]
+ // NO-UNROLL-NEXT: %[[NIV:.*]] = add nuw nsw i64 %{{.*}}, 1
+ // NO-UNROLL-NEXT: %[[EXIT:.*]] = icmp eq i64 %[[NIV]], 1001
+ // NO-UNROLL-NEXT: br i1 %[[EXIT]], label %{{.*}}, label %[[BLK]]
+
+ // UNROLL-NEXT: %[[PHI:.*]] = phi i64 [ 0, %{{.*}} ], [ %[[NIV:.*]], %[[BLK]] ]
+ // UNROLL-NEXT: %[[IV0:.*]] = or disjoint i64 %[[PHI]], 1
+ // UNROLL-NEXT: %[[IV1:.*]] = add i64 %[[PHI]], 2
+ // UNROLL-NEXT: %[[IV0_D:.*]] = uitofp nneg i64 %[[IV0]] to double
+ // UNROLL-NEXT: %[[IV1_D:.*]] = uitofp nneg i64 %[[IV1]] to double
+ // UNROLL-NEXT: %[[GEP0:.*]] = getelementptr double, ptr %[[ARG0]], i64 %[[PHI]]
+ // UNROLL-NEXT: %[[GEP1:.*]] = getelementptr double, ptr %[[GEPIV]], i64 %[[IV1]]
+ // UNROLL-NEXT: store double %[[IV0_D]], ptr %[[GEP0]]
+ // UNROLL-NEXT: store double %[[IV1_D]], ptr %[[GEP1]]
+ // UNROLL-NEXT: %[[NIV:.*]] = add nuw i64 %[[PHI]], 2
+ // UNROLL-NEXT: %[[EXIT:.*]] = icmp eq i64 %[[NIV]], 1000
+ // UNROLL-NEXT: br i1 %[[EXIT]], label %{{.*}}, label %[[BLK]]
+ %iv = fir.convert %arg1 : (index) -> f64
+ %ai = hlfir.designate %a#0 (%arg1) : (!fir.ref<!fir.array<1000xf64>>, index) -> !fir.ref<f64>
+ hlfir.assign %iv to %ai : f64, !fir.ref<f64>
+ }
+ return
+}
|
@llvm/pr-subscribers-clang Author: David Truby (DavidTruby) ChangesThis patch adds support for the -funroll-loops and -fno-unroll-loops flags with similar behaviour to clang. funroll-loops is enabled at -O2 onwards as is the current default. Full diff: https://github.com/llvm/llvm-project/pull/122906.diff 6 Files Affected:
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 2721c1b5d8dc55..4bab2ae4d8dd5c 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4157,9 +4157,9 @@ def ftrap_function_EQ : Joined<["-"], "ftrap-function=">, Group<f_Group>,
HelpText<"Issue call to specified function rather than a trap instruction">,
MarshallingInfoString<CodeGenOpts<"TrapFuncName">>;
def funroll_loops : Flag<["-"], "funroll-loops">, Group<f_Group>,
- HelpText<"Turn on loop unroller">, Visibility<[ClangOption, CC1Option]>;
+ HelpText<"Turn on loop unroller">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>;
def fno_unroll_loops : Flag<["-"], "fno-unroll-loops">, Group<f_Group>,
- HelpText<"Turn off loop unroller">, Visibility<[ClangOption, CC1Option]>;
+ HelpText<"Turn off loop unroller">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>;
def ffinite_loops: Flag<["-"], "ffinite-loops">, Group<f_Group>,
HelpText<"Assume all non-trivial loops are finite.">, Visibility<[ClangOption, CC1Option]>;
def fno_finite_loops: Flag<["-"], "fno-finite-loops">, Group<f_Group>,
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index a7d0cc99f27d2d..282a4e267b3dfc 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -150,12 +150,17 @@ void Flang::addCodegenOptions(const ArgList &Args,
if (shouldLoopVersion(Args))
CmdArgs.push_back("-fversion-loops-for-stride");
+ Args.addAllArgs(CmdArgs, {options::OPT_flang_experimental_hlfir,
+ options::OPT_flang_deprecated_no_hlfir,
+ options::OPT_fno_ppc_native_vec_elem_order,
+ options::OPT_fppc_native_vec_elem_order});
Args.addAllArgs(CmdArgs,
{options::OPT_flang_experimental_hlfir,
options::OPT_flang_deprecated_no_hlfir,
options::OPT_fno_ppc_native_vec_elem_order,
options::OPT_fppc_native_vec_elem_order,
- options::OPT_ftime_report, options::OPT_ftime_report_EQ});
+ options::OPT_ftime_report, options::OPT_ftime_report_EQ,
+ options::OPT_funroll_loops, options::OPT_fno_unroll_loops});
}
void Flang::addPicOptions(const ArgList &Args, ArgStringList &CmdArgs) const {
diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def
index 9d03ec88a56b8a..deb8d1aede518b 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.def
+++ b/flang/include/flang/Frontend/CodeGenOptions.def
@@ -32,6 +32,7 @@ CODEGENOPT(PrepareForThinLTO , 1, 0) ///< Set when -flto=thin is enabled on the
///< compile step.
CODEGENOPT(StackArrays, 1, 0) ///< -fstack-arrays (enable the stack-arrays pass)
CODEGENOPT(LoopVersioning, 1, 0) ///< Enable loop versioning.
+CODEGENOPT(UnrollLoops, 1, 0) ///< Enable loop unrolling
CODEGENOPT(AliasAnalysis, 1, 0) ///< Enable alias analysis pass
CODEGENOPT(Underscoring, 1, 1)
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 5e7127313c1335..15b1e1e0a24881 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -246,6 +246,10 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
clang::driver::options::OPT_fno_loop_versioning, false))
opts.LoopVersioning = 1;
+ opts.UnrollLoops = args.hasFlag(clang::driver::options::OPT_funroll_loops,
+ clang::driver::options::OPT_fno_unroll_loops,
+ (opts.OptimizationLevel > 1));
+
opts.AliasAnalysis = opts.OptimizationLevel > 0;
// -mframe-pointer=none/non-leaf/all option.
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index 52a18d59c7cda5..b0545a7ac2f99a 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -1028,6 +1028,8 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) {
si.registerCallbacks(pic, &mam);
if (ci.isTimingEnabled())
si.getTimePasses().setOutStream(ci.getTimingStreamLLVM());
+ pto.LoopUnrolling = opts.UnrollLoops;
+ pto.LoopInterleaving = opts.UnrollLoops;
llvm::PassBuilder pb(targetMachine, pto, pgoOpt, &pic);
// Attempt to load pass plugins and register their callbacks with PB.
diff --git a/flang/test/HLFIR/unroll-loops.fir b/flang/test/HLFIR/unroll-loops.fir
new file mode 100644
index 00000000000000..f645132262f8d6
--- /dev/null
+++ b/flang/test/HLFIR/unroll-loops.fir
@@ -0,0 +1,43 @@
+// RUN: %flang_fc1 -emit-llvm -O1 -funroll-loops -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,UNROLL
+// RUN: %flang_fc1 -emit-llvm -O2 -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,UNROLL
+// RUN: %flang_fc1 -emit-llvm -O1 -fno-unroll-loops -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,NO-UNROLL
+// RUN: %flang_fc1 -emit-llvm -O1 -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,NO-UNROLL
+
+// CHECK-LABEL: @unroll
+// CHECK-SAME: (ptr nocapture writeonly %[[ARG0:.*]])
+func.func @unroll(%arg0: !fir.ref<!fir.array<1000xf64>> {fir.bindc_name = "a"}) {
+ // CHECK: %[[GEPIV:.*]] = getelementptr i8, ptr %0, i64 -8
+ %scope = fir.dummy_scope : !fir.dscope
+ %c1000 = arith.constant 1000 : index
+ %shape = fir.shape %c1000 : (index) -> !fir.shape<1>
+ %a:2 = hlfir.declare %arg0(%shape) dummy_scope %scope {uniq_name = "unrollEa"} : (!fir.ref<!fir.array<1000xf64>>, !fir.shape<1>, !fir.dscope) -> (!fir.ref<!fir.array<1000xf64>>, !fir.ref<!fir.array<1000xf64>>)
+ %c1 = arith.constant 1 : index
+ fir.do_loop %arg1 = %c1 to %c1000 step %c1 {
+ // CHECK: [[BLK:.*]]:
+
+ // NO-UNROLL-NEXT: %[[PHI:.*]] = phi i64 [ 1, %{{.*}} ], [ %[[NIV:.*]], %[[BLK]] ]
+ // NO-UNROLL-NEXT: %[[IV_D:.*]] = uitofp nneg i64 %[[PHI]] to double
+ // NO-UNROLL-NEXT: %[[GEP:.*]] = getelementptr double, ptr %[[GEPIV]], i64 %[[PHI]]
+ // NO-UNROLL-NEXT: store double %[[IV_D]], ptr %[[GEP]]
+ // NO-UNROLL-NEXT: %[[NIV:.*]] = add nuw nsw i64 %{{.*}}, 1
+ // NO-UNROLL-NEXT: %[[EXIT:.*]] = icmp eq i64 %[[NIV]], 1001
+ // NO-UNROLL-NEXT: br i1 %[[EXIT]], label %{{.*}}, label %[[BLK]]
+
+ // UNROLL-NEXT: %[[PHI:.*]] = phi i64 [ 0, %{{.*}} ], [ %[[NIV:.*]], %[[BLK]] ]
+ // UNROLL-NEXT: %[[IV0:.*]] = or disjoint i64 %[[PHI]], 1
+ // UNROLL-NEXT: %[[IV1:.*]] = add i64 %[[PHI]], 2
+ // UNROLL-NEXT: %[[IV0_D:.*]] = uitofp nneg i64 %[[IV0]] to double
+ // UNROLL-NEXT: %[[IV1_D:.*]] = uitofp nneg i64 %[[IV1]] to double
+ // UNROLL-NEXT: %[[GEP0:.*]] = getelementptr double, ptr %[[ARG0]], i64 %[[PHI]]
+ // UNROLL-NEXT: %[[GEP1:.*]] = getelementptr double, ptr %[[GEPIV]], i64 %[[IV1]]
+ // UNROLL-NEXT: store double %[[IV0_D]], ptr %[[GEP0]]
+ // UNROLL-NEXT: store double %[[IV1_D]], ptr %[[GEP1]]
+ // UNROLL-NEXT: %[[NIV:.*]] = add nuw i64 %[[PHI]], 2
+ // UNROLL-NEXT: %[[EXIT:.*]] = icmp eq i64 %[[NIV]], 1000
+ // UNROLL-NEXT: br i1 %[[EXIT]], label %{{.*}}, label %[[BLK]]
+ %iv = fir.convert %arg1 : (index) -> f64
+ %ai = hlfir.designate %a#0 (%arg1) : (!fir.ref<!fir.array<1000xf64>>, index) -> !fir.ref<f64>
+ hlfir.assign %iv to %ai : f64, !fir.ref<f64>
+ }
+ return
+}
|
flang/test/HLFIR/unroll-loops.fir
Outdated
@@ -0,0 +1,43 @@ | |||
// RUN: %flang_fc1 -emit-llvm -O1 -funroll-loops -mllvm -force-vector-width=1 -o- %s | FileCheck %s --check-prefixes=CHECK,UNROLL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test should probably be in Integration directory and possibly a source to LLVM IR test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A source to (at least) HLFIR test will also check that the -f(no-)unroll loops option has not disappeared from the frontend, and is being passed correctly to fc1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we should possibly have all 3? A source to HLFIR, this test, and source->llvmir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, source->HLFIR isn't useful when the driver test already exists. I'll add a source->llvm integration test.
Args.addAllArgs(CmdArgs, {options::OPT_flang_experimental_hlfir, | ||
options::OPT_flang_deprecated_no_hlfir, | ||
options::OPT_fno_ppc_native_vec_elem_order, | ||
options::OPT_fppc_native_vec_elem_order}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this added separately from below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad rebase I think 😓
Args.addAllArgs(CmdArgs, | ||
{options::OPT_flang_experimental_hlfir, | ||
options::OPT_flang_deprecated_no_hlfir, | ||
options::OPT_fno_ppc_native_vec_elem_order, | ||
options::OPT_fppc_native_vec_elem_order, | ||
options::OPT_ftime_report, options::OPT_ftime_report_EQ}); | ||
options::OPT_ftime_report, options::OPT_ftime_report_EQ, | ||
options::OPT_funroll_loops, options::OPT_fno_unroll_loops}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add a forwarding test from the driver to the frontend driver.
! NO-UNROLL-NEXT: %[[IV_D:.*]] = uitofp nneg <2 x i64> %[[VIND]] to <2 x double> | ||
! NO-UNROLL-NEXT: %[[GEP:.*]] = getelementptr double, ptr %[[ARG0]], i64 %[[IND]] | ||
! NO-UNROLL-NEXT: store <2 x double> %[[IV_D]], ptr %[[GEP]] | ||
! NO-UNROLL-NEXT: %[[NIV:.*]] = add nuw i64 %{{.*}}, 2 | ||
! NO-UNROLL-NEXT: %[[NVIND]] = add <2 x i64> %[[VIND]], splat (i64 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unroll should ideally check for the branch back to the body. And nounroll should probably check that there is no such branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-funroll-loops
doesn't mean that the loop will be fully unrolled, just that some unrolling can occur. So there's a branch back to the body of the loop in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For small loops, it will fully unroll if the body is small.
The issue with checking for two iterations is that this could be due to interleaving during vectorisation as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually did try to only have (e.g.) 2 iterations of the loop; for some reason the full loop is still emitted, at least at -O1. What I ended up with here was essentially copied from the clang -funroll-loops
tests but converted to fortran.
! CHECK-LABEL: @unroll | ||
! CHECK-SAME: (ptr nocapture writeonly %[[ARG0:.*]]) | ||
subroutine unroll(a) | ||
real(kind=8), intent(out) :: a(1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be an integer array to avoid the uitofp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/17336 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/201/builds/1461 Here is the relevant piece of the build log for the reference
|
Hi @DavidTruby, have are you investigating this failure on |
This reverts commit 0195ec4.
I guess the tests in this should have had --target= for the targets I checked them on (x86_64 and aarch64). I can add those and remove the xfail? |
This patch adds support for the -funroll-loops and -fno-unroll-loops flags with similar behaviour to clang. funroll-loops is enabled at -O2 onwards as is the current default.