Skip to content

[LV] Split checking if tail-folding is possible, collecting masked ops. #77612

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 8, 2024

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Jan 10, 2024

Introduce new canFoldTail helper which only checks if tail-folding is possible, but without modifying MaskedOps.

Just because tail-folding is possible doesn't mean the tail will be folded; that's up to the cost-model to decide. Separating the check if tail-folding is possible and preparing for tail-folding makes sure that MaskedOps is only populated when tail-folding is actually selected.

This allows only creating the header mask if needed after #76635.

@llvmbot
Copy link
Member

llvmbot commented Jan 10, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Introduce new canFoldTail helper which only checks if tail-folding is possible, but without modifying MaskedOps.

Just because tail-folding is possible doesn't mean the tail will be folded; that's up to the cost-model to decide. Separating the check if tail-folding is possible and preparing for tail-folding makes sure that MaskedOps is only populated when tail-folding is actually selected.

This allows only creating the header mask if needed after #76635.


Full diff: https://github.com/llvm/llvm-project/pull/77612.diff

3 Files Affected:

  • (modified) llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h (+6-3)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp (+18-2)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+10-5)
diff --git a/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h b/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
index a509ebf6a7e1b3..af23813da569b5 100644
--- a/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
+++ b/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
@@ -276,9 +276,12 @@ class LoopVectorizationLegality {
   bool canVectorizeFPMath(bool EnableStrictReductions);
 
   /// Return true if we can vectorize this loop while folding its tail by
-  /// masking, and mark all respective loads/stores for masking.
-  /// This object's state is only modified iff this function returns true.
-  bool prepareToFoldTailByMasking();
+  /// masking.
+  bool canFoldTailByMasking() const;
+
+  /// Mark all respective loads/stores for masking. Must only be called when
+  /// ail-folding is possible.
+  void prepareToFoldTailByMasking();
 
   /// Returns the primary induction variable.
   PHINode *getPrimaryInduction() { return PrimaryInduction; }
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
index 37a356c43e29a4..88fabf620a83ab 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
@@ -1525,7 +1525,7 @@ bool LoopVectorizationLegality::canVectorize(bool UseVPlanNativePath) {
   return Result;
 }
 
-bool LoopVectorizationLegality::prepareToFoldTailByMasking() {
+bool LoopVectorizationLegality::canFoldTailByMasking() const {
 
   LLVM_DEBUG(dbgs() << "LV: checking if tail can be folded by masking.\n");
 
@@ -1570,8 +1570,24 @@ bool LoopVectorizationLegality::prepareToFoldTailByMasking() {
 
   LLVM_DEBUG(dbgs() << "LV: can fold tail by masking.\n");
 
-  MaskedOp.insert(TmpMaskedOp.begin(), TmpMaskedOp.end());
   return true;
 }
 
+void LoopVectorizationLegality::prepareToFoldTailByMasking() {
+  // The list of pointers that we can safely read and write to remains empty.
+  SmallPtrSet<Value *, 8> SafePointers;
+
+  // Collect masked ops in temporary set first to avoid partially populating
+  // MaskedOp if a block cannot be predicated.
+  SmallPtrSet<const Instruction *, 8> TmpMaskedOp;
+
+  // Check and mark all blocks for predication, including those that ordinarily
+  // do not need predication such as the header block.
+  for (BasicBlock *BB : TheLoop->blocks()) {
+    bool R = blockCanBePredicated(BB, SafePointers, MaskedOp);
+    (void)R;
+    assert(R && "Must be able to predicate block when tail-folding.");
+  }
+}
+
 } // namespace llvm
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 51ce88480c0880..d2506bd2aee6ae 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -4689,7 +4689,7 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
   // found modulo the vectorization factor is not zero, try to fold the tail
   // by masking.
   // FIXME: look for a smaller MaxVF that does divide TC rather than masking.
-  if (Legal->prepareToFoldTailByMasking()) {
+  if (Legal->canFoldTailByMasking()) {
     CanFoldTailByMasking = true;
     return MaxFactors;
   }
@@ -7307,6 +7307,9 @@ LoopVectorizationPlanner::plan(ElementCount UserVF, unsigned UserIC) {
       CM.invalidateCostModelingDecisions();
   }
 
+  if (CM.foldTailByMasking())
+    Legal->prepareToFoldTailByMasking();
+
   ElementCount MaxUserVF =
       UserVF.isScalable() ? MaxFactors.ScalableVF : MaxFactors.FixedVF;
   bool UserVFIsLegal = ElementCount::isKnownLE(UserVF, MaxUserVF);
@@ -8680,10 +8683,12 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) {
       VPBB->setName(BB->getName());
     Builder.setInsertPoint(VPBB);
 
-    if (VPBB == HeaderVPBB)
-      RecipeBuilder.createHeaderMask(*Plan);
-    else if (NeedsMasks)
-      RecipeBuilder.createBlockInMask(BB, *Plan);
+    if (NeedsMasks) {
+      if (VPBB == HeaderVPBB)
+        RecipeBuilder.createHeaderMask(*Plan);
+      else
+        RecipeBuilder.createBlockInMask(BB, *Plan);
+    }
 
     // Introduce each ingredient into VPlan.
     // TODO: Model and preserve debug intrinsics in VPlan.

@fhahn fhahn force-pushed the split-can-fold-tail-prepare-for-tailfolding branch from 507e366 to 35741d9 Compare May 31, 2024 04:48
@fhahn fhahn force-pushed the split-can-fold-tail-prepare-for-tailfolding branch from 35741d9 to 2aaf792 Compare July 8, 2024 08:33
Introduce new canFoldTail helper which only checks if tail-folding is
possible, but without modifying MaskedOps.

Just because tail-folding is possible doesn't mean the tail will be
folded; that's up to the cost-model to decide. Separating the check if
tail-folding is possible and preparing for tail-folding makes sure that
MaskedOps is only populated when tail-folding is actually selected.

This allows only creating the header mask if needed after
llvm#76635.
@fhahn fhahn force-pushed the split-can-fold-tail-prepare-for-tailfolding branch from 2aaf792 to 9fa9457 Compare July 8, 2024 11:00
@fhahn fhahn requested a review from alexey-bataev July 8, 2024 11:09
@fhahn
Copy link
Contributor Author

fhahn commented Jul 8, 2024

Updated llvm/test/Transforms/LoopVectorize/RISCV/vectorize-force-tail-with-evl-interleave.ll to show the impact of the PR: before this patch, prepareToFoldTailByMasking is called before setting the tail-folding decision, so we mark blocks as masked, even if the decision later is to not fold the tail. With the patch, legality check is decoupled from marking blocks as predicated. In the changed test, we now can form an interleave group for scalable vectors, which in turn makes that decision more profitable than the fixed-width VF.

Copy link
Member

@alexey-bataev alexey-bataev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG with a nit

Comment on lines 1618 to 1619
bool R = blockCanBePredicated(BB, SafePointers, MaskedOp);
(void)R;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool R = blockCanBePredicated(BB, SafePointers, MaskedOp);
(void)R;
[[maybe_unused]] bool R = blockCanBePredicated(BB, SafePointers, MaskedOp);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated thanks!

Copy link
Collaborator

@ayalz ayalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2, adding various nits.

bool canFoldTailByMasking() const;

/// Mark all respective loads/stores for masking. Must only be called when
/// ail-folding is possible.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// ail-folding is possible.
/// tail-folding is possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed thanks!

return true;
}

void LoopVectorizationLegality::prepareToFoldTailByMasking() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above comments worth updating:

  // Collect masked ops in temporary set first to avoid partially populating
  // MaskedOp if a block cannot be predicated.

comment needs rephrasing.

  // Check and mark all blocks for predication, including those that ordinarily
  // do not need predication such as the header block.

only Check, w/o mark

LLVM_DEBUG(dbgs() << "LV: Cannot fold tail by masking as requested.\n");

drop "as requested"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

Comment on lines 1611 to 1614
// Collect masked ops in temporary set first to avoid partially populating
// MaskedOp if a block cannot be predicated.
SmallPtrSet<const Instruction *, 8> TmpMaskedOp;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Collect masked ops in temporary set first to avoid partially populating
// MaskedOp if a block cannot be predicated.
SmallPtrSet<const Instruction *, 8> TmpMaskedOp;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed thanks!

// MaskedOp if a block cannot be predicated.
SmallPtrSet<const Instruction *, 8> TmpMaskedOp;

// Check and mark all blocks for predication, including those that ordinarily
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Check and mark all blocks for predication, including those that ordinarily
// Mark all blocks for predication, including those that ordinarily

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

// Check and mark all blocks for predication, including those that ordinarily
// do not need predication such as the header block.
for (BasicBlock *BB : TheLoop->blocks()) {
bool R = blockCanBePredicated(BB, SafePointers, MaskedOp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth considering splitting this into two, as checking if block can be predicated and collecting its masked ops (if it can) seem two distinct operations. Both avoid considering SafePointers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left as is for now for this PR to keep the number of changes small, but will check as follow-up. More generally, we should probably collect and consider safe pointers with tail folding as well

@fhahn fhahn unassigned ayalz Jul 8, 2024
@fhahn fhahn merged commit 0577cda into llvm:main Jul 8, 2024
5 of 6 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 8, 2024

LLVM Buildbot has detected a new failure on builder clang-aarch64-sve-vla-2stage running on linaro-g3-01 while building llvm at step 11 "build stage 2".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/41/builds/456

Here is the relevant piece of the build log for the reference:

Step 11 (build stage 2) failure: 'ninja' (failure)
...
[7900/8584] Building CXX object tools/flang/lib/Optimizer/Transforms/CMakeFiles/obj.FIRTransforms.dir/FunctionAttr.cpp.o
[7901/8584] Building CXX object tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/exceptions.cpp.o
[7902/8584] Building CXX object tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/descriptor-io.cpp.o
[7903/8584] Building CXX object tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/environment.cpp.o
[7904/8584] Building CXX object tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/extensions.cpp.o
[7905/8584] Building CXX object tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/execute.cpp.o
[7906/8584] Building CXX object tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/external-unit.cpp.o
[7907/8584] Building CXX object tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/file.cpp.o
[7908/8584] Building CXX object tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/edit-output.cpp.o
[7909/8584] Building CXX object tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/edit-input.cpp.o
FAILED: tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/edit-input.cpp.o 
/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang++ -DFLANG_INCLUDE_TESTS=1 -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/tools/flang/runtime -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/flang/runtime -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/flang/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/tools/flang/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/include -isystem /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/../mlir/include -isystem /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/tools/mlir/include -isystem /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/tools/clang/include -isystem /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/../clang/include -mcpu=neoverse-512tvb -mllvm -scalable-vectorization=preferred -mllvm -treat-scalable-fixed-error-as-warning=false -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-string-conversion -Wno-ctad-maybe-unsupported -Wno-unused-command-line-argument -Wstring-conversion           -Wcovered-switch-default -Wno-nested-anon-types -fno-lto -O3 -DNDEBUG   -U_GLIBCXX_ASSERTIONS -U_LIBCPP_ENABLE_ASSERTIONS -UNDEBUG -std=c++17  -fno-exceptions -funwind-tables -fno-rtti -MD -MT tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/edit-input.cpp.o -MF tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/edit-input.cpp.o.d -o tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/edit-input.cpp.o -c /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/flang/runtime/edit-input.cpp
clang++: ../llvm/llvm/include/llvm/ADT/APInt.h:1503: uint64_t llvm::APInt::getZExtValue() const: Assertion `getActiveBits() <= 64 && "Too many bits for uint64_t"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang++ -DFLANG_INCLUDE_TESTS=1 -DFLANG_LITTLE_ENDIAN=1 -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/tools/flang/runtime -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/flang/runtime -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/flang/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/tools/flang/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/include -isystem /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/../mlir/include -isystem /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/tools/mlir/include -isystem /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/tools/clang/include -isystem /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/../clang/include -mcpu=neoverse-512tvb -mllvm -scalable-vectorization=preferred -mllvm -treat-scalable-fixed-error-as-warning=false -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-string-conversion -Wno-ctad-maybe-unsupported -Wno-unused-command-line-argument -Wstring-conversion -Wcovered-switch-default -Wno-nested-anon-types -fno-lto -O3 -DNDEBUG -U_GLIBCXX_ASSERTIONS -U_LIBCPP_ENABLE_ASSERTIONS -UNDEBUG -std=c++17 -fno-exceptions -funwind-tables -fno-rtti -MD -MT tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/edit-input.cpp.o -MF tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/edit-input.cpp.o.d -o tools/flang/runtime/CMakeFiles/obj.FortranRuntime.dir/edit-input.cpp.o -c /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/flang/runtime/edit-input.cpp
1.	<eof> parser at end of file
2.	Optimizer
3.	Running pass "require<globals-aa>,function(invalidate<aa>),require<profile-summary>,cgscc(devirt<4>(inline,function-attrs<skip-non-recursive-function-attrs>,argpromotion,openmp-opt-cgscc,function<eager-inv;no-rerun>(sroa<modify-cfg>,early-cse<memssa>,speculative-execution<only-if-divergent-target>,jump-threading,correlated-propagation,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts;speculate-blocks;simplify-cond-branch>,instcombine<max-iterations=1;no-use-loop-info;no-verify-fixpoint>,aggressive-instcombine,libcalls-shrinkwrap,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts;speculate-blocks;simplify-cond-branch>,reassociate,constraint-elimination,loop-mssa(loop-instsimplify,loop-simplifycfg,licm<no-allowspeculation>,loop-rotate<header-duplication;no-prepare-for-lto>,licm<allowspeculation>,simple-loop-unswitch<nontrivial;trivial>),simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts;speculate-blocks;simplify-cond-branch>,instcombine<max-iterations=1;no-use-loop-info;no-verify-fixpoint>,loop(loop-idiom,indvars,simple-loop-unswitch<nontrivial;trivial>,loop-idiom-vectorize,loop-deletion,loop-unroll-full),sroa<modify-cfg>,vector-combine,mldst-motion<no-split-footer-bb>,gvn<>,sccp,bdce,instcombine<max-iterations=1;no-use-loop-info;no-verify-fixpoint>,jump-threading,correlated-propagation,adce,memcpyopt,dse,move-auto-init,loop-mssa(licm<allowspeculation>),coro-elide,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;hoist-common-insts;sink-common-insts;speculate-blocks;simplify-cond-branch>,instcombine<max-iterations=1;no-use-loop-info;no-verify-fixpoint>),function-attrs,function(require<should-not-run-function-passes>),coro-split)),function(invalidate<should-not-run-function-passes>),cgscc(devirt<4>())" on module "/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/flang/runtime/edit-input.cpp"
4.	Running pass "cgscc(devirt<4>(inline,function-attrs<skip-non-recursive-function-attrs>,argpromotion,openmp-opt-cgscc,function<eager-inv;no-rerun>(sroa<modify-cfg>,early-cse<memssa>,speculative-execution<only-if-divergent-target>,jump-threading,correlated-propagation,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts;speculate-blocks;simplify-cond-branch>,instcombine<max-iterations=1;no-use-loop-info;no-verify-fixpoint>,aggressive-instcombine,libcalls-shrinkwrap,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts;speculate-blocks;simplify-cond-branch>,reassociate,constraint-elimination,loop-mssa(loop-instsimplify,loop-simplifycfg,licm<no-allowspeculation>,loop-rotate<header-duplication;no-prepare-for-lto>,licm<allowspeculation>,simple-loop-unswitch<nontrivial;trivial>),simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts;speculate-blocks;simplify-cond-branch>,instcombine<max-iterations=1;no-use-loop-info;no-verify-fixpoint>,loop(loop-idiom,indvars,simple-loop-unswitch<nontrivial;trivial>,loop-idiom-vectorize,loop-deletion,loop-unroll-full),sroa<modify-cfg>,vector-combine,mldst-motion<no-split-footer-bb>,gvn<>,sccp,bdce,instcombine<max-iterations=1;no-use-loop-info;no-verify-fixpoint>,jump-threading,correlated-propagation,adce,memcpyopt,dse,move-auto-init,loop-mssa(licm<allowspeculation>),coro-elide,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;hoist-common-insts;sink-common-insts;speculate-blocks;simplify-cond-branch>,instcombine<max-iterations=1;no-use-loop-info;no-verify-fixpoint>),function-attrs,function(require<should-not-run-function-passes>),coro-split))" on module "/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/flang/runtime/edit-input.cpp"
5.	Running pass "loop(loop-idiom,indvars,simple-loop-unswitch<nontrivial;trivial>,loop-idiom-vectorize,loop-deletion,loop-unroll-full)" on function "_ZN7Fortran7runtime2io18ConvertHexadecimalILi64EEENS_7decimal24ConversionToBinaryResultIXT_EEERPKcNS3_15FortranRoundingEi"
 #0 0x0000aaaabba4d480 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x79ad480)
 #1 0x0000aaaabba4b350 llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x79ab350)
 #2 0x0000aaaabba4c8d4 llvm::sys::CleanupOnSignal(unsigned long) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x79ac8d4)
 #3 0x0000aaaabb9b3f7c CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #4 0x0000ffff896629d0 (linux-vdso.so.1+0x9d0)
 #5 0x0000ffff891b5d78 raise /build/glibc-Q8DG8B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #6 0x0000ffff891a2aac abort /build/glibc-Q8DG8B/glibc-2.31/stdlib/abort.c:81:7
 #7 0x0000ffff891af490 __assert_fail_base /build/glibc-Q8DG8B/glibc-2.31/assert/assert.c:89:7
 #8 0x0000ffff891af4f4 (/lib/aarch64-linux-gnu/libc.so.6+0x2d4f4)
 #9 0x0000aaaabd1f0198 (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x9150198)
#10 0x0000aaaabd1e217c (anonymous namespace)::LoopIdiomRecognize::runOnNoncountableLoop() LoopIdiomRecognize.cpp:0:0
#11 0x0000aaaabd1e0c84 llvm::LoopIdiomRecognizePass::run(llvm::Loop&, llvm::AnalysisManager<llvm::Loop, llvm::LoopStandardAnalysisResults&>&, llvm::LoopStandardAnalysisResults&, llvm::LPMUpdater&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x9140c84)
#12 0x0000aaaabb8b1264 std::optional<llvm::PreservedAnalyses> llvm::PassManager<llvm::Loop, llvm::AnalysisManager<llvm::Loop, llvm::LoopStandardAnalysisResults&>, llvm::LoopStandardAnalysisResults&, llvm::LPMUpdater&>::runSinglePass<llvm::Loop, std::unique_ptr<llvm::detail::PassConcept<llvm::Loop, llvm::AnalysisManager<llvm::Loop, llvm::LoopStandardAnalysisResults&>, llvm::LoopStandardAnalysisResults&, llvm::LPMUpdater&>, std::default_delete<llvm::detail::PassConcept<llvm::Loop, llvm::AnalysisManager<llvm::Loop, llvm::LoopStandardAnalysisResults&>, llvm::LoopStandardAnalysisResults&, llvm::LPMUpdater&>>>>(llvm::Loop&, std::unique_ptr<llvm::detail::PassConcept<llvm::Loop, llvm::AnalysisManager<llvm::Loop, llvm::LoopStandardAnalysisResults&>, llvm::LoopStandardAnalysisResults&, llvm::LPMUpdater&>, std::default_delete<llvm::detail::PassConcept<llvm::Loop, llvm::AnalysisManager<llvm::Loop, llvm::LoopStandardAnalysisResults&>, llvm::LoopStandardAnalysisResults&, llvm::LPMUpdater&>>>&, llvm::AnalysisManager<llvm::Loop, llvm::LoopStandardAnalysisResults&>&, llvm::LoopStandardAnalysisResults&, llvm::LPMUpdater&, llvm::PassInstrumentation&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x7811264)
#13 0x0000aaaabb8b0e9c llvm::PassManager<llvm::Loop, llvm::AnalysisManager<llvm::Loop, llvm::LoopStandardAnalysisResults&>, llvm::LoopStandardAnalysisResults&, llvm::LPMUpdater&>::runWithoutLoopNestPasses(llvm::Loop&, llvm::AnalysisManager<llvm::Loop, llvm::LoopStandardAnalysisResults&>&, llvm::LoopStandardAnalysisResults&, llvm::LPMUpdater&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x7810e9c)
#14 0x0000aaaabb8b0680 llvm::PassManager<llvm::Loop, llvm::AnalysisManager<llvm::Loop, llvm::LoopStandardAnalysisResults&>, llvm::LoopStandardAnalysisResults&, llvm::LPMUpdater&>::run(llvm::Loop&, llvm::AnalysisManager<llvm::Loop, llvm::LoopStandardAnalysisResults&>&, llvm::LoopStandardAnalysisResults&, llvm::LPMUpdater&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x7810680)
#15 0x0000aaaabb8b20b4 llvm::FunctionToLoopPassAdaptor::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x78120b4)
#16 0x0000aaaabb4a3aac llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x7403aac)
#17 0x0000aaaabaaed73c llvm::CGSCCToFunctionPassAdaptor::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x6a4d73c)
#18 0x0000aaaabaae89bc llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x6a489bc)
#19 0x0000aaaabaaebd84 llvm::DevirtSCCRepeatedPass::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x6a4bd84)
#20 0x0000aaaabaaea3cc llvm::ModuleToPostOrderCGSCCPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x6a4a3cc)
#21 0x0000aaaabb4a2dbc llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x7402dbc)
#22 0x0000aaaabceca810 llvm::ModuleInlinerWrapperPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x8e2a810)
#23 0x0000aaaabb4a2dbc llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x7402dbc)
#24 0x0000aaaabc1e307c (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&, clang::BackendConsumer*) BackendUtil.cpp:0:0
#25 0x0000aaaabc1da364 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x813a364)
#26 0x0000aaaabc1fda78 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x815da78)
#27 0x0000aaaabde800a8 clang::ParseAST(clang::Sema&, bool, bool) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x9de00a8)

@fhahn fhahn deleted the split-can-fold-tail-prepare-for-tailfolding branch July 13, 2024 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants