-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[LV] Invalidate SCEV values in the scalar loop after loop vectorization #69886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Yingwei Zheng (dtcxzyw) ChangesThis PR fixes the assertion failure of Fixes #69097. I wonder if it increases compile time. Waiting for the evaluation result from https://llvm-compile-time-tracker.com/. Full diff: https://github.com/llvm/llvm-project/pull/69886.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index a5d523894bef155..8c8b4a1e2c32f7a 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -3543,7 +3543,10 @@ void InnerLoopVectorizer::fixVectorizedLoop(VPTransformState &State,
// Forget the original basic block.
PSE.getSE()->forgetLoop(OrigLoop);
- PSE.getSE()->forgetBlockAndLoopDispositions();
+ // Now the scalar loop doesn't dominate any of the loop exits.
+ for (auto *Block : OrigLoop->getBlocks())
+ for (auto &Inst : *Block)
+ PSE.getSE()->forgetValue(&Inst);
// After vectorization, the exit blocks of the original loop will have
// additional predecessors. Invalidate SCEVs for the exit phis in case SE
diff --git a/llvm/test/Transforms/LoopVectorize/pr66616.ll b/llvm/test/Transforms/LoopVectorize/pr66616.ll
new file mode 100644
index 000000000000000..8374cfacfb71ecc
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/pr66616.ll
@@ -0,0 +1,35 @@
+; RUN: opt -passes="require<scalar-evolution>,print<scalar-evolution>,loop-vectorize" --verify-scev --verify-scev-strict -S < %s | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; CHECK-LABEL: @pr66616(
+; CHECK: vector.body
+define void @pr66616() {
+entry:
+ br label %for.body
+
+for.cond5.preheader: ; preds = %for.body
+ br label %while.body.i
+
+for.body: ; preds = %for.body, %entry
+ %storemerge12 = phi i8 [ 0, %entry ], [ %inc, %for.body ]
+ %0 = load i8, ptr null, align 1
+ %conv2 = sext i8 %0 to i32
+ %add3 = add i32 %conv2, 1
+ %inc = add i8 %storemerge12, 1
+ %conv1 = zext i8 %inc to i32
+ %tobool.not = icmp eq i32 %conv1, 0
+ br i1 %tobool.not, label %for.cond5.preheader, label %for.body
+
+while.body.i: ; preds = %while.body.i, %for.cond5.preheader
+ %i.addr.09.i = phi i32 [ %dec.i, %while.body.i ], [ %add3, %for.cond5.preheader ]
+ %incdec.ptr48.i = phi ptr [ %incdec.ptr.i, %while.body.i ], [ null, %for.cond5.preheader ]
+ %dec.i = add i32 %i.addr.09.i, 1
+ %incdec.ptr.i = getelementptr i8, ptr %incdec.ptr48.i, i64 1
+ %tobool.not.i = icmp eq i32 %i.addr.09.i, 0
+ br i1 %tobool.not.i, label %for.inc9.loopexit, label %while.body.i
+
+for.inc9.loopexit: ; preds = %while.body.i
+ ret void
+}
|
Looks like it doesn't have an impact on the compile time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for investigating this!
Could you provide more context about the failure in the description? form the issue, it looks like the crash is bool llvm::ScalarEvolution::isLoopEntryGuardedByCond(const Loop *, ICmpInst::Predicate, const SCEV *, const SCEV *): Assertion
isAvailableAtLoopEntry(LHS, L) && "LHS is not available at Loop Entry"`, correct?
In that case, is it possible that we kept around some invalid block & loop dispositions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the correct solution, but given the lack of root cause analysis in the description I cannot suggest what the correct one would be.
IR after vectorizing the loop
Without this patch, the trip count for loop
I will update the test later. |
Simplified test: https://godbolt.org/z/36cqs16os |
…orization Simplify tests.
Thanks for the explanation. As I understand it, the root issue is that SCEV looks through trivial LCSSA phi nodes, and we currently fail to fully invalidate when they become non-trivial. If this understand is correct, I believe it would be sufficient to invalidate SCEVUnknowns used by the LCSSA phi and defined in the loop. I've put up #69909 based on that idea. |
Closed as #69909 is better. |
Fix this an alternative to llvm#69886. The basic problem is that SCEV can look through trivial LCSSA phis. When the phi node later becomes non-trivial, we do invalidate it, but this doesn't catch uses that are not covered by the IR use-def walk, such as thoe in BECounts. Fix this by adding a special invalidation method for LCSSA phis, which will also invalidate all the SCEVUnknowns used by the LCSSA phi node. We should probably also use this invalidation method in other places that add predecessors to exit blocks, such as loop unroling and loop peeling.
This an alternative to #69886. The basic problem is that SCEV can look through trivial LCSSA phis. When the phi node later becomes non-trivial, we do invalidate it, but this doesn't catch uses that are not covered by the IR use-def walk, such as those in BECounts. Fix this by adding a special invalidation method for LCSSA phis, which will also invalidate all the SCEVUnknowns/SCEVAddRecExprs used by the LCSSA phi node and defined in the loop. We should probably also use this invalidation method in other places that add predecessors to exit blocks, such as loop unrolling and loop peeling. Fixes #69097. Fixes #66616. Fixes #63970.
This an alternative to llvm#69886. The basic problem is that SCEV can look through trivial LCSSA phis. When the phi node later becomes non-trivial, we do invalidate it, but this doesn't catch uses that are not covered by the IR use-def walk, such as those in BECounts. Fix this by adding a special invalidation method for LCSSA phis, which will also invalidate all the SCEVUnknowns/SCEVAddRecExprs used by the LCSSA phi node and defined in the loop. We should probably also use this invalidation method in other places that add predecessors to exit blocks, such as loop unrolling and loop peeling. Fixes llvm#69097. Fixes llvm#66616. Fixes llvm#63970.
This an alternative to llvm#69886. The basic problem is that SCEV can look through trivial LCSSA phis. When the phi node later becomes non-trivial, we do invalidate it, but this doesn't catch uses that are not covered by the IR use-def walk, such as those in BECounts. Fix this by adding a special invalidation method for LCSSA phis, which will also invalidate all the SCEVUnknowns/SCEVAddRecExprs used by the LCSSA phi node and defined in the loop. We should probably also use this invalidation method in other places that add predecessors to exit blocks, such as loop unrolling and loop peeling. Fixes llvm#69097. Fixes llvm#66616. Fixes llvm#63970.
This PR fixes incorrect trip counts after vectorizing the first loop.
After loop vectorization, we should forget all SCEV values in the scalar loops since the scalar loop doesn't dominate any of the loop exits.
Fixes #69097.
Fixes #66616.
Fixes #63970.