-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[LV] Forget LCSSA phi with new pred before other SCEV invalidation. #119897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-vectorizers @llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesWhen the original SCEV gets invalidate, any SCEV computed at a different scope may also become invalid. Add SCEVs created for scopes as users of the original value to ensure proper invalidation. Fixes #119665. Full diff: https://github.com/llvm/llvm-project/pull/119897.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index e18133971f5bf0..c7488faa649e23 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -8442,7 +8442,7 @@ ScalarEvolution::getBackedgeTakenInfo(const Loop *L) {
// recusive call to getBackedgeTakenInfo (on a different
// loop), which would invalidate the iterator computed
// earlier.
- return BackedgeTakenCounts.find(L)->second = std::move(Result);
+ return BackedgeTakenCounts[L] = std::move(Result);
}
void ScalarEvolution::forgetAllLoops() {
@@ -9867,8 +9867,10 @@ const SCEV *ScalarEvolution::getSCEVAtScope(const SCEV *V, const Loop *L) {
for (auto &LS : reverse(ValuesAtScopes[V]))
if (LS.first == L) {
LS.second = C;
- if (!isa<SCEVConstant>(C))
+ if (!isa<SCEVConstant>(C)) {
ValuesAtScopesUsers[C].push_back({L, V});
+ SCEVUsers[V].insert(C);
+ }
break;
}
return C;
@@ -14211,7 +14213,8 @@ void ScalarEvolution::forgetBackedgeTakenCounts(const Loop *L,
for (const SCEV *S : {ENT.ExactNotTaken, ENT.SymbolicMaxNotTaken}) {
if (!isa<SCEVConstant>(S)) {
auto UserIt = BECountUsers.find(S);
- assert(UserIt != BECountUsers.end());
+ if (UserIt == BECountUsers.end())
+ continue;
UserIt->second.erase({L, Predicated});
}
}
diff --git a/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll b/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll
new file mode 100644
index 00000000000000..c91f36bfb66af3
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll
@@ -0,0 +1,113 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes='print<scalar-evolution>,loop-vectorize' -force-vector-width=4 -scalar-evolution-classify-expressions=false -S %s | FileCheck %s
+
+; Test case for https://github.com/llvm/llvm-project/issues/119665.
+define void @test_invalidate_scevs_at_scope(ptr %p) {
+; CHECK-LABEL: define void @test_invalidate_scevs_at_scope(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; CHECK: [[VECTOR_PH]]:
+; CHECK-NEXT: br label %[[VECTOR_BODY:.*]]
+; CHECK: [[VECTOR_BODY]]:
+; CHECK-NEXT: [[INDEX:%.*]] = phi i32 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[VEC_IND:%.*]] = phi <4 x i32> [ <i32 0, i32 1, i32 2, i32 3>, %[[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[P]], align 4
+; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i64 0
+; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
+; CHECK-NEXT: [[TMP1:%.*]] = add <4 x i32> [[BROADCAST_SPLAT]], [[VEC_IND]]
+; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
+; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], splat (i32 4)
+; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i32 [[INDEX_NEXT]], 100
+; CHECK-NEXT: br i1 [[TMP2]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK: [[MIDDLE_BLOCK]]:
+; CHECK-NEXT: [[TMP3:%.*]] = extractelement <4 x i32> [[TMP1]], i32 3
+; CHECK-NEXT: br i1 false, label %[[EXIT_1:.*]], label %[[SCALAR_PH]]
+; CHECK: [[SCALAR_PH]]:
+; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i32 [ 100, %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ]
+; CHECK-NEXT: br label %[[LOOP_1:.*]]
+; CHECK: [[LOOP_1]]:
+; CHECK-NEXT: [[IV_1:%.*]] = phi i32 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[IV_1_NEXT:%.*]], %[[LOOP_1]] ]
+; CHECK-NEXT: [[TMP4:%.*]] = load i32, ptr [[P]], align 4
+; CHECK-NEXT: [[ADD_1:%.*]] = add i32 [[TMP4]], [[IV_1]]
+; CHECK-NEXT: [[IV_1_NEXT]] = add i32 [[IV_1]], 1
+; CHECK-NEXT: [[C_1:%.*]] = icmp eq i32 [[IV_1]], 100
+; CHECK-NEXT: br i1 [[C_1]], label %[[EXIT_1]], label %[[LOOP_1]], !llvm.loop [[LOOP3:![0-9]+]]
+; CHECK: [[EXIT_1]]:
+; CHECK-NEXT: [[ADD_LCSSA:%.*]] = phi i32 [ [[ADD_1]], %[[LOOP_1]] ], [ [[TMP3]], %[[MIDDLE_BLOCK]] ]
+; CHECK-NEXT: [[SMAX:%.*]] = call i32 @llvm.smax.i32(i32 [[ADD_LCSSA]], i32 100)
+; CHECK-NEXT: [[TMP5:%.*]] = sub i32 [[SMAX]], [[ADD_LCSSA]]
+; CHECK-NEXT: [[TMP6:%.*]] = zext i32 [[TMP5]] to i64
+; CHECK-NEXT: [[TMP7:%.*]] = add nuw nsw i64 [[TMP6]], 1
+; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP7]], 4
+; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH2:.*]], label %[[VECTOR_PH3:.*]]
+; CHECK: [[VECTOR_PH3]]:
+; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP7]], 4
+; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[TMP7]], [[N_MOD_VF]]
+; CHECK-NEXT: br label %[[VECTOR_BODY4:.*]]
+; CHECK: [[VECTOR_BODY4]]:
+; CHECK-NEXT: [[INDEX5:%.*]] = phi i64 [ 0, %[[VECTOR_PH3]] ], [ [[INDEX_NEXT8:%.*]], %[[VECTOR_BODY4]] ]
+; CHECK-NEXT: [[VEC_IND6:%.*]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, %[[VECTOR_PH3]] ], [ [[VEC_IND_NEXT7:%.*]], %[[VECTOR_BODY4]] ]
+; CHECK-NEXT: [[TMP8:%.*]] = add i64 [[INDEX5]], 0
+; CHECK-NEXT: [[TMP9:%.*]] = getelementptr inbounds i64, ptr [[P]], i64 [[TMP8]]
+; CHECK-NEXT: [[TMP10:%.*]] = getelementptr inbounds i64, ptr [[TMP9]], i32 0
+; CHECK-NEXT: store <4 x i64> [[VEC_IND6]], ptr [[TMP10]], align 4
+; CHECK-NEXT: [[INDEX_NEXT8]] = add nuw i64 [[INDEX5]], 4
+; CHECK-NEXT: [[VEC_IND_NEXT7]] = add <4 x i64> [[VEC_IND6]], splat (i64 4)
+; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i64 [[INDEX_NEXT8]], [[N_VEC]]
+; CHECK-NEXT: br i1 [[TMP11]], label %[[MIDDLE_BLOCK1:.*]], label %[[VECTOR_BODY4]], !llvm.loop [[LOOP4:![0-9]+]]
+; CHECK: [[MIDDLE_BLOCK1]]:
+; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP7]], [[N_VEC]]
+; CHECK-NEXT: br i1 [[CMP_N]], label %[[EXIT_2:.*]], label %[[SCALAR_PH2]]
+; CHECK: [[SCALAR_PH2]]:
+; CHECK-NEXT: [[BC_RESUME_VAL9:%.*]] = phi i64 [ [[N_VEC]], %[[MIDDLE_BLOCK1]] ], [ 0, %[[EXIT_1]] ]
+; CHECK-NEXT: br label %[[LOOP_2:.*]]
+; CHECK: [[LOOP_2]]:
+; CHECK-NEXT: [[IV_2:%.*]] = phi i64 [ [[BC_RESUME_VAL9]], %[[SCALAR_PH2]] ], [ [[IV_2_NEXT:%.*]], %[[LOOP_2]] ]
+; CHECK-NEXT: [[IV_2_TRUNC:%.*]] = trunc i64 [[IV_2]] to i32
+; CHECK-NEXT: [[IV_2_NEXT]] = add i64 [[IV_2]], 1
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds i64, ptr [[P]], i64 [[IV_2]]
+; CHECK-NEXT: store i64 [[IV_2]], ptr [[GEP]], align 4
+; CHECK-NEXT: [[ADD_2:%.*]] = add i32 [[ADD_LCSSA]], [[IV_2_TRUNC]]
+; CHECK-NEXT: [[C_2:%.*]] = icmp slt i32 [[ADD_2]], 100
+; CHECK-NEXT: br i1 [[C_2]], label %[[LOOP_2]], label %[[EXIT_2]], !llvm.loop [[LOOP5:![0-9]+]]
+; CHECK: [[EXIT_2]]:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop.1
+
+loop.1:
+ %iv.1 = phi i32 [ 0, %entry ], [ %iv.1.next, %loop.1 ]
+ %1 = load i32, ptr %p, align 4
+ %add.1 = add i32 %1, %iv.1
+ %iv.1.next = add i32 %iv.1, 1
+ %c.1 = icmp eq i32 %iv.1, 100
+ br i1 %c.1, label %exit.1, label %loop.1
+
+exit.1:
+ %add.lcssa = phi i32 [ %add.1, %loop.1 ]
+ br label %loop.2
+
+loop.2:
+ %iv.2 = phi i64 [ 0, %exit.1 ], [ %iv.2.next, %loop.2 ]
+ %iv.2.trunc = trunc i64 %iv.2 to i32
+ %iv.2.next = add i64 %iv.2, 1
+ %gep = getelementptr inbounds i64, ptr %p, i64 %iv.2
+ store i64 %iv.2, ptr %gep
+ %add.2 = add i32 %add.lcssa, %iv.2.trunc
+ %c.2 = icmp slt i32 %add.2, 100
+ br i1 %c.2, label %loop.2, label %exit.2
+
+exit.2:
+ ret void
+}
+
+;.
+; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
+; CHECK: [[META1]] = !{!"llvm.loop.isvectorized", i32 1}
+; CHECK: [[META2]] = !{!"llvm.loop.unroll.runtime.disable"}
+; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META2]], [[META1]]}
+; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META1]], [[META2]]}
+; CHECK: [[LOOP5]] = distinct !{[[LOOP5]], [[META2]], [[META1]]}
+;.
|
@llvm/pr-subscribers-llvm-analysis Author: Florian Hahn (fhahn) ChangesWhen the original SCEV gets invalidate, any SCEV computed at a different scope may also become invalid. Add SCEVs created for scopes as users of the original value to ensure proper invalidation. Fixes #119665. Full diff: https://github.com/llvm/llvm-project/pull/119897.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index e18133971f5bf0..c7488faa649e23 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -8442,7 +8442,7 @@ ScalarEvolution::getBackedgeTakenInfo(const Loop *L) {
// recusive call to getBackedgeTakenInfo (on a different
// loop), which would invalidate the iterator computed
// earlier.
- return BackedgeTakenCounts.find(L)->second = std::move(Result);
+ return BackedgeTakenCounts[L] = std::move(Result);
}
void ScalarEvolution::forgetAllLoops() {
@@ -9867,8 +9867,10 @@ const SCEV *ScalarEvolution::getSCEVAtScope(const SCEV *V, const Loop *L) {
for (auto &LS : reverse(ValuesAtScopes[V]))
if (LS.first == L) {
LS.second = C;
- if (!isa<SCEVConstant>(C))
+ if (!isa<SCEVConstant>(C)) {
ValuesAtScopesUsers[C].push_back({L, V});
+ SCEVUsers[V].insert(C);
+ }
break;
}
return C;
@@ -14211,7 +14213,8 @@ void ScalarEvolution::forgetBackedgeTakenCounts(const Loop *L,
for (const SCEV *S : {ENT.ExactNotTaken, ENT.SymbolicMaxNotTaken}) {
if (!isa<SCEVConstant>(S)) {
auto UserIt = BECountUsers.find(S);
- assert(UserIt != BECountUsers.end());
+ if (UserIt == BECountUsers.end())
+ continue;
UserIt->second.erase({L, Predicated});
}
}
diff --git a/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll b/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll
new file mode 100644
index 00000000000000..c91f36bfb66af3
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll
@@ -0,0 +1,113 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes='print<scalar-evolution>,loop-vectorize' -force-vector-width=4 -scalar-evolution-classify-expressions=false -S %s | FileCheck %s
+
+; Test case for https://github.com/llvm/llvm-project/issues/119665.
+define void @test_invalidate_scevs_at_scope(ptr %p) {
+; CHECK-LABEL: define void @test_invalidate_scevs_at_scope(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; CHECK: [[VECTOR_PH]]:
+; CHECK-NEXT: br label %[[VECTOR_BODY:.*]]
+; CHECK: [[VECTOR_BODY]]:
+; CHECK-NEXT: [[INDEX:%.*]] = phi i32 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[VEC_IND:%.*]] = phi <4 x i32> [ <i32 0, i32 1, i32 2, i32 3>, %[[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[P]], align 4
+; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i64 0
+; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
+; CHECK-NEXT: [[TMP1:%.*]] = add <4 x i32> [[BROADCAST_SPLAT]], [[VEC_IND]]
+; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
+; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], splat (i32 4)
+; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i32 [[INDEX_NEXT]], 100
+; CHECK-NEXT: br i1 [[TMP2]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK: [[MIDDLE_BLOCK]]:
+; CHECK-NEXT: [[TMP3:%.*]] = extractelement <4 x i32> [[TMP1]], i32 3
+; CHECK-NEXT: br i1 false, label %[[EXIT_1:.*]], label %[[SCALAR_PH]]
+; CHECK: [[SCALAR_PH]]:
+; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i32 [ 100, %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ]
+; CHECK-NEXT: br label %[[LOOP_1:.*]]
+; CHECK: [[LOOP_1]]:
+; CHECK-NEXT: [[IV_1:%.*]] = phi i32 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[IV_1_NEXT:%.*]], %[[LOOP_1]] ]
+; CHECK-NEXT: [[TMP4:%.*]] = load i32, ptr [[P]], align 4
+; CHECK-NEXT: [[ADD_1:%.*]] = add i32 [[TMP4]], [[IV_1]]
+; CHECK-NEXT: [[IV_1_NEXT]] = add i32 [[IV_1]], 1
+; CHECK-NEXT: [[C_1:%.*]] = icmp eq i32 [[IV_1]], 100
+; CHECK-NEXT: br i1 [[C_1]], label %[[EXIT_1]], label %[[LOOP_1]], !llvm.loop [[LOOP3:![0-9]+]]
+; CHECK: [[EXIT_1]]:
+; CHECK-NEXT: [[ADD_LCSSA:%.*]] = phi i32 [ [[ADD_1]], %[[LOOP_1]] ], [ [[TMP3]], %[[MIDDLE_BLOCK]] ]
+; CHECK-NEXT: [[SMAX:%.*]] = call i32 @llvm.smax.i32(i32 [[ADD_LCSSA]], i32 100)
+; CHECK-NEXT: [[TMP5:%.*]] = sub i32 [[SMAX]], [[ADD_LCSSA]]
+; CHECK-NEXT: [[TMP6:%.*]] = zext i32 [[TMP5]] to i64
+; CHECK-NEXT: [[TMP7:%.*]] = add nuw nsw i64 [[TMP6]], 1
+; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP7]], 4
+; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH2:.*]], label %[[VECTOR_PH3:.*]]
+; CHECK: [[VECTOR_PH3]]:
+; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP7]], 4
+; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[TMP7]], [[N_MOD_VF]]
+; CHECK-NEXT: br label %[[VECTOR_BODY4:.*]]
+; CHECK: [[VECTOR_BODY4]]:
+; CHECK-NEXT: [[INDEX5:%.*]] = phi i64 [ 0, %[[VECTOR_PH3]] ], [ [[INDEX_NEXT8:%.*]], %[[VECTOR_BODY4]] ]
+; CHECK-NEXT: [[VEC_IND6:%.*]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, %[[VECTOR_PH3]] ], [ [[VEC_IND_NEXT7:%.*]], %[[VECTOR_BODY4]] ]
+; CHECK-NEXT: [[TMP8:%.*]] = add i64 [[INDEX5]], 0
+; CHECK-NEXT: [[TMP9:%.*]] = getelementptr inbounds i64, ptr [[P]], i64 [[TMP8]]
+; CHECK-NEXT: [[TMP10:%.*]] = getelementptr inbounds i64, ptr [[TMP9]], i32 0
+; CHECK-NEXT: store <4 x i64> [[VEC_IND6]], ptr [[TMP10]], align 4
+; CHECK-NEXT: [[INDEX_NEXT8]] = add nuw i64 [[INDEX5]], 4
+; CHECK-NEXT: [[VEC_IND_NEXT7]] = add <4 x i64> [[VEC_IND6]], splat (i64 4)
+; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i64 [[INDEX_NEXT8]], [[N_VEC]]
+; CHECK-NEXT: br i1 [[TMP11]], label %[[MIDDLE_BLOCK1:.*]], label %[[VECTOR_BODY4]], !llvm.loop [[LOOP4:![0-9]+]]
+; CHECK: [[MIDDLE_BLOCK1]]:
+; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP7]], [[N_VEC]]
+; CHECK-NEXT: br i1 [[CMP_N]], label %[[EXIT_2:.*]], label %[[SCALAR_PH2]]
+; CHECK: [[SCALAR_PH2]]:
+; CHECK-NEXT: [[BC_RESUME_VAL9:%.*]] = phi i64 [ [[N_VEC]], %[[MIDDLE_BLOCK1]] ], [ 0, %[[EXIT_1]] ]
+; CHECK-NEXT: br label %[[LOOP_2:.*]]
+; CHECK: [[LOOP_2]]:
+; CHECK-NEXT: [[IV_2:%.*]] = phi i64 [ [[BC_RESUME_VAL9]], %[[SCALAR_PH2]] ], [ [[IV_2_NEXT:%.*]], %[[LOOP_2]] ]
+; CHECK-NEXT: [[IV_2_TRUNC:%.*]] = trunc i64 [[IV_2]] to i32
+; CHECK-NEXT: [[IV_2_NEXT]] = add i64 [[IV_2]], 1
+; CHECK-NEXT: [[GEP:%.*]] = getelementptr inbounds i64, ptr [[P]], i64 [[IV_2]]
+; CHECK-NEXT: store i64 [[IV_2]], ptr [[GEP]], align 4
+; CHECK-NEXT: [[ADD_2:%.*]] = add i32 [[ADD_LCSSA]], [[IV_2_TRUNC]]
+; CHECK-NEXT: [[C_2:%.*]] = icmp slt i32 [[ADD_2]], 100
+; CHECK-NEXT: br i1 [[C_2]], label %[[LOOP_2]], label %[[EXIT_2]], !llvm.loop [[LOOP5:![0-9]+]]
+; CHECK: [[EXIT_2]]:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop.1
+
+loop.1:
+ %iv.1 = phi i32 [ 0, %entry ], [ %iv.1.next, %loop.1 ]
+ %1 = load i32, ptr %p, align 4
+ %add.1 = add i32 %1, %iv.1
+ %iv.1.next = add i32 %iv.1, 1
+ %c.1 = icmp eq i32 %iv.1, 100
+ br i1 %c.1, label %exit.1, label %loop.1
+
+exit.1:
+ %add.lcssa = phi i32 [ %add.1, %loop.1 ]
+ br label %loop.2
+
+loop.2:
+ %iv.2 = phi i64 [ 0, %exit.1 ], [ %iv.2.next, %loop.2 ]
+ %iv.2.trunc = trunc i64 %iv.2 to i32
+ %iv.2.next = add i64 %iv.2, 1
+ %gep = getelementptr inbounds i64, ptr %p, i64 %iv.2
+ store i64 %iv.2, ptr %gep
+ %add.2 = add i32 %add.lcssa, %iv.2.trunc
+ %c.2 = icmp slt i32 %add.2, 100
+ br i1 %c.2, label %loop.2, label %exit.2
+
+exit.2:
+ ret void
+}
+
+;.
+; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
+; CHECK: [[META1]] = !{!"llvm.loop.isvectorized", i32 1}
+; CHECK: [[META2]] = !{!"llvm.loop.unroll.runtime.disable"}
+; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META2]], [[META1]]}
+; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META1]], [[META2]]}
+; CHECK: [[LOOP5]] = distinct !{[[LOOP5]], [[META2]], [[META1]]}
+;.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please provide some more details on what happens in the test case? Like which SCEV with which SCEVAtScope gets invalidated where?
Added more details for the test, thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on your description, I feel like this is something that forgetLcssaPhiWithNewPredecessor() on modified LCSSA phi nodes should be handling -- why doesn't it? Is there a call missing somewhere or does it not do what is necessary here?
IIUC the issue here is that we need to invalidate |
Looking at https://llvm.godbolt.org/z/xdWvqbhaj, |
When the original SCEV gets invalidate, any SCEV computed at a different scope may also become invalid. Add SCEVs created for scopes as users of the original value to ensure proper invalidation. Fixes llvm#119665.
dc15503
to
e3d5b2d
Compare
Had another look and it seems the issue was that we invalidated the SCEV for the phi before |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…lidation. (#119897) `forgetLcssaPhiWithNewPredecessor` performs additional invalidation if there is an existing SCEV for the phi, but earlier `forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate the SCEV for the phi. Change the order to first call `forgetLcssaPhiWithNewPredecessor` to ensure it runs before its SCEV gets invalidated too eagerly. Fixes llvm/llvm-project#119665. PR: llvm/llvm-project#119897
…#119897) `forgetLcssaPhiWithNewPredecessor` performs additional invalidation if there is an existing SCEV for the phi, but earlier `forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate the SCEV for the phi. Change the order to first call `forgetLcssaPhiWithNewPredecessor` to ensure it runs before its SCEV gets invalidated too eagerly. Fixes llvm/llvm-project#119665. PR: llvm/llvm-project#119897 (cherry picked from commit 3706dfe)
…lidation. (#119897) `forgetLcssaPhiWithNewPredecessor` performs additional invalidation if there is an existing SCEV for the phi, but earlier `forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate the SCEV for the phi. Change the order to first call `forgetLcssaPhiWithNewPredecessor` to ensure it runs before its SCEV gets invalidated too eagerly. Fixes llvm/llvm-project#119665. PR: llvm/llvm-project#119897 (cherry picked from commit 3706dfe)
…lvm#119897) `forgetLcssaPhiWithNewPredecessor` performs additional invalidation if there is an existing SCEV for the phi, but earlier `forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate the SCEV for the phi. Change the order to first call `forgetLcssaPhiWithNewPredecessor` to ensure it runs before its SCEV gets invalidated too eagerly. Fixes llvm#119665. PR: llvm#119897
…lvm#119897) `forgetLcssaPhiWithNewPredecessor` performs additional invalidation if there is an existing SCEV for the phi, but earlier `forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate the SCEV for the phi. Change the order to first call `forgetLcssaPhiWithNewPredecessor` to ensure it runs before its SCEV gets invalidated too eagerly. Fixes llvm#119665. PR: llvm#119897
forgetLcssaPhiWithNewPredecessor
performs additional invalidation if there is an existing SCEV for the phi, but earlierforgetBlockAndLoopDispositions
orforgetLoop
may already invalidate the SCEV for the phi.Change the order to first call
forgetLcssaPhiWithNewPredecessor
to ensure it runs before its SCEV gets invalidated too eagerly.Fixes #119665.