Skip to content

[LV] Forget LCSSA phi with new pred before other SCEV invalidation. #119897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 10, 2025

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Dec 13, 2024

forgetLcssaPhiWithNewPredecessor performs additional invalidation if there is an existing SCEV for the phi, but earlier forgetBlockAndLoopDispositions or forgetLoop may already invalidate the SCEV for the phi.

Change the order to first call forgetLcssaPhiWithNewPredecessor to ensure it runs before its SCEV gets invalidated too eagerly.

Fixes #119665.

@fhahn fhahn requested a review from nikic as a code owner December 13, 2024 16:32
@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Dec 13, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 13, 2024

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

When the original SCEV gets invalidate, any SCEV computed at a different scope may also become invalid. Add SCEVs created for scopes as users of the original value to ensure proper invalidation.

Fixes #119665.


Full diff: https://github.com/llvm/llvm-project/pull/119897.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+6-3)
  • (added) llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll (+113)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index e18133971f5bf0..c7488faa649e23 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -8442,7 +8442,7 @@ ScalarEvolution::getBackedgeTakenInfo(const Loop *L) {
   // recusive call to getBackedgeTakenInfo (on a different
   // loop), which would invalidate the iterator computed
   // earlier.
-  return BackedgeTakenCounts.find(L)->second = std::move(Result);
+  return BackedgeTakenCounts[L] = std::move(Result);
 }
 
 void ScalarEvolution::forgetAllLoops() {
@@ -9867,8 +9867,10 @@ const SCEV *ScalarEvolution::getSCEVAtScope(const SCEV *V, const Loop *L) {
   for (auto &LS : reverse(ValuesAtScopes[V]))
     if (LS.first == L) {
       LS.second = C;
-      if (!isa<SCEVConstant>(C))
+      if (!isa<SCEVConstant>(C)) {
         ValuesAtScopesUsers[C].push_back({L, V});
+        SCEVUsers[V].insert(C);
+      }
       break;
     }
   return C;
@@ -14211,7 +14213,8 @@ void ScalarEvolution::forgetBackedgeTakenCounts(const Loop *L,
       for (const SCEV *S : {ENT.ExactNotTaken, ENT.SymbolicMaxNotTaken}) {
         if (!isa<SCEVConstant>(S)) {
           auto UserIt = BECountUsers.find(S);
-          assert(UserIt != BECountUsers.end());
+          if (UserIt == BECountUsers.end())
+            continue;
           UserIt->second.erase({L, Predicated});
         }
       }
diff --git a/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll b/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll
new file mode 100644
index 00000000000000..c91f36bfb66af3
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll
@@ -0,0 +1,113 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes='print<scalar-evolution>,loop-vectorize' -force-vector-width=4 -scalar-evolution-classify-expressions=false -S %s | FileCheck %s
+
+; Test case for https://github.com/llvm/llvm-project/issues/119665.
+define void @test_invalidate_scevs_at_scope(ptr %p) {
+; CHECK-LABEL: define void @test_invalidate_scevs_at_scope(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:    br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; CHECK:       [[VECTOR_PH]]:
+; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
+; CHECK:       [[VECTOR_BODY]]:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <4 x i32> [ <i32 0, i32 1, i32 2, i32 3>, %[[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[P]], align 4
+; CHECK-NEXT:    [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i64 0
+; CHECK-NEXT:    [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
+; CHECK-NEXT:    [[TMP1:%.*]] = add <4 x i32> [[BROADCAST_SPLAT]], [[VEC_IND]]
+; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
+; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], splat (i32 4)
+; CHECK-NEXT:    [[TMP2:%.*]] = icmp eq i32 [[INDEX_NEXT]], 100
+; CHECK-NEXT:    br i1 [[TMP2]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK:       [[MIDDLE_BLOCK]]:
+; CHECK-NEXT:    [[TMP3:%.*]] = extractelement <4 x i32> [[TMP1]], i32 3
+; CHECK-NEXT:    br i1 false, label %[[EXIT_1:.*]], label %[[SCALAR_PH]]
+; CHECK:       [[SCALAR_PH]]:
+; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i32 [ 100, %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ]
+; CHECK-NEXT:    br label %[[LOOP_1:.*]]
+; CHECK:       [[LOOP_1]]:
+; CHECK-NEXT:    [[IV_1:%.*]] = phi i32 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[IV_1_NEXT:%.*]], %[[LOOP_1]] ]
+; CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[P]], align 4
+; CHECK-NEXT:    [[ADD_1:%.*]] = add i32 [[TMP4]], [[IV_1]]
+; CHECK-NEXT:    [[IV_1_NEXT]] = add i32 [[IV_1]], 1
+; CHECK-NEXT:    [[C_1:%.*]] = icmp eq i32 [[IV_1]], 100
+; CHECK-NEXT:    br i1 [[C_1]], label %[[EXIT_1]], label %[[LOOP_1]], !llvm.loop [[LOOP3:![0-9]+]]
+; CHECK:       [[EXIT_1]]:
+; CHECK-NEXT:    [[ADD_LCSSA:%.*]] = phi i32 [ [[ADD_1]], %[[LOOP_1]] ], [ [[TMP3]], %[[MIDDLE_BLOCK]] ]
+; CHECK-NEXT:    [[SMAX:%.*]] = call i32 @llvm.smax.i32(i32 [[ADD_LCSSA]], i32 100)
+; CHECK-NEXT:    [[TMP5:%.*]] = sub i32 [[SMAX]], [[ADD_LCSSA]]
+; CHECK-NEXT:    [[TMP6:%.*]] = zext i32 [[TMP5]] to i64
+; CHECK-NEXT:    [[TMP7:%.*]] = add nuw nsw i64 [[TMP6]], 1
+; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP7]], 4
+; CHECK-NEXT:    br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH2:.*]], label %[[VECTOR_PH3:.*]]
+; CHECK:       [[VECTOR_PH3]]:
+; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i64 [[TMP7]], 4
+; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 [[TMP7]], [[N_MOD_VF]]
+; CHECK-NEXT:    br label %[[VECTOR_BODY4:.*]]
+; CHECK:       [[VECTOR_BODY4]]:
+; CHECK-NEXT:    [[INDEX5:%.*]] = phi i64 [ 0, %[[VECTOR_PH3]] ], [ [[INDEX_NEXT8:%.*]], %[[VECTOR_BODY4]] ]
+; CHECK-NEXT:    [[VEC_IND6:%.*]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, %[[VECTOR_PH3]] ], [ [[VEC_IND_NEXT7:%.*]], %[[VECTOR_BODY4]] ]
+; CHECK-NEXT:    [[TMP8:%.*]] = add i64 [[INDEX5]], 0
+; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr inbounds i64, ptr [[P]], i64 [[TMP8]]
+; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr inbounds i64, ptr [[TMP9]], i32 0
+; CHECK-NEXT:    store <4 x i64> [[VEC_IND6]], ptr [[TMP10]], align 4
+; CHECK-NEXT:    [[INDEX_NEXT8]] = add nuw i64 [[INDEX5]], 4
+; CHECK-NEXT:    [[VEC_IND_NEXT7]] = add <4 x i64> [[VEC_IND6]], splat (i64 4)
+; CHECK-NEXT:    [[TMP11:%.*]] = icmp eq i64 [[INDEX_NEXT8]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[TMP11]], label %[[MIDDLE_BLOCK1:.*]], label %[[VECTOR_BODY4]], !llvm.loop [[LOOP4:![0-9]+]]
+; CHECK:       [[MIDDLE_BLOCK1]]:
+; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[TMP7]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[CMP_N]], label %[[EXIT_2:.*]], label %[[SCALAR_PH2]]
+; CHECK:       [[SCALAR_PH2]]:
+; CHECK-NEXT:    [[BC_RESUME_VAL9:%.*]] = phi i64 [ [[N_VEC]], %[[MIDDLE_BLOCK1]] ], [ 0, %[[EXIT_1]] ]
+; CHECK-NEXT:    br label %[[LOOP_2:.*]]
+; CHECK:       [[LOOP_2]]:
+; CHECK-NEXT:    [[IV_2:%.*]] = phi i64 [ [[BC_RESUME_VAL9]], %[[SCALAR_PH2]] ], [ [[IV_2_NEXT:%.*]], %[[LOOP_2]] ]
+; CHECK-NEXT:    [[IV_2_TRUNC:%.*]] = trunc i64 [[IV_2]] to i32
+; CHECK-NEXT:    [[IV_2_NEXT]] = add i64 [[IV_2]], 1
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i64, ptr [[P]], i64 [[IV_2]]
+; CHECK-NEXT:    store i64 [[IV_2]], ptr [[GEP]], align 4
+; CHECK-NEXT:    [[ADD_2:%.*]] = add i32 [[ADD_LCSSA]], [[IV_2_TRUNC]]
+; CHECK-NEXT:    [[C_2:%.*]] = icmp slt i32 [[ADD_2]], 100
+; CHECK-NEXT:    br i1 [[C_2]], label %[[LOOP_2]], label %[[EXIT_2]], !llvm.loop [[LOOP5:![0-9]+]]
+; CHECK:       [[EXIT_2]]:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop.1
+
+loop.1:
+  %iv.1 = phi i32 [ 0, %entry ], [ %iv.1.next, %loop.1 ]
+  %1 = load i32, ptr %p, align 4
+  %add.1 = add i32 %1, %iv.1
+  %iv.1.next = add i32 %iv.1, 1
+  %c.1 = icmp eq i32 %iv.1, 100
+  br i1 %c.1, label %exit.1, label %loop.1
+
+exit.1:
+  %add.lcssa = phi i32 [ %add.1, %loop.1 ]
+  br label %loop.2
+
+loop.2:
+  %iv.2 = phi i64 [ 0, %exit.1 ], [ %iv.2.next, %loop.2 ]
+  %iv.2.trunc = trunc i64 %iv.2 to i32
+  %iv.2.next = add i64 %iv.2, 1
+  %gep = getelementptr inbounds i64, ptr %p, i64 %iv.2
+  store i64 %iv.2, ptr %gep
+  %add.2 = add i32 %add.lcssa, %iv.2.trunc
+  %c.2 = icmp slt i32 %add.2, 100
+  br i1 %c.2, label %loop.2, label %exit.2
+
+exit.2:
+  ret void
+}
+
+;.
+; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
+; CHECK: [[META1]] = !{!"llvm.loop.isvectorized", i32 1}
+; CHECK: [[META2]] = !{!"llvm.loop.unroll.runtime.disable"}
+; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META2]], [[META1]]}
+; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META1]], [[META2]]}
+; CHECK: [[LOOP5]] = distinct !{[[LOOP5]], [[META2]], [[META1]]}
+;.

@llvmbot
Copy link
Member

llvmbot commented Dec 13, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Florian Hahn (fhahn)

Changes

When the original SCEV gets invalidate, any SCEV computed at a different scope may also become invalid. Add SCEVs created for scopes as users of the original value to ensure proper invalidation.

Fixes #119665.


Full diff: https://github.com/llvm/llvm-project/pull/119897.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+6-3)
  • (added) llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll (+113)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index e18133971f5bf0..c7488faa649e23 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -8442,7 +8442,7 @@ ScalarEvolution::getBackedgeTakenInfo(const Loop *L) {
   // recusive call to getBackedgeTakenInfo (on a different
   // loop), which would invalidate the iterator computed
   // earlier.
-  return BackedgeTakenCounts.find(L)->second = std::move(Result);
+  return BackedgeTakenCounts[L] = std::move(Result);
 }
 
 void ScalarEvolution::forgetAllLoops() {
@@ -9867,8 +9867,10 @@ const SCEV *ScalarEvolution::getSCEVAtScope(const SCEV *V, const Loop *L) {
   for (auto &LS : reverse(ValuesAtScopes[V]))
     if (LS.first == L) {
       LS.second = C;
-      if (!isa<SCEVConstant>(C))
+      if (!isa<SCEVConstant>(C)) {
         ValuesAtScopesUsers[C].push_back({L, V});
+        SCEVUsers[V].insert(C);
+      }
       break;
     }
   return C;
@@ -14211,7 +14213,8 @@ void ScalarEvolution::forgetBackedgeTakenCounts(const Loop *L,
       for (const SCEV *S : {ENT.ExactNotTaken, ENT.SymbolicMaxNotTaken}) {
         if (!isa<SCEVConstant>(S)) {
           auto UserIt = BECountUsers.find(S);
-          assert(UserIt != BECountUsers.end());
+          if (UserIt == BECountUsers.end())
+            continue;
           UserIt->second.erase({L, Predicated});
         }
       }
diff --git a/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll b/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll
new file mode 100644
index 00000000000000..c91f36bfb66af3
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/invalidate-scev-at-scope-after-vectorization.ll
@@ -0,0 +1,113 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes='print<scalar-evolution>,loop-vectorize' -force-vector-width=4 -scalar-evolution-classify-expressions=false -S %s | FileCheck %s
+
+; Test case for https://github.com/llvm/llvm-project/issues/119665.
+define void @test_invalidate_scevs_at_scope(ptr %p) {
+; CHECK-LABEL: define void @test_invalidate_scevs_at_scope(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:    br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; CHECK:       [[VECTOR_PH]]:
+; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
+; CHECK:       [[VECTOR_BODY]]:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <4 x i32> [ <i32 0, i32 1, i32 2, i32 3>, %[[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[P]], align 4
+; CHECK-NEXT:    [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i64 0
+; CHECK-NEXT:    [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i32> [[BROADCAST_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
+; CHECK-NEXT:    [[TMP1:%.*]] = add <4 x i32> [[BROADCAST_SPLAT]], [[VEC_IND]]
+; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i32 [[INDEX]], 4
+; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], splat (i32 4)
+; CHECK-NEXT:    [[TMP2:%.*]] = icmp eq i32 [[INDEX_NEXT]], 100
+; CHECK-NEXT:    br i1 [[TMP2]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK:       [[MIDDLE_BLOCK]]:
+; CHECK-NEXT:    [[TMP3:%.*]] = extractelement <4 x i32> [[TMP1]], i32 3
+; CHECK-NEXT:    br i1 false, label %[[EXIT_1:.*]], label %[[SCALAR_PH]]
+; CHECK:       [[SCALAR_PH]]:
+; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i32 [ 100, %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ]
+; CHECK-NEXT:    br label %[[LOOP_1:.*]]
+; CHECK:       [[LOOP_1]]:
+; CHECK-NEXT:    [[IV_1:%.*]] = phi i32 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[IV_1_NEXT:%.*]], %[[LOOP_1]] ]
+; CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[P]], align 4
+; CHECK-NEXT:    [[ADD_1:%.*]] = add i32 [[TMP4]], [[IV_1]]
+; CHECK-NEXT:    [[IV_1_NEXT]] = add i32 [[IV_1]], 1
+; CHECK-NEXT:    [[C_1:%.*]] = icmp eq i32 [[IV_1]], 100
+; CHECK-NEXT:    br i1 [[C_1]], label %[[EXIT_1]], label %[[LOOP_1]], !llvm.loop [[LOOP3:![0-9]+]]
+; CHECK:       [[EXIT_1]]:
+; CHECK-NEXT:    [[ADD_LCSSA:%.*]] = phi i32 [ [[ADD_1]], %[[LOOP_1]] ], [ [[TMP3]], %[[MIDDLE_BLOCK]] ]
+; CHECK-NEXT:    [[SMAX:%.*]] = call i32 @llvm.smax.i32(i32 [[ADD_LCSSA]], i32 100)
+; CHECK-NEXT:    [[TMP5:%.*]] = sub i32 [[SMAX]], [[ADD_LCSSA]]
+; CHECK-NEXT:    [[TMP6:%.*]] = zext i32 [[TMP5]] to i64
+; CHECK-NEXT:    [[TMP7:%.*]] = add nuw nsw i64 [[TMP6]], 1
+; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP7]], 4
+; CHECK-NEXT:    br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH2:.*]], label %[[VECTOR_PH3:.*]]
+; CHECK:       [[VECTOR_PH3]]:
+; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i64 [[TMP7]], 4
+; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 [[TMP7]], [[N_MOD_VF]]
+; CHECK-NEXT:    br label %[[VECTOR_BODY4:.*]]
+; CHECK:       [[VECTOR_BODY4]]:
+; CHECK-NEXT:    [[INDEX5:%.*]] = phi i64 [ 0, %[[VECTOR_PH3]] ], [ [[INDEX_NEXT8:%.*]], %[[VECTOR_BODY4]] ]
+; CHECK-NEXT:    [[VEC_IND6:%.*]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, %[[VECTOR_PH3]] ], [ [[VEC_IND_NEXT7:%.*]], %[[VECTOR_BODY4]] ]
+; CHECK-NEXT:    [[TMP8:%.*]] = add i64 [[INDEX5]], 0
+; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr inbounds i64, ptr [[P]], i64 [[TMP8]]
+; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr inbounds i64, ptr [[TMP9]], i32 0
+; CHECK-NEXT:    store <4 x i64> [[VEC_IND6]], ptr [[TMP10]], align 4
+; CHECK-NEXT:    [[INDEX_NEXT8]] = add nuw i64 [[INDEX5]], 4
+; CHECK-NEXT:    [[VEC_IND_NEXT7]] = add <4 x i64> [[VEC_IND6]], splat (i64 4)
+; CHECK-NEXT:    [[TMP11:%.*]] = icmp eq i64 [[INDEX_NEXT8]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[TMP11]], label %[[MIDDLE_BLOCK1:.*]], label %[[VECTOR_BODY4]], !llvm.loop [[LOOP4:![0-9]+]]
+; CHECK:       [[MIDDLE_BLOCK1]]:
+; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[TMP7]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[CMP_N]], label %[[EXIT_2:.*]], label %[[SCALAR_PH2]]
+; CHECK:       [[SCALAR_PH2]]:
+; CHECK-NEXT:    [[BC_RESUME_VAL9:%.*]] = phi i64 [ [[N_VEC]], %[[MIDDLE_BLOCK1]] ], [ 0, %[[EXIT_1]] ]
+; CHECK-NEXT:    br label %[[LOOP_2:.*]]
+; CHECK:       [[LOOP_2]]:
+; CHECK-NEXT:    [[IV_2:%.*]] = phi i64 [ [[BC_RESUME_VAL9]], %[[SCALAR_PH2]] ], [ [[IV_2_NEXT:%.*]], %[[LOOP_2]] ]
+; CHECK-NEXT:    [[IV_2_TRUNC:%.*]] = trunc i64 [[IV_2]] to i32
+; CHECK-NEXT:    [[IV_2_NEXT]] = add i64 [[IV_2]], 1
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i64, ptr [[P]], i64 [[IV_2]]
+; CHECK-NEXT:    store i64 [[IV_2]], ptr [[GEP]], align 4
+; CHECK-NEXT:    [[ADD_2:%.*]] = add i32 [[ADD_LCSSA]], [[IV_2_TRUNC]]
+; CHECK-NEXT:    [[C_2:%.*]] = icmp slt i32 [[ADD_2]], 100
+; CHECK-NEXT:    br i1 [[C_2]], label %[[LOOP_2]], label %[[EXIT_2]], !llvm.loop [[LOOP5:![0-9]+]]
+; CHECK:       [[EXIT_2]]:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop.1
+
+loop.1:
+  %iv.1 = phi i32 [ 0, %entry ], [ %iv.1.next, %loop.1 ]
+  %1 = load i32, ptr %p, align 4
+  %add.1 = add i32 %1, %iv.1
+  %iv.1.next = add i32 %iv.1, 1
+  %c.1 = icmp eq i32 %iv.1, 100
+  br i1 %c.1, label %exit.1, label %loop.1
+
+exit.1:
+  %add.lcssa = phi i32 [ %add.1, %loop.1 ]
+  br label %loop.2
+
+loop.2:
+  %iv.2 = phi i64 [ 0, %exit.1 ], [ %iv.2.next, %loop.2 ]
+  %iv.2.trunc = trunc i64 %iv.2 to i32
+  %iv.2.next = add i64 %iv.2, 1
+  %gep = getelementptr inbounds i64, ptr %p, i64 %iv.2
+  store i64 %iv.2, ptr %gep
+  %add.2 = add i32 %add.lcssa, %iv.2.trunc
+  %c.2 = icmp slt i32 %add.2, 100
+  br i1 %c.2, label %loop.2, label %exit.2
+
+exit.2:
+  ret void
+}
+
+;.
+; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
+; CHECK: [[META1]] = !{!"llvm.loop.isvectorized", i32 1}
+; CHECK: [[META2]] = !{!"llvm.loop.unroll.runtime.disable"}
+; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META2]], [[META1]]}
+; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META1]], [[META2]]}
+; CHECK: [[LOOP5]] = distinct !{[[LOOP5]], [[META2]], [[META1]]}
+;.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please provide some more details on what happens in the test case? Like which SCEV with which SCEVAtScope gets invalidated where?

@fhahn
Copy link
Contributor Author

fhahn commented Dec 13, 2024

Can you please provide some more details on what happens in the test case? Like which SCEV with which SCEVAtScope gets invalidated where?

Added more details for the test, thanks

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your description, I feel like this is something that forgetLcssaPhiWithNewPredecessor() on modified LCSSA phi nodes should be handling -- why doesn't it? Is there a call missing somewhere or does it not do what is necessary here?

@fhahn
Copy link
Contributor Author

fhahn commented Dec 13, 2024

Based on your description, I feel like this is something that forgetLcssaPhiWithNewPredecessor() on modified LCSSA phi nodes should be handling -- why doesn't it? Is there a call missing somewhere or does it not do what is necessary here?

IIUC the issue here is that we need to invalidate loop.2's BTC, which contains %add.1's SCEV at scope of %loop.2 which is -100 + (-1 * %l). But I don't think this SCEV is reachable via a chain of IR instructions reachable from the phi node or a SCEV defined inside loop.1`, so won't be covered by the existing forgetLcssaPhiWithNewPredecessor calls in LV, unless I am missing some existing link?

@nikic
Copy link
Contributor

nikic commented Dec 13, 2024

Looking at https://llvm.godbolt.org/z/xdWvqbhaj, %add.lcssa has SCEV ({0,+,1}<nuw><nsw><%loop.1> + %load) with exit value (100 + %load). Invalidating %add.lcssa should find the invalidation roots {0,+,1}<nuw><nsw><%loop.1> and %load. Invalidating %load should invalidate both the exit value and the BECount using it.

fhahn added 3 commits February 8, 2025 09:43
When the original SCEV gets invalidate, any SCEV computed at a different
scope may also become invalid. Add SCEVs created for scopes as users of
the original value to ensure proper invalidation.

Fixes llvm#119665.
@fhahn fhahn force-pushed the scev-at-scope-users branch from dc15503 to e3d5b2d Compare February 8, 2025 10:15
@fhahn fhahn changed the title [SCEV] Add SCEVs for scopes as users of origin SCEV. [LV] Forget LCSSA phi with new pred before other SCEV invalidation. Feb 8, 2025
@fhahn
Copy link
Contributor Author

fhahn commented Feb 8, 2025

Looking at https://llvm.godbolt.org/z/xdWvqbhaj, %add.lcssa has SCEV ({0,+,1}<nuw><nsw><%loop.1> + %load) with exit value (100 + %load). Invalidating %add.lcssa should find the invalidation roots {0,+,1}<nuw><nsw><%loop.1> and %load. Invalidating %load should invalidate both the exit value and the BECount using it.

Had another look and it seems the issue was that we invalidated the SCEV for the phi before forgetLcssaPhiWithNewPredecessor could apply its specific invalidation logic. Updated the patch to first run forgetLcssaPhiWithNewPredecessor, then the other SCEV invalidation.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fhahn fhahn merged commit 3706dfe into llvm:main Feb 10, 2025
9 checks passed
@fhahn fhahn deleted the scev-at-scope-users branch February 10, 2025 16:30
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 10, 2025
…lidation. (#119897)

`forgetLcssaPhiWithNewPredecessor` performs additional invalidation if
there is an existing SCEV for the phi, but earlier
`forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate
the SCEV for the phi.

Change the order to first call `forgetLcssaPhiWithNewPredecessor` to
ensure it runs before its SCEV gets invalidated too eagerly.

Fixes llvm/llvm-project#119665.

PR: llvm/llvm-project#119897
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 10, 2025
…#119897)

`forgetLcssaPhiWithNewPredecessor` performs additional invalidation if
there is an existing SCEV for the phi, but earlier
`forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate
the SCEV for the phi.

Change the order to first call `forgetLcssaPhiWithNewPredecessor` to
ensure it runs before its SCEV gets invalidated too eagerly.

Fixes llvm/llvm-project#119665.

PR: llvm/llvm-project#119897
(cherry picked from commit 3706dfe)
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 10, 2025
…lidation. (#119897)

`forgetLcssaPhiWithNewPredecessor` performs additional invalidation if
there is an existing SCEV for the phi, but earlier
`forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate
the SCEV for the phi.

Change the order to first call `forgetLcssaPhiWithNewPredecessor` to
ensure it runs before its SCEV gets invalidated too eagerly.

Fixes llvm/llvm-project#119665.

PR: llvm/llvm-project#119897
(cherry picked from commit 3706dfe)
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
…lvm#119897)

`forgetLcssaPhiWithNewPredecessor` performs additional invalidation if
there is an existing SCEV for the phi, but earlier
`forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate
the SCEV for the phi.

Change the order to first call `forgetLcssaPhiWithNewPredecessor` to
ensure it runs before its SCEV gets invalidated too eagerly.

Fixes llvm#119665.

PR: llvm#119897
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
…lvm#119897)

`forgetLcssaPhiWithNewPredecessor` performs additional invalidation if
there is an existing SCEV for the phi, but earlier
`forgetBlockAndLoopDispositions` or `forgetLoop` may already invalidate
the SCEV for the phi.

Change the order to first call `forgetLcssaPhiWithNewPredecessor` to
ensure it runs before its SCEV gets invalidated too eagerly.

Fixes llvm#119665.

PR: llvm#119897
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms vectorizers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[clang] Crash at -O2: Assertion `isAvailableAtLoopEntry(LHS, L) && "LHS is not available at Loop Entry"' failed.
3 participants