[LV] Invalidate SCEV values in the scalar loop after loop vectorization #69886

dtcxzyw · 2023-10-22T17:18:56Z

This PR fixes incorrect trip counts after vectorizing the first loop.
After loop vectorization, we should forget all SCEV values in the scalar loops since the scalar loop doesn't dominate any of the loop exits.

Fixes #69097.
Fixes #66616.
Fixes #63970.

llvmbot · 2023-10-22T17:20:10Z

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

This PR fixes the assertion failure of SE.verify() after loop vectorization (with --verify-scev-strict).
After loop vectorization, we should forget all SCEV values in the scalar loops since the scalar loop doesn't dominate any of the loop exits.

Fixes #69097.
Fixes #66616.

I wonder if it increases compile time. Waiting for the evaluation result from https://llvm-compile-time-tracker.com/.

Full diff: https://github.com/llvm/llvm-project/pull/69886.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+4-1)
(added) llvm/test/Transforms/LoopVectorize/pr66616.ll (+35)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index a5d523894bef155..8c8b4a1e2c32f7a 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -3543,7 +3543,10 @@ void InnerLoopVectorizer::fixVectorizedLoop(VPTransformState &State,
 
   // Forget the original basic block.
   PSE.getSE()->forgetLoop(OrigLoop);
-  PSE.getSE()->forgetBlockAndLoopDispositions();
+  // Now the scalar loop doesn't dominate any of the loop exits.
+  for (auto *Block : OrigLoop->getBlocks())
+    for (auto &Inst : *Block)
+      PSE.getSE()->forgetValue(&Inst);
 
   // After vectorization, the exit blocks of the original loop will have
   // additional predecessors. Invalidate SCEVs for the exit phis in case SE
diff --git a/llvm/test/Transforms/LoopVectorize/pr66616.ll b/llvm/test/Transforms/LoopVectorize/pr66616.ll
new file mode 100644
index 000000000000000..8374cfacfb71ecc
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/pr66616.ll
@@ -0,0 +1,35 @@
+; RUN: opt -passes="require<scalar-evolution>,print<scalar-evolution>,loop-vectorize" --verify-scev --verify-scev-strict -S < %s | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; CHECK-LABEL: @pr66616(
+; CHECK: vector.body
+define void @pr66616() {
+entry:
+  br label %for.body
+
+for.cond5.preheader:                              ; preds = %for.body
+  br label %while.body.i
+
+for.body:                                         ; preds = %for.body, %entry
+  %storemerge12 = phi i8 [ 0, %entry ], [ %inc, %for.body ]
+  %0 = load i8, ptr null, align 1
+  %conv2 = sext i8 %0 to i32
+  %add3 = add i32 %conv2, 1
+  %inc = add i8 %storemerge12, 1
+  %conv1 = zext i8 %inc to i32
+  %tobool.not = icmp eq i32 %conv1, 0
+  br i1 %tobool.not, label %for.cond5.preheader, label %for.body
+
+while.body.i:                                     ; preds = %while.body.i, %for.cond5.preheader
+  %i.addr.09.i = phi i32 [ %dec.i, %while.body.i ], [ %add3, %for.cond5.preheader ]
+  %incdec.ptr48.i = phi ptr [ %incdec.ptr.i, %while.body.i ], [ null, %for.cond5.preheader ]
+  %dec.i = add i32 %i.addr.09.i, 1
+  %incdec.ptr.i = getelementptr i8, ptr %incdec.ptr48.i, i64 1
+  %tobool.not.i = icmp eq i32 %i.addr.09.i, 0
+  br i1 %tobool.not.i, label %for.inc9.loopexit, label %while.body.i
+
+for.inc9.loopexit:                                ; preds = %while.body.i
+  ret void
+}

dtcxzyw · 2023-10-22T17:46:18Z

Looks like it doesn't have an impact on the compile time.
https://llvm-compile-time-tracker.com/compare.php?from=3d40411ee88958864316ff9afb23ec2d0166b285&to=72b287aca006db378d0d8c81c0401edd9a6563d5&stat=instructions

fhahn

Thanks for investigating this!

Could you provide more context about the failure in the description? form the issue, it looks like the crash is bool llvm::ScalarEvolution::isLoopEntryGuardedByCond(const Loop *, ICmpInst::Predicate, const SCEV *, const SCEV *): Assertion isAvailableAtLoopEntry(LHS, L) && "LHS is not available at Loop Entry"`, correct?

In that case, is it possible that we kept around some invalid block & loop dispositions?

llvm/test/Transforms/LoopVectorize/pr66616.ll

nikic

I don't think this is the correct solution, but given the lack of root cause analysis in the description I cannot suggest what the correct one would be.

llvm/test/Transforms/LoopVectorize/pr66616.ll

dtcxzyw · 2023-10-23T09:10:54Z

Thanks for investigating this!

Could you provide more context about the failure in the description? form the issue, it looks like the crash is bool llvm::ScalarEvolution::isLoopEntryGuardedByCond(const Loop *, ICmpInst::Predicate, const SCEV *, const SCEV *): Assertion isAvailableAtLoopEntry(LHS, L) && "LHS is not available at Loop Entry"`, correct?

IR after vectorizing the loop %for.body<header><latch><exiting>:

define void @pr66616() {
entry:
  br i1 false, label %scalar.ph, label %vector.ph

vector.ph:                                        ; preds = %entry
  br label %vector.body

vector.body:                                      ; preds = %vector.body, %vector.ph
  %index = phi i32 [ 0, %vector.ph ], [ %index.next, %vector.body ]
  %0 = load i8, ptr null, align 1
  %broadcast.splatinsert = insertelement <16 x i8> poison, i8 %0, i64 0
  %broadcast.splat = shufflevector <16 x i8> %broadcast.splatinsert, <16 x i8> poison, <16 x i32> zeroinitializer
  %1 = sext <16 x i8> %broadcast.splat to <16 x i32>
  %2 = add <16 x i32> %1, <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>
  %index.next = add nuw i32 %index, 16
  %3 = icmp eq i32 %index.next, 256
  br i1 %3, label %middle.block, label %vector.body, !llvm.loop !0

middle.block:                                     ; preds = %vector.body
  %4 = extractelement <16 x i32> %2, i32 15
  br i1 true, label %for.cond5.preheader, label %scalar.ph

scalar.ph:                                        ; preds = %entry, %middle.block
  %bc.resume.val = phi i8 [ 0, %middle.block ], [ 0, %entry ]
  br label %for.body

for.cond5.preheader:                              ; preds = %middle.block, %for.body
  %add3.lcssa = phi i32 [ %add3, %for.body ], [ %4, %middle.block ]
  br label %while.body.i

for.body:                                         ; preds = %for.body, %scalar.ph
  %storemerge12 = phi i8 [ %bc.resume.val, %scalar.ph ], [ %inc, %for.body ]
  %5 = load i8, ptr null, align 1
  %conv2 = sext i8 %5 to i32
  %add3 = add i32 %conv2, 1
  %inc = add i8 %storemerge12, 1
  %conv1 = zext i8 %inc to i32
  %tobool.not = icmp eq i32 %conv1, 0
  br i1 %tobool.not, label %for.cond5.preheader, label %for.body, !llvm.loop !3

while.body.i:                                     ; preds = %while.body.i, %for.cond5.preheader
  %i.addr.09.i = phi i32 [ %dec.i, %while.body.i ], [ %add3.lcssa, %for.cond5.preheader ]
  %incdec.ptr48.i = phi ptr [ %incdec.ptr.i, %while.body.i ], [ null, %for.cond5.preheader ]
  %dec.i = add i32 %i.addr.09.i, 1
  %incdec.ptr.i = getelementptr i8, ptr %incdec.ptr48.i, i64 1
  %tobool.not.i = icmp eq i32 %i.addr.09.i, 0
  br i1 %tobool.not.i, label %for.inc9.loopexit, label %while.body.i

for.inc9.loopexit:                                ; preds = %while.body.i
  ret void
}

Without this patch, the trip count for loop %while.body.i<header><latch><exiting> is:
(-1 + (-1 * (sext i8 %5 to i32))<nsw>)<nsw>
However, we cannot access %5 in BB %while.body.i since %for.body doesn't dominate %while.body.i.
The correct trip count should be (-1 * %add3.lcssa)<nsw>.

In that case, is it possible that we kept around some invalid block & loop dispositions?
forgetValue also cleanups block & loop dispositions.

I will update the test later.

dtcxzyw · 2023-10-23T09:34:59Z

Simplified test: https://godbolt.org/z/36cqs16os

…orization Simplify tests.

nikic · 2023-10-23T10:02:04Z

Thanks for the explanation. As I understand it, the root issue is that SCEV looks through trivial LCSSA phi nodes, and we currently fail to fully invalidate when they become non-trivial. If this understand is correct, I believe it would be sufficient to invalidate SCEVUnknowns used by the LCSSA phi and defined in the loop. I've put up #69909 based on that idea.

dtcxzyw · 2023-10-23T10:58:58Z

Closed as #69909 is better.

Fix this an alternative to llvm#69886. The basic problem is that SCEV can look through trivial LCSSA phis. When the phi node later becomes non-trivial, we do invalidate it, but this doesn't catch uses that are not covered by the IR use-def walk, such as thoe in BECounts. Fix this by adding a special invalidation method for LCSSA phis, which will also invalidate all the SCEVUnknowns used by the LCSSA phi node. We should probably also use this invalidation method in other places that add predecessors to exit blocks, such as loop unroling and loop peeling.

This an alternative to #69886. The basic problem is that SCEV can look through trivial LCSSA phis. When the phi node later becomes non-trivial, we do invalidate it, but this doesn't catch uses that are not covered by the IR use-def walk, such as those in BECounts. Fix this by adding a special invalidation method for LCSSA phis, which will also invalidate all the SCEVUnknowns/SCEVAddRecExprs used by the LCSSA phi node and defined in the loop. We should probably also use this invalidation method in other places that add predecessors to exit blocks, such as loop unrolling and loop peeling. Fixes #69097. Fixes #66616. Fixes #63970.

This an alternative to llvm#69886. The basic problem is that SCEV can look through trivial LCSSA phis. When the phi node later becomes non-trivial, we do invalidate it, but this doesn't catch uses that are not covered by the IR use-def walk, such as those in BECounts. Fix this by adding a special invalidation method for LCSSA phis, which will also invalidate all the SCEVUnknowns/SCEVAddRecExprs used by the LCSSA phi node and defined in the loop. We should probably also use this invalidation method in other places that add predecessors to exit blocks, such as loop unrolling and loop peeling. Fixes llvm#69097. Fixes llvm#66616. Fixes llvm#63970.

[LV] Invalidate SCEV values in the scalar loop after loop vectorization

72b287a

dtcxzyw requested review from nikic and fhahn October 22, 2023 17:18

llvmbot added vectorizers llvm:transforms labels Oct 22, 2023

fhahn reviewed Oct 23, 2023

View reviewed changes

nikic reviewed Oct 23, 2023

View reviewed changes

llvm/test/Transforms/LoopVectorize/pr66616.ll Outdated Show resolved Hide resolved

fixup! [LV] Invalidate SCEV values in the scalar loop after loop vect…

56f3214

…orization Simplify tests.

nikic mentioned this pull request Oct 23, 2023

[SCEV][LV] Invalidate LCSSA exit phis more thoroughly #69909

Merged

dtcxzyw closed this Oct 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LV] Invalidate SCEV values in the scalar loop after loop vectorization #69886

[LV] Invalidate SCEV values in the scalar loop after loop vectorization #69886

Uh oh!

dtcxzyw commented Oct 22, 2023 •

edited

Loading

Uh oh!

llvmbot commented Oct 22, 2023

Uh oh!

dtcxzyw commented Oct 22, 2023

Uh oh!

fhahn left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nikic left a comment

Uh oh!

Uh oh!

dtcxzyw commented Oct 23, 2023

Uh oh!

dtcxzyw commented Oct 23, 2023

Uh oh!

nikic commented Oct 23, 2023

Uh oh!

dtcxzyw commented Oct 23, 2023

Uh oh!

Uh oh!

[LV] Invalidate SCEV values in the scalar loop after loop vectorization #69886

[LV] Invalidate SCEV values in the scalar loop after loop vectorization #69886

Uh oh!

Conversation

dtcxzyw commented Oct 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Oct 22, 2023

Uh oh!

dtcxzyw commented Oct 22, 2023

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dtcxzyw commented Oct 23, 2023

Uh oh!

dtcxzyw commented Oct 23, 2023

Uh oh!

nikic commented Oct 23, 2023

Uh oh!

dtcxzyw commented Oct 23, 2023

Uh oh!

Uh oh!

dtcxzyw commented Oct 22, 2023 •

edited

Loading