[LoopVectorize] In LoopVectorize.cpp start using getSymbolicMaxBackedgeTakenCount #108833

david-arm · 2024-09-16T14:16:22Z

LoopVectorizationLegality currently only treats a loop as legal to vectorise
if PredicatedScalarEvolution::getBackedgeTakenCount returns a valid SCEV, or
more precisely that the loop must have an exact backedge taken count.
Therefore, in LoopVectorize.cpp we can safely replace all calls to
getBackedgeTakenCount with calls to getSymbolicMaxBackedgeTakenCount, since
the result is the same.

This also helps prepare the loop vectoriser for PR #88385.

llvmbot · 2024-09-16T14:16:56Z

@llvm/pr-subscribers-llvm-transforms

Author: David Sherwood (david-arm)

Changes

LoopVectorizationLegality currently only treats a loop as legal to vectorise
if PredicatedScalarEvolution::getBackedgeTakenCount returns a valid SCEV, or
more precisely that the loop must have an exact backedge taken count.
Therefore, in LoopVectorize.cpp we can safely replace all calls to
getBackedgeTakenCount with calls to getSymbolicMaxBackedgeTakenCount, since
the result is the same.

This also helps prepare the loop vectoriser for PR #88385.

Full diff: https://github.com/llvm/llvm-project/pull/108833.diff

1 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+3-3)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index f726b171969a30..5ab0fd12c538f3 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -907,7 +907,7 @@ Value *getRuntimeVF(IRBuilderBase &B, Type *Ty, ElementCount VF) {
 
 const SCEV *createTripCountSCEV(Type *IdxTy, PredicatedScalarEvolution &PSE,
                                 Loop *OrigLoop) {
-  const SCEV *BackedgeTakenCount = PSE.getBackedgeTakenCount();
+  const SCEV *BackedgeTakenCount = PSE.getSymbolicMaxBackedgeTakenCount();
   assert(!isa<SCEVCouldNotCompute>(BackedgeTakenCount) && "Invalid loop count");
 
   ScalarEvolution &SE = *PSE.getSE();
@@ -4090,7 +4090,7 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
     unsigned MaxVFtimesIC =
         UserIC ? *MaxPowerOf2RuntimeVF * UserIC : *MaxPowerOf2RuntimeVF;
     ScalarEvolution *SE = PSE.getSE();
-    const SCEV *BackedgeTakenCount = PSE.getBackedgeTakenCount();
+    const SCEV *BackedgeTakenCount = PSE.getSymbolicMaxBackedgeTakenCount();
     const SCEV *ExitCount = SE->getAddExpr(
         BackedgeTakenCount, SE->getOne(BackedgeTakenCount->getType()));
     const SCEV *Rem = SE->getURemExpr(
@@ -9584,7 +9584,7 @@ static bool processLoopInVPlanNativePath(
     ProfileSummaryInfo *PSI, LoopVectorizeHints &Hints,
     LoopVectorizationRequirements &Requirements) {
 
-  if (isa<SCEVCouldNotCompute>(PSE.getBackedgeTakenCount())) {
+  if (isa<SCEVCouldNotCompute>(PSE.getSymbolicMaxBackedgeTakenCount())) {
     LLVM_DEBUG(dbgs() << "LV: cannot compute the outer-loop trip count\n");
     return false;
   }

paulwalker-arm

I think you might be taking the "create small patches" idea a bit too far but I suppose this change does ensure LoopVectorize is not relying on the more restricted behaviour of getBackedgeTakenCount().

fhahn

I think you might be taking the "create small patches" idea a bit too far but I suppose this change does ensure LoopVectorize is not relying on the more restricted behaviour of getBackedgeTakenCount().

Yeah, I think as is this may be more confusing for the reader, at least without a comment. Might also clarify to with an assertion that getBackedgeTakenCount is not SCEVCouldNotCompute....

fhahn · 2024-09-18T09:39:02Z

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

@@ -9584,7 +9584,7 @@ static bool processLoopInVPlanNativePath(
    ProfileSummaryInfo *PSI, LoopVectorizeHints &Hints,
    LoopVectorizationRequirements &Requirements) {

-  if (isa<SCEVCouldNotCompute>(PSE.getBackedgeTakenCount())) {
+  if (isa<SCEVCouldNotCompute>(PSE.getSymbolicMaxBackedgeTakenCount())) {


Does the change the behavior to allow vectorizing loops where getBackedgeTakenCount is SCEVCouldNotCompute? May need a test case and would be an unintended change?

You're absolutely right. I was being too eager! I looked and there is no defence against loops with uncountable exits, since LoopVectorizationLegality doesn't filter those out in advance. I've reverted this change.

Would it be possible to add a test case?

Given my patch no longer changes this code are you asking if I can add a negative test for the native vplan case that proves we don't vectorise outer loops with an early exit? I can have a look, although I'm not sure that's even possible.

Anyway, not sure if this is what you wanted or not but I added a test for an outer loop with an early exit. It didn't seem to be covered by any existing tests, although the legality code bails out too early to even attempt calculating a backedge taken count, due to the form of the conditional branch that is rejected by canVectorizeOuterLoop. I hacked the legality code and verified that we can get a symbolic max backedge take count for the outer loop, but not a normal one.

can add a negative test for the native vplan case that proves we don't vectorise outer loops with an early exit?

yes that was what I had in mind to guard against the regression in the future, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

david-arm · 2024-09-27T12:19:34Z

I've added comments explaining why it's ok to use the symbolic max variant. I also fixed rename a variable in createInitialVPlan that was shadowing a variable of the same name in the VPlan class.

paulwalker-arm · 2024-09-27T14:24:41Z

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

+    // Currently only loops with countable exits are vectorized so it's safe to
+    // use getSymbolicMaxBackedgeTakenCount as it should give the same result
+    // as getBackedgeTakenCount.


To me the comment suggests when uncountable exists are support calling getSymbolicMaxBackedgeTakenCount becomes unsafe? which I think is the opposite of what you mean.

Perhaps something akin to:

Currently only loops with countable exits are vectorized, but calling getSymbolicMaxBackedgeTakenCount allows enablement work for loops with uncountable exits whilst also ensuring the symbolic maximum and known back-edge taken count remain identical for loops with countable exits.

Seems like a good suggestion! I've rebased the patch to see if the libc++ tests start passing now, and updated the comment. Two for the price of one. :)

…kenCount LoopVectorizationLegality currently only treats a loop as legal to vectorise if PredicatedScalarEvolution::getBackedgeTakenCount returns a valid SCEV, or more precisely that the loop must have an exact backedge taken count. Therefore, in LoopVectorize.cpp we can safely replace all calls to getBackedgeTakenCount with calls to getSymbolicMaxBackedgeTakenCount, since the result is the same. This also helps prepare the loop vectoriser for PR llvm#88385.

* Updated comments around calls to getSymbolicMaxBackedgeTakenCount

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

* Updated assert. * Add outer loop vectorisation test that has an early exit in the outer loop. I checked and there are no existing tests for this. It fails in LoopVectorizationLegality so we never reach the point in LoopVectorize.cpp where attempt to calculate the backedge taken count.

fhahn

LGTM, thanks!

llvm/test/Transforms/LoopVectorize/outer_loop_early_exit.ll

fhahn · 2024-10-01T19:59:19Z

llvm/test/Transforms/LoopVectorize/outer_loop_early_exit.ll

+  br i1 %cmp.early, label %for.early, label %for.body.inner
+
+for.body.inner:
+  %indvars.iv = phi i64 [ 0, %for.body ], [ %indvars.iv.next, %for.body.inner ]


Suggested change

%indvars.iv = phi i64 [ 0, %for.body ], [ %indvars.iv.next, %for.body.inner ]

%iv = phi i64 [ 0, %for.body ], [ %indvars.iv.next, %for.body.inner ]

fhahn · 2024-10-01T20:01:41Z

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

@@ -9584,7 +9584,7 @@ static bool processLoopInVPlanNativePath(
    ProfileSummaryInfo *PSI, LoopVectorizeHints &Hints,
    LoopVectorizationRequirements &Requirements) {

-  if (isa<SCEVCouldNotCompute>(PSE.getBackedgeTakenCount())) {
+  if (isa<SCEVCouldNotCompute>(PSE.getSymbolicMaxBackedgeTakenCount())) {


can add a negative test for the native vplan case that proves we don't vectorise outer loops with an early exit?

yes that was what I had in mind to guard against the regression in the future, thanks!

…geTakenCount (llvm#108833) LoopVectorizationLegality currently only treats a loop as legal to vectorise if PredicatedScalarEvolution::getBackedgeTakenCount returns a valid SCEV, or more precisely that the loop must have an exact backedge taken count. Therefore, in LoopVectorize.cpp we can safely replace all calls to getBackedgeTakenCount with calls to getSymbolicMaxBackedgeTakenCount, since the result is the same. This also helps prepare the loop vectoriser for PR llvm#88385.

david-arm requested review from fhahn, paulwalker-arm and huntergr-arm September 16, 2024 14:16

llvmbot added vectorizers llvm:transforms labels Sep 16, 2024

paulwalker-arm approved these changes Sep 18, 2024

View reviewed changes

fhahn reviewed Sep 18, 2024

View reviewed changes

david-arm force-pushed the create_tc branch from e0dc97f to 459661e Compare September 27, 2024 12:17

paulwalker-arm reviewed Sep 27, 2024

View reviewed changes

david-arm added 2 commits September 30, 2024 12:12

Address review comment

d12dbae

* Updated comments around calls to getSymbolicMaxBackedgeTakenCount

david-arm force-pushed the create_tc branch from 459661e to d12dbae Compare September 30, 2024 12:13

paulwalker-arm approved these changes Sep 30, 2024

View reviewed changes

fhahn reviewed Sep 30, 2024

View reviewed changes

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp Outdated Show resolved Hide resolved

david-arm added 2 commits October 1, 2024 09:11

Re-add "assert(!isa<SCEVCouldNotCompute>(BackedgeTakenCount)"

ef5f963

fhahn reviewed Oct 1, 2024

View reviewed changes

Tidy up the new test!

dd74b6b

david-arm merged commit 0b24031 into llvm:main Oct 2, 2024
8 checks passed

david-arm deleted the create_tc branch October 3, 2024 08:52

	%indvars.iv = phi i64 [ 0, %for.body ], [ %indvars.iv.next, %for.body.inner ]
	%iv = phi i64 [ 0, %for.body ], [ %indvars.iv.next, %for.body.inner ]

[LoopVectorize] In LoopVectorize.cpp start using getSymbolicMaxBackedgeTakenCount #108833

[LoopVectorize] In LoopVectorize.cpp start using getSymbolicMaxBackedgeTakenCount #108833

Uh oh!

Conversation

david-arm commented Sep 16, 2024

Uh oh!

llvmbot commented Sep 16, 2024

Uh oh!

paulwalker-arm left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

david-arm commented Sep 27, 2024

Uh oh!

paulwalker-arm Sep 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

paulwalker-arm Sep 27, 2024 •

edited

Loading