Skip to content

[LoopVectorize] In LoopVectorize.cpp start using getSymbolicMaxBackedgeTakenCount #108833

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 2, 2024

Conversation

david-arm
Copy link
Contributor

LoopVectorizationLegality currently only treats a loop as legal to vectorise
if PredicatedScalarEvolution::getBackedgeTakenCount returns a valid SCEV, or
more precisely that the loop must have an exact backedge taken count.
Therefore, in LoopVectorize.cpp we can safely replace all calls to
getBackedgeTakenCount with calls to getSymbolicMaxBackedgeTakenCount, since
the result is the same.

This also helps prepare the loop vectoriser for PR #88385.

@llvmbot
Copy link
Member

llvmbot commented Sep 16, 2024

@llvm/pr-subscribers-llvm-transforms

Author: David Sherwood (david-arm)

Changes

LoopVectorizationLegality currently only treats a loop as legal to vectorise
if PredicatedScalarEvolution::getBackedgeTakenCount returns a valid SCEV, or
more precisely that the loop must have an exact backedge taken count.
Therefore, in LoopVectorize.cpp we can safely replace all calls to
getBackedgeTakenCount with calls to getSymbolicMaxBackedgeTakenCount, since
the result is the same.

This also helps prepare the loop vectoriser for PR #88385.


Full diff: https://github.com/llvm/llvm-project/pull/108833.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+3-3)
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index f726b171969a30..5ab0fd12c538f3 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -907,7 +907,7 @@ Value *getRuntimeVF(IRBuilderBase &B, Type *Ty, ElementCount VF) {
 
 const SCEV *createTripCountSCEV(Type *IdxTy, PredicatedScalarEvolution &PSE,
                                 Loop *OrigLoop) {
-  const SCEV *BackedgeTakenCount = PSE.getBackedgeTakenCount();
+  const SCEV *BackedgeTakenCount = PSE.getSymbolicMaxBackedgeTakenCount();
   assert(!isa<SCEVCouldNotCompute>(BackedgeTakenCount) && "Invalid loop count");
 
   ScalarEvolution &SE = *PSE.getSE();
@@ -4090,7 +4090,7 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
     unsigned MaxVFtimesIC =
         UserIC ? *MaxPowerOf2RuntimeVF * UserIC : *MaxPowerOf2RuntimeVF;
     ScalarEvolution *SE = PSE.getSE();
-    const SCEV *BackedgeTakenCount = PSE.getBackedgeTakenCount();
+    const SCEV *BackedgeTakenCount = PSE.getSymbolicMaxBackedgeTakenCount();
     const SCEV *ExitCount = SE->getAddExpr(
         BackedgeTakenCount, SE->getOne(BackedgeTakenCount->getType()));
     const SCEV *Rem = SE->getURemExpr(
@@ -9584,7 +9584,7 @@ static bool processLoopInVPlanNativePath(
     ProfileSummaryInfo *PSI, LoopVectorizeHints &Hints,
     LoopVectorizationRequirements &Requirements) {
 
-  if (isa<SCEVCouldNotCompute>(PSE.getBackedgeTakenCount())) {
+  if (isa<SCEVCouldNotCompute>(PSE.getSymbolicMaxBackedgeTakenCount())) {
     LLVM_DEBUG(dbgs() << "LV: cannot compute the outer-loop trip count\n");
     return false;
   }

Copy link
Collaborator

@paulwalker-arm paulwalker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might be taking the "create small patches" idea a bit too far but I suppose this change does ensure LoopVectorize is not relying on the more restricted behaviour of getBackedgeTakenCount().

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might be taking the "create small patches" idea a bit too far but I suppose this change does ensure LoopVectorize is not relying on the more restricted behaviour of getBackedgeTakenCount().

Yeah, I think as is this may be more confusing for the reader, at least without a comment. Might also clarify to with an assertion that getBackedgeTakenCount is not SCEVCouldNotCompute....

@@ -9584,7 +9584,7 @@ static bool processLoopInVPlanNativePath(
ProfileSummaryInfo *PSI, LoopVectorizeHints &Hints,
LoopVectorizationRequirements &Requirements) {

if (isa<SCEVCouldNotCompute>(PSE.getBackedgeTakenCount())) {
if (isa<SCEVCouldNotCompute>(PSE.getSymbolicMaxBackedgeTakenCount())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the change the behavior to allow vectorizing loops where getBackedgeTakenCount is SCEVCouldNotCompute? May need a test case and would be an unintended change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. I was being too eager! I looked and there is no defence against loops with uncountable exits, since LoopVectorizationLegality doesn't filter those out in advance. I've reverted this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add a test case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given my patch no longer changes this code are you asking if I can add a negative test for the native vplan case that proves we don't vectorise outer loops with an early exit? I can have a look, although I'm not sure that's even possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, not sure if this is what you wanted or not but I added a test for an outer loop with an early exit. It didn't seem to be covered by any existing tests, although the legality code bails out too early to even attempt calculating a backedge taken count, due to the form of the conditional branch that is rejected by canVectorizeOuterLoop. I hacked the legality code and verified that we can get a symbolic max backedge take count for the outer loop, but not a normal one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can add a negative test for the native vplan case that proves we don't vectorise outer loops with an early exit?

yes that was what I had in mind to guard against the regression in the future, thanks!

@david-arm
Copy link
Contributor Author

I've added comments explaining why it's ok to use the symbolic max variant. I also fixed rename a variable in createInitialVPlan that was shadowing a variable of the same name in the VPlan class.

Comment on lines 4078 to 4080
// Currently only loops with countable exits are vectorized so it's safe to
// use getSymbolicMaxBackedgeTakenCount as it should give the same result
// as getBackedgeTakenCount.
Copy link
Collaborator

@paulwalker-arm paulwalker-arm Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me the comment suggests when uncountable exists are support calling getSymbolicMaxBackedgeTakenCount becomes unsafe? which I think is the opposite of what you mean.

Perhaps something akin to:

Currently only loops with countable exits are vectorized, but calling getSymbolicMaxBackedgeTakenCount allows enablement work for loops with uncountable exits whilst also ensuring the symbolic maximum and known back-edge taken count remain identical for loops with countable exits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a good suggestion! I've rebased the patch to see if the libc++ tests start passing now, and updated the comment. Two for the price of one. :)

…kenCount

LoopVectorizationLegality currently only treats a loop as legal
to vectorise if PredicatedScalarEvolution::getBackedgeTakenCount
returns a valid SCEV, or more precisely that the loop must have
an exact backedge taken count. Therefore, in LoopVectorize.cpp
we can safely replace all calls to getBackedgeTakenCount with
calls to getSymbolicMaxBackedgeTakenCount, since the result is
the same.

This also helps prepare the loop vectoriser for PR llvm#88385.
* Updated comments around calls to getSymbolicMaxBackedgeTakenCount
* Updated assert.
* Add outer loop vectorisation test that has an early exit in the
outer loop. I checked and there are no existing tests for this. It
fails in LoopVectorizationLegality so we never reach the point
in LoopVectorize.cpp where attempt to calculate the backedge taken
count.
Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

br i1 %cmp.early, label %for.early, label %for.body.inner

for.body.inner:
%indvars.iv = phi i64 [ 0, %for.body ], [ %indvars.iv.next, %for.body.inner ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
%indvars.iv = phi i64 [ 0, %for.body ], [ %indvars.iv.next, %for.body.inner ]
%iv = phi i64 [ 0, %for.body ], [ %indvars.iv.next, %for.body.inner ]

@@ -9584,7 +9584,7 @@ static bool processLoopInVPlanNativePath(
ProfileSummaryInfo *PSI, LoopVectorizeHints &Hints,
LoopVectorizationRequirements &Requirements) {

if (isa<SCEVCouldNotCompute>(PSE.getBackedgeTakenCount())) {
if (isa<SCEVCouldNotCompute>(PSE.getSymbolicMaxBackedgeTakenCount())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can add a negative test for the native vplan case that proves we don't vectorise outer loops with an early exit?

yes that was what I had in mind to guard against the regression in the future, thanks!

@david-arm david-arm merged commit 0b24031 into llvm:main Oct 2, 2024
8 checks passed
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
…geTakenCount (llvm#108833)

LoopVectorizationLegality currently only treats a loop as legal to vectorise
if PredicatedScalarEvolution::getBackedgeTakenCount returns a valid
SCEV, or more precisely that the loop must have an exact backedge taken
count. Therefore, in LoopVectorize.cpp we can safely replace all calls to
getBackedgeTakenCount with calls to getSymbolicMaxBackedgeTakenCount,
since the result is the same.

This also helps prepare the loop vectoriser for PR llvm#88385.
@david-arm david-arm deleted the create_tc branch October 3, 2024 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants