-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[LV] Handle scalable VFs in optimizeForVFAndUF #82669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LV] Handle scalable VFs in optimizeForVFAndUF #82669
Conversation
Given a scalable VF of the form <NumElts * VScale>, this patch adds the ability to discharge a backedge test for a loop whose trip count is between (NumElts, MinVScale*NumElts). A couple of notes on this: * Annoyingly, I could not figure out to write a test for this case. My attempt is checked in as test32_i8 in f67ef1a, but LV uses a fixed vector in that case, and ignored the force flags. * This depends on 9eb5f94 to avoid appearing like a regression. Since SCEV doesn't know any upper bound on vscale without the vscale_range attribute (it doesn't query TTI), the ranges overflow on the multiply. Arguably, this is fixing a bug in the current LV code since in theory vscale can be large enough to overflow for real, but no actual target is going to see that case.
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-llvm-analysis Author: Philip Reames (preames) ChangesGiven a scalable VF of the form <NumElts * VScale>, this patch adds the ability to discharge a backedge test for a loop whose trip count is between (NumElts, MinVScale*NumElts). A couple of notes on this:
Full diff: https://github.com/llvm/llvm-project/pull/82669.diff 3 Files Affected:
diff --git a/llvm/include/llvm/Analysis/ScalarEvolution.h b/llvm/include/llvm/Analysis/ScalarEvolution.h
index 0880f9c65aa45d..5828cc156cc785 100644
--- a/llvm/include/llvm/Analysis/ScalarEvolution.h
+++ b/llvm/include/llvm/Analysis/ScalarEvolution.h
@@ -570,6 +570,7 @@ class ScalarEvolution {
const SCEV *getPtrToIntExpr(const SCEV *Op, Type *Ty);
const SCEV *getTruncateExpr(const SCEV *Op, Type *Ty, unsigned Depth = 0);
const SCEV *getVScale(Type *Ty);
+ const SCEV *getElementCount(Type *Ty, ElementCount EC);
const SCEV *getZeroExtendExpr(const SCEV *Op, Type *Ty, unsigned Depth = 0);
const SCEV *getZeroExtendExprImpl(const SCEV *Op, Type *Ty,
unsigned Depth = 0);
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 4b2db80bc1ec30..e1e6742e50efec 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -509,6 +509,13 @@ const SCEV *ScalarEvolution::getVScale(Type *Ty) {
return S;
}
+const SCEV *ScalarEvolution::getElementCount(Type *Ty, ElementCount EC) {
+ const SCEV *Res = getConstant(Ty, EC.getKnownMinValue());
+ if (EC.isScalable())
+ Res = getMulExpr(Res, getVScale(Ty));
+ return Res;
+}
+
SCEVCastExpr::SCEVCastExpr(const FoldingSetNodeIDRef ID, SCEVTypes SCEVTy,
const SCEV *op, Type *ty)
: SCEV(ID, SCEVTy, computeExpressionSize(op)), Op(op), Ty(ty) {}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 9c3f35112b592f..a01eaa3c6c8b3a 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -626,8 +626,8 @@ void VPlanTransforms::optimizeForVFAndUF(VPlan &Plan, ElementCount BestVF,
Plan.getCanonicalIV()->getStartValue()->getLiveInIRValue()->getType();
const SCEV *TripCount = createTripCountSCEV(IdxTy, PSE);
ScalarEvolution &SE = *PSE.getSE();
- const SCEV *C =
- SE.getConstant(TripCount->getType(), BestVF.getKnownMinValue() * BestUF);
+ ElementCount NumElements = BestVF.multiplyCoefficientBy(BestUF);
+ const SCEV *C = SE.getElementCount(TripCount->getType(), NumElements);
if (TripCount->isZero() ||
!SE.isKnownPredicate(CmpInst::ICMP_ULE, TripCount, C))
return;
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Given a scalable VF of the form <NumElts * VScale>, this patch adds the ability to discharge a backedge test for a loop whose trip count is between (NumElts, MinVScale*NumElts).
A couple of notes on this: