Skip to content

Commit df9ba13

Browse files
authored
[LV] Handle scalable VFs in optimizeForVFAndUF (#82669)
Given a scalable VF of the form <NumElts * VScale>, this patch adds the ability to discharge a backedge test for a loop whose trip count is between (NumElts, MinVScale*NumElts). A couple of notes on this: * Annoyingly, I could not figure out to write a test for this case. My attempt is checked in as test32_i8 in f67ef1a, but LV uses a fixed vector in that case, and ignored the force flags. * This depends on 9eb5f94 to avoid appearing like a regression. Since SCEV doesn't know any upper bound on vscale without the vscale_range attribute (it doesn't query TTI), the ranges overflow on the multiply. Arguably, this is fixing a bug in the current LV code since in theory vscale can be large enough to overflow for real, but no actual target is going to see that case.
1 parent 6325dd5 commit df9ba13

File tree

3 files changed

+10
-2
lines changed

3 files changed

+10
-2
lines changed

llvm/include/llvm/Analysis/ScalarEvolution.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -570,6 +570,7 @@ class ScalarEvolution {
570570
const SCEV *getPtrToIntExpr(const SCEV *Op, Type *Ty);
571571
const SCEV *getTruncateExpr(const SCEV *Op, Type *Ty, unsigned Depth = 0);
572572
const SCEV *getVScale(Type *Ty);
573+
const SCEV *getElementCount(Type *Ty, ElementCount EC);
573574
const SCEV *getZeroExtendExpr(const SCEV *Op, Type *Ty, unsigned Depth = 0);
574575
const SCEV *getZeroExtendExprImpl(const SCEV *Op, Type *Ty,
575576
unsigned Depth = 0);

llvm/lib/Analysis/ScalarEvolution.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -509,6 +509,13 @@ const SCEV *ScalarEvolution::getVScale(Type *Ty) {
509509
return S;
510510
}
511511

512+
const SCEV *ScalarEvolution::getElementCount(Type *Ty, ElementCount EC) {
513+
const SCEV *Res = getConstant(Ty, EC.getKnownMinValue());
514+
if (EC.isScalable())
515+
Res = getMulExpr(Res, getVScale(Ty));
516+
return Res;
517+
}
518+
512519
SCEVCastExpr::SCEVCastExpr(const FoldingSetNodeIDRef ID, SCEVTypes SCEVTy,
513520
const SCEV *op, Type *ty)
514521
: SCEV(ID, SCEVTy, computeExpressionSize(op)), Op(op), Ty(ty) {}

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -618,8 +618,8 @@ void VPlanTransforms::optimizeForVFAndUF(VPlan &Plan, ElementCount BestVF,
618618
Plan.getCanonicalIV()->getStartValue()->getLiveInIRValue()->getType();
619619
const SCEV *TripCount = createTripCountSCEV(IdxTy, PSE);
620620
ScalarEvolution &SE = *PSE.getSE();
621-
const SCEV *C =
622-
SE.getConstant(TripCount->getType(), BestVF.getKnownMinValue() * BestUF);
621+
ElementCount NumElements = BestVF.multiplyCoefficientBy(BestUF);
622+
const SCEV *C = SE.getElementCount(TripCount->getType(), NumElements);
623623
if (TripCount->isZero() ||
624624
!SE.isKnownPredicate(CmpInst::ICMP_ULE, TripCount, C))
625625
return;

0 commit comments

Comments
 (0)