-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[VPlan] Add getSCEVExprForVPValue util, use to get trip count SCEV (NFC) #94464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add an initial version of VPScalarEvolution, which can be used to return a SCEV expression for a VPValue. The initial implementation only returns SCEVs for live-in IR values (by constructing a SCEV based on the live-in IR value) and VPExpandSCEVRecipe. This is enough to serve its first use, getting a SCEV for a VPlan's trip count, but will be extended in the future.
@llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesAdd an initial version of VPScalarEvolution, which can be used to return a SCEV expression for a VPValue. The initial implementation only returns SCEVs for live-in IR values (by constructing a SCEV based on the live-in IR value) and VPExpandSCEVRecipe. This is enough to serve its first use, getting a SCEV for a VPlan's trip count, but will be extended in the future. Full diff: https://github.com/llvm/llvm-project/pull/94464.diff 5 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index c7c19ef456c7c..14197259a9f98 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -978,8 +978,9 @@ Value *getRuntimeVF(IRBuilderBase &B, Type *Ty, ElementCount VF) {
return B.CreateElementCount(Ty, VF);
}
-const SCEV *createTripCountSCEV(Type *IdxTy, PredicatedScalarEvolution &PSE,
- Loop *OrigLoop) {
+static const SCEV *createTripCountSCEV(Type *IdxTy,
+ PredicatedScalarEvolution &PSE,
+ Loop *OrigLoop) {
const SCEV *BackedgeTakenCount = PSE.getBackedgeTakenCount();
assert(!isa<SCEVCouldNotCompute>(BackedgeTakenCount) && "Invalid loop count");
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 943edc3520869..de37262d2c3de 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -79,9 +79,6 @@ Value *getRuntimeVF(IRBuilderBase &B, Type *Ty, ElementCount VF);
Value *createStepForVF(IRBuilderBase &B, Type *Ty, ElementCount VF,
int64_t Step);
-const SCEV *createTripCountSCEV(Type *IdxTy, PredicatedScalarEvolution &PSE,
- Loop *CurLoop = nullptr);
-
/// A range of powers-of-2 vectorization factors with fixed start and
/// adjustable end. The range includes start and excludes end, e.g.,:
/// [1, 16) = {1, 2, 4, 8}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
index 90bbf2d5d99fa..e9f01c1a334e1 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
@@ -9,6 +9,7 @@
#include "VPlanAnalysis.h"
#include "VPlan.h"
#include "llvm/ADT/TypeSwitch.h"
+#include "llvm/Analysis/ScalarEvolution.h"
using namespace llvm;
@@ -265,3 +266,15 @@ Type *VPTypeAnalysis::inferScalarType(const VPValue *V) {
CachedTypes[V] = ResultTy;
return ResultTy;
}
+
+const SCEV *VPScalarEvolution::getSCEV(VPValue *V) {
+ if (V->isLiveIn())
+ return SE.getSCEV(V->getLiveInIRValue());
+
+ // TODO: Support constructing SCEVs for more recipes as needed.
+ return TypeSwitch<const VPRecipeBase *, const SCEV *>(V->getDefiningRecipe())
+ .Case<VPExpandSCEVRecipe>(
+ [](const VPExpandSCEVRecipe *R) { return R->getSCEV(); })
+ .Default(
+ [this](const VPRecipeBase *) { return SE.getCouldNotCompute(); });
+}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h
index 7d310b1b31b6f..9d549778092c9 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h
@@ -23,6 +23,8 @@ class VPWidenIntOrFpInductionRecipe;
class VPWidenMemoryRecipe;
struct VPWidenSelectRecipe;
class VPReplicateRecipe;
+class SCEV;
+class ScalarEvolution;
class Type;
/// An analysis for type-inference for VPValues.
@@ -61,6 +63,19 @@ class VPTypeAnalysis {
LLVMContext &getContext() { return Ctx; }
};
+/// A light wrapper over ScalarEvolution to construct SCEV expressions for
+/// VPValues and recipes.
+class VPScalarEvolution {
+ ScalarEvolution &SE;
+
+public:
+ VPScalarEvolution(ScalarEvolution &SE) : SE(SE) {}
+
+ /// Return the SCEV expression for \p V. Returns SCEVCouldNotCompute if no
+ /// SCEV expression could be constructed.
+ const SCEV *getSCEV(VPValue *V);
+};
+
} // end namespace llvm
#endif // LLVM_TRANSFORMS_VECTORIZE_VPLANANALYSIS_H
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index ab3b5cf2b9dab..541246266e1d5 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -674,10 +674,8 @@ void VPlanTransforms::optimizeForVFAndUF(VPlan &Plan, ElementCount BestVF,
m_BranchOnCond(m_Not(m_ActiveLaneMask(m_VPValue(), m_VPValue())))))
return;
- Type *IdxTy =
- Plan.getCanonicalIV()->getStartValue()->getLiveInIRValue()->getType();
- const SCEV *TripCount = createTripCountSCEV(IdxTy, PSE);
ScalarEvolution &SE = *PSE.getSE();
+ const SCEV *TripCount = VPScalarEvolution(SE).getSCEV(Plan.getTripCount());
ElementCount NumElements = BestVF.multiplyCoefficientBy(BestUF);
const SCEV *C = SE.getElementCount(TripCount->getType(), NumElements);
if (TripCount->isZero() ||
|
This looks ok to me, but not yet clear how the new Will it just be a helper class? If so, why not a static method? If not, then how will it be composed across different VPlans and the analysis part? I don't have a strong opinion here, just trying to understand the direction. |
In the future, I am planning to extend it to support constructing SCEVs for any recipe (e.g. Users are expected to be spread across new analyses and transforms. It could also be a single |
Right, with spread usage comes passing the helper classes around, unless they're part of some other existing object that already gets passed around, or is in the global context. In an object, this could even be in the |
I think at this point it is not yet clear if it is desirable to add a common interface to VPValue/VPRecipeBase to get corresponding SCEVs, or if it makes more sense to have more specific interfaces returning more targeted information required to make vectorization decisions. This is why I think a lightweight helper functions for constructing trivial SCEVs would be a good start. WDYT? |
ping :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking of alternatives to consider for discussion.
ScalarEvolution &SE = *PSE.getSE(); | ||
const SCEV *TripCount = VPScalarEvolution(SE).getSCEV(Plan.getTripCount()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suffice to have, in this case, as an alternative,
const SCEV *TripCount = VPScalarEvolution(SE).getSCEV(Plan.getTripCount()); | |
VPValue *TripCountVPV = Plan.getTripCount(); | |
auto *Expand = dyn_cast<VPExpandSCEVRecipe>(TripCountVPV); | |
const SCEV *TripCount = Expand ? Expand->getSCEV() : SE.getSCEV(TripCountVPV->getLiveInIRValue()); |
asserting TripCountVPV is either an Expand or a live-in?
Perhaps could be folded into Plan.getTripCountSCEV(SE), or have vputils::getSCEVExprForVPValue() complement vputils::getOrCreateVPValueForSCEVExpr().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified by moving to a single function, but kept declaration and definition in VPAnalysis for now
ping :) |
ScalarEvolution &SE = *PSE.getSE(); | ||
const SCEV *TripCount = | ||
vputils::getSCEVExprForVPValue(Plan.getTripCount(), SE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, independently - BOC below would better use an unconditional branch than BranchOnCond with true
condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will also entail to remove the region and header phis.
@@ -26,6 +26,8 @@ struct VPWidenSelectRecipe; | |||
class VPReplicateRecipe; | |||
class VPRecipeBase; | |||
class VPlan; | |||
class SCEV; | |||
class ScalarEvolution; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: better placed above between LLVMContext and VPValue, along with Type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to VPlanUtils.h
/// Return the SCEV expression for \p V. Returns SCEVCouldNotCompute if no | ||
/// SCEV expression could be constructed. | ||
const SCEV *getSCEVExprForVPValue(VPValue *V, ScalarEvolution &SE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this more naturally belong to VPlan.h and VPlan.cpp, alongside other vputils functions, in particular the complementary vputils::getOrCreateVPValueForSCEVExpr()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VPlan.h
has grown quite a bit by now, so I think it would possibly make sense to split things up by functionality a bit (same for VPlan.cpp). it doesn't necessarily have to go into VPlanAnalysis.cpp
but could be moved into a separate VPlanUtils.h as an alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, +1 to reorganize all vputils
functions in files of their own.
Most of these would presumably promote to a type system of VPValues, or other refinements like introducing an extract-first-lane as only user to capture onlyFirstLaneUsed()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to VPlanUtils.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to VPlanUtils
@@ -26,6 +26,8 @@ struct VPWidenSelectRecipe; | |||
class VPReplicateRecipe; | |||
class VPRecipeBase; | |||
class VPlan; | |||
class SCEV; | |||
class ScalarEvolution; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to VPlanUtils.h
/// Return the SCEV expression for \p V. Returns SCEVCouldNotCompute if no | ||
/// SCEV expression could be constructed. | ||
const SCEV *getSCEVExprForVPValue(VPValue *V, ScalarEvolution &SE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to VPlanUtils.cpp
ScalarEvolution &SE = *PSE.getSE(); | ||
const SCEV *TripCount = | ||
vputils::getSCEVExprForVPValue(Plan.getTripCount(), SE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will also entail to remove the region and header phis.
ping :) |
ping :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this, thanks, with various minor comments, if @rengolin is.
const SCEV *vputils::getSCEVExprForVPValue(VPValue *V, ScalarEvolution &SE) { | ||
if (V->isLiveIn()) | ||
return SE.getSCEV(V->getLiveInIRValue()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to work well for Values at VPlan's boundary: can complement live-ins with live-out/VPIRInstructions which could also simply retrieve SE.getSCEV() of the instruction they wrap; as follow-up, along with concrete use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
@@ -67,6 +67,7 @@ class VPTypeAnalysis { | |||
// Collect a VPlan's ephemeral recipes (those used only by an assume). | |||
void collectEphemeralRecipesForVPlan(VPlan &Plan, | |||
DenseSet<VPRecipeBase *> &EphRecipes); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, thanks!
ScalarEvolution &SE = *PSE.getSE(); | ||
const SCEV *TripCount = | ||
vputils::getSCEVExprForVPValue(Plan.getTripCount(), SE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth asserting that the SCEV returned could be computed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, added assert.
Loop *OrigLoop) { | ||
static const SCEV *createTripCountSCEV(Type *IdxTy, | ||
PredicatedScalarEvolution &PSE, | ||
Loop *OrigLoop) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly here or as follow-up:
The call to createTripCountSCEV() in LVP::selectEpilogueVectorizationFactor() has available VPlans from which to retrieve a TripCount VPValue and call the new API instead of continuing to call createTripCountSCEV().
The other two calls to createTripCountSCEV() are inside calls to VPlan::createInitialVPlan(), setting its first parameter; should they be inlined into createInitialVPlan() itself instead?
Thereby retiring createTripCountSCEV altogether, leaving only vputils::getSCEVExprForVPValue(Plan.getTripCount(), SE) as the way to obtain the SCEV of a loop's trip count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated the PR to also inline createTripCountSCEV
.
|
||
/// Return the SCEV expression for \p V. Returns SCEVCouldNotCompute if no | ||
/// SCEV expression could be constructed. | ||
const SCEV *getSCEVExprForVPValue(VPValue *V, ScalarEvolution &SE); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This complements getOrCreateVPValueForSCEVExpr()
above and should be placed next to it.
(Otherwise would have suffice to call it getSCEVFor() or getSCEV(), as it currently delegates to SE.getSCEV() or SCEVExpandRecioe.getSCEV(). Could even have getOrCreateSCEVExprForVPValue()
more consistently, but getSCEV() may typically create them - in contrast to getExistingSCEV().)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, comments should addressed!
@rengolin WDYT about the latest version?
Loop *OrigLoop) { | ||
static const SCEV *createTripCountSCEV(Type *IdxTy, | ||
PredicatedScalarEvolution &PSE, | ||
Loop *OrigLoop) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated the PR to also inline createTripCountSCEV
.
@@ -67,6 +67,7 @@ class VPTypeAnalysis { | |||
// Collect a VPlan's ephemeral recipes (those used only by an assume). | |||
void collectEphemeralRecipesForVPlan(VPlan &Plan, | |||
DenseSet<VPRecipeBase *> &EphRecipes); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, thanks!
ScalarEvolution &SE = *PSE.getSE(); | ||
const SCEV *TripCount = | ||
vputils::getSCEVExprForVPValue(Plan.getTripCount(), SE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, added assert.
const SCEV *vputils::getSCEVExprForVPValue(VPValue *V, ScalarEvolution &SE) { | ||
if (V->isLiveIn()) | ||
return SE.getSCEV(V->getLiveInIRValue()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
|
||
/// Return the SCEV expression for \p V. Returns SCEVCouldNotCompute if no | ||
/// SCEV expression could be constructed. | ||
const SCEV *getSCEVExprForVPValue(VPValue *V, ScalarEvolution &SE); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the additional cleanup, adding few final nits.
bool RequiresScalarEpilogueCheck, | ||
bool TailFolded, Loop *TheLoop) { | ||
VPIRBasicBlock *Entry = createVPIRBasicBlockFor(TheLoop->getLoopPreheader()); | ||
VPBasicBlock *VecPreheader = new VPBasicBlock("vector.ph"); | ||
auto Plan = std::make_unique<VPlan>(Entry, VecPreheader); | ||
Plan->TripCount = | ||
vputils::getOrCreateVPValueForSCEVExpr(*Plan, TripCount, SE); | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: keeping the inlined block seems a bit odd, worth unboxing, or adding a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
: Builder.createICmp(CmpInst::ICMP_EQ, Plan->getTripCount(), | ||
&Plan->getVectorTripCount(), | ||
ScalarLatchTerm->getDebugLoc(), "cmp.n"); | ||
TailFolded ? Plan->getOrAddLiveIn(ConstantInt::getTrue( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if the inlined block is unboxed, this change can be avoided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undone, thanks!
@@ -3477,8 +3474,8 @@ class VPlan { | |||
/// middle VPBasicBlock. If a check is needed to guard executing the scalar | |||
/// epilogue loop, it will be added to the middle block, together with | |||
/// VPBasicBlocks for the scalar preheader and exit blocks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would be good to explain what IdxTy
stands for, and/or use a more descriptive name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to InductionTy
and documented, thanks!
…FC) (llvm#94464) Add a new getSCEVExprForVPValue utility which can be used to get a SCEV expression for a VPValue. The initial implementation only returns SCEVs for live-in IR values (by constructing a SCEV based on the live-in IR value) and VPExpandSCEVRecipe. This is enough to serve its first use, getting a SCEV for a VPlan's trip count, but will be extended in the future. It also removes createTripCountSCEV, as the new helper can be used to retrieve the SCEV from the VPlan. PR: llvm#94464
Add a new getSCEVExprForVPValue utility which can be used to get a SCEV expression for a VPValue. The initial implementation only returns SCEVs for live-in IR values (by constructing a SCEV based on the live-in IR value) and VPExpandSCEVRecipe. This is enough to serve its first use, getting a SCEV for a VPlan's trip count, but will be extended in the future.
It also removes createTripCountSCEV, as the new helper can be used to retrieve the SCEV from the VPlan.