-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[VPlan] Move auxiliary declarations out of VPlan.h (NFC). #124104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-vectorizers Author: Florian Hahn (fhahn) ChangesNothing in VPlan.h directly depends on VPTransformState, VPCostContext, VPFRange, VPlanPrinter or VPSlotTracker. Move them out to a separate header to reduce the size of widely used VPlan.h. This is a first step towards more cleanly separating declarations in VPlan. Besides reducing VPlan.h's size, this also allows including additional VPlan-related headers in VPlanHelpers.h for use there. An example is using VPDominatorTree in VPTransformState Patch is 58.96 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/124104.diff 13 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h b/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
index bc44ec11edb7b0..fc10a518d39ef8 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
@@ -40,6 +40,7 @@ class OptimizationRemarkEmitter;
class TargetTransformInfo;
class TargetLibraryInfo;
class VPRecipeBuilder;
+struct VFRange;
/// VPlan-based builder utility analogous to IRBuilder.
class VPBuilder {
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 7167e2179af535..777822d71e80a5 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -59,6 +59,7 @@
#include "VPlan.h"
#include "VPlanAnalysis.h"
#include "VPlanHCFGBuilder.h"
+#include "VPlanHelpers.h"
#include "VPlanPatternMatch.h"
#include "VPlanTransforms.h"
#include "VPlanUtils.h"
diff --git a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
index 44745bfd46f891..5dc84a9132160e 100644
--- a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
+++ b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
@@ -23,6 +23,7 @@ class LoopVectorizationCostModel;
class TargetLibraryInfo;
class TargetTransformInfo;
struct HistogramInfo;
+struct VFRange;
/// A chain of instructions that form a partial reduction.
/// Designed to match: reduction_bin_op (bin_op (extend (A), (extend (B))),
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index f1228368804beb..6cd88897accd43 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -19,6 +19,7 @@
#include "VPlan.h"
#include "LoopVectorizationPlanner.h"
#include "VPlanCFG.h"
+#include "VPlanHelpers.h"
#include "VPlanPatternMatch.h"
#include "VPlanTransforms.h"
#include "VPlanUtils.h"
@@ -400,8 +401,8 @@ void VPTransformState::packScalarIntoVectorValue(VPValue *Def,
set(Def, VectorValue);
}
-BasicBlock *
-VPBasicBlock::createEmptyBasicBlock(VPTransformState::CFGState &CFG) {
+BasicBlock *VPBasicBlock::createEmptyBasicBlock(VPTransformState &State) {
+ auto &CFG = State.CFG;
// BB stands for IR BasicBlocks. VPBB stands for VPlan VPBasicBlocks.
// Pred stands for Predessor. Prev stands for Previous - last visited/created.
BasicBlock *PrevBB = CFG.PrevBB;
@@ -412,7 +413,8 @@ VPBasicBlock::createEmptyBasicBlock(VPTransformState::CFGState &CFG) {
return NewBB;
}
-void VPBasicBlock::connectToPredecessors(VPTransformState::CFGState &CFG) {
+void VPBasicBlock::connectToPredecessors(VPTransformState &State) {
+ auto &CFG = State.CFG;
BasicBlock *NewBB = CFG.VPBB2IRBB[this];
// Hook up the new basic block to its predecessors.
for (VPBlockBase *PredVPBlock : getHierarchicalPredecessors()) {
@@ -467,7 +469,7 @@ void VPIRBasicBlock::execute(VPTransformState *State) {
"other blocks must be terminated by a branch");
}
- connectToPredecessors(State->CFG);
+ connectToPredecessors(*State);
}
VPIRBasicBlock *VPIRBasicBlock::clone() {
@@ -494,7 +496,7 @@ void VPBasicBlock::execute(VPTransformState *State) {
// * the exit of a replicate region.
State->CFG.VPBB2IRBB[this] = NewBB;
} else {
- NewBB = createEmptyBasicBlock(State->CFG);
+ NewBB = createEmptyBasicBlock(*State);
State->Builder.SetInsertPoint(NewBB);
// Temporarily terminate with unreachable until CFG is rewired.
@@ -507,7 +509,7 @@ void VPBasicBlock::execute(VPTransformState *State) {
State->CFG.PrevBB = NewBB;
State->CFG.VPBB2IRBB[this] = NewBB;
- connectToPredecessors(State->CFG);
+ connectToPredecessors(*State);
}
// 2. Fill the IR basic block with IR instructions.
@@ -616,6 +618,11 @@ bool VPBasicBlock::isExiting() const {
}
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+void VPBlockBase::print(raw_ostream &O) const {
+ VPSlotTracker SlotTracker(getPlan());
+ print(O, "", SlotTracker);
+}
+
void VPBlockBase::printSuccessors(raw_ostream &O, const Twine &Indent) const {
if (getSuccessors().empty()) {
O << Indent << "No successors\n";
@@ -1460,58 +1467,6 @@ void VPUser::printOperands(raw_ostream &O, VPSlotTracker &SlotTracker) const {
}
#endif
-void VPInterleavedAccessInfo::visitRegion(VPRegionBlock *Region,
- Old2NewTy &Old2New,
- InterleavedAccessInfo &IAI) {
- ReversePostOrderTraversal<VPBlockShallowTraversalWrapper<VPBlockBase *>>
- RPOT(Region->getEntry());
- for (VPBlockBase *Base : RPOT) {
- visitBlock(Base, Old2New, IAI);
- }
-}
-
-void VPInterleavedAccessInfo::visitBlock(VPBlockBase *Block, Old2NewTy &Old2New,
- InterleavedAccessInfo &IAI) {
- if (VPBasicBlock *VPBB = dyn_cast<VPBasicBlock>(Block)) {
- for (VPRecipeBase &VPI : *VPBB) {
- if (isa<VPWidenPHIRecipe>(&VPI))
- continue;
- assert(isa<VPInstruction>(&VPI) && "Can only handle VPInstructions");
- auto *VPInst = cast<VPInstruction>(&VPI);
-
- auto *Inst = dyn_cast_or_null<Instruction>(VPInst->getUnderlyingValue());
- if (!Inst)
- continue;
- auto *IG = IAI.getInterleaveGroup(Inst);
- if (!IG)
- continue;
-
- auto NewIGIter = Old2New.find(IG);
- if (NewIGIter == Old2New.end())
- Old2New[IG] = new InterleaveGroup<VPInstruction>(
- IG->getFactor(), IG->isReverse(), IG->getAlign());
-
- if (Inst == IG->getInsertPos())
- Old2New[IG]->setInsertPos(VPInst);
-
- InterleaveGroupMap[VPInst] = Old2New[IG];
- InterleaveGroupMap[VPInst]->insertMember(
- VPInst, IG->getIndex(Inst),
- Align(IG->isReverse() ? (-1) * int(IG->getFactor())
- : IG->getFactor()));
- }
- } else if (VPRegionBlock *Region = dyn_cast<VPRegionBlock>(Block))
- visitRegion(Region, Old2New, IAI);
- else
- llvm_unreachable("Unsupported kind of VPBlock.");
-}
-
-VPInterleavedAccessInfo::VPInterleavedAccessInfo(VPlan &Plan,
- InterleavedAccessInfo &IAI) {
- Old2NewTy Old2New;
- visitRegion(Plan.getVectorLoopRegion(), Old2New, IAI);
-}
-
void VPSlotTracker::assignName(const VPValue *V) {
assert(!VPValue2Name.contains(V) && "VPValue already has a name!");
auto *UV = V->getUnderlyingValue();
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 11ba7f06735134..70ff82c2bb7541 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -17,7 +17,6 @@
/// 4. VPInstruction, a concrete Recipe and VPUser modeling a single planned
/// instruction;
/// 5. The VPlan class holding a candidate for vectorization;
-/// 6. The VPlanPrinter class providing a way to print a plan in dot format;
/// These are documented in docs/VectorizationPlan.rst.
//
//===----------------------------------------------------------------------===//
@@ -34,10 +33,7 @@
#include "llvm/ADT/Twine.h"
#include "llvm/ADT/ilist.h"
#include "llvm/ADT/ilist_node.h"
-#include "llvm/Analysis/DomTreeUpdater.h"
#include "llvm/Analysis/IVDescriptors.h"
-#include "llvm/Analysis/LoopInfo.h"
-#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/VectorUtils.h"
#include "llvm/IR/DebugLoc.h"
#include "llvm/IR/FMF.h"
@@ -54,7 +50,7 @@ class BasicBlock;
class DominatorTree;
class InnerLoopVectorizer;
class IRBuilderBase;
-class LoopInfo;
+struct VPTransformState;
class raw_ostream;
class RecurrenceDescriptor;
class SCEV;
@@ -62,11 +58,11 @@ class Type;
class VPBasicBlock;
class VPRegionBlock;
class VPlan;
+class VPLane;
class VPReplicateRecipe;
class VPlanSlp;
class Value;
class LoopVectorizationCostModel;
-class LoopVersioning;
struct VPCostContext;
@@ -74,324 +70,8 @@ namespace Intrinsic {
typedef unsigned ID;
}
-/// Returns a calculation for the total number of elements for a given \p VF.
-/// For fixed width vectors this value is a constant, whereas for scalable
-/// vectors it is an expression determined at runtime.
-Value *getRuntimeVF(IRBuilderBase &B, Type *Ty, ElementCount VF);
-
-/// Return a value for Step multiplied by VF.
-Value *createStepForVF(IRBuilderBase &B, Type *Ty, ElementCount VF,
- int64_t Step);
-
-/// A helper function that returns the reciprocal of the block probability of
-/// predicated blocks. If we return X, we are assuming the predicated block
-/// will execute once for every X iterations of the loop header.
-///
-/// TODO: We should use actual block probability here, if available. Currently,
-/// we always assume predicated blocks have a 50% chance of executing.
-inline unsigned getReciprocalPredBlockProb() { return 2; }
-
-/// A range of powers-of-2 vectorization factors with fixed start and
-/// adjustable end. The range includes start and excludes end, e.g.,:
-/// [1, 16) = {1, 2, 4, 8}
-struct VFRange {
- // A power of 2.
- const ElementCount Start;
-
- // A power of 2. If End <= Start range is empty.
- ElementCount End;
-
- bool isEmpty() const {
- return End.getKnownMinValue() <= Start.getKnownMinValue();
- }
-
- VFRange(const ElementCount &Start, const ElementCount &End)
- : Start(Start), End(End) {
- assert(Start.isScalable() == End.isScalable() &&
- "Both Start and End should have the same scalable flag");
- assert(isPowerOf2_32(Start.getKnownMinValue()) &&
- "Expected Start to be a power of 2");
- assert(isPowerOf2_32(End.getKnownMinValue()) &&
- "Expected End to be a power of 2");
- }
-
- /// Iterator to iterate over vectorization factors in a VFRange.
- class iterator
- : public iterator_facade_base<iterator, std::forward_iterator_tag,
- ElementCount> {
- ElementCount VF;
-
- public:
- iterator(ElementCount VF) : VF(VF) {}
-
- bool operator==(const iterator &Other) const { return VF == Other.VF; }
-
- ElementCount operator*() const { return VF; }
-
- iterator &operator++() {
- VF *= 2;
- return *this;
- }
- };
-
- iterator begin() { return iterator(Start); }
- iterator end() {
- assert(isPowerOf2_32(End.getKnownMinValue()));
- return iterator(End);
- }
-};
-
using VPlanPtr = std::unique_ptr<VPlan>;
-/// In what follows, the term "input IR" refers to code that is fed into the
-/// vectorizer whereas the term "output IR" refers to code that is generated by
-/// the vectorizer.
-
-/// VPLane provides a way to access lanes in both fixed width and scalable
-/// vectors, where for the latter the lane index sometimes needs calculating
-/// as a runtime expression.
-class VPLane {
-public:
- /// Kind describes how to interpret Lane.
- enum class Kind : uint8_t {
- /// For First, Lane is the index into the first N elements of a
- /// fixed-vector <N x <ElTy>> or a scalable vector <vscale x N x <ElTy>>.
- First,
- /// For ScalableLast, Lane is the offset from the start of the last
- /// N-element subvector in a scalable vector <vscale x N x <ElTy>>. For
- /// example, a Lane of 0 corresponds to lane `(vscale - 1) * N`, a Lane of
- /// 1 corresponds to `((vscale - 1) * N) + 1`, etc.
- ScalableLast
- };
-
-private:
- /// in [0..VF)
- unsigned Lane;
-
- /// Indicates how the Lane should be interpreted, as described above.
- Kind LaneKind;
-
-public:
- VPLane(unsigned Lane) : Lane(Lane), LaneKind(VPLane::Kind::First) {}
- VPLane(unsigned Lane, Kind LaneKind) : Lane(Lane), LaneKind(LaneKind) {}
-
- static VPLane getFirstLane() { return VPLane(0, VPLane::Kind::First); }
-
- static VPLane getLaneFromEnd(const ElementCount &VF, unsigned Offset) {
- assert(Offset > 0 && Offset <= VF.getKnownMinValue() &&
- "trying to extract with invalid offset");
- unsigned LaneOffset = VF.getKnownMinValue() - Offset;
- Kind LaneKind;
- if (VF.isScalable())
- // In this case 'LaneOffset' refers to the offset from the start of the
- // last subvector with VF.getKnownMinValue() elements.
- LaneKind = VPLane::Kind::ScalableLast;
- else
- LaneKind = VPLane::Kind::First;
- return VPLane(LaneOffset, LaneKind);
- }
-
- static VPLane getLastLaneForVF(const ElementCount &VF) {
- return getLaneFromEnd(VF, 1);
- }
-
- /// Returns a compile-time known value for the lane index and asserts if the
- /// lane can only be calculated at runtime.
- unsigned getKnownLane() const {
- assert(LaneKind == Kind::First);
- return Lane;
- }
-
- /// Returns an expression describing the lane index that can be used at
- /// runtime.
- Value *getAsRuntimeExpr(IRBuilderBase &Builder, const ElementCount &VF) const;
-
- /// Returns the Kind of lane offset.
- Kind getKind() const { return LaneKind; }
-
- /// Returns true if this is the first lane of the whole vector.
- bool isFirstLane() const { return Lane == 0 && LaneKind == Kind::First; }
-
- /// Maps the lane to a cache index based on \p VF.
- unsigned mapToCacheIndex(const ElementCount &VF) const {
- switch (LaneKind) {
- case VPLane::Kind::ScalableLast:
- assert(VF.isScalable() && Lane < VF.getKnownMinValue());
- return VF.getKnownMinValue() + Lane;
- default:
- assert(Lane < VF.getKnownMinValue());
- return Lane;
- }
- }
-
- /// Returns the maxmimum number of lanes that we are able to consider
- /// caching for \p VF.
- static unsigned getNumCachedLanes(const ElementCount &VF) {
- return VF.getKnownMinValue() * (VF.isScalable() ? 2 : 1);
- }
-};
-
-/// VPTransformState holds information passed down when "executing" a VPlan,
-/// needed for generating the output IR.
-struct VPTransformState {
- VPTransformState(const TargetTransformInfo *TTI, ElementCount VF, unsigned UF,
- LoopInfo *LI, DominatorTree *DT, IRBuilderBase &Builder,
- InnerLoopVectorizer *ILV, VPlan *Plan,
- Loop *CurrentParentLoop, Type *CanonicalIVTy);
- /// Target Transform Info.
- const TargetTransformInfo *TTI;
-
- /// The chosen Vectorization Factor of the loop being vectorized.
- ElementCount VF;
-
- /// Hold the index to generate specific scalar instructions. Null indicates
- /// that all instances are to be generated, using either scalar or vector
- /// instructions.
- std::optional<VPLane> Lane;
-
- struct DataState {
- // Each value from the original loop, when vectorized, is represented by a
- // vector value in the map.
- DenseMap<VPValue *, Value *> VPV2Vector;
-
- DenseMap<VPValue *, SmallVector<Value *, 4>> VPV2Scalars;
- } Data;
-
- /// Get the generated vector Value for a given VPValue \p Def if \p IsScalar
- /// is false, otherwise return the generated scalar. \See set.
- Value *get(VPValue *Def, bool IsScalar = false);
-
- /// Get the generated Value for a given VPValue and given Part and Lane.
- Value *get(VPValue *Def, const VPLane &Lane);
-
- bool hasVectorValue(VPValue *Def) { return Data.VPV2Vector.contains(Def); }
-
- bool hasScalarValue(VPValue *Def, VPLane Lane) {
- auto I = Data.VPV2Scalars.find(Def);
- if (I == Data.VPV2Scalars.end())
- return false;
- unsigned CacheIdx = Lane.mapToCacheIndex(VF);
- return CacheIdx < I->second.size() && I->second[CacheIdx];
- }
-
- /// Set the generated vector Value for a given VPValue, if \p
- /// IsScalar is false. If \p IsScalar is true, set the scalar in lane 0.
- void set(VPValue *Def, Value *V, bool IsScalar = false) {
- if (IsScalar) {
- set(Def, V, VPLane(0));
- return;
- }
- assert((VF.isScalar() || V->getType()->isVectorTy()) &&
- "scalar values must be stored as (0, 0)");
- Data.VPV2Vector[Def] = V;
- }
-
- /// Reset an existing vector value for \p Def and a given \p Part.
- void reset(VPValue *Def, Value *V) {
- assert(Data.VPV2Vector.contains(Def) && "need to overwrite existing value");
- Data.VPV2Vector[Def] = V;
- }
-
- /// Set the generated scalar \p V for \p Def and the given \p Lane.
- void set(VPValue *Def, Value *V, const VPLane &Lane) {
- auto &Scalars = Data.VPV2Scalars[Def];
- unsigned CacheIdx = Lane.mapToCacheIndex(VF);
- if (Scalars.size() <= CacheIdx)
- Scalars.resize(CacheIdx + 1);
- assert(!Scalars[CacheIdx] && "should overwrite existing value");
- Scalars[CacheIdx] = V;
- }
-
- /// Reset an existing scalar value for \p Def and a given \p Lane.
- void reset(VPValue *Def, Value *V, const VPLane &Lane) {
- auto Iter = Data.VPV2Scalars.find(Def);
- assert(Iter != Data.VPV2Scalars.end() &&
- "need to overwrite existing value");
- unsigned CacheIdx = Lane.mapToCacheIndex(VF);
- assert(CacheIdx < Iter->second.size() &&
- "need to overwrite existing value");
- Iter->second[CacheIdx] = V;
- }
-
- /// Add additional metadata to \p To that was not present on \p Orig.
- ///
- /// Currently this is used to add the noalias annotations based on the
- /// inserted memchecks. Use this for instructions that are *cloned* into the
- /// vector loop.
- void addNewMetadata(Instruction *To, const Instruction *Orig);
-
- /// Add metadata from one instruction to another.
- ///
- /// This includes both the original MDs from \p From and additional ones (\see
- /// addNewMetadata). Use this for *newly created* instructions in the vector
- /// loop.
- void addMetadata(Value *To, Instruction *From);
-
- /// Set the debug location in the builder using the debug location \p DL.
- void setDebugLocFrom(DebugLoc DL);
-
- /// Construct the vector value of a scalarized value \p V one lane at a time.
- void packScalarIntoVectorValue(VPValue *Def, const VPLane &Lane);
-
- /// Hold state information used when constructing the CFG of the output IR,
- /// traversing the VPBasicBlocks and generating corresponding IR BasicBlocks.
- struct CFGState {
- /// The previous VPBasicBlock visited. Initially set to null.
- VPBasicBlock *PrevVPBB = nullptr;
-
- /// The previous IR BasicBlock created or used. Initially set to the new
- /// header BasicBlock.
- BasicBlock *PrevBB = nullptr;
-
- /// The last IR BasicBlock in the output IR. Set to the exit block of the
- /// vector loop.
- BasicBlock *ExitBB = nullptr;
-
- /// A mapping of each VPBasicBlock to the corresponding BasicBlock. In case
- /// of replication, maps the BasicBlock of the last replica created.
- SmallDenseMap<VPBasicBlock *, BasicBlock *> VPBB2IRBB;
-
- /// Updater for the DominatorTree.
- DomTreeUpdater DTU;
-
- CFGState(DominatorTree *DT)
- : DTU(DT, DomTreeUpdater::UpdateStrategy::Lazy) {}
-
- /// Returns the BasicBlock* mapped to the pre-header of the loop region
- /// containing \p R.
- BasicBlock *getPreheaderBBFor(VPRecipeBase *R);
- } CFG;
-
- /// Hold a pointer to LoopInfo to register new basic blocks in the loop.
- LoopInfo *LI;
-
- /// Hold a reference to the IRBuilder used to generate output IR code.
- IRBuilderBase &Builder;
-
- /// Hold a pointer to InnerLoopVectorizer to reuse its IR generation methods.
- InnerLoopVectorizer *ILV;
-
- /// Pointer to the VPlan code is generated for.
- VPlan *Plan;
-
- /// The parent loop object for the current scope, or nullptr.
- Loop *CurrentParentLoop = nullptr;
-
- /// LoopVersioning. It's only set up (non-null) if memchecks were
- /// used.
- ///
- /// This is currently only used to add no-alias metadata based on the
- /// memchecks. The actually versioning is performed manually.
- LoopVersioning *LVer = nullptr;
-
- /// Map SCEVs to their expanded values. Populated when executing
- /// VPExpandSCEVRecipes.
- DenseMap<const SCEV *, Value *> ExpandedSCEVs;
-
- /// VPlan-based type analysis.
- VPTypeAnalysis TypeAnalysis;
-};
-
/// VPBlockBase is the building block of the Hierarchical Control-Flow Graph.
/// A VPBlockBase can be either a VPBasicBlock or a VPRegionBlock.
class VPBlockBase {
@@ -659,10 +339,7 @@ class VPBlockBase {
VPSlotTracker &SlotTracker) const = 0;
/// Print plain-text dump of this VPlan to \p O.
- void print(raw_ostream &O) const {
- VPSlotTracker SlotTracker(getPlan());
- print(O, "", SlotTracker);
- }
+ void print(raw_ostream &O) const;
/// Print the successors of...
[truncated]
|
/// Returns a compile-time known value for the lane index and asserts if the | ||
/// lane can only be calculated at runtime. | ||
unsigned getKnownLane() const { | ||
assert(LaneKind == Kind::First); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertion message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks
unsigned Lane; | ||
|
||
/// Indicates how the Lane should be interpreted, as described above. | ||
Kind LaneKind; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add default initializers and skip default initialization in constructors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks
unsigned mapToCacheIndex(const ElementCount &VF) const { | ||
switch (LaneKind) { | ||
case VPLane::Kind::ScalableLast: | ||
assert(VF.isScalable() && Lane < VF.getKnownMinValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertion message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added thanks
assert(VF.isScalable() && Lane < VF.getKnownMinValue()); | ||
return VF.getKnownMinValue() + Lane; | ||
default: | ||
assert(Lane < VF.getKnownMinValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertion message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added thanks
/// Returns the maxmimum number of lanes that we are able to consider | ||
/// caching for \p VF. | ||
static unsigned getNumCachedLanes(const ElementCount &VF) { | ||
return VF.getKnownMinValue() * (VF.isScalable() ? 2 : 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we replace all these magic constants with the named ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function isn't used, removed in 7a831eb
LoopInfo *LI; | ||
|
||
/// Hold a reference to the IRBuilder used to generate output IR code. | ||
IRBuilderBase &Builder; | ||
|
||
/// Hold a pointer to InnerLoopVectorizer to reuse its IR generation methods. | ||
InnerLoopVectorizer *ILV; | ||
|
||
/// Pointer to the VPlan code is generated for. | ||
VPlan *Plan; | ||
|
||
/// The parent loop object for the current scope, or nullptr. | ||
Loop *CurrentParentLoop = nullptr; | ||
|
||
/// LoopVersioning. It's only set up (non-null) if memchecks were | ||
/// used. | ||
/// | ||
/// This is currently only used to add no-alias metadata based on the | ||
/// memchecks. The actually versioning is performed manually. | ||
LoopVersioning *LVer = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use references instead of pointers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment it is only set if memory checks are needed hence the pointer. Others like VPlan
ILV
& co could be references, but I'd prefer to update that separately and keep this PR mostly a plain move
for (VPRecipeBase &VPI : *VPBB) { | ||
if (isa<VPWidenPHIRecipe>(&VPI)) | ||
continue; | ||
assert(isa<VPInstruction>(&VPI) && "Can only handle VPInstructions"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this assertion? cast<>
will assert itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, removed, thanks!
} else if (VPRegionBlock *Region = dyn_cast<VPRegionBlock>(Block)) | ||
visitRegion(Region, Old2New, IAI); | ||
else | ||
llvm_unreachable("Unsupported kind of VPBlock."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add braces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks
d57ee93
to
fda19f9
Compare
Given the size of the changes in this patch I wonder if it's worth considering breaking down into a fewer smaller patches, if possible? I'm just thinking potential disruption to other developers. If all the changes in this patch depend upon each other that's fine though. |
I originally thought it might be less disruptive to move things together to their new locations, instead of moving them piece by piece, potentially causing conflicts on each commit. There are some bits that could be moved independently (e.g. VPInterleavedAccessInfo/VPlanSLP), if that would be preferred. I think the definitions moved in this patch hopefully shouldn't cause too much pain. There are other moves that might be more painful, like moving most recipe definitions, which I plan to propose separately, to further reduce the main header size and allow for better further recipe additions. |
Nothing in VPlan.h directly depends on VPTransformState, VPCostContext, VPFRange, VPlanPrinter or VPSlotTracker. Move them out to a separate header to reduce the size of widely used VPlan.h. This is a first step towards more cleanly separating declarations in VPlan. Besides reducing VPlan.h's size, this also allows including additional VPlan-related headers in VPlanHelpers.h for use there. An example is using VPDominatorTree in VPTransformState (llvm#117138).
f39424d
to
4ef2284
Compare
ping :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Kind LaneKind = Kind::First; | ||
|
||
public: | ||
VPLane(unsigned Lane) : Lane(Lane) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a shame this isn't a simple copy and paste of the code in VPlan.h. It's ok - no need to revert or anything. But for large refactoring patches like this I personally feel it's easier for the reviewer (and leads to faster reviews!) if it's done in phases, i.e.
- Do simple copy and paste, followed by
- NFC tidy-up patches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep agreed. The current version pulls in a few edits as suggested by @alexey-bataev plus some very small edits that were required for the move, but the former would probably have been better to do separately.
…(#124104) Nothing in VPlan.h directly depends on VPTransformState, VPCostContext, VPFRange, VPlanPrinter or VPSlotTracker. Move them out to a separate header to reduce the size of widely used VPlan.h. This is a first step towards more cleanly separating declarations in VPlan. Besides reducing VPlan.h's size, this also allows including additional VPlan-related headers in VPlanHelpers.h for use there. An example is using VPDominatorTree in VPTransformState (llvm/llvm-project#117138). PR: llvm/llvm-project#124104
Nothing in VPlan.h directly depends on VPTransformState, VPCostContext, VPFRange, VPlanPrinter or VPSlotTracker. Move them out to a separate header to reduce the size of widely used VPlan.h.
This is a first step towards more cleanly separating declarations in VPlan.
Besides reducing VPlan.h's size, this also allows including additional VPlan-related headers in VPlanHelpers.h for use there. An example is using VPDominatorTree in VPTransformState
(#117138).