-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[VPlan] Build initial VPlan 0 using HCFGBuilder for inner loops. (NFC) #124432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
85721ea
b8ef9b4
d2bdde2
953eeff
6a872d7
efc4b83
7a9e7ab
f8fa5c0
83e933a
73e2beb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -9298,6 +9298,7 @@ static void addExitUsersForFirstOrderRecurrences( | |||||||||||
VPlanPtr | ||||||||||||
LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) { | ||||||||||||
|
||||||||||||
using namespace llvm::VPlanPatternMatch; | ||||||||||||
SmallPtrSet<const InterleaveGroup<Instruction> *, 1> InterleaveGroups; | ||||||||||||
|
||||||||||||
// --------------------------------------------------------------------------- | ||||||||||||
|
@@ -9321,6 +9322,10 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) { | |||||||||||
PSE, RequiresScalarEpilogueCheck, | ||||||||||||
CM.foldTailByMasking(), OrigLoop); | ||||||||||||
|
||||||||||||
// Build hierarchical CFG. | ||||||||||||
VPlanHCFGBuilder HCFGBuilder(OrigLoop, LI, *Plan); | ||||||||||||
HCFGBuilder.buildHierarchicalCFG(); | ||||||||||||
|
||||||||||||
// Don't use getDecisionAndClampRange here, because we don't know the UF | ||||||||||||
// so this function is better to be conservative, rather than to split | ||||||||||||
// it up into different VPlans. | ||||||||||||
|
@@ -9371,12 +9376,8 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) { | |||||||||||
// Construct recipes for the instructions in the loop | ||||||||||||
// --------------------------------------------------------------------------- | ||||||||||||
|
||||||||||||
// Scan the body of the loop in a topological order to visit each basic block | ||||||||||||
// after having visited its predecessor basic blocks. | ||||||||||||
LoopBlocksDFS DFS(OrigLoop); | ||||||||||||
DFS.perform(LI); | ||||||||||||
|
||||||||||||
VPBasicBlock *HeaderVPBB = Plan->getVectorLoopRegion()->getEntryBasicBlock(); | ||||||||||||
VPRegionBlock *LoopRegion = Plan->getVectorLoopRegion(); | ||||||||||||
VPBasicBlock *HeaderVPBB = LoopRegion->getEntryBasicBlock(); | ||||||||||||
VPBasicBlock *VPBB = HeaderVPBB; | ||||||||||||
BasicBlock *HeaderBB = OrigLoop->getHeader(); | ||||||||||||
bool NeedsMasks = | ||||||||||||
|
@@ -9389,26 +9390,70 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) { | |||||||||||
RecipeBuilder.collectScaledReductions(Range); | ||||||||||||
|
||||||||||||
auto *MiddleVPBB = Plan->getMiddleBlock(); | ||||||||||||
|
||||||||||||
// Scan the body of the loop in a topological order to visit each basic block | ||||||||||||
// after having visited its predecessor basic blocks. | ||||||||||||
ReversePostOrderTraversal<VPBlockShallowTraversalWrapper<VPBlockBase *>> RPOT( | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Move the above comment over here:
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done thanks |
||||||||||||
HeaderVPBB); | ||||||||||||
|
||||||||||||
VPBasicBlock::iterator MBIP = MiddleVPBB->getFirstNonPhi(); | ||||||||||||
for (BasicBlock *BB : make_range(DFS.beginRPO(), DFS.endRPO())) { | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also remove
above which become dead. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done thanks |
||||||||||||
// Relevant instructions from basic block BB will be grouped into VPRecipe | ||||||||||||
// ingredients and fill a new VPBasicBlock. | ||||||||||||
if (VPBB != HeaderVPBB) | ||||||||||||
VPBB->setName(BB->getName()); | ||||||||||||
Builder.setInsertPoint(VPBB); | ||||||||||||
VPBlockBase *PrevVPBB = nullptr; | ||||||||||||
for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(RPOT)) { | ||||||||||||
// Handle VPBBs down to the latch. | ||||||||||||
if (VPBB == LoopRegion->getExiting()) { | ||||||||||||
assert(!HCFGBuilder.getIRBBForVPB(VPBB) && | ||||||||||||
"the latch block shouldn't have a corresponding IRBB"); | ||||||||||||
Comment on lines
+9404
to
+9405
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying. Should this discrepancy be fixed, perhaps as part of integrating HCFGBuilder with skeleton creation, to form "buildLoop" VPlan whose loop VPBB's are in full 1-1 correspondence with the original IRBB's of the loop? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good, currently they are quite intertwined already, but once we use it in the inner loop path I'll continue to further refactor/improve on the current structure |
||||||||||||
VPBlockUtils::connectBlocks(PrevVPBB, VPBB); | ||||||||||||
break; | ||||||||||||
} | ||||||||||||
|
||||||||||||
if (VPBB == HeaderVPBB) | ||||||||||||
// Create mask based on the IR BB corresponding to VPBB. | ||||||||||||
// TODO: Predicate directly based on VPlan. | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we set the insert point of Builder once, here, as done now, possibly to first non-Phi of VPBB? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately that doesn't play nice with remove the recipes. |
||||||||||||
Builder.setInsertPoint(VPBB, VPBB->begin()); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: setting insert point here to first non phi of VPBB instead of beginning of VPBB doesn't play well with recipe removal?
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to insert at the begining here, as the blocks other than the header may contain VPWidenPhis that need to be converted to VPBlend, and the mask needs to be inserted at the beginning of the block for those blends. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So non-phi recipes that compute a block's mask must be introduced before the phi recipes of the block, because the latter is eventually replaced by blends that use the former? This calls for a FIXME note. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a FIXME for now. We should be able to sink the blends at the right place. The whole mask creation logic will be replaced when rewriting to perform predication on VPlan, cleaning up the logic here. |
||||||||||||
if (VPBB == HeaderVPBB) { | ||||||||||||
Builder.setInsertPoint(VPBB, VPBB->getFirstNonPhi()); | ||||||||||||
RecipeBuilder.createHeaderMask(); | ||||||||||||
else if (NeedsMasks) | ||||||||||||
RecipeBuilder.createBlockInMask(BB); | ||||||||||||
} else if (NeedsMasks) { | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should the header mask also be created only if NeedsMasks? (Seems independent) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently |
||||||||||||
// FIXME: At the moment, masks need to be placed at the beginning of the | ||||||||||||
// block, as blends introduced for phi nodes need to use it. The created | ||||||||||||
// blends should be sunk after the mask recipes. | ||||||||||||
RecipeBuilder.createBlockInMask(HCFGBuilder.getIRBBForVPB(VPBB)); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Time to move edge masks and block masks to be cached according to VPBB's rather than IRBB's? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it is somewhat a chicken-and-egg problem. Without this patch, we cannot do that, because we only ever create the flattened CFG in VPlan. I might be missing another option, but I think once this patch lands as first step we can move the mask creation to be based on the VPBlocks. WDYT? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I gave this a try, but it doesn't work out with the flattening of the CFG here, because we might already removed edges earlier before querying a mask on that edge. We could first compute all masks in a separate loop, but there also are quite a few places that need updating, include There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, another suggestion: getIRBBForVPB() is needed only here - for non-header (non-latch) (non-empty?) VPBB, whose recipes are all expected to have underlying Instructions, which know their parental BasicBlock? Can the latter be retrieved as in below rather than hacking HCFGBuilder to record and expose this VRBB-to-IRBB mapping? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfrotunately we don't create recipes for branches, so there may be empty blocks for which we would need to create masks (because there succesors may require them at the moment). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, let's go with HCFGBuilder.getIRBBForVPB(), for now ...
(Recipes are created for internal conditional branches but not for unconditional branches (including early exits) - which imply a single successor. An empty block also implies a single predecessor due to lack of phi recipes. Figuring out the BasicBlock from the single predecessor and/or successor of an empty VPBB seems cumbersome; getIRBBForVPB() could record the mapping for empty VPBB's only; simplest to probably to record all VPBB's in loop, for now.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a FXIME to the declaration, thanks There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
? |
||||||||||||
} | ||||||||||||
|
||||||||||||
// Convert input VPInstructions to widened recipes. | ||||||||||||
for (VPRecipeBase &R : make_early_inc_range(*VPBB)) { | ||||||||||||
auto *SingleDef = cast<VPSingleDefRecipe>(&R); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done thanks |
||||||||||||
auto *UnderlyingValue = SingleDef->getUnderlyingValue(); | ||||||||||||
// Skip recipes that do not need transforming, including canonical IV, | ||||||||||||
// wide canonical IV and VPInstructions without underlying values. The | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add a FIXME to stop relying on the underlying values of VPInstructions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, thanks! |
||||||||||||
// latter are added above for masking. | ||||||||||||
// FIXME: Migrate code relying on the underlying instruction from VPlan0 | ||||||||||||
// to construct recipes below to not use the underlying instruction. | ||||||||||||
if (isa<VPCanonicalIVPHIRecipe, VPWidenCanonicalIVRecipe>(&R) || | ||||||||||||
(isa<VPInstruction>(&R) && !UnderlyingValue)) | ||||||||||||
continue; | ||||||||||||
|
||||||||||||
// Introduce each ingredient into VPlan. | ||||||||||||
// TODO: Model and preserve debug intrinsics in VPlan. | ||||||||||||
for (Instruction &I : drop_end(BB->instructionsWithoutDebug(false))) { | ||||||||||||
Instruction *Instr = &I; | ||||||||||||
// FIXME: VPlan0, which models a copy of the original scalar loop, should | ||||||||||||
// not use VPWidenPHIRecipe to model the phis. | ||||||||||||
assert((isa<VPWidenPHIRecipe>(&R) || isa<VPInstruction>(&R)) && | ||||||||||||
UnderlyingValue && "unsupported recipe"); | ||||||||||||
|
||||||||||||
if (isa<VPInstruction>(&R) && | ||||||||||||
(cast<VPInstruction>(&R)->getOpcode() == | ||||||||||||
VPInstruction::BranchOnCond || | ||||||||||||
(cast<VPInstruction>(&R)->getOpcode() == Instruction::Switch))) { | ||||||||||||
R.eraseFromParent(); | ||||||||||||
break; | ||||||||||||
} | ||||||||||||
|
||||||||||||
// TODO: Gradually replace uses of underlying instruction by analyses on | ||||||||||||
// VPlan. | ||||||||||||
Instruction *Instr = cast<Instruction>(UnderlyingValue); | ||||||||||||
Builder.setInsertPoint(SingleDef); | ||||||||||||
SmallVector<VPValue *, 4> Operands; | ||||||||||||
auto *Phi = dyn_cast<PHINode>(Instr); | ||||||||||||
if (Phi && Phi->getParent() == HeaderBB) { | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worth commenting that the other operand of the header phi - the one across the back-edge, will be added later? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, thanks |
||||||||||||
// The backedge value will be added in fixHeaderPhis later. | ||||||||||||
Operands.push_back(Plan->getOrAddLiveIn( | ||||||||||||
Phi->getIncomingValueForBlock(OrigLoop->getLoopPreheader()))); | ||||||||||||
} else { | ||||||||||||
|
@@ -9420,15 +9465,16 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) { | |||||||||||
// in the exit block, a uniform store recipe will be created for the final | ||||||||||||
// invariant store of the reduction. | ||||||||||||
StoreInst *SI; | ||||||||||||
if ((SI = dyn_cast<StoreInst>(&I)) && | ||||||||||||
if ((SI = dyn_cast<StoreInst>(Instr)) && | ||||||||||||
Legal->isInvariantAddressOfReduction(SI->getPointerOperand())) { | ||||||||||||
// Only create recipe for the final invariant store of the reduction. | ||||||||||||
if (!Legal->isInvariantStoreOfReduction(SI)) | ||||||||||||
continue; | ||||||||||||
auto *Recipe = new VPReplicateRecipe( | ||||||||||||
SI, make_range(Operands.begin(), Operands.end()), | ||||||||||||
true /* IsUniform */); | ||||||||||||
Recipe->insertBefore(*MiddleVPBB, MBIP); | ||||||||||||
if (Legal->isInvariantStoreOfReduction(SI)) { | ||||||||||||
auto *Recipe = new VPReplicateRecipe( | ||||||||||||
SI, make_range(Operands.begin(), Operands.end()), | ||||||||||||
true /* IsUniform */); | ||||||||||||
Recipe->insertBefore(*MiddleVPBB, MBIP); | ||||||||||||
} | ||||||||||||
R.eraseFromParent(); | ||||||||||||
Comment on lines
+9476
to
+9477
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
otherwise R seems to be dropped altogether? Is this case tested? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There may be multiple stores to the invariant address of a reduction ( There are tests covering the cases with single and multiple stores |
||||||||||||
continue; | ||||||||||||
} | ||||||||||||
|
||||||||||||
|
@@ -9438,25 +9484,29 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) { | |||||||||||
Recipe = RecipeBuilder.handleReplication(Instr, Operands, Range); | ||||||||||||
|
||||||||||||
RecipeBuilder.setRecipe(Instr, Recipe); | ||||||||||||
if (isa<VPHeaderPHIRecipe>(Recipe)) { | ||||||||||||
// VPHeaderPHIRecipes must be kept in the phi section of HeaderVPBB. In | ||||||||||||
// the following cases, VPHeaderPHIRecipes may be created after non-phi | ||||||||||||
// recipes and need to be moved to the phi section of HeaderVPBB: | ||||||||||||
// * tail-folding (non-phi recipes computing the header mask are | ||||||||||||
// introduced earlier than regular header phi recipes, and should appear | ||||||||||||
// after them) | ||||||||||||
// * Optimizing truncates to VPWidenIntOrFpInductionRecipe. | ||||||||||||
|
||||||||||||
assert((HeaderVPBB->getFirstNonPhi() == VPBB->end() || | ||||||||||||
CM.foldTailByMasking() || isa<TruncInst>(Instr)) && | ||||||||||||
"unexpected recipe needs moving"); | ||||||||||||
Comment on lines
-9450
to
-9452
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This assert is associated with the preceding explanation. Both become obsolete? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated to only move truncated inductions, for which it still is needed |
||||||||||||
if (isa<VPWidenIntOrFpInductionRecipe>(Recipe) && isa<TruncInst>(Instr)) { | ||||||||||||
// Optimized a truncate to VPWidenIntOrFpInductionRecipe. It needs to be | ||||||||||||
// moved to the phi section in the header. | ||||||||||||
Recipe->insertBefore(*HeaderVPBB, HeaderVPBB->getFirstNonPhi()); | ||||||||||||
} else | ||||||||||||
VPBB->appendRecipe(Recipe); | ||||||||||||
} | ||||||||||||
|
||||||||||||
VPBlockUtils::insertBlockAfter(Plan->createVPBasicBlock(""), VPBB); | ||||||||||||
VPBB = cast<VPBasicBlock>(VPBB->getSingleSuccessor()); | ||||||||||||
} else { | ||||||||||||
Builder.insert(Recipe); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit:
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done thanks |
||||||||||||
} | ||||||||||||
if (Recipe->getNumDefinedValues() == 1) | ||||||||||||
SingleDef->replaceAllUsesWith(Recipe->getVPSingleValue()); | ||||||||||||
else | ||||||||||||
assert(Recipe->getNumDefinedValues() == 0 && | ||||||||||||
"Unexpected multidef recipe"); | ||||||||||||
R.eraseFromParent(); | ||||||||||||
} | ||||||||||||
|
||||||||||||
// Flatten the CFG in the loop. Masks for blocks have already been generated | ||||||||||||
// and added to recipes as needed. To do so, first disconnect VPBB from its | ||||||||||||
// successors. Then connect VPBB to the previously visited VPBB. | ||||||||||||
for (auto *Succ : to_vector(VPBB->getSuccessors())) | ||||||||||||
VPBlockUtils::disconnectBlocks(VPBB, Succ); | ||||||||||||
if (PrevVPBB) | ||||||||||||
VPBlockUtils::connectBlocks(PrevVPBB, VPBB); | ||||||||||||
PrevVPBB = VPBB; | ||||||||||||
} | ||||||||||||
|
||||||||||||
// After here, VPBB should not be used. | ||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -75,7 +75,7 @@ class PlainCFGBuilder { | |
: TheLoop(Lp), LI(LI), Plan(P) {} | ||
|
||
/// Build plain CFG for TheLoop and connects it to Plan's entry. | ||
void buildPlainCFG(); | ||
void buildPlainCFG(DenseMap<VPBlockBase *, BasicBlock *> &VPB2IRBB); | ||
}; | ||
} // anonymous namespace | ||
|
||
|
@@ -237,10 +237,10 @@ bool PlainCFGBuilder::isExternalDef(Value *Val) { | |
// Instruction definition is in outermost loop PH. | ||
return false; | ||
|
||
// Check whether Instruction definition is in the loop exit. | ||
BasicBlock *Exit = TheLoop->getUniqueExitBlock(); | ||
assert(Exit && "Expected loop with single exit."); | ||
if (InstParent == Exit) { | ||
// Check whether Instruction definition is in a loop exit. | ||
SmallVector<BasicBlock *> ExitBlocks; | ||
TheLoop->getExitBlocks(ExitBlocks); | ||
if (is_contained(ExitBlocks, InstParent)) { | ||
// Instruction definition is in outermost loop exit. | ||
return false; | ||
} | ||
Comment on lines
+241
to
246
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No longer expecting a loop with single exit? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is needed for the inner loop multi-exit support. The native path still won't vectorize early exits, due to checks in legality I think. I can add a test case, thanks |
||
|
@@ -283,6 +283,7 @@ VPValue *PlainCFGBuilder::getOrCreateVPOperand(Value *IRVal) { | |
void PlainCFGBuilder::createVPInstructionsForVPBB(VPBasicBlock *VPBB, | ||
BasicBlock *BB) { | ||
VPIRBuilder.setInsertPoint(VPBB); | ||
// TODO: Model and preserve debug intrinsics in VPlan. | ||
for (Instruction &InstRef : BB->instructionsWithoutDebug(false)) { | ||
Instruction *Inst = &InstRef; | ||
|
||
|
@@ -308,6 +309,14 @@ void PlainCFGBuilder::createVPInstructionsForVPBB(VPBasicBlock *VPBB, | |
continue; | ||
} | ||
|
||
if (auto *SI = dyn_cast<SwitchInst>(Inst)) { | ||
SmallVector<VPValue *> Ops = {getOrCreateVPOperand(SI->getCondition())}; | ||
for (auto Case : SI->cases()) | ||
Ops.push_back(getOrCreateVPOperand(Case.getCaseValue())); | ||
VPIRBuilder.createNaryOp(Instruction::Switch, Ops, Inst); | ||
continue; | ||
} | ||
|
||
Comment on lines
+312
to
+319
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Teaching PlainCFGBuilder about switch statements - does this imply that "native" can now vectorize outerloops with switches, and worth testing? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The native path still won't vectorize switches, due to checks in legality I think. I can add a test case, thanks |
||
VPValue *NewVPV; | ||
if (auto *Phi = dyn_cast<PHINode>(Inst)) { | ||
// Phi node's operands may have not been visited at this point. We create | ||
|
@@ -334,7 +343,8 @@ void PlainCFGBuilder::createVPInstructionsForVPBB(VPBasicBlock *VPBB, | |
} | ||
|
||
// Main interface to build the plain CFG. | ||
void PlainCFGBuilder::buildPlainCFG() { | ||
void PlainCFGBuilder::buildPlainCFG( | ||
DenseMap<VPBlockBase *, BasicBlock *> &VPB2IRBB) { | ||
// 0. Reuse the top-level region, vector-preheader and exit VPBBs from the | ||
// skeleton. These were created directly rather than via getOrCreateVPBB(), | ||
// revisit them now to update BB2VPBB. Note that header/entry and | ||
|
@@ -423,6 +433,14 @@ void PlainCFGBuilder::buildPlainCFG() { | |
// Set VPBB successors. We create empty VPBBs for successors if they don't | ||
// exist already. Recipes will be created when the successor is visited | ||
// during the RPO traversal. | ||
if (auto *SI = dyn_cast<SwitchInst>(BB->getTerminator())) { | ||
SmallVector<VPBlockBase *> Succs = { | ||
getOrCreateVPBB(SI->getDefaultDest())}; | ||
for (auto Case : SI->cases()) | ||
Succs.push_back(getOrCreateVPBB(Case.getCaseSuccessor())); | ||
VPBB->setSuccessors(Succs); | ||
continue; | ||
} | ||
auto *BI = cast<BranchInst>(BB->getTerminator()); | ||
unsigned NumSuccs = succ_size(BB); | ||
if (NumSuccs == 1) { | ||
|
@@ -476,11 +494,14 @@ void PlainCFGBuilder::buildPlainCFG() { | |
// have a VPlan couterpart. Fix VPlan phi nodes by adding their corresponding | ||
// VPlan operands. | ||
fixPhiNodes(); | ||
|
||
for (const auto &[IRBB, VPB] : BB2VPBB) | ||
VPB2IRBB[VPB] = IRBB; | ||
} | ||
|
||
void VPlanHCFGBuilder::buildPlainCFG() { | ||
PlainCFGBuilder PCFGBuilder(TheLoop, LI, Plan); | ||
PCFGBuilder.buildPlainCFG(); | ||
PCFGBuilder.buildPlainCFG(VPB2IRBB); | ||
} | ||
|
||
// Public interface to build a H-CFG. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -46,7 +46,7 @@ define void @vector_reverse_i64(ptr nocapture noundef writeonly %A, ptr nocaptur | |
; CHECK-NEXT: LV: Found an estimated cost of 0 for VF vscale x 4 For instruction: br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit, !llvm.loop !0 | ||
; CHECK-NEXT: LV: Using user VF vscale x 4. | ||
; CHECK-NEXT: LV: Loop does not require scalar epilogue | ||
; CHECK-NEXT: LV: Scalarizing: %i.0 = add nsw i32 %i.0.in8, -1 | ||
; CHECK: LV: Scalarizing: %i.0 = add nsw i32 %i.0.in8, -1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why/Is this line no longer NEXT? Same below. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We now also include printing the VPlan after HCFG construction, should add a test to check that or drop the extra output? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, whatever you prefer. |
||
; CHECK-NEXT: LV: Scalarizing: %idxprom = zext i32 %i.0 to i64 | ||
; CHECK-NEXT: LV: Scalarizing: %arrayidx = getelementptr inbounds i32, ptr %B, i64 %idxprom | ||
; CHECK-NEXT: LV: Scalarizing: %arrayidx3 = getelementptr inbounds i32, ptr %A, i64 %idxprom | ||
|
@@ -295,7 +295,7 @@ define void @vector_reverse_f32(ptr nocapture noundef writeonly %A, ptr nocaptur | |
; CHECK-NEXT: LV: Found an estimated cost of 0 for VF vscale x 4 For instruction: br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit, !llvm.loop !0 | ||
; CHECK-NEXT: LV: Using user VF vscale x 4. | ||
; CHECK-NEXT: LV: Loop does not require scalar epilogue | ||
; CHECK-NEXT: LV: Scalarizing: %i.0 = add nsw i32 %i.0.in8, -1 | ||
; CHECK: LV: Scalarizing: %i.0 = add nsw i32 %i.0.in8, -1 | ||
; CHECK-NEXT: LV: Scalarizing: %idxprom = zext i32 %i.0 to i64 | ||
; CHECK-NEXT: LV: Scalarizing: %arrayidx = getelementptr inbounds float, ptr %B, i64 %idxprom | ||
; CHECK-NEXT: LV: Scalarizing: %arrayidx3 = getelementptr inbounds float, ptr %A, i64 %idxprom | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps
VPlanHCFGBuilder::buildingHierarchicalCFG()
which creates initial VPBB's inside loop region ( viaPlainCFGBuilder::buildPlainCFG()
) should now be combined withVPlan::createInitialVPlan()
which creates initial VPBB's elsewhere?Can be done as follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, maybe best to merge as follow-up