-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe #136173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
365d1e2
ed0bb64
6ec61b2
8290d2f
bb31e27
99bae91
c762c23
78a89df
91122e4
f47a24c
464ec72
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried downloading this patch and applying to the HEAD of LLVM and There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah perhaps this is my mistake. You did say it depends upon #113903. :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah that's the case :). Let me know if you have any issues applying it after applying 113903 too. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2581,9 +2581,15 @@ expandVPMulAccumulateReduction(VPMulAccumulateReductionRecipe *MulAcc) { | |
MulAcc->hasNoSignedWrap(), MulAcc->getDebugLoc()); | ||
Mul->insertBefore(MulAcc); | ||
|
||
auto *Red = new VPReductionRecipe( | ||
MulAcc->getRecurrenceKind(), FastMathFlags(), MulAcc->getChainOp(), Mul, | ||
MulAcc->getCondOp(), MulAcc->isOrdered(), MulAcc->getDebugLoc()); | ||
// Generate VPReductionRecipe. | ||
VPReductionRecipe *Red = nullptr; | ||
if (unsigned ScaleFactor = MulAcc->getVFScaleFactor(); ScaleFactor > 1) | ||
Red = new VPPartialReductionRecipe(Instruction::Add, MulAcc->getChainOp(), | ||
Mul, MulAcc->getCondOp(), ScaleFactor); | ||
else | ||
Red = new VPReductionRecipe(MulAcc->getRecurrenceKind(), FastMathFlags(), | ||
MulAcc->getChainOp(), Mul, MulAcc->getCondOp(), | ||
MulAcc->isOrdered(), MulAcc->getDebugLoc()); | ||
Red->insertBefore(MulAcc); | ||
|
||
MulAcc->replaceAllUsesWith(Red); | ||
|
@@ -2911,12 +2917,43 @@ static void tryToCreateAbstractReductionRecipe(VPReductionRecipe *Red, | |
Red->replaceAllUsesWith(AbstractR); | ||
} | ||
|
||
/// This function tries to create an abstract recipe from a partial reduction to | ||
/// hide its mul and extends from cost estimation. | ||
static void | ||
tryToCreateAbstractPartialReductionRecipe(VPPartialReductionRecipe *PRed) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this need to be given the (note that the cost part is still missing) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At this point we've already created the partial reduction and clamped the range so I don't think we need to do any costing (like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The way I read the code is that at the point of getting to this point in the code, it has recognised a reduction so there is a In fact, why wouldn't the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The existing method would need some modifications, as a VPPartialReductionRecipe is used inside the loop instead of a widened add or similar. There's three parts to partial reductions in vplan; the VPPartialReductionRecipe itself, the matching PHI recipe with VF scaling, and an out-of-loop reduction. We currently automatically create both the PHI and the VPPartialReduction in tryToCreateWidenRecipe if the loop exit instruction is present in a map regardless of the VF, so we need to do the same here. However, I think that means we need to perhaps rethink how we create partial reductions. We could create a normal reduction in the original vplan, detect whether a transformation to a partial reduction is appropriate here, and if so clamp the range and perform the transformation to a VPMulAccumulateReductionRecipe here (and replace the PHI node with a new one having the correct scaling). Deciding whether to remove the early detection code will depend on runtime cost I guess, since if we're planning all sensible VFs based on MaxBW then we can just do the detection in a transform. Do we want to hold up the costing work behind a refactor? |
||
if (PRed->getOpcode() != Instruction::Add) | ||
return; | ||
|
||
using namespace llvm::VPlanPatternMatch; | ||
auto *BinOp = PRed->getBinOp(); | ||
if (!match(BinOp, | ||
m_Mul(m_ZExtOrSExt(m_VPValue()), m_ZExtOrSExt(m_VPValue())))) | ||
return; | ||
|
||
auto *BinOpR = cast<VPWidenRecipe>(BinOp->getDefiningRecipe()); | ||
VPWidenCastRecipe *Ext0R = dyn_cast<VPWidenCastRecipe>(BinOpR->getOperand(0)); | ||
VPWidenCastRecipe *Ext1R = dyn_cast<VPWidenCastRecipe>(BinOpR->getOperand(1)); | ||
|
||
// TODO: Make work with extends of different signedness | ||
if (Ext0R->hasMoreThanOneUniqueUser() || Ext1R->hasMoreThanOneUniqueUser() || | ||
Ext0R->getOpcode() != Ext1R->getOpcode()) | ||
return; | ||
|
||
auto *AbstractR = new VPMulAccumulateReductionRecipe( | ||
PRed, BinOpR, Ext0R, Ext1R, Ext0R->getResultType(), | ||
PRed->getVFScaleFactor()); | ||
AbstractR->insertBefore(PRed); | ||
PRed->replaceAllUsesWith(AbstractR); | ||
} | ||
|
||
void VPlanTransforms::convertToAbstractRecipes(VPlan &Plan, VPCostContext &Ctx, | ||
VFRange &Range) { | ||
for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>( | ||
vp_depth_first_deep(Plan.getVectorLoopRegion()))) { | ||
for (VPRecipeBase &R : *VPBB) { | ||
if (auto *Red = dyn_cast<VPReductionRecipe>(&R)) | ||
if (auto *PRed = dyn_cast<VPPartialReductionRecipe>(&R)) | ||
tryToCreateAbstractPartialReductionRecipe(PRed); | ||
else if (auto *Red = dyn_cast<VPReductionRecipe>(&R)) | ||
tryToCreateAbstractReductionRecipe(Red, Ctx, Range); | ||
} | ||
} | ||
|
Uh oh!
There was an error while loading. Please reload this page.