Skip to content

[LV]Set tailfolding styles before computing feasible max VF. #91403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 42 additions & 11 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1540,6 +1540,14 @@ class LoopVectorizationCostModel {
}
}

/// Disables previously chosen tail folding policy, sets it to None. Expects,
/// that the tail policy was selected.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

independent of this patch: perhaps this should be setting a new policy/style - one which explicitly states there is NoTail - to fold nor unfold, rather than "disabling" and having None mean both unfolded tail and no tail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. This needs to be investigated separately.

void disableTailFolding() {
assert(ChosenTailFoldingStyle && "Tail folding must be selected.");
ChosenTailFoldingStyle =
std::make_pair(TailFoldingStyle::None, TailFoldingStyle::None);
}

/// Returns true if all loop blocks should be masked to fold tail loop.
bool foldTailByMasking() const {
// TODO: check if it is possible to check for None style independent of
Expand Down Expand Up @@ -1631,6 +1639,11 @@ class LoopVectorizationCostModel {
ElementCount MaxSafeVF,
bool FoldTailByMasking);

/// Checks if the scalable vectorization is supported and enabled. The result
/// is stored in \p IsScalableVectorizationAllowed and used later, if
/// requested.
Comment on lines +1642 to +1644
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Checks if the scalable vectorization is supported and enabled. The result
/// is stored in \p IsScalableVectorizationAllowed and used later, if
/// requested.
/// Checks if scalable vectorization is supported and enabled. Caches the result to avoid repeated debug dumps for repeated queries.

bool isScalableVectorizationAllowed();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not const because of debug dumps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it may change the value of IsScalableVectorizationAllowed, if it is not set yet.


/// \return the maximum legal scalable VF, based on the safe max number
/// of elements.
ElementCount getMaxLegalScalableVF(unsigned MaxSafeElements);
Expand Down Expand Up @@ -1695,6 +1708,9 @@ class LoopVectorizationCostModel {
std::optional<std::pair<TailFoldingStyle, TailFoldingStyle>>
ChosenTailFoldingStyle;

/// true if scalable vectorization is supported and enabled.
std::optional<bool> IsScalableVectorizationAllowed;

/// A map holding scalar costs for different vectorization factors. The
/// presence of a cost for an instruction in the mapping indicates that the
/// instruction will be scalarized when vectorizing with the associated
Expand Down Expand Up @@ -4189,15 +4205,18 @@ bool LoopVectorizationCostModel::runtimeChecksRequired() {
return false;
}

ElementCount
LoopVectorizationCostModel::getMaxLegalScalableVF(unsigned MaxSafeElements) {
bool LoopVectorizationCostModel::isScalableVectorizationAllowed() {
if (IsScalableVectorizationAllowed)
return *IsScalableVectorizationAllowed;

IsScalableVectorizationAllowed = false;
if (!TTI.supportsScalableVectors() && !ForceTargetSupportsScalableVectors)
return ElementCount::getScalable(0);
return false;

if (Hints->isScalableVectorizationDisabled()) {
reportVectorizationInfo("Scalable vectorization is explicitly disabled",
"ScalableVectorizationDisabled", ORE, TheLoop);
return ElementCount::getScalable(0);
return false;
}

LLVM_DEBUG(dbgs() << "LV: Scalable vectorization is available\n");
Expand All @@ -4217,7 +4236,7 @@ LoopVectorizationCostModel::getMaxLegalScalableVF(unsigned MaxSafeElements) {
"Scalable vectorization not supported for the reduction "
"operations found in this loop.",
"ScalableVFUnfeasible", ORE, TheLoop);
return ElementCount::getScalable(0);
return false;
}

// Disable scalable vectorization if the loop contains any instructions
Expand All @@ -4229,9 +4248,20 @@ LoopVectorizationCostModel::getMaxLegalScalableVF(unsigned MaxSafeElements) {
reportVectorizationInfo("Scalable vectorization is not supported "
"for all element types found in this loop.",
"ScalableVFUnfeasible", ORE, TheLoop);
return ElementCount::getScalable(0);
return false;
}

IsScalableVectorizationAllowed = true;
return true;
}

ElementCount
LoopVectorizationCostModel::getMaxLegalScalableVF(unsigned MaxSafeElements) {
if (!isScalableVectorizationAllowed())
return ElementCount::getScalable(0);

auto MaxScalableVF = ElementCount::getScalable(
std::numeric_limits<ElementCount::ScalarTy>::max());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

independent of this patch - this potentially overrides MaxSafeElements - worth some assert?

The report below refers to getScalable(MaxSafeElements / *MaxVScale) returning false, rather than getMaxVScale() returning false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Do not know how to add the assertion here without looking into internals of the LoopAccessAnalysis. Legal->isSafeForAnyVectorWidth() checks that MaxSafeVectorWidthInBits == UINT_MAX, and MaxSafeElements = bit_floor(MaxSafeVectorWidthInBits / WidestType).

  2. Not only. If MaxSafeElements < MaxVScale too.

if (Legal->isSafeForAnyVectorWidth())
return MaxScalableVF;

Expand Down Expand Up @@ -4434,6 +4464,11 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
InterleaveInfo.invalidateGroupsRequiringScalarEpilogue();
}

// If we don't know the precise trip count, or if the trip count that we
// found modulo the vectorization factor is not zero, try to fold the tail
// by masking.
// FIXME: look for a smaller MaxVF that does divide TC rather than masking.
Comment on lines +4467 to +4470
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment best continue to appear below before the if (foldTailByMasking()) part which deals with actually folding the tail, rather than here which tries to avoid tail folding if the precise trip count is known - to be a multiple of any VF we choose, possibly times UserIC? (i.e., not necessarily a power of 2)

Some other comment is needed here to explain why tail folding style is being set here (before being sure there is a tail, possibly to be reset below when we're sure there isn't), before calling computeFeasibleMaxVF(MaxTC, UserVF, /* FoldTail */ true), rather than below, before the first time it is checked explicitly via foldTailByMasking().

Perhaps that last boolean parameter of computeFeasibleMaxVF() is insufficient/redundant?

setTailFoldingStyles(isScalableVectorizationAllowed(), UserIC);
FixedScalableVFPair MaxFactors = computeFeasibleMaxVF(MaxTC, UserVF, true);

// Avoid tail folding if the trip count is known to be a multiple of any VF
Expand Down Expand Up @@ -4465,15 +4500,11 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
if (Rem->isZero()) {
// Accept MaxFixedVF if we do not have a tail.
LLVM_DEBUG(dbgs() << "LV: No tail will remain for any chosen VF.\n");
disableTailFolding();
return MaxFactors;
}
}

// If we don't know the precise trip count, or if the trip count that we
// found modulo the vectorization factor is not zero, try to fold the tail
// by masking.
// FIXME: look for a smaller MaxVF that does divide TC rather than masking.
setTailFoldingStyles(MaxFactors.ScalableVF.isScalable(), UserIC);
if (foldTailByMasking()) {
if (getTailFoldingStyle() == TailFoldingStyle::DataWithEVL) {
LLVM_DEBUG(
Expand Down
Loading