Skip to content

[VPlan] Use VPInstruction for uniform binops. #141429

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8395,7 +8395,7 @@ VPRecipeBuilder::tryToWidenHistogram(const HistogramInfo *HI,
return new VPHistogramRecipe(Opcode, HGramOps, HI->Store->getDebugLoc());
}

VPReplicateRecipe *
VPSingleDefRecipe *
VPRecipeBuilder::handleReplication(Instruction *I, ArrayRef<VPValue *> Operands,
VFRange &Range) {
bool IsUniform = LoopVectorizationPlanner::getDecisionAndClampRange(
Expand Down Expand Up @@ -8453,6 +8453,21 @@ VPRecipeBuilder::handleReplication(Instruction *I, ArrayRef<VPValue *> Operands,
assert((Range.Start.isScalar() || !IsUniform || !IsPredicated ||
(Range.Start.isScalable() && isa<IntrinsicInst>(I))) &&
"Should not predicate a uniform recipe");
if (IsUniform && !IsPredicated) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (IsUniform && !IsPredicated) {
if (IsUniform) {
assert(!IsPredicated && "IsUniform implies unpredicated");

if not asserted earlier?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added assert to #140623

VPInstruction *VPI = nullptr;
if (Instruction::isCast(I->getOpcode())) {
VPI = new VPInstructionWithType(I->getOpcode(), Operands, I->getType(),
VPIRFlags(*I), I->getDebugLoc(),
I->getName());
Comment on lines +8459 to +8461
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could VPInstructionWithType also represent binary ops which are uniform - thereby placing IsSingleScalar alongside Type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that could work, the only question is what to do with ResultTy for opcodes that do not need it but are single-scalar (potentially any opcode)

} else if (Instruction::isBinaryOp(I->getOpcode())) {
VPI = new VPInstruction(I->getOpcode(), Operands, VPIRFlags(*I),
I->getDebugLoc(), I->getName(), true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
I->getDebugLoc(), I->getName(), true);
I->getDebugLoc(), I->getName(), IsUniform);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also done in #140623

}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else - a uniform replicate recipe still takes care of all other cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

if (VPI) {
VPI->setUnderlyingValue(I);
return VPI;
}
}
auto *Recipe = new VPReplicateRecipe(I, Operands, IsUniform, BlockInMask,
VPIRMetadata(*I, LVer));
return Recipe;
Expand Down
2 changes: 1 addition & 1 deletion llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ class VPRecipeBuilder {
/// Build a VPReplicationRecipe for \p I using \p Operands. If it is
/// predicated, add the mask as last operand. Range.End may be decreased to
/// ensure same recipe behavior from \p Range.Start to \p Range.End.
VPReplicateRecipe *handleReplication(Instruction *I,
VPSingleDefRecipe *handleReplication(Instruction *I,
ArrayRef<VPValue *> Operands,
VFRange &Range);

Expand Down
13 changes: 7 additions & 6 deletions llvm/lib/Transforms/Vectorize/VPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -876,6 +876,9 @@ class VPInstruction : public VPRecipeWithIRFlags,
public VPUnrollPartAccessor<1> {
friend class VPlanSlp;

/// True if the VPInstruction produces a single scalar value.
bool IsSingleScalar;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being single scalar is conceptually part of a VPValue's type, complementing its element type. A non single scalar value is either a vector of VF (fixed, scalable, EVL) elements or a collection of VF scalars, which may also deserve clear indication(?), potentially extended in the future to support other widths considering SLP starting with interleave groups, mixed vectorization, etc.

VPlan keeps VPValues agnostic of their Type, relying on VPTypeAnalysis to infer the element type, taking care of caching it. Only recipes that change their type contain that information, as in VPInstructionWithType casts, loads, calls, and initial VPValues. Narrowing bit widths introduces such casts to convey newly inferred element types.

Should single-scalar rely on similar analysis with only initial VPValues, and recipes that change it (vector to scalar reduction and extraction, broadcast and build steps), know about it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should single-scalar rely on similar analysis with only initial VPValues, and recipes that change it (vector to scalar reduction and extraction, broadcast and build steps), know about it?

I think in some cases it can also be seens as property of the producer (e.g. if we proved uniformity for the VF for the specific operation + operands). In those cases, whether a single scalar is produced doesn't depend on either users or operands (e.g. adding a new wide user doesn't change what is produced by the recipe).

I think contrary to scalar types, whether the result is a single scalar is currently queried in many more places during various transforms and would be needed when executing each VPInstruction.

The proposed patch would allow us to start the transition by using a similiar field to current VPReplicateRecipe and hence simplify VPReplicateRecipe to only handle the replicating case.

Similarly we could also use VPInstruction for various VPWiden*Recipes.

It might be worth consolidating and simplifying the existing recipe classes first, before adjusting how we reason/represent single-scalar/uniformity?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in some cases it can also be seen as property of the producer (e.g. if we proved uniformity for the VF for the specific operation + operands). In those cases, whether a single scalar is produced doesn't depend on either users or operands (e.g. adding a new wide user doesn't change what is produced by the recipe).

Indeed in some cases the producer knows it generates a single scalar, as in reducing a vector to a scalar and extracting a scalar from a vector.
If "we proved uniformity for the VF for the specific operation + operands" and then the operands change - that proof may need to be revisited?

I think contrary to scalar types, whether the result is a single scalar is currently queried in many more places during various transforms and would be needed when executing each VPInstruction.

Execution and cost arguably need complete type information - both number of elements and type of each element. In addition to the relative frequency of other usages, is the issue of cache-invalidation or updating the type information when recipes are added/removed/replaced/updated.

The proposed patch would allow us to start the transition by using a similar field to current VPReplicateRecipe and hence simplify VPReplicateRecipe to only handle the replicating case.

This is indeed a good incremental step forward, although replicate recipe still handles remaining uniform cases?

Similarly we could also use VPInstruction for various VPWiden*Recipes.

It might be worth consolidating and simplifying the existing recipe classes first, before adjusting how we reason/represent single-scalar/uniformity?

Would be good to make incremental progress alongside clarifying roadmap goal. If current representation is expected to be revisited, worth a note. I.e., reason about the inconsistency between element type information recorded where needed and propagated via type analysis, versus recording IsSingleScalar in each VPInstruction - all other VPSingleDefRecipe's are implicitly non single scalar? All other VPValues are implicitly single scalar - live-ins?

Should vputils::isSingleScalar(VPInstruction* VPI) be updated to return VPI::isSingleScalar(), w/o considering its operands if it's not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed in some cases the producer knows it generates a single scalar, as in reducing a vector to a scalar and extracting a scalar from a vector. If "we proved uniformity for the VF for the specific operation + operands" and then the operands change - that proof may need to be revisited?

Yep I think that would be good to re-visit at some point.

Execution and cost arguably need complete type information - both number of elements and type of each element. In addition to the relative frequency of other usages, is the issue of cache-invalidation or updating the type information when recipes are added/removed/replaced/updated.

Most recipes w/o attached type can construct IR instruction w/o querying the type explicitly as opcode + operands is sufficient in most cases. For cost, both will be needed.

This is indeed a good incremental step forward, although replicate recipe still handles remaining uniform cases?

Yep there's still some way to go to complete the full transition. Would probably be good to start with one of the simplest cases first (casts, #140623), then binops, then the rest.

Similarly we could also use VPInstruction for various VPWiden*Recipes.

It might be worth consolidating and simplifying the existing recipe classes first, before adjusting how we reason/represent single-scalar/uniformity?

Would be good to make incremental progress alongside clarifying roadmap goal. If current representation is expected to be revisited, worth a note. I.e., reason about the inconsistency between element type information recorded where needed and propagated via type analysis, versus recording IsSingleScalar in each VPInstruction - all other VPSingleDefRecipe's are implicitly non single scalar? All other VPValues are implicitly single scalar - live-ins?

There are some recipes that are always scalar (e.g. VPCanonicalIVPhi), but after the transition allmost all other cases should be VPInstruction with IsSingleScalar set.

Should vputils::isSingleScalar(VPInstruction* VPI) be updated to return VPI::isSingleScalar(), w/o considering its operands if it's not?

I think it at least in the first patch (casts, https://github.com/llvm/llvm-project/pull/1406230) it is too early, as we don't yet set it for all relevant opcodes . But once we set it for all relevant opcodes, IsSingleScalar should be the only source used to decide whether a VPInstruction produces a scalar value, with convertToSingleScalar being the only place where we convert to SingleScalar explicitly.


public:
/// VPlan opcodes, extending LLVM IR with idiomatics instructions.
enum {
Expand Down Expand Up @@ -966,7 +969,7 @@ class VPInstruction : public VPRecipeWithIRFlags,

VPInstruction(unsigned Opcode, ArrayRef<VPValue *> Operands,
const VPIRFlags &Flags, DebugLoc DL = {},
const Twine &Name = "");
const Twine &Name = "", bool IsSingleScalar = false);

VP_CLASSOF_IMPL(VPDef::VPInstructionSC)

Expand Down Expand Up @@ -1051,7 +1054,8 @@ class VPInstructionWithType : public VPInstruction {
VPInstructionWithType(unsigned Opcode, ArrayRef<VPValue *> Operands,
Type *ResultTy, const VPIRFlags &Flags, DebugLoc DL,
const Twine &Name = "")
: VPInstruction(Opcode, Operands, Flags, DL, Name), ResultTy(ResultTy) {}
: VPInstruction(Opcode, Operands, Flags, DL, Name, true),
ResultTy(ResultTy) {}

static inline bool classof(const VPRecipeBase *R) {
// VPInstructionWithType are VPInstructions with specific opcodes requiring
Expand Down Expand Up @@ -1087,10 +1091,7 @@ class VPInstructionWithType : public VPInstruction {

/// Return the cost of this VPInstruction.
InstructionCost computeCost(ElementCount VF,
VPCostContext &Ctx) const override {
// TODO: Compute accurate cost after retiring the legacy cost model.
return 0;
}
VPCostContext &Ctx) const override;

Type *getResultType() const { return ResultTy; }

Expand Down
38 changes: 26 additions & 12 deletions llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -408,9 +408,9 @@ template class VPUnrollPartAccessor<3>;

VPInstruction::VPInstruction(unsigned Opcode, ArrayRef<VPValue *> Operands,
const VPIRFlags &Flags, DebugLoc DL,
const Twine &Name)
const Twine &Name, bool IsSingleScalar)
: VPRecipeWithIRFlags(VPDef::VPInstructionSC, Operands, Flags, DL),
Opcode(Opcode), Name(Name.str()) {
IsSingleScalar(IsSingleScalar), Opcode(Opcode), Name(Name.str()) {
assert(flagsValidForOpcode(getOpcode()) &&
"Set flags not supported for the provided opcode");
}
Expand Down Expand Up @@ -841,7 +841,8 @@ bool VPInstruction::isVectorToScalar() const {
}

bool VPInstruction::isSingleScalar() const {
return getOpcode() == VPInstruction::ResumePhi ||
// TODO: Set IsSingleScalar for ResumePhi and PHI.
return IsSingleScalar || getOpcode() == VPInstruction::ResumePhi ||
getOpcode() == Instruction::PHI;
}

Expand Down Expand Up @@ -965,7 +966,7 @@ void VPInstruction::dump() const {

void VPInstruction::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
O << Indent << "EMIT ";
O << Indent << (isSingleScalar() ? "SINGLE-SCALAR " : "EMIT ");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs rebasing to emit EMIT-SCALAR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks


if (hasResult()) {
printAsOperand(O, SlotTracker);
Expand Down Expand Up @@ -1049,15 +1050,17 @@ void VPInstruction::print(raw_ostream &O, const Twine &Indent,

void VPInstructionWithType::execute(VPTransformState &State) {
State.setDebugLocFrom(getDebugLoc());
switch (getOpcode()) {
case Instruction::ZExt:
case Instruction::Trunc: {
if (Instruction::isCast(getOpcode())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed now that SExt/UIToFP/? are also represented as VPInstructionWithType?
Could alternatively introduce additional case Instruction::'s, but isCast is more general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, now we need to generate code for any cast

Value *Op = State.get(getOperand(0), VPLane(0));
Value *Cast = State.Builder.CreateCast(Instruction::CastOps(getOpcode()),
Op, ResultTy);
if (auto *I = dyn_cast<Instruction>(Cast))
applyFlags(*I);
Comment on lines +1057 to +1058
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed now that flagged casts are represented as VPInstructionWithType's?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

State.set(this, Cast, VPLane(0));
break;
return;
}

switch (getOpcode()) {
case VPInstruction::StepVector: {
Value *StepVector =
State.Builder.CreateStepVector(VectorType::get(ResultTy, State.VF));
Expand All @@ -1069,10 +1072,19 @@ void VPInstructionWithType::execute(VPTransformState &State) {
}
}

InstructionCost VPInstructionWithType::computeCost(ElementCount VF,
VPCostContext &Ctx) const {
// TODO: Compute cost for VPInstructions without underlying values once
// the legacy cost model has been retired.
if (!getUnderlyingValue())
return 0;
return Ctx.getLegacyCost(cast<Instruction>(getUnderlyingValue()), VF);
}

#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
void VPInstructionWithType::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
O << Indent << "EMIT ";
O << Indent << (isSingleScalar() ? "SINGLE-SCALAR " : "EMIT ");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
O << Indent << (isSingleScalar() ? "SINGLE-SCALAR " : "EMIT ");
O << Indent << "EMIT" << (isSingleScalar() ? "-SCALAR" : "") << " ";

consistent with VPInstruction::print().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

printAsOperand(O, SlotTracker);
O << " = ";

Expand Down Expand Up @@ -1585,23 +1597,25 @@ bool VPIRFlags::flagsValidForOpcode(unsigned Opcode) const {
switch (OpType) {
case OperationType::OverflowingBinOp:
return Opcode == Instruction::Add || Opcode == Instruction::Sub ||
Opcode == Instruction::Mul ||
Opcode == Instruction::Mul || Opcode == Instruction::Shl ||
Opcode == VPInstruction::VPInstruction::CanonicalIVIncrementForPart;
case OperationType::DisjointOp:
return Opcode == Instruction::Or;
case OperationType::PossiblyExactOp:
return Opcode == Instruction::AShr;
return Opcode == Instruction::AShr || Opcode == Instruction::LShr ||
Opcode == Instruction::SDiv || Opcode == Instruction::UDiv;
case OperationType::GEPOp:
return Opcode == Instruction::GetElementPtr ||
Opcode == VPInstruction::PtrAdd;
case OperationType::FPMathOp:
return Opcode == Instruction::FAdd || Opcode == Instruction::FMul ||
Opcode == Instruction::FSub || Opcode == Instruction::FNeg ||
Opcode == Instruction::FDiv || Opcode == Instruction::FRem ||
Opcode == Instruction::FPTrunc || Opcode == Instruction::FPExt ||
Opcode == Instruction::FCmp || Opcode == Instruction::Select ||
Opcode == VPInstruction::WideIVStep;
case OperationType::NonNegOp:
return Opcode == Instruction::ZExt;
return Opcode == Instruction::UIToFP || Opcode == Instruction::ZExt;
break;
case OperationType::Cmp:
return Opcode == Instruction::ICmp;
Expand Down
23 changes: 21 additions & 2 deletions llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,10 @@ static bool sinkScalarOperands(VPlan &Plan) {
if (auto *RepR = dyn_cast<VPReplicateRecipe>(SinkCandidate)) {
if (!ScalarVFOnly && RepR->isSingleScalar())
continue;
Comment on lines 154 to 156
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be simplified and generalized to

Suggested change
if (auto *RepR = dyn_cast<VPReplicateRecipe>(SinkCandidate)) {
if (!ScalarVFOnly && RepR->isSingleScalar())
continue;
if (!isa<VPReplicateRecipe, VPInstruction, VPScalarIVStepsRecipe>(SinkCandidate))
continue;
if (!ScalarVFOnly && vputils::isSingleScalar(SinkCandidate))
continue;

?

} else if (auto *RepR = dyn_cast<VPInstruction>(SinkCandidate)) {
if ((!ScalarVFOnly && RepR->isSingleScalar()) ||
!RepR->getUnderlyingValue())
Comment on lines +157 to +159
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now non-single-scalar VPInstructions with underlying values are also subject to sinking?
Why require an underlying value?
Worth updating above "// Try to sink each replicate or scalar IV steps recipe in the worklist."

Suggested change
} else if (auto *RepR = dyn_cast<VPInstruction>(SinkCandidate)) {
if ((!ScalarVFOnly && RepR->isSingleScalar()) ||
!RepR->getUnderlyingValue())
} else if (auto *VPI = dyn_cast<VPInstruction>(SinkCandidate)) {
if ((!ScalarVFOnly && VPI->isSingleScalar()) ||
!VPI->getUnderlyingValue())

continue;
} else if (!isa<VPScalarIVStepsRecipe>(SinkCandidate))
continue;

Expand Down Expand Up @@ -196,6 +200,15 @@ static bool sinkScalarOperands(VPlan &Plan) {
SinkCandidate->replaceUsesWithIf(Clone, [SinkTo](VPUser &U, unsigned) {
return cast<VPRecipeBase>(&U)->getParent() != SinkTo;
});
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a comment how non-single-scalar VPInstructions are sunk.

if (auto *VPI = dyn_cast<VPInstruction>(SinkCandidate)) {
Comment on lines +203 to +204
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else {
if (auto *VPI = dyn_cast<VPInstruction>(SinkCandidate)) {
} else if (auto *VPI = dyn_cast<VPInstruction>(SinkCandidate)) {

auto *OldCand = SinkCandidate;
SinkCandidate = new VPReplicateRecipe(VPI->getUnderlyingInstr(),
SinkCandidate->operands(), true,
nullptr /*Mask*/);
Comment on lines +206 to +208
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can VPI be cloned instead?

SinkCandidate->insertBefore(OldCand);
OldCand->replaceAllUsesWith(SinkCandidate);
}
}
SinkCandidate->moveBefore(*SinkTo, SinkTo->getFirstNonPhi());
for (VPValue *Op : SinkCandidate->operands())
Expand Down Expand Up @@ -1047,8 +1060,14 @@ static void simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) {
unsigned ExtOpcode = match(R.getOperand(0), m_SExt(m_VPValue()))
? Instruction::SExt
: Instruction::ZExt;
auto *VPC =
new VPWidenCastRecipe(Instruction::CastOps(ExtOpcode), A, TruncTy);
VPSingleDefRecipe *VPC;
if (vputils::isSingleScalar(R.getVPSingleValue()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (vputils::isSingleScalar(R.getVPSingleValue()))
if (vputils::isSingleScalar(Trunc))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #140623

VPC = new VPInstructionWithType(Instruction::CastOps(ExtOpcode), {A},
TruncTy, {}, {});
else
VPC = new VPWidenCastRecipe(Instruction::CastOps(ExtOpcode), A,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should both be VPInstructionWithType, one with IsSingleScalar turned on and the other off?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, there's an earlier WIP patch to use VPInstructionWithType also for VPWidenCastRecipe (#129712), although I think it would probably make sense to re-visit this after support for uniform casts land.

TruncTy);

if (auto *UnderlyingExt = R.getOperand(0)->getUnderlyingValue()) {
// UnderlyingExt has distinct return type, used to retain legacy cost.
VPC->setUnderlyingValue(UnderlyingExt);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,10 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: [[STEPS:vp.*]] = SCALAR-STEPS [[IV]], ir<1>, [[VF]]
; CHECK-NEXT: CLONE [[GEP_IDX:.*]] = getelementptr inbounds ir<%indices>, [[STEPS]]
; CHECK-NEXT: CLONE [[IDX:.*]] = load [[GEP_IDX]]
; CHECK-NEXT: CLONE [[EXT_IDX:.*]] = zext [[IDX]]
; CHECK-NEXT: SINGLE-SCALAR [[EXT_IDX:.*]] = zext [[IDX]]
; CHECK-NEXT: CLONE [[GEP_BUCKET:.*]] = getelementptr inbounds ir<%buckets>, [[EXT_IDX]]
; CHECK-NEXT: CLONE [[HISTVAL:.*]] = load [[GEP_BUCKET]]
; CHECK-NEXT: CLONE [[UPDATE:.*]] = add nsw [[HISTVAL]], ir<1>
; CHECK-NEXT: SINGLE-SCALAR [[UPDATE:.*]] = add nsw [[HISTVAL]], ir<1>
; CHECK-NEXT: CLONE store [[UPDATE]], [[GEP_BUCKET]]
; CHECK-NEXT: EMIT [[IV_NEXT]] = add nuw [[IV]], [[VFxUF]]
; CHECK-NEXT: EMIT branch-on-count [[IV_NEXT]], [[VTC]]
Expand All @@ -46,7 +46,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Successor(s): ir-bb<for.exit>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi [[VTC]], ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[RESUME:%.+]]> = resume-phi [[VTC]], ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.body>:
Expand Down Expand Up @@ -93,7 +93,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Successor(s): ir-bb<for.exit>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi [[VTC]], ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[RESUME:%.+]]> = resume-phi [[VTC]], ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.body>:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.body>:
Expand Down Expand Up @@ -91,7 +91,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.body>:
Expand Down Expand Up @@ -143,7 +143,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.body>:
Expand Down Expand Up @@ -190,7 +190,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.body>:
Expand Down Expand Up @@ -241,7 +241,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.body>:
Expand Down Expand Up @@ -288,7 +288,7 @@ target triple = "aarch64-unknown-linux-gnu"
; CHECK-NEXT: Successor(s): ir-bb<for.cond.cleanup>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.body>:
Expand Down
8 changes: 4 additions & 4 deletions llvm/test/Transforms/LoopVectorize/AArch64/vplan-printing.ll
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,8 @@ define i32 @print_partial_reduction(ptr %a, ptr %b) {
; CHECK-NEXT: Successor(s): ir-bb<exit>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<%bc.resume.val> = resume-phi vp<[[VEC_TC]]>, ir<0>
; CHECK-NEXT: EMIT vp<%bc.merge.rdx> = resume-phi vp<[[RED_RESULT]]>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<%bc.resume.val> = resume-phi vp<[[VEC_TC]]>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<%bc.merge.rdx> = resume-phi vp<[[RED_RESULT]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.body>:
Expand Down Expand Up @@ -114,8 +114,8 @@ define i32 @print_partial_reduction(ptr %a, ptr %b) {
; CHECK-NEXT: No successors
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<scalar.ph>:
; CHECK-NEXT: EMIT vp<[[EP_RESUME:%.+]]> = resume-phi ir<1024>, ir<0>
; CHECK-NEXT: EMIT vp<[[EP_MERGE:%.+]]> = resume-phi vp<[[RED_RESULT]]>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[EP_RESUME:%.+]]> = resume-phi ir<1024>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[EP_MERGE:%.+]]> = resume-phi vp<[[RED_RESULT]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<for.body>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<for.body>:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ target triple = "arm64-apple-ios"
; CHECK-NEXT: Successor(s): ir-bb<exit>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<loop>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<loop>:
Expand Down Expand Up @@ -89,7 +89,7 @@ target triple = "arm64-apple-ios"
; CHECK-NEXT: Successor(s): ir-bb<exit>, scalar.ph
; CHECK-EMPTY:
; CHECK-NEXT: scalar.ph:
; CHECK-NEXT: EMIT vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: SINGLE-SCALAR vp<[[RESUME:%.+]]> = resume-phi vp<[[VTC]]>, ir<0>
; CHECK-NEXT: Successor(s): ir-bb<loop>
; CHECK-EMPTY:
; CHECK-NEXT: ir-bb<loop>:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ define void @safe_dep(ptr %p) {
; CHECK-NEXT: CLONE ir<%a1> = getelementptr ir<%p>, vp<[[STEPS]]>
; CHECK-NEXT: vp<[[VPTR1:%.+]]> = vector-pointer ir<%a1>
; CHECK-NEXT: WIDEN ir<%v> = load vp<[[VPTR1]]>
; CHECK-NEXT: CLONE ir<%offset> = add vp<[[STEPS]]>, ir<100>
; CHECK-NEXT: SINGLE-SCALAR ir<%offset> = add vp<[[STEPS]]>, ir<100>
; CHECK-NEXT: CLONE ir<%a2> = getelementptr ir<%p>, ir<%offset>
; CHECK-NEXT: vp<[[VPTR2:%.+]]> = vector-pointer ir<%a2>
; CHECK-NEXT: WIDEN store vp<[[VPTR2]]>, ir<%v>
Expand Down
Loading
Loading