Skip to content

[SandboxVec] Move seed collection into its own separate pass #127132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 18, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions llvm/include/llvm/SandboxIR/Region.h
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,10 @@ class Region {
/// Set \p I as the \p Idx'th element in the auxiliary vector.
/// NOTE: This is for internal use, it does not set the metadata.
void setAux(unsigned Idx, Instruction *I);
/// Helper for dropping Aux metadata for \p I.
void dropAuxMetadata(Instruction *I);
/// Remove instruction \p I from Aux and drop metadata.
void removeFromAux(Instruction *I);

public:
Region(Context &Ctx, TargetTransformInfo &TTI);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,23 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/SandboxIR/Constant.h"
#include "llvm/SandboxIR/Pass.h"
#include "llvm/SandboxIR/PassManager.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/Vectorize/SandboxVectorizer/InstrMaps.h"
#include "llvm/Transforms/Vectorize/SandboxVectorizer/Legality.h"

namespace llvm::sandboxir {

class BottomUpVec final : public FunctionPass {
/// This is a simple bottom-up vectorizer Region pass.
/// It expects a "seed slice" as an input in the Region's Aux vector.
/// The "seed slice" is a vector of instructions that can be used as a starting
/// point for vectorization, like stores to consecutive memory addresses.
/// Starting from the seed instructions, it walks up the def-use chain looking
/// for more instructions that can be vectorized. This pass will generate vector
/// code if it can legally vectorize the code, regardless of whether it is
/// profitable or not. For now profitability is checked at the end of the region
/// pass pipeline by a dedicated pass that accepts or rejects the IR
/// transaction, depending on the cost.
class BottomUpVec final : public RegionPass {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have, in the aux vector, an extra way in which information can flow from one pass to the next, we should document very carefully what each pass expects to get there. This class should have a documentation comment anyway, so please add that and describe what needs to be passed in the aux vector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a great point. Even though it is safe to run the pass without having set the Aux vector, it really expects it to be populated. I am wondering if there is a more structured way of expressing this, perhaps using a different subclass of the RegionPass?

bool Change = false;
std::unique_ptr<LegalityAnalysis> Legality;
/// The original instructions that are potentially dead after vectorization.
Expand Down Expand Up @@ -55,16 +64,9 @@ class BottomUpVec final : public FunctionPass {
/// Entry point for vectorization starting from \p Seeds.
bool tryVectorize(ArrayRef<Value *> Seeds);

/// The PM containing the pipeline of region passes.
RegionPassManager RPM;

public:
BottomUpVec(StringRef Pipeline);
bool runOnFunction(Function &F, const Analyses &A) final;
void printPipeline(raw_ostream &OS) const final {
OS << getName() << "\n";
RPM.printPipeline(OS);
}
BottomUpVec() : RegionPass("bottom-up-vec") {}
bool runOnRegion(Region &Rgn, const Analyses &A) final;
};

} // namespace llvm::sandboxir
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
//===- SeedCollection.h -----------------------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// The seed-collection pass of the bottom-up vectorizer.
//

#ifndef LLVM_TRANSFORMS_VECTORIZE_SANDBOXVECTORIZER_PASSES_SEEDCOLLECTION_H
#define LLVM_TRANSFORMS_VECTORIZE_SANDBOXVECTORIZER_PASSES_SEEDCOLLECTION_H

#include "llvm/SandboxIR/Pass.h"
#include "llvm/SandboxIR/PassManager.h"

namespace llvm::sandboxir {

/// This pass collects the instructions that can become vectorization "seeds",
/// like stores to consecutive memory addresses. It then goes over the collected
/// seeds, slicing them into appropriately sized chunks, creating a Region with
/// the seed slice as the Auxiliary vector and runs the region pass pipeline.
class SeedCollection final : public FunctionPass {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, this should probably explain that it runs a pipeline of region passes, passing the collected seeds in the aux vector to the first pass in the pipeline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything new. Did you forget to push a commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops I pushed it to the wrong branch...


/// The PM containing the pipeline of region passes.
RegionPassManager RPM;

public:
SeedCollection(StringRef Pipeline);
bool runOnFunction(Function &F, const Analyses &A) final;
void printPipeline(raw_ostream &OS) const final {
OS << getName() << "\n";
RPM.printPipeline(OS);
}
};

} // namespace llvm::sandboxir

#endif // LLVM_TRANSFORMS_VECTORIZE_SANDBOXVECTORIZER_PASSES_SEEDCOLLECTION_H
25 changes: 19 additions & 6 deletions llvm/lib/SandboxIR/Region.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,10 @@ Region::Region(Context &Ctx, TargetTransformInfo &TTI)

CreateInstCB = Ctx.registerCreateInstrCallback(
[this](Instruction *NewInst) { add(NewInst); });
EraseInstCB = Ctx.registerEraseInstrCallback(
[this](Instruction *ErasedInst) { remove(ErasedInst); });
EraseInstCB = Ctx.registerEraseInstrCallback([this](Instruction *ErasedInst) {
remove(ErasedInst);
removeFromAux(ErasedInst);
});
}

Region::~Region() {
Expand Down Expand Up @@ -84,11 +86,22 @@ void Region::setAux(unsigned Idx, Instruction *I) {
Aux[Idx] = I;
}

void Region::dropAuxMetadata(Instruction *I) {
auto *LLVMI = cast<llvm::Instruction>(I->Val);
LLVMI->setMetadata(AuxMDKind, nullptr);
}

void Region::removeFromAux(Instruction *I) {
auto It = find(Aux, I);
if (It == Aux.end())
return;
dropAuxMetadata(I);
Aux.erase(It);
}

void Region::clearAux() {
for (unsigned Idx : seq<unsigned>(0, Aux.size())) {
auto *LLVMI = cast<llvm::Instruction>(Aux[Idx]->Val);
LLVMI->setMetadata(AuxMDKind, nullptr);
}
for (unsigned Idx : seq<unsigned>(0, Aux.size()))
dropAuxMetadata(Aux[Idx]);
Aux.clear();
}

Expand Down
1 change: 1 addition & 0 deletions llvm/lib/Transforms/Vectorize/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ add_llvm_component_library(LLVMVectorize
SandboxVectorizer/Legality.cpp
SandboxVectorizer/Passes/BottomUpVec.cpp
SandboxVectorizer/Passes/RegionsFromMetadata.cpp
SandboxVectorizer/Passes/SeedCollection.cpp
SandboxVectorizer/Passes/TransactionAcceptOrRevert.cpp
SandboxVectorizer/SandboxVectorizer.cpp
SandboxVectorizer/SandboxVectorizerPassBuilder.cpp
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,20 +14,10 @@
#include "llvm/SandboxIR/Module.h"
#include "llvm/SandboxIR/Region.h"
#include "llvm/SandboxIR/Utils.h"
#include "llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizerPassBuilder.h"
#include "llvm/Transforms/Vectorize/SandboxVectorizer/SeedCollector.h"
#include "llvm/Transforms/Vectorize/SandboxVectorizer/VecUtils.h"

namespace llvm {

static cl::opt<unsigned>
OverrideVecRegBits("sbvec-vec-reg-bits", cl::init(0), cl::Hidden,
cl::desc("Override the vector register size in bits, "
"which is otherwise found by querying TTI."));
static cl::opt<bool>
AllowNonPow2("sbvec-allow-non-pow2", cl::init(false), cl::Hidden,
cl::desc("Allow non-power-of-2 vectorization."));

#ifndef NDEBUG
static cl::opt<bool>
AlwaysVerify("sbvec-always-verify", cl::init(false), cl::Hidden,
Expand All @@ -37,10 +27,6 @@ static cl::opt<bool>

namespace sandboxir {

BottomUpVec::BottomUpVec(StringRef Pipeline)
: FunctionPass("bottom-up-vec"),
RPM("rpm", Pipeline, SandboxVectorizerPassBuilder::createRegionPass) {}

static SmallVector<Value *, 4> getOperand(ArrayRef<Value *> Bndl,
unsigned OpIdx) {
SmallVector<Value *, 4> Operands;
Expand Down Expand Up @@ -413,90 +399,29 @@ Value *BottomUpVec::vectorizeRec(ArrayRef<Value *> Bndl,
}

bool BottomUpVec::tryVectorize(ArrayRef<Value *> Bndl) {
Change = false;
DeadInstrCandidates.clear();
Legality->clear();
vectorizeRec(Bndl, {}, /*Depth=*/0);
tryEraseDeadInstrs();
return Change;
}

bool BottomUpVec::runOnFunction(Function &F, const Analyses &A) {
bool BottomUpVec::runOnRegion(Region &Rgn, const Analyses &A) {
const auto &SeedSlice = Rgn.getAux();
assert(SeedSlice.size() >= 2 && "Bad slice!");
Function &F = *SeedSlice[0]->getParent()->getParent();
IMaps = std::make_unique<InstrMaps>(F.getContext());
Legality = std::make_unique<LegalityAnalysis>(
A.getAA(), A.getScalarEvolution(), F.getParent()->getDataLayout(),
F.getContext(), *IMaps);
Change = false;
const auto &DL = F.getParent()->getDataLayout();
unsigned VecRegBits =
OverrideVecRegBits != 0
? OverrideVecRegBits
: A.getTTI()
.getRegisterBitWidth(TargetTransformInfo::RGK_FixedWidthVector)
.getFixedValue();

// TODO: Start from innermost BBs first
for (auto &BB : F) {
SeedCollector SC(&BB, A.getScalarEvolution());
for (SeedBundle &Seeds : SC.getStoreSeeds()) {
unsigned ElmBits =
Utils::getNumBits(VecUtils::getElementType(Utils::getExpectedType(
Seeds[Seeds.getFirstUnusedElementIdx()])),
DL);

auto DivideBy2 = [](unsigned Num) {
auto Floor = VecUtils::getFloorPowerOf2(Num);
if (Floor == Num)
return Floor / 2;
return Floor;
};
// Try to create the largest vector supported by the target. If it fails
// reduce the vector size by half.
for (unsigned SliceElms = std::min(VecRegBits / ElmBits,
Seeds.getNumUnusedBits() / ElmBits);
SliceElms >= 2u; SliceElms = DivideBy2(SliceElms)) {
if (Seeds.allUsed())
break;
// Keep trying offsets after FirstUnusedElementIdx, until we vectorize
// the slice. This could be quite expensive, so we enforce a limit.
for (unsigned Offset = Seeds.getFirstUnusedElementIdx(),
OE = Seeds.size();
Offset + 1 < OE; Offset += 1) {
// Seeds are getting used as we vectorize, so skip them.
if (Seeds.isUsed(Offset))
continue;
if (Seeds.allUsed())
break;

auto SeedSlice =
Seeds.getSlice(Offset, SliceElms * ElmBits, !AllowNonPow2);
if (SeedSlice.empty())
continue;

assert(SeedSlice.size() >= 2 && "Should have been rejected!");

// TODO: Refactor to remove the unnecessary copy to SeedSliceVals.
SmallVector<Value *> SeedSliceVals(SeedSlice.begin(),
SeedSlice.end());
// Create an empty region. Instructions get added to the region
// automatically by the callbacks.
auto &Ctx = F.getContext();
Region Rgn(Ctx, A.getTTI());
// Save the state of the IR before we make any changes. The
// transaction gets accepted/reverted by the tr-accept-or-revert pass.
Ctx.save();
// Try to vectorize starting from the seed slice. The returned value
// is true if we found vectorizable code and generated some vector
// code for it. It does not mean that the code is profitable.
bool VecSuccess = tryVectorize(SeedSliceVals);
if (VecSuccess)
// WARNING: All passes should return false, except those that
// accept/revert the state.
Change |= RPM.runOnRegion(Rgn, A);
}
}
}
}
return Change;
// TODO: Refactor to remove the unnecessary copy to SeedSliceVals.
SmallVector<Value *> SeedSliceVals(SeedSlice.begin(), SeedSlice.end());
// Try to vectorize starting from the seed slice. The returned value
// is true if we found vectorizable code and generated some vector
// code for it. It does not mean that the code is profitable.
return tryVectorize(SeedSliceVals);
}

} // namespace sandboxir
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,15 @@ REGION_PASS("null", ::llvm::sandboxir::NullPass)
REGION_PASS("print-instruction-count", ::llvm::sandboxir::PrintInstructionCount)
REGION_PASS("tr-accept", ::llvm::sandboxir::TransactionAlwaysAccept)
REGION_PASS("tr-accept-or-revert", ::llvm::sandboxir::TransactionAcceptOrRevert)
REGION_PASS("bottom-up-vec", ::llvm::sandboxir::BottomUpVec)

#undef REGION_PASS

#ifndef FUNCTION_PASS_WITH_PARAMS
#define FUNCTION_PASS_WITH_PARAMS(NAME, CLASS_NAME)
#endif

FUNCTION_PASS_WITH_PARAMS("bottom-up-vec", ::llvm::sandboxir::BottomUpVec)
FUNCTION_PASS_WITH_PARAMS("seed-collection", ::llvm::sandboxir::SeedCollection)
FUNCTION_PASS_WITH_PARAMS("regions-from-metadata", ::llvm::sandboxir::RegionsFromMetadata)

#undef FUNCTION_PASS_WITH_PARAMS
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
//===- SeedCollection.cpp - Seed collection pass --------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "llvm/Transforms/Vectorize/SandboxVectorizer/Passes/SeedCollection.h"
#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/SandboxIR/Module.h"
#include "llvm/SandboxIR/Region.h"
#include "llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizerPassBuilder.h"
#include "llvm/Transforms/Vectorize/SandboxVectorizer/SeedCollector.h"
#include "llvm/Transforms/Vectorize/SandboxVectorizer/VecUtils.h"

namespace llvm {

static cl::opt<unsigned>
OverrideVecRegBits("sbvec-vec-reg-bits", cl::init(0), cl::Hidden,
cl::desc("Override the vector register size in bits, "
"which is otherwise found by querying TTI."));
static cl::opt<bool>
AllowNonPow2("sbvec-allow-non-pow2", cl::init(false), cl::Hidden,
cl::desc("Allow non-power-of-2 vectorization."));

namespace sandboxir {
SeedCollection::SeedCollection(StringRef Pipeline)
: FunctionPass("seed-collection"),
RPM("rpm", Pipeline, SandboxVectorizerPassBuilder::createRegionPass) {}

bool SeedCollection::runOnFunction(Function &F, const Analyses &A) {
bool Change = false;
const auto &DL = F.getParent()->getDataLayout();
unsigned VecRegBits =
OverrideVecRegBits != 0
? OverrideVecRegBits
: A.getTTI()
.getRegisterBitWidth(TargetTransformInfo::RGK_FixedWidthVector)
.getFixedValue();

// TODO: Start from innermost BBs first
for (auto &BB : F) {
SeedCollector SC(&BB, A.getScalarEvolution());
for (SeedBundle &Seeds : SC.getStoreSeeds()) {
unsigned ElmBits =
Utils::getNumBits(VecUtils::getElementType(Utils::getExpectedType(
Seeds[Seeds.getFirstUnusedElementIdx()])),
DL);

auto DivideBy2 = [](unsigned Num) {
auto Floor = VecUtils::getFloorPowerOf2(Num);
if (Floor == Num)
return Floor / 2;
return Floor;
};
// Try to create the largest vector supported by the target. If it fails
// reduce the vector size by half.
for (unsigned SliceElms = std::min(VecRegBits / ElmBits,
Seeds.getNumUnusedBits() / ElmBits);
SliceElms >= 2u; SliceElms = DivideBy2(SliceElms)) {
if (Seeds.allUsed())
break;
// Keep trying offsets after FirstUnusedElementIdx, until we vectorize
// the slice. This could be quite expensive, so we enforce a limit.
for (unsigned Offset = Seeds.getFirstUnusedElementIdx(),
OE = Seeds.size();
Offset + 1 < OE; Offset += 1) {
// Seeds are getting used as we vectorize, so skip them.
if (Seeds.isUsed(Offset))
continue;
if (Seeds.allUsed())
break;

auto SeedSlice =
Seeds.getSlice(Offset, SliceElms * ElmBits, !AllowNonPow2);
if (SeedSlice.empty())
continue;

assert(SeedSlice.size() >= 2 && "Should have been rejected!");

// Create a region containing the seed slice.
auto &Ctx = F.getContext();
Region Rgn(Ctx, A.getTTI());
// TODO: Replace save() with a save pass in the pass pipeline.
Ctx.save();
Rgn.setAux(SeedSlice);
// Run the region pass pipeline.
Change |= RPM.runOnRegion(Rgn, A);
Rgn.clearAux();
}
}
}
}
return Change;
}
} // namespace sandboxir
} // namespace llvm
Loading