Skip to content

[SandboxVectorizer] Add container class to track and manage SeedBundles #112048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ class SeedBundle {
NumUnusedBits += Utils::getNumBits(I);
}

virtual void insert(Instruction *I, ScalarEvolution &SE) = 0;

unsigned getFirstUnusedElementIdx() const {
for (unsigned ElmIdx : seq<unsigned>(0, Seeds.size()))
if (!isUsed(ElmIdx))
Expand Down Expand Up @@ -96,6 +98,9 @@ class SeedBundle {
MutableArrayRef<Instruction *>
getSlice(unsigned StartIdx, unsigned MaxVecRegBits, bool ForcePowOf2);

/// \Returns the number of seed elements in the bundle.
std::size_t size() const { return Seeds.size(); }

protected:
SmallVector<Instruction *> Seeds;
/// The lanes that we have already vectorized.
Expand Down Expand Up @@ -148,7 +153,7 @@ template <typename LoadOrStoreT> class MemSeedBundle : public SeedBundle {
"Expected LoadInst or StoreInst!");
assert(isa<LoadOrStoreT>(MemI) && "Expected Load or Store!");
}
void insert(sandboxir::Instruction *I, ScalarEvolution &SE) {
void insert(sandboxir::Instruction *I, ScalarEvolution &SE) override {
assert(isa<LoadOrStoreT>(I) && "Expected a Store or a Load!");
auto Cmp = [&SE](Instruction *I0, Instruction *I1) {
return Utils::atLowerAddress(cast<LoadOrStoreT>(I0),
Expand All @@ -162,5 +167,123 @@ template <typename LoadOrStoreT> class MemSeedBundle : public SeedBundle {
using StoreSeedBundle = MemSeedBundle<sandboxir::StoreInst>;
using LoadSeedBundle = MemSeedBundle<sandboxir::LoadInst>;

/// Class to conveniently track Seeds within SeedBundles. Saves newly collected
/// seeds in the proper bundle. Supports constant-time removal, as seeds and
/// entire bundles are vectorized and marked used to signify removal. Iterators
/// skip bundles that are completely used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

completely used -> completely unused ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterators skip bundles that are completely used, because the vectorizer will set an instruction as "used" when it is vectorized (or we have concluded that it can't be vectorized). We don't actually remove them from the bundle so that "removal" is a constant time operation. I will elaborate on this in the iterator comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes you are referring to the used flag. I was thinking of something tagged with the used flag as unused.

class SeedContainer {
// Use the same key for different seeds if they are the same type and
// reference the same pointer, even if at different offsets. This directs
// potentially vectorizable seeds into the same bundle.
using KeyT = std::tuple<Value *, Type *, Instruction::Opcode>;
// Trying to vectorize too many seeds at once is expensive in
// compilation-time. Use a vector of bundles (all with the same key) to
// partition the candidate set into more manageable units. Each bundle is
// size-limited by sbvec-seed-bundle-size-limit. TODO: There might be a
// better way to divide these than by simple insertion order.
using ValT = SmallVector<std::unique_ptr<SeedBundle>>;
using BundleMapT = MapVector<KeyT, ValT>;
// Map from {pointer, Type, Opcode} to a vector of bundles.
BundleMapT Bundles;
// Allows finding a particular Instruction's bundle.
DenseMap<Instruction *, SeedBundle *> SeedLookupMap;

ScalarEvolution &SE;

template <typename LoadOrStoreT> KeyT getKey(LoadOrStoreT *LSI) const;

public:
SeedContainer(ScalarEvolution &SE) : SE(SE) {}

class iterator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a brief description, mentioning what it iterates over and whether the iterator follows a specific order.

BundleMapT *Map = nullptr;
BundleMapT::iterator MapIt;
ValT *Vec = nullptr;
size_t VecIdx;

public:
using difference_type = std::ptrdiff_t;
using value_type = SeedBundle;
using pointer = value_type *;
using reference = value_type &;
using iterator_category = std::input_iterator_tag;

/// Iterates over the \p Map of SeedBundle Vectors, starting at \p MapIt,
/// and \p Vec at \p VecIdx, skipping vectors that are completely
/// used. Iteration order over the keys {Pointer, Type, Opcode} follows
/// DenseMap iteration order. For a given key, the vectors of
/// SeedBundles will be returned in insertion order. As in the
/// pseudo code below:
///
/// for Key,Value in Bundles
/// for SeedBundleVector in Value
/// for SeedBundle in SeedBundleVector
/// if !SeedBundle.allUsed() ...
///
/// Note that the bundles themselves may have additional ordering, created
/// by the subclasses by insertAt. The bundles themselves may also have used
/// instructions.
iterator(BundleMapT &Map, BundleMapT::iterator MapIt, ValT *Vec, int VecIdx)
: Map(&Map), MapIt(MapIt), Vec(Vec), VecIdx(VecIdx) {}
value_type &operator*() {
assert(Vec != nullptr && "Already at end!");
return *(*Vec)[VecIdx];
}
// Skip completely used bundles by repeatedly calling operator++().
void skipUsed() {
while (Vec && VecIdx < Vec->size() && this->operator*().allUsed())
++(*this);
}
// Iterators iterate over the bundles
iterator &operator++() {
assert(VecIdx >= 0 && "Already at end!");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sterling-Augustine I'm seeing "comparison of unsigned expression >= 0 is always true" warnings due to VecIdx being unsigned - please can you take a look?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #112392

Sorry for the trouble

++VecIdx;
if (VecIdx >= Vec->size()) {
assert(MapIt != Map->end() && "Already at end!");
VecIdx = 0;
++MapIt;
if (MapIt != Map->end())
Vec = &MapIt->second;
else {
Vec = nullptr;
}
}
skipUsed();
return *this;
}
iterator operator++(int) {
auto Copy = *this;
++(*this);
return Copy;
}
bool operator==(const iterator &Other) const {
assert(Map == Other.Map && "Iterator of different objects!");
return MapIt == Other.MapIt && VecIdx == Other.VecIdx;
}
bool operator!=(const iterator &Other) const { return !(*this == Other); }
};
using const_iterator = BundleMapT::const_iterator;
template <typename LoadOrStoreT> void insert(LoadOrStoreT *LSI);
// To support constant-time erase, these just mark the element used, rather
// than actually removing them from the bundle.
bool erase(Instruction *I);
bool erase(const KeyT &Key) { return Bundles.erase(Key); }
iterator begin() {
if (Bundles.empty())
return end();
auto BeginIt =
iterator(Bundles, Bundles.begin(), &Bundles.begin()->second, 0);
BeginIt.skipUsed();
return BeginIt;
}
iterator end() { return iterator(Bundles, Bundles.end(), nullptr, 0); }
unsigned size() const { return Bundles.size(); }

#ifndef NDEBUG
LLVM_DUMP_METHOD void dump() const;
#endif // NDEBUG
};

} // namespace llvm::sandboxir

#endif // LLVM_TRANSFORMS_VECTORIZE_SANDBOXVECTORIZER_SEEDCOLLECTOR_H
68 changes: 68 additions & 0 deletions llvm/lib/Transforms/Vectorize/SandboxVectorizer/SeedCollector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@
using namespace llvm;
namespace llvm::sandboxir {

cl::opt<unsigned> SeedBundleSizeLimit(
"sbvec-seed-bundle-size-limit", cl::init(32), cl::Hidden,
cl::desc("Limit the size of the seed bundle to cap compilation time."));

MutableArrayRef<Instruction *> SeedBundle::getSlice(unsigned StartIdx,
unsigned MaxVecRegBits,
bool ForcePowerOf2) {
Expand Down Expand Up @@ -61,4 +65,68 @@ MutableArrayRef<Instruction *> SeedBundle::getSlice(unsigned StartIdx,
return {};
}

template <typename LoadOrStoreT>
SeedContainer::KeyT SeedContainer::getKey(LoadOrStoreT *LSI) const {
assert((isa<LoadInst>(LSI) || isa<StoreInst>(LSI)) &&
"Expected Load or Store!");
Value *Ptr = Utils::getMemInstructionBase(LSI);
Instruction::Opcode Op = LSI->getOpcode();
Type *Ty = Utils::getExpectedType(LSI);
if (auto *VTy = dyn_cast<VectorType>(Ty))
Ty = VTy->getElementType();
return {Ptr, Ty, Op};
}

// Explicit instantiations
template SeedContainer::KeyT
SeedContainer::getKey<LoadInst>(LoadInst *LSI) const;
template SeedContainer::KeyT
SeedContainer::getKey<StoreInst>(StoreInst *LSI) const;

bool SeedContainer::erase(Instruction *I) {
assert((isa<LoadInst>(I) || isa<StoreInst>(I)) && "Expected Load or Store!");
auto It = SeedLookupMap.find(I);
if (It == SeedLookupMap.end())
return false;
SeedBundle *Bndl = It->second;
Bndl->setUsed(I);
return true;
}

template <typename LoadOrStoreT> void SeedContainer::insert(LoadOrStoreT *LSI) {
// Find the bundle containing seeds for this symbol and type-of-access.
auto &BundleVec = Bundles[getKey(LSI)];
// Fill this vector of bundles front to back so that only the last bundle in
// the vector may have available space. This avoids iteration to find one with
// space.
if (BundleVec.empty() || BundleVec.back()->size() == SeedBundleSizeLimit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will this work when instructions are inserted in out-of-order offsets? Can you move instructions between bundles to make space for one that "completes" another bundle, or do you avoid this by pre-sorting by offsets before insertion (I assumed this is not the case due to the logic in insert checking Utils::atLowerAddress) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally this won't be a problem. Seeds are inserted into the bundle at the proper offset position in memory by the calculations in insert and then calling insertAt.

But you are right that if we exceed the seed count limit for a bundle, we could encounter a situation where a vectorizable group isn't found because it is split across two bundles. The limit is either 32 loads or stores with the same base symbol (a nice power-of-two, so hopefully vectorizable). I wouldn't expect to encounter it frequently for now, but long-term this is a problem that will need to be fixed, as noted in the TODO on line 184. Perhaps I should expand that explanation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG.

BundleVec.emplace_back(std::make_unique<MemSeedBundle<LoadOrStoreT>>(LSI));
else
BundleVec.back()->insert(LSI, SE);

SeedLookupMap[LSI] = BundleVec.back().get();
}

// Explicit instantiations
template void SeedContainer::insert<LoadInst>(LoadInst *);
template void SeedContainer::insert<StoreInst>(StoreInst *);

#ifndef NDEBUG
void SeedContainer::dump() const {
for (const auto &Pair : Bundles) {
auto [I, Ty, Opc] = Pair.first;
const auto &SeedsVec = Pair.second;
std::string RefType = dyn_cast<LoadInst>(I) ? "Load"
: dyn_cast<StoreInst>(I) ? "Store"
: "Other";
dbgs() << "[Inst=" << *I << " Ty=" << Ty << " " << RefType << "]\n";
for (const auto &SeedPtr : SeedsVec) {
SeedPtr->dump(dbgs());
dbgs() << "\n";
}
}
dbgs() << "\n";
}
#endif // NDEBUG

} // namespace llvm::sandboxir
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,15 @@ struct SeedBundleTest : public testing::Test {
}
};

// Stub class to make the abstract base class testable.
class SeedBundleForTest : public sandboxir::SeedBundle {
public:
using sandboxir::SeedBundle::SeedBundle;
void insert(sandboxir::Instruction *I, ScalarEvolution &SE) override {
insertAt(Seeds.end(), I);
}
};

TEST_F(SeedBundleTest, SeedBundle) {
parseIR(C, R"IR(
define void @foo(float %v0, i32 %i0, i16 %i1, i8 %i2) {
Expand All @@ -66,7 +75,7 @@ define void @foo(float %v0, i32 %i0, i16 %i1, i8 %i2) {
// Assume first two instructions are identical in the number of bits.
const unsigned IOBits = sandboxir::Utils::getNumBits(I0, DL);
// Constructor
sandboxir::SeedBundle SBO(I0);
SeedBundleForTest SBO(I0);
EXPECT_EQ(*SBO.begin(), I0);
// getNumUnusedBits after constructor
EXPECT_EQ(SBO.getNumUnusedBits(), IOBits);
Expand Down Expand Up @@ -103,7 +112,7 @@ define void @foo(float %v0, i32 %i0, i16 %i1, i8 %i2) {
EXPECT_EQ(BundleBits, 88u);
auto Seeds = Insts;
// Constructor
sandboxir::SeedBundle SB1(std::move(Seeds));
SeedBundleForTest SB1(std::move(Seeds));
// getNumUnusedBits after constructor
EXPECT_EQ(SB1.getNumUnusedBits(), BundleBits);
// setUsed with index
Expand Down Expand Up @@ -196,3 +205,66 @@ define void @foo(ptr %ptrA, float %val, ptr %ptr) {
sandboxir::LoadSeedBundle LB(std::move(Loads), SE);
EXPECT_THAT(LB, testing::ElementsAre(L0, L1, L2, L3));
}

TEST_F(SeedBundleTest, Container) {
parseIR(C, R"IR(
define void @foo(ptr %ptrA, float %val, ptr %ptrB) {
bb:
%gepA0 = getelementptr float, ptr %ptrA, i32 0
%gepA1 = getelementptr float, ptr %ptrA, i32 1
%gepB0 = getelementptr float, ptr %ptrB, i32 0
%gepB1 = getelementptr float, ptr %ptrB, i32 1
store float %val, ptr %gepA0
store float %val, ptr %gepA1
store float %val, ptr %gepB0
store float %val, ptr %gepB1
ret void
}
)IR");
Function &LLVMF = *M->getFunction("foo");

DominatorTree DT(LLVMF);
TargetLibraryInfoImpl TLII;
TargetLibraryInfo TLI(TLII);
DataLayout DL(M->getDataLayout());
LoopInfo LI(DT);
AssumptionCache AC(LLVMF);
ScalarEvolution SE(LLVMF, TLI, AC, DT, LI);

sandboxir::Context Ctx(C);
auto &F = *Ctx.createFunction(&LLVMF);
auto &BB = *F.begin();
auto It = std::next(BB.begin(), 4);
auto *S0 = cast<sandboxir::StoreInst>(&*It++);
auto *S1 = cast<sandboxir::StoreInst>(&*It++);
auto *S2 = cast<sandboxir::StoreInst>(&*It++);
auto *S3 = cast<sandboxir::StoreInst>(&*It++);
sandboxir::SeedContainer SC(SE);
// Check begin() end() when empty.
EXPECT_EQ(SC.begin(), SC.end());

SC.insert(S0);
SC.insert(S1);
SC.insert(S2);
SC.insert(S3);
unsigned Cnt = 0;
SmallVector<sandboxir::SeedBundle *> Bndls;
for (auto &SeedBndl : SC) {
EXPECT_EQ(SeedBndl.size(), 2u);
++Cnt;
Bndls.push_back(&SeedBndl);
}
EXPECT_EQ(Cnt, 2u);

// Mark them "Used" to check if operator++ skips them in the next loop.
for (auto *SeedBndl : Bndls)
for (auto Lane : seq<unsigned>(SeedBndl->size()))
SeedBndl->setUsed(Lane);
// Check if iterator::operator++ skips used lanes.
Cnt = 0;
for (auto &SeedBndl : SC) {
(void)SeedBndl;
++Cnt;
}
EXPECT_EQ(Cnt, 0u);
}
Loading