-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SLP] Use named structs in vectorizeStores() (NFC) #132781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-vectorizers Author: Gaëtan Bossu (gbossu) ChangesThis is a mostly straightforward replacement of the previous I had done that change in my local tree to help me better understand the code. It’s not very invasive, so I thought I’d create a PR for it. Full diff: https://github.com/llvm/llvm-project/pull/132781.diff 1 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 95d697bbd734a..a3d282bdb615f 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -19923,31 +19923,51 @@ bool SLPVectorizerPass::vectorizeStores(
BoUpSLP::ValueSet VectorizedStores;
bool Changed = false;
+ /// A store instruction and the distance of its address to a base pointer.
+ struct CandidateInstr {
+ CandidateInstr(unsigned InstrIdx, int DistToBasePtr)
+ : InstrIdx(InstrIdx), DistToBasePtr(DistToBasePtr) {}
+ unsigned InstrIdx;
+ int DistToBasePtr;
+ };
struct StoreDistCompare {
- bool operator()(const std::pair<unsigned, int> &Op1,
- const std::pair<unsigned, int> &Op2) const {
- return Op1.second < Op2.second;
+ bool operator()(const CandidateInstr &Op1,
+ const CandidateInstr &Op2) const {
+ return Op1.DistToBasePtr < Op2.DistToBasePtr;
}
};
- // A set of pairs (index of store in Stores array ref, Distance of the store
- // address relative to base store address in units).
- using StoreIndexToDistSet =
- std::set<std::pair<unsigned, int>, StoreDistCompare>;
- auto TryToVectorize = [&](const StoreIndexToDistSet &Set) {
+
+ /// A group of store instructions that we'll try to bundle together.
+ /// They are ordered using their signed distance to the address of this
+ /// set's BaseInstr.
+ struct CandidateBundle {
+ CandidateBundle(unsigned BaseInstrIdx) { reset(BaseInstrIdx); }
+ void reset(unsigned NewBaseInstr) {
+ BaseInstrIdx = NewBaseInstr;
+ Instrs.clear();
+ Instrs.emplace(NewBaseInstr, 0);
+ }
+ // TODO: This should probably just be an std::map
+ using CandidateSet = std::set<CandidateInstr, StoreDistCompare>;
+ unsigned BaseInstrIdx;
+ CandidateSet Instrs;
+ };
+
+ auto TryToVectorize = [&](const CandidateBundle::CandidateSet &StoreSeq) {
int PrevDist = -1;
BoUpSLP::ValueList Operands;
// Collect the chain into a list.
- for (auto [Idx, Data] : enumerate(Set)) {
- if (Operands.empty() || Data.second - PrevDist == 1) {
- Operands.push_back(Stores[Data.first]);
- PrevDist = Data.second;
- if (Idx != Set.size() - 1)
+ for (auto [Idx, Data] : enumerate(StoreSeq)) {
+ if (Operands.empty() || Data.DistToBasePtr - PrevDist == 1) {
+ Operands.push_back(Stores[Data.InstrIdx]);
+ PrevDist = Data.DistToBasePtr;
+ if (Idx != StoreSeq.size() - 1)
continue;
}
auto E = make_scope_exit([&, &DataVar = Data]() {
Operands.clear();
- Operands.push_back(Stores[DataVar.first]);
- PrevDist = DataVar.second;
+ Operands.push_back(Stores[DataVar.InstrIdx]);
+ PrevDist = DataVar.DistToBasePtr;
});
if (Operands.size() <= 1 ||
@@ -20214,7 +20234,8 @@ bool SLPVectorizerPass::vectorizeStores(
// Need to store the index of the very first store separately, since the set
// may be reordered after the insertion and the first store may be moved. This
// container allows to reduce number of calls of getPointersDiff() function.
- SmallVector<std::pair<unsigned, StoreIndexToDistSet>> SortedStores;
+ SmallVector<CandidateBundle> SortedStores;
+
// Inserts the specified store SI with the given index Idx to the set of the
// stores. If the store with the same distance is found already - stop
// insertion, try to vectorize already found stores. If some stores from this
@@ -20248,31 +20269,27 @@ bool SLPVectorizerPass::vectorizeStores(
// dependencies and no need to waste compile time to try to vectorize them.
// - Try to vectorize the sequence {1, {1, 0}, {3, 2}}.
auto FillStoresSet = [&](unsigned Idx, StoreInst *SI) {
- for (std::pair<unsigned, StoreIndexToDistSet> &Set : SortedStores) {
+ for (CandidateBundle &StoreSeq : SortedStores) {
std::optional<int> Diff = getPointersDiff(
- Stores[Set.first]->getValueOperand()->getType(),
- Stores[Set.first]->getPointerOperand(),
+ Stores[StoreSeq.BaseInstrIdx]->getValueOperand()->getType(),
+ Stores[StoreSeq.BaseInstrIdx]->getPointerOperand(),
SI->getValueOperand()->getType(), SI->getPointerOperand(), *DL, *SE,
/*StrictCheck=*/true);
if (!Diff)
continue;
- auto It = Set.second.find(std::make_pair(Idx, *Diff));
- if (It == Set.second.end()) {
- Set.second.emplace(Idx, *Diff);
+ auto It = StoreSeq.Instrs.find({Idx, *Diff});
+ if (It == StoreSeq.Instrs.end()) {
+ StoreSeq.Instrs.emplace(Idx, *Diff);
return;
}
// Try to vectorize the first found set to avoid duplicate analysis.
- TryToVectorize(Set.second);
- unsigned ItIdx = It->first;
- int ItDist = It->second;
- StoreIndexToDistSet PrevSet;
- copy_if(Set.second, std::inserter(PrevSet, PrevSet.end()),
- [&](const std::pair<unsigned, int> &Pair) {
- return Pair.first > ItIdx;
- });
- Set.second.clear();
- Set.first = Idx;
- Set.second.emplace(Idx, 0);
+ TryToVectorize(StoreSeq.Instrs);
+ unsigned ItIdx = It->InstrIdx;
+ int ItDist = It->DistToBasePtr;
+ CandidateBundle::CandidateSet PrevSet;
+ copy_if(StoreSeq.Instrs, std::inserter(PrevSet, PrevSet.end()),
+ [&](const CandidateInstr &I) { return I.InstrIdx > ItIdx; });
+ StoreSeq.reset(Idx);
// Insert stores that followed previous match to try to vectorize them
// with this store.
unsigned StartIdx = ItIdx + 1;
@@ -20280,24 +20297,23 @@ bool SLPVectorizerPass::vectorizeStores(
// Distances to previously found dup store (or this store, since they
// store to the same addresses).
SmallVector<int> Dists(Idx - StartIdx, 0);
- for (const std::pair<unsigned, int> &Pair : reverse(PrevSet)) {
+ for (const CandidateInstr &Store : reverse(PrevSet)) {
// Do not try to vectorize sequences, we already tried.
- if (VectorizedStores.contains(Stores[Pair.first]))
+ if (VectorizedStores.contains(Stores[Store.InstrIdx]))
break;
- unsigned BI = Pair.first - StartIdx;
+ unsigned BI = Store.InstrIdx - StartIdx;
UsedStores.set(BI);
- Dists[BI] = Pair.second - ItDist;
+ Dists[BI] = Store.DistToBasePtr - ItDist;
}
for (unsigned I = StartIdx; I < Idx; ++I) {
unsigned BI = I - StartIdx;
if (UsedStores.test(BI))
- Set.second.emplace(I, Dists[BI]);
+ StoreSeq.Instrs.emplace(I, Dists[BI]);
}
return;
}
- auto &Res = SortedStores.emplace_back();
- Res.first = Idx;
- Res.second.emplace(Idx, 0);
+ // We did not find a comparable store, start a new sequence.
+ SortedStores.emplace_back(Idx);
};
Type *PrevValTy = nullptr;
for (auto [I, SI] : enumerate(Stores)) {
@@ -20307,8 +20323,8 @@ bool SLPVectorizerPass::vectorizeStores(
PrevValTy = SI->getValueOperand()->getType();
// Check that we do not try to vectorize stores of different types.
if (PrevValTy != SI->getValueOperand()->getType()) {
- for (auto &Set : SortedStores)
- TryToVectorize(Set.second);
+ for (CandidateBundle &StoreSeq : SortedStores)
+ TryToVectorize(StoreSeq.Instrs);
SortedStores.clear();
PrevValTy = SI->getValueOperand()->getType();
}
@@ -20316,8 +20332,8 @@ bool SLPVectorizerPass::vectorizeStores(
}
// Final vectorization attempt.
- for (auto &Set : SortedStores)
- TryToVectorize(Set.second);
+ for (CandidateBundle &StoreSeq : SortedStores)
+ TryToVectorize(StoreSeq.Instrs);
return Changed;
}
|
@llvm/pr-subscribers-llvm-transforms Author: Gaëtan Bossu (gbossu) ChangesThis is a mostly straightforward replacement of the previous I had done that change in my local tree to help me better understand the code. It’s not very invasive, so I thought I’d create a PR for it. Full diff: https://github.com/llvm/llvm-project/pull/132781.diff 1 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 95d697bbd734a..a3d282bdb615f 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -19923,31 +19923,51 @@ bool SLPVectorizerPass::vectorizeStores(
BoUpSLP::ValueSet VectorizedStores;
bool Changed = false;
+ /// A store instruction and the distance of its address to a base pointer.
+ struct CandidateInstr {
+ CandidateInstr(unsigned InstrIdx, int DistToBasePtr)
+ : InstrIdx(InstrIdx), DistToBasePtr(DistToBasePtr) {}
+ unsigned InstrIdx;
+ int DistToBasePtr;
+ };
struct StoreDistCompare {
- bool operator()(const std::pair<unsigned, int> &Op1,
- const std::pair<unsigned, int> &Op2) const {
- return Op1.second < Op2.second;
+ bool operator()(const CandidateInstr &Op1,
+ const CandidateInstr &Op2) const {
+ return Op1.DistToBasePtr < Op2.DistToBasePtr;
}
};
- // A set of pairs (index of store in Stores array ref, Distance of the store
- // address relative to base store address in units).
- using StoreIndexToDistSet =
- std::set<std::pair<unsigned, int>, StoreDistCompare>;
- auto TryToVectorize = [&](const StoreIndexToDistSet &Set) {
+
+ /// A group of store instructions that we'll try to bundle together.
+ /// They are ordered using their signed distance to the address of this
+ /// set's BaseInstr.
+ struct CandidateBundle {
+ CandidateBundle(unsigned BaseInstrIdx) { reset(BaseInstrIdx); }
+ void reset(unsigned NewBaseInstr) {
+ BaseInstrIdx = NewBaseInstr;
+ Instrs.clear();
+ Instrs.emplace(NewBaseInstr, 0);
+ }
+ // TODO: This should probably just be an std::map
+ using CandidateSet = std::set<CandidateInstr, StoreDistCompare>;
+ unsigned BaseInstrIdx;
+ CandidateSet Instrs;
+ };
+
+ auto TryToVectorize = [&](const CandidateBundle::CandidateSet &StoreSeq) {
int PrevDist = -1;
BoUpSLP::ValueList Operands;
// Collect the chain into a list.
- for (auto [Idx, Data] : enumerate(Set)) {
- if (Operands.empty() || Data.second - PrevDist == 1) {
- Operands.push_back(Stores[Data.first]);
- PrevDist = Data.second;
- if (Idx != Set.size() - 1)
+ for (auto [Idx, Data] : enumerate(StoreSeq)) {
+ if (Operands.empty() || Data.DistToBasePtr - PrevDist == 1) {
+ Operands.push_back(Stores[Data.InstrIdx]);
+ PrevDist = Data.DistToBasePtr;
+ if (Idx != StoreSeq.size() - 1)
continue;
}
auto E = make_scope_exit([&, &DataVar = Data]() {
Operands.clear();
- Operands.push_back(Stores[DataVar.first]);
- PrevDist = DataVar.second;
+ Operands.push_back(Stores[DataVar.InstrIdx]);
+ PrevDist = DataVar.DistToBasePtr;
});
if (Operands.size() <= 1 ||
@@ -20214,7 +20234,8 @@ bool SLPVectorizerPass::vectorizeStores(
// Need to store the index of the very first store separately, since the set
// may be reordered after the insertion and the first store may be moved. This
// container allows to reduce number of calls of getPointersDiff() function.
- SmallVector<std::pair<unsigned, StoreIndexToDistSet>> SortedStores;
+ SmallVector<CandidateBundle> SortedStores;
+
// Inserts the specified store SI with the given index Idx to the set of the
// stores. If the store with the same distance is found already - stop
// insertion, try to vectorize already found stores. If some stores from this
@@ -20248,31 +20269,27 @@ bool SLPVectorizerPass::vectorizeStores(
// dependencies and no need to waste compile time to try to vectorize them.
// - Try to vectorize the sequence {1, {1, 0}, {3, 2}}.
auto FillStoresSet = [&](unsigned Idx, StoreInst *SI) {
- for (std::pair<unsigned, StoreIndexToDistSet> &Set : SortedStores) {
+ for (CandidateBundle &StoreSeq : SortedStores) {
std::optional<int> Diff = getPointersDiff(
- Stores[Set.first]->getValueOperand()->getType(),
- Stores[Set.first]->getPointerOperand(),
+ Stores[StoreSeq.BaseInstrIdx]->getValueOperand()->getType(),
+ Stores[StoreSeq.BaseInstrIdx]->getPointerOperand(),
SI->getValueOperand()->getType(), SI->getPointerOperand(), *DL, *SE,
/*StrictCheck=*/true);
if (!Diff)
continue;
- auto It = Set.second.find(std::make_pair(Idx, *Diff));
- if (It == Set.second.end()) {
- Set.second.emplace(Idx, *Diff);
+ auto It = StoreSeq.Instrs.find({Idx, *Diff});
+ if (It == StoreSeq.Instrs.end()) {
+ StoreSeq.Instrs.emplace(Idx, *Diff);
return;
}
// Try to vectorize the first found set to avoid duplicate analysis.
- TryToVectorize(Set.second);
- unsigned ItIdx = It->first;
- int ItDist = It->second;
- StoreIndexToDistSet PrevSet;
- copy_if(Set.second, std::inserter(PrevSet, PrevSet.end()),
- [&](const std::pair<unsigned, int> &Pair) {
- return Pair.first > ItIdx;
- });
- Set.second.clear();
- Set.first = Idx;
- Set.second.emplace(Idx, 0);
+ TryToVectorize(StoreSeq.Instrs);
+ unsigned ItIdx = It->InstrIdx;
+ int ItDist = It->DistToBasePtr;
+ CandidateBundle::CandidateSet PrevSet;
+ copy_if(StoreSeq.Instrs, std::inserter(PrevSet, PrevSet.end()),
+ [&](const CandidateInstr &I) { return I.InstrIdx > ItIdx; });
+ StoreSeq.reset(Idx);
// Insert stores that followed previous match to try to vectorize them
// with this store.
unsigned StartIdx = ItIdx + 1;
@@ -20280,24 +20297,23 @@ bool SLPVectorizerPass::vectorizeStores(
// Distances to previously found dup store (or this store, since they
// store to the same addresses).
SmallVector<int> Dists(Idx - StartIdx, 0);
- for (const std::pair<unsigned, int> &Pair : reverse(PrevSet)) {
+ for (const CandidateInstr &Store : reverse(PrevSet)) {
// Do not try to vectorize sequences, we already tried.
- if (VectorizedStores.contains(Stores[Pair.first]))
+ if (VectorizedStores.contains(Stores[Store.InstrIdx]))
break;
- unsigned BI = Pair.first - StartIdx;
+ unsigned BI = Store.InstrIdx - StartIdx;
UsedStores.set(BI);
- Dists[BI] = Pair.second - ItDist;
+ Dists[BI] = Store.DistToBasePtr - ItDist;
}
for (unsigned I = StartIdx; I < Idx; ++I) {
unsigned BI = I - StartIdx;
if (UsedStores.test(BI))
- Set.second.emplace(I, Dists[BI]);
+ StoreSeq.Instrs.emplace(I, Dists[BI]);
}
return;
}
- auto &Res = SortedStores.emplace_back();
- Res.first = Idx;
- Res.second.emplace(Idx, 0);
+ // We did not find a comparable store, start a new sequence.
+ SortedStores.emplace_back(Idx);
};
Type *PrevValTy = nullptr;
for (auto [I, SI] : enumerate(Stores)) {
@@ -20307,8 +20323,8 @@ bool SLPVectorizerPass::vectorizeStores(
PrevValTy = SI->getValueOperand()->getType();
// Check that we do not try to vectorize stores of different types.
if (PrevValTy != SI->getValueOperand()->getType()) {
- for (auto &Set : SortedStores)
- TryToVectorize(Set.second);
+ for (CandidateBundle &StoreSeq : SortedStores)
+ TryToVectorize(StoreSeq.Instrs);
SortedStores.clear();
PrevValTy = SI->getValueOperand()->getType();
}
@@ -20316,8 +20332,8 @@ bool SLPVectorizerPass::vectorizeStores(
}
// Final vectorization attempt.
- for (auto &Set : SortedStores)
- TryToVectorize(Set.second);
+ for (CandidateBundle &StoreSeq : SortedStores)
+ TryToVectorize(StoreSeq.Instrs);
return Changed;
}
|
FYI @alexey-bataev , let me know if you're fine with this small change. If so, I wouldn't mind pushing other NFC patches as I browse through the code 🙂 |
/// A group of store instructions that we'll try to bundle together. | ||
/// They are ordered using their signed distance to the address of this | ||
/// set's BaseInstr. | ||
struct CandidateBundle { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bundle
is reserved for scheduling model. Better to use something like StoresIndicesDistsSet
or keep the original name.
Also, better to move these structs out of function into the anonymous namespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bundle
is reserved for scheduling model
Agree, I think it's better not to use it. That actually did confuse me when I first saw this word in the SLPVectorizer 😅 I'd still prefer to use a word that describes the intent rather than a word that just gives an implementation detail. What would you think about ComparableStores
, or ComparableMemInsts
in case it's later extended to loads? Reading through the code, I think this is what we use the data structure for: it collects all pointers based off of a same base, and tries to splice that list to create vector ops.
Edit: I find RelatedMemInsts
is maybe a better name compared to ComparableMemInsts
.
Also, better to move these structs out of function into the anonymous namespace
Agree again, I just assumed you'd prefer it this way because you seem to prefer lambdas over free functions. I will move the declarations out 🙂
I made some small changes @alexey-bataev , let me know if that works for you 🙂 |
b36818f
to
df0add1
Compare
Any more comments @alexey-bataev ? |
df0add1
to
8687fac
Compare
I have pushed some more changes, the code is even simpler now after changing the |
I'd be happy if we could push this small change upstream @alexey-bataev , I think it's an incremental improvement for readability. What do you think? |
Ping |
std::optional<unsigned> PrevInst = StoreSeq.getInstIdx(/*PtrDist=*/*Diff); | ||
if (!PrevInst) { | ||
StoreSeq.insert(Idx, *Diff); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid double lookup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But what is the problem with it? Decoupled operations are almost always more readable, and as I mentioned here this won't impact performance, especially as there already was a double lookup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to have a single lookup, I'm hoping this can be committed now
8687fac
to
ab37453
Compare
Ping |
/// \p PtrDist. | ||
/// Does nothing if there is already an instruction with that \p PtrDist. | ||
/// \returns The previously associated Instruction index, or std::nullopt | ||
std::optional<unsigned> insertOrLookup(unsigned InstrIdx, int PtrDist) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool insertOrLookup(unsigned InstrIdx, int PtrDist) {
and use InstrIdx, where needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the value this function returns is the previous index (if it exists), not the current index (which is what the parameter would be at the time of calling).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly why it would have been clearer to have two different functions, one for inserting, one for doing lookups.
/// A group of instructions that we'll try to bundle together using vector ops. | ||
/// They are ordered using the signed distance of their address operand to the | ||
/// address of this group's BaseInstr. | ||
struct RelatedStoreInsts { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it a class and make BaseInstrIdx private
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I kept this public is to avoid mixing struct/class semantics, because the Instrs
attribute will still be public. I had a WIP change locally to encapsulate more within RelatedStoreInsts
and make it truly own its attributes. However, that requires a lot of reorganization and I'm not happy with the result I got at that point.
TLDR: I'll make at least BaseInstrIdx
private, I agree it's better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay even making BaseInstrIdx
private is not straight-forward and goes beyond the scope of this change. 😞 As I said I have some plans to simplify vectorizeStores()
further and use a bit more OOP, but I want to keep that initial patch simple and concise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What prevents you from making BaseInstrIdx (and Instrs) private and make it a class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to refactor the code one step at a time so it's easy to review (and follow in the commits history as well). Currently my goal is to get rid of the std::pair<int, std::set<std::pair<...>>>
data structure that was used, and introduce something a bit more readable. Once this is done, I'll open a follow-up PR that brings OOP and simplifies vectorizeStores()
further, to the point where there is some encapsulation and RelatedStoreInsts
is an actual class.
I'll apply the latest nits, and then I do hope this can be committed. It's not clear to me why there is so much push-back for a small change that is a (tiny) step towards making the code somewhat more readable. |
This is a mostly straightforward replacement of the previous std::pair<int, std::set<std::pair<...>>> data structure with slightly more readable alternatives. One can use a more standard std::map for mapping pointer distances to instruction indices. This is a tiny change which hopefully will enable further refactoring.
ab37453
to
35e2206
Compare
@alexey-bataev Are you fine with the changes now? |
Thank you 🙂 |
Merging on @gbossu's request (as he does not have commit access yet). |
This is a mostly straightforward replacement of the previous
std::pair<int, std::set<std::pair<...>>>
data structure used inSLPVectorizerPass::vectorizeStores()
with slightly more readable alternatives.I had done that change in my local tree to help me better understand the code. It’s not very invasive, so I thought I’d create a PR for it.