Skip to content

Commit c160c3c

Browse files
committed
[AArch64] Add an AArch64 pass for loop idiom transformations
We have added a new pass that looks for loops such as the following: while (i != max_len) if (a[i] != b[i]) break; ... use index i ... Although similar to a memcmp, this is slightly different because instead of returning the difference between the values of the first non-matching pair of bytes, it returns the index of the first mismatch. As such, we are not able to lower this to a memcmp call. The new pass can now spot such idioms and transform them into a specialised predicated loop that gives a significant performance improvement for AArch64. It is intended as a stop-gap solution until this can be handled by the vectoriser, which doesn't currently deal with early exits. This specialised loop makes use of a generic intrinsic that counts the trailing zero elements in a predicate vector. This was added in https://reviews.llvm.org/D159283 and for SVE we end up with brkb & incp instructions. Although we have added this pass only for AArch64, it was written in a generic way so that in theory it could be used by other targets. Currently the pass requires scalable vector support and needs to know the minimum page size for the target, however it's possible to make it work for fixed-width vectors too. Also, the llvm.experimental.cttz.elts intrinsic used by the pass has generic lowering, but can be made efficient for targets with instructions similar to SVE's brkb, cntp and incp. Original version of patch was posted on Phabricator: https://reviews.llvm.org/D158291 Patch co-authored by Kerry McLaughlin (@kmclaughlin-arm) and David Sherwood (@david-arm) See the original discussion on Discourse: https://discourse.llvm.org/t/aarch64-target-specific-loop-idiom-recognition/72383
1 parent d4cec1c commit c160c3c

File tree

12 files changed

+2423
-0
lines changed

12 files changed

+2423
-0
lines changed

llvm/include/llvm/Analysis/TargetTransformInfo.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1155,6 +1155,9 @@ class TargetTransformInfo {
11551155
/// \return The associativity of the cache level, if available.
11561156
std::optional<unsigned> getCacheAssociativity(CacheLevel Level) const;
11571157

1158+
/// \return The minimum architectural page size for the target.
1159+
std::optional<unsigned> getMinPageSize() const;
1160+
11581161
/// \return How much before a load we should place the prefetch
11591162
/// instruction. This is currently measured in number of
11601163
/// instructions.
@@ -1889,6 +1892,7 @@ class TargetTransformInfo::Concept {
18891892
virtual std::optional<unsigned> getCacheSize(CacheLevel Level) const = 0;
18901893
virtual std::optional<unsigned> getCacheAssociativity(CacheLevel Level)
18911894
const = 0;
1895+
virtual std::optional<unsigned> getMinPageSize() const = 0;
18921896

18931897
/// \return How much before a load we should place the prefetch
18941898
/// instruction. This is currently measured in number of
@@ -2475,6 +2479,10 @@ class TargetTransformInfo::Model final : public TargetTransformInfo::Concept {
24752479
return Impl.getCacheAssociativity(Level);
24762480
}
24772481

2482+
std::optional<unsigned> getMinPageSize() const override {
2483+
return Impl.getMinPageSize();
2484+
}
2485+
24782486
/// Return the preferred prefetch distance in terms of instructions.
24792487
///
24802488
unsigned getPrefetchDistance() const override {

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -494,6 +494,8 @@ class TargetTransformInfoImplBase {
494494
llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");
495495
}
496496

497+
std::optional<unsigned> getMinPageSize() const { return {}; }
498+
497499
unsigned getPrefetchDistance() const { return 0; }
498500
unsigned getMinPrefetchStride(unsigned NumMemAccesses,
499501
unsigned NumStridedMemAccesses,

llvm/lib/Analysis/TargetTransformInfo.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -753,6 +753,10 @@ TargetTransformInfo::getCacheAssociativity(CacheLevel Level) const {
753753
return TTIImpl->getCacheAssociativity(Level);
754754
}
755755

756+
std::optional<unsigned> TargetTransformInfo::getMinPageSize() const {
757+
return TTIImpl->getMinPageSize();
758+
}
759+
756760
unsigned TargetTransformInfo::getPrefetchDistance() const {
757761
return TTIImpl->getPrefetchDistance();
758762
}

llvm/lib/Target/AArch64/AArch64.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,7 @@ void initializeAArch64DeadRegisterDefinitionsPass(PassRegistry&);
8888
void initializeAArch64ExpandPseudoPass(PassRegistry &);
8989
void initializeAArch64GlobalsTaggingPass(PassRegistry &);
9090
void initializeAArch64LoadStoreOptPass(PassRegistry&);
91+
void initializeAArch64LoopIdiomTransformLegacyPassPass(PassRegistry &);
9192
void initializeAArch64LowerHomogeneousPrologEpilogPass(PassRegistry &);
9293
void initializeAArch64MIPeepholeOptPass(PassRegistry &);
9394
void initializeAArch64O0PreLegalizerCombinerPass(PassRegistry &);

0 commit comments

Comments
 (0)