[SLP]Support vectorization of small strided loads only graph. #101659

alexey-bataev · 2024-08-02T12:12:20Z

If the graph includes only strided loads node, the compiler should still
try to vectorize it.

Created using spr 1.3.5

llvmbot · 2024-08-02T12:12:50Z

@llvm/pr-subscribers-llvm-transforms

Author: Alexey Bataev (alexey-bataev)

Changes

If the graph includes only strided loads node, the compiler should still
try to vectorize it.

Full diff: https://github.com/llvm/llvm-project/pull/101659.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+1)
(modified) llvm/test/Transforms/SLPVectorizer/RISCV/reductions.ll (+3-47)

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 8d2ce6bad6af7..9502148399ece 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -10074,6 +10074,7 @@ bool BoUpSLP::isFullyVectorizableTinyTree(bool ForReduction) const {
   // We only handle trees of heights 1 and 2.
   if (VectorizableTree.size() == 1 &&
       (VectorizableTree[0]->State == TreeEntry::Vectorize ||
+       VectorizableTree[0]->State == TreeEntry::StridedVectorize ||
        (ForReduction &&
         AreVectorizableGathers(VectorizableTree[0].get(),
                                VectorizableTree[0]->Scalars.size()) &&
diff --git a/llvm/test/Transforms/SLPVectorizer/RISCV/reductions.ll b/llvm/test/Transforms/SLPVectorizer/RISCV/reductions.ll
index 77bd894eb78f1..ff3d2c4c59394 100644
--- a/llvm/test/Transforms/SLPVectorizer/RISCV/reductions.ll
+++ b/llvm/test/Transforms/SLPVectorizer/RISCV/reductions.ll
@@ -146,53 +146,9 @@ entry:
 define i64 @red_strided_ld_16xi64(ptr %ptr) {
 ; CHECK-LABEL: @red_strided_ld_16xi64(
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[LD0:%.*]] = load i64, ptr [[PTR:%.*]], align 8
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 2
-; CHECK-NEXT:    [[LD1:%.*]] = load i64, ptr [[GEP]], align 8
-; CHECK-NEXT:    [[ADD_1:%.*]] = add nuw nsw i64 [[LD0]], [[LD1]]
-; CHECK-NEXT:    [[GEP_1:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 4
-; CHECK-NEXT:    [[LD2:%.*]] = load i64, ptr [[GEP_1]], align 8
-; CHECK-NEXT:    [[ADD_2:%.*]] = add nuw nsw i64 [[ADD_1]], [[LD2]]
-; CHECK-NEXT:    [[GEP_2:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 6
-; CHECK-NEXT:    [[LD3:%.*]] = load i64, ptr [[GEP_2]], align 8
-; CHECK-NEXT:    [[ADD_3:%.*]] = add nuw nsw i64 [[ADD_2]], [[LD3]]
-; CHECK-NEXT:    [[GEP_3:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 8
-; CHECK-NEXT:    [[LD4:%.*]] = load i64, ptr [[GEP_3]], align 8
-; CHECK-NEXT:    [[ADD_4:%.*]] = add nuw nsw i64 [[ADD_3]], [[LD4]]
-; CHECK-NEXT:    [[GEP_4:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 10
-; CHECK-NEXT:    [[LD5:%.*]] = load i64, ptr [[GEP_4]], align 8
-; CHECK-NEXT:    [[ADD_5:%.*]] = add nuw nsw i64 [[ADD_4]], [[LD5]]
-; CHECK-NEXT:    [[GEP_5:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 12
-; CHECK-NEXT:    [[LD6:%.*]] = load i64, ptr [[GEP_5]], align 8
-; CHECK-NEXT:    [[ADD_6:%.*]] = add nuw nsw i64 [[ADD_5]], [[LD6]]
-; CHECK-NEXT:    [[GEP_6:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 14
-; CHECK-NEXT:    [[LD7:%.*]] = load i64, ptr [[GEP_6]], align 8
-; CHECK-NEXT:    [[ADD_7:%.*]] = add nuw nsw i64 [[ADD_6]], [[LD7]]
-; CHECK-NEXT:    [[GEP_7:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 16
-; CHECK-NEXT:    [[LD8:%.*]] = load i64, ptr [[GEP_7]], align 8
-; CHECK-NEXT:    [[ADD_8:%.*]] = add nuw nsw i64 [[ADD_7]], [[LD8]]
-; CHECK-NEXT:    [[GEP_8:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 18
-; CHECK-NEXT:    [[LD9:%.*]] = load i64, ptr [[GEP_8]], align 8
-; CHECK-NEXT:    [[ADD_9:%.*]] = add nuw nsw i64 [[ADD_8]], [[LD9]]
-; CHECK-NEXT:    [[GEP_9:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 20
-; CHECK-NEXT:    [[LD10:%.*]] = load i64, ptr [[GEP_9]], align 8
-; CHECK-NEXT:    [[ADD_10:%.*]] = add nuw nsw i64 [[ADD_9]], [[LD10]]
-; CHECK-NEXT:    [[GEP_10:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 22
-; CHECK-NEXT:    [[LD11:%.*]] = load i64, ptr [[GEP_10]], align 8
-; CHECK-NEXT:    [[ADD_11:%.*]] = add nuw nsw i64 [[ADD_10]], [[LD11]]
-; CHECK-NEXT:    [[GEP_11:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 24
-; CHECK-NEXT:    [[LD12:%.*]] = load i64, ptr [[GEP_11]], align 8
-; CHECK-NEXT:    [[ADD_12:%.*]] = add nuw nsw i64 [[ADD_11]], [[LD12]]
-; CHECK-NEXT:    [[GEP_12:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 26
-; CHECK-NEXT:    [[LD13:%.*]] = load i64, ptr [[GEP_12]], align 8
-; CHECK-NEXT:    [[ADD_13:%.*]] = add nuw nsw i64 [[ADD_12]], [[LD13]]
-; CHECK-NEXT:    [[GEP_13:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 28
-; CHECK-NEXT:    [[LD14:%.*]] = load i64, ptr [[GEP_13]], align 8
-; CHECK-NEXT:    [[ADD_14:%.*]] = add nuw nsw i64 [[ADD_13]], [[LD14]]
-; CHECK-NEXT:    [[GEP_14:%.*]] = getelementptr inbounds i64, ptr [[PTR]], i64 30
-; CHECK-NEXT:    [[LD15:%.*]] = load i64, ptr [[GEP_14]], align 8
-; CHECK-NEXT:    [[ADD_15:%.*]] = add nuw nsw i64 [[ADD_14]], [[LD15]]
-; CHECK-NEXT:    ret i64 [[ADD_15]]
+; CHECK-NEXT:    [[TMP0:%.*]] = call <16 x i64> @llvm.experimental.vp.strided.load.v16i64.p0.i64(ptr align 8 [[PTR:%.*]], i64 16, <16 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true>, i32 16)
+; CHECK-NEXT:    [[TMP1:%.*]] = call i64 @llvm.vector.reduce.add.v16i64(<16 x i64> [[TMP0]])
+; CHECK-NEXT:    ret i64 [[TMP1]]
 ;
 entry:
   %ld0 = load i64, ptr %ptr

alexey-bataev · 2024-08-05T10:48:12Z

Ping!

RKSimon

LGTM

[𝘀𝗽𝗿] initial version

f3a976f

Created using spr 1.3.5

llvmbot added vectorizers llvm:transforms labels Aug 2, 2024

alexey-bataev requested a review from RKSimon August 2, 2024 12:12

alexey-bataev requested review from preames and topperc August 2, 2024 12:59

RKSimon approved these changes Aug 5, 2024

View reviewed changes

alexey-bataev merged commit 799fd3d into main Aug 5, 2024
7 of 10 checks passed

alexey-bataev deleted the users/alexey-bataev/spr/slpsupport-vectorization-of-small-strided-loads-only-graph branch August 5, 2024 16:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SLP]Support vectorization of small strided loads only graph. #101659

[SLP]Support vectorization of small strided loads only graph. #101659

Uh oh!

alexey-bataev commented Aug 2, 2024

Uh oh!

llvmbot commented Aug 2, 2024

Uh oh!

alexey-bataev commented Aug 5, 2024

Uh oh!

RKSimon left a comment

Uh oh!

Uh oh!

Uh oh!

[SLP]Support vectorization of small strided loads only graph. #101659

[SLP]Support vectorization of small strided loads only graph. #101659

Uh oh!

Conversation

alexey-bataev commented Aug 2, 2024

Uh oh!

llvmbot commented Aug 2, 2024

Uh oh!

alexey-bataev commented Aug 5, 2024

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!