[RISCV] Use VP strided load in concat_vectors combine #98131

lukel97 · 2024-07-09T08:45:35Z

After #98112 and #98111 this should be the last use of riscv_masked_strided_load.

The diff is due to vp_load not having the same generic combine for bitcasts where (conv (load x)) -> (load (conv*)x). I don't think this makes much of a difference on RVV, and it doesn't seem to affect other patterns.

After llvm#98112 and llvm#98111 this should be the last use of riscv_masked_strided_load. The diff is due to vp_load not having the same generic combine for bitcasts where `(conv (load x)) -> (load (conv*)x)`. I don't think this makes much of a difference on RVV, and it doesn't seem to affect other patterns.

llvmbot · 2024-07-09T08:46:09Z

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

After #98112 and #98111 this should be the last use of riscv_masked_strided_load.

The diff is due to vp_load not having the same generic combine for bitcasts where (conv (load x)) -> (load (conv*)x). I don't think this makes much of a difference on RVV, and it doesn't seem to affect other patterns.

Full diff: https://github.com/llvm/llvm-project/pull/98131.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+5-10)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-combine.ll (+9-9)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 7972b9abc456c..a6f2641b48004 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -16205,18 +16205,10 @@ static SDValue performCONCAT_VECTORSCombine(SDNode *N, SelectionDAG &DAG,
   if (MustNegateStride)
     Stride = DAG.getNegative(Stride, DL, Stride.getValueType());
 
-  SDVTList VTs = DAG.getVTList({WideVecVT, MVT::Other});
-  SDValue IntID =
-    DAG.getTargetConstant(Intrinsic::riscv_masked_strided_load, DL,
-                          Subtarget.getXLenVT());
-
   SDValue AllOneMask =
     DAG.getSplat(WideVecVT.changeVectorElementType(MVT::i1), DL,
                  DAG.getConstant(1, DL, MVT::i1));
 
-  SDValue Ops[] = {BaseLd->getChain(),   IntID,  DAG.getUNDEF(WideVecVT),
-                   BaseLd->getBasePtr(), Stride, AllOneMask};
-
   uint64_t MemSize;
   if (auto *ConstStride = dyn_cast<ConstantSDNode>(Stride);
       ConstStride && ConstStride->getSExtValue() >= 0)
@@ -16232,8 +16224,11 @@ static SDValue performCONCAT_VECTORSCombine(SDNode *N, SelectionDAG &DAG,
       BaseLd->getPointerInfo(), BaseLd->getMemOperand()->getFlags(), MemSize,
       Align);
 
-  SDValue StridedLoad = DAG.getMemIntrinsicNode(ISD::INTRINSIC_W_CHAIN, DL, VTs,
-                                                Ops, WideVecVT, MMO);
+  SDValue StridedLoad = DAG.getStridedLoadVP(
+      WideVecVT, DL, BaseLd->getChain(), BaseLd->getBasePtr(), Stride,
+      AllOneMask,
+      DAG.getConstant(N->getNumOperands(), DL, Subtarget.getXLenVT()), MMO);
+
   for (SDValue Ld : N->ops())
     DAG.makeEquivalentMemoryOrdering(cast<LoadSDNode>(Ld), StridedLoad);
 
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-combine.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-combine.ll
index 0e1105848440a..cdf0d35843620 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-combine.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-combine.ll
@@ -9,9 +9,9 @@
 define void @widen_2xv4i16(ptr %x, ptr %z) {
 ; CHECK-LABEL: widen_2xv4i16:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
-; CHECK-NEXT:    vle16.v v8, (a0)
-; CHECK-NEXT:    vse16.v v8, (a1)
+; CHECK-NEXT:    vsetivli zero, 2, e64, m1, ta, ma
+; CHECK-NEXT:    vle64.v v8, (a0)
+; CHECK-NEXT:    vse64.v v8, (a1)
 ; CHECK-NEXT:    ret
   %a = load <4 x i16>, ptr %x
   %b.gep = getelementptr i8, ptr %x, i64 8
@@ -52,9 +52,9 @@ define void @widen_3xv4i16(ptr %x, ptr %z) {
 define void @widen_4xv4i16(ptr %x, ptr %z) {
 ; CHECK-LABEL: widen_4xv4i16:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 16, e16, m2, ta, ma
-; CHECK-NEXT:    vle16.v v8, (a0)
-; CHECK-NEXT:    vse16.v v8, (a1)
+; CHECK-NEXT:    vsetivli zero, 4, e64, m2, ta, ma
+; CHECK-NEXT:    vle64.v v8, (a0)
+; CHECK-NEXT:    vse64.v v8, (a1)
 ; CHECK-NEXT:    ret
   %a = load <4 x i16>, ptr %x
   %b.gep = getelementptr i8, ptr %x, i64 8
@@ -90,9 +90,9 @@ define void @widen_4xv4i16_unaligned(ptr %x, ptr %z) {
 ;
 ; RV64-MISALIGN-LABEL: widen_4xv4i16_unaligned:
 ; RV64-MISALIGN:       # %bb.0:
-; RV64-MISALIGN-NEXT:    vsetivli zero, 16, e16, m2, ta, ma
-; RV64-MISALIGN-NEXT:    vle16.v v8, (a0)
-; RV64-MISALIGN-NEXT:    vse16.v v8, (a1)
+; RV64-MISALIGN-NEXT:    vsetivli zero, 4, e64, m2, ta, ma
+; RV64-MISALIGN-NEXT:    vle64.v v8, (a0)
+; RV64-MISALIGN-NEXT:    vse64.v v8, (a1)
 ; RV64-MISALIGN-NEXT:    ret
   %a = load <4 x i16>, ptr %x, align 1
   %b.gep = getelementptr i8, ptr %x, i64 8

wangpc-pp · 2024-07-09T09:08:52Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-combine.ll

-; CHECK-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
-; CHECK-NEXT:    vle16.v v8, (a0)
-; CHECK-NEXT:    vse16.v v8, (a1)
+; CHECK-NEXT:    vsetivli zero, 2, e64, m1, ta, ma


This may cause unaligned access error?

The combine checks for the alignment with TLI.isLegalStridedLoadStore(WideVecVT, Align), so I think this should be fine. This is what it was originally combining to anyway, the generic dag combine was just re-combining it to a smaller SEW

So the alignment of <4 x i16> is 64 bits, IIUC? Then this makes sense to me.

wangpc-pp

LGTM.

wangpc-pp · 2024-07-09T09:26:09Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-combine.ll

-; CHECK-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
-; CHECK-NEXT:    vle16.v v8, (a0)
-; CHECK-NEXT:    vse16.v v8, (a1)
+; CHECK-NEXT:    vsetivli zero, 2, e64, m1, ta, ma


So the alignment of <4 x i16> is 64 bits, IIUC? Then this makes sense to me.

RISCVGatherScatterLowering is the last user of riscv_masked_strided_{load,store} after llvm#98131 and llvm#98112, this patch changes it to emit the VP equivalent instead. This allows us to remove the masked_strided intrinsics so we have only have one lowering path. riscv_masked_strided_{load,store} didn't have AVL operands and were always VLMAX, so this passes in the fixed or scalable element count to the EVL instead, which RISCVVectorPeephole should now convert to VLMAX after llvm#97800

…98111) RISCVGatherScatterLowering is the last user of riscv_masked_strided_{load,store} after #98131 and #98112, this patch changes it to emit the VP equivalent instead. This allows us to remove the masked_strided intrinsics so we have only have one lowering path. riscv_masked_strided_{load,store} didn't have AVL operands and were always VLMAX, so this passes in the fixed or scalable element count to the EVL instead, which RISCVVectorPeephole should now convert to VLMAX after #97800. For loads we also use a vp_select to get passthru (mask undisturbed) behaviour

…98111) Summary: RISCVGatherScatterLowering is the last user of riscv_masked_strided_{load,store} after #98131 and #98112, this patch changes it to emit the VP equivalent instead. This allows us to remove the masked_strided intrinsics so we have only have one lowering path. riscv_masked_strided_{load,store} didn't have AVL operands and were always VLMAX, so this passes in the fixed or scalable element count to the EVL instead, which RISCVVectorPeephole should now convert to VLMAX after #97800. For loads we also use a vp_select to get passthru (mask undisturbed) behaviour Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60250735

lukel97 requested review from topperc, wangpc-pp and yetingk July 9, 2024 08:45

llvmbot added the backend:RISC-V label Jul 9, 2024

lukel97 requested a review from preames July 9, 2024 08:45

wangpc-pp reviewed Jul 9, 2024

View reviewed changes

wangpc-pp approved these changes Jul 9, 2024

View reviewed changes

lukel97 merged commit 19cc461 into llvm:main Jul 9, 2024
7 of 9 checks passed

aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024

[RISCV] Use VP strided load in concat_vectors combine (llvm#98131)

70c8acb

lukel97 mentioned this pull request Jul 16, 2024

[RISCV] Emit VP strided loads/stores in RISCVGatherScatterLowering #98111

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Use VP strided load in concat_vectors combine #98131

[RISCV] Use VP strided load in concat_vectors combine #98131

Uh oh!

lukel97 commented Jul 9, 2024

Uh oh!

llvmbot commented Jul 9, 2024

Uh oh!

wangpc-pp Jul 9, 2024

Uh oh!

lukel97 Jul 9, 2024 •

edited

Loading

Uh oh!

wangpc-pp Jul 9, 2024

Uh oh!

wangpc-pp left a comment

Uh oh!

wangpc-pp Jul 9, 2024

Uh oh!

Uh oh!

Uh oh!

[RISCV] Use VP strided load in concat_vectors combine #98131

[RISCV] Use VP strided load in concat_vectors combine #98131

Uh oh!

Conversation

lukel97 commented Jul 9, 2024

Uh oh!

llvmbot commented Jul 9, 2024

Uh oh!

wangpc-pp Jul 9, 2024

Choose a reason for hiding this comment

Uh oh!

lukel97 Jul 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangpc-pp Jul 9, 2024

Choose a reason for hiding this comment

Uh oh!

wangpc-pp left a comment

Choose a reason for hiding this comment

Uh oh!

wangpc-pp Jul 9, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lukel97 Jul 9, 2024 •

edited

Loading