-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[RISCV] Emit VP strided load in mgather combine. NFCI #98112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This combine is a duplication of the transform in RISCVGatherScatterLowering but at the SelectionDAG level, so similarly to llvm#98111 we can replace the use of riscv_masked_strided_load with a VP strided load. Unlike llvm#98111 we don't require llvm#97800 or llvm#97798 since it only operates on fixed vectors with a non-zero stride.
@llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesThis combine is a duplication of the transform in RISCVGatherScatterLowering but at the SelectionDAG level, so similarly to #98111 we can replace the use of riscv_masked_strided_load with a VP strided load. Unlike #98111 we don't require #97800 or #97798 since it only operates on fixed vectors with a non-zero stride. Full diff: https://github.com/llvm/llvm-project/pull/98112.diff 1 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 7972b9abc456c..e488b9f327582 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -17060,15 +17060,16 @@ SDValue RISCVTargetLowering::PerformDAGCombine(SDNode *N,
SDValue BasePtr = DAG.getNode(ISD::ADD, DL, PtrVT, MGN->getBasePtr(),
DAG.getConstant(Addend, DL, PtrVT));
- SDVTList VTs = DAG.getVTList({VT, MVT::Other});
- SDValue IntID =
- DAG.getTargetConstant(Intrinsic::riscv_masked_strided_load, DL,
- XLenVT);
- SDValue Ops[] =
- {MGN->getChain(), IntID, MGN->getPassThru(), BasePtr,
- DAG.getConstant(StepNumerator, DL, XLenVT), MGN->getMask()};
- return DAG.getMemIntrinsicNode(ISD::INTRINSIC_W_CHAIN, DL, VTs,
- Ops, VT, MGN->getMemOperand());
+ SDValue EVL = DAG.getElementCount(DL, Subtarget.getXLenVT(),
+ VT.getVectorElementCount());
+ SDValue StridedLoad =
+ DAG.getStridedLoadVP(VT, DL, MGN->getChain(), BasePtr,
+ DAG.getConstant(StepNumerator, DL, XLenVT),
+ MGN->getMask(), EVL, MGN->getMemOperand());
+ SDValue VPSelect = DAG.getNode(ISD::VP_SELECT, DL, VT, MGN->getMask(),
+ StridedLoad, MGN->getPassThru(), EVL);
+ return DAG.getMergeValues({VPSelect, SDValue(StridedLoad.getNode(), 1)},
+ DL);
}
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
After llvm#98112 and llvm#98111 this should be the last use of riscv_masked_strided_load. The diff is due to vp_load not having the same generic combine for bitcasts where `(conv (load x)) -> (load (conv*)x)`. I don't think this makes much of a difference on RVV, and it doesn't seem to affect other patterns.
This combine is a duplication of the transform in RISCVGatherScatterLowering but at the SelectionDAG level, so similarly to llvm#98111 we can replace the use of riscv_masked_strided_load with a VP strided load. Unlike llvm#98111 we don't require llvm#97800 or llvm#97798 since it only operates on fixed vectors with a non-zero stride.
RISCVGatherScatterLowering is the last user of riscv_masked_strided_{load,store} after llvm#98131 and llvm#98112, this patch changes it to emit the VP equivalent instead. This allows us to remove the masked_strided intrinsics so we have only have one lowering path. riscv_masked_strided_{load,store} didn't have AVL operands and were always VLMAX, so this passes in the fixed or scalable element count to the EVL instead, which RISCVVectorPeephole should now convert to VLMAX after llvm#97800
RISCVGatherScatterLowering is the last user of riscv_masked_strided_{load,store} after llvm#98131 and llvm#98112, this patch changes it to emit the VP equivalent instead. This allows us to remove the masked_strided intrinsics so we have only have one lowering path. riscv_masked_strided_{load,store} didn't have AVL operands and were always VLMAX, so this passes in the fixed or scalable element count to the EVL instead, which RISCVVectorPeephole should now convert to VLMAX after llvm#97800
RISCVGatherScatterLowering is the last user of riscv_masked_strided_{load,store} after llvm#98131 and llvm#98112, this patch changes it to emit the VP equivalent instead. This allows us to remove the masked_strided intrinsics so we have only have one lowering path. riscv_masked_strided_{load,store} didn't have AVL operands and were always VLMAX, so this passes in the fixed or scalable element count to the EVL instead, which RISCVVectorPeephole should now convert to VLMAX after llvm#97800
…98111) RISCVGatherScatterLowering is the last user of riscv_masked_strided_{load,store} after #98131 and #98112, this patch changes it to emit the VP equivalent instead. This allows us to remove the masked_strided intrinsics so we have only have one lowering path. riscv_masked_strided_{load,store} didn't have AVL operands and were always VLMAX, so this passes in the fixed or scalable element count to the EVL instead, which RISCVVectorPeephole should now convert to VLMAX after #97800. For loads we also use a vp_select to get passthru (mask undisturbed) behaviour
…98111) Summary: RISCVGatherScatterLowering is the last user of riscv_masked_strided_{load,store} after #98131 and #98112, this patch changes it to emit the VP equivalent instead. This allows us to remove the masked_strided intrinsics so we have only have one lowering path. riscv_masked_strided_{load,store} didn't have AVL operands and were always VLMAX, so this passes in the fixed or scalable element count to the EVL instead, which RISCVVectorPeephole should now convert to VLMAX after #97800. For loads we also use a vp_select to get passthru (mask undisturbed) behaviour Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60250735
This combine is a duplication of the transform in RISCVGatherScatterLowering but at the SelectionDAG level, so similarly to #98111 we can replace the use of riscv_masked_strided_load with a VP strided load.
Unlike #98111 we don't require #97800 or #97798 since it only operates on fixed vectors with a non-zero stride.