-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[LLVM][SelectionDAG] Remove scalable vector restriction from poison analysis. #102504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1501,18 +1501,23 @@ define <vscale x 8 x i32> @vwadd_vx_splat_zext_i1(<vscale x 8 x i1> %va, i16 %b) | |
; RV32: # %bb.0: | ||
; RV32-NEXT: slli a0, a0, 16 | ||
; RV32-NEXT: srli a0, a0, 16 | ||
; RV32-NEXT: vsetvli a1, zero, e32, m4, ta, mu | ||
; RV32-NEXT: vsetvli a1, zero, e32, m4, ta, ma | ||
; RV32-NEXT: vmv.v.x v8, a0 | ||
; RV32-NEXT: vadd.vi v8, v8, 1, v0.t | ||
; RV32-NEXT: addi a0, a0, 1 | ||
; RV32-NEXT: vmerge.vxm v8, v8, a0, v0 | ||
; RV32-NEXT: ret | ||
; | ||
; RV64-LABEL: vwadd_vx_splat_zext_i1: | ||
; RV64: # %bb.0: | ||
; RV64-NEXT: slli a0, a0, 48 | ||
; RV64-NEXT: srli a0, a0, 48 | ||
; RV64-NEXT: vsetvli a1, zero, e32, m4, ta, mu | ||
; RV64-NEXT: vsetvli a1, zero, e16, m2, ta, ma | ||
; RV64-NEXT: vmv.v.x v12, a0 | ||
; RV64-NEXT: vsetvli zero, zero, e32, m4, ta, ma | ||
; RV64-NEXT: vmv.v.x v8, a0 | ||
; RV64-NEXT: vadd.vi v8, v8, 1, v0.t | ||
; RV64-NEXT: li a0, 1 | ||
; RV64-NEXT: vsetvli zero, zero, e16, m2, ta, mu | ||
; RV64-NEXT: vwaddu.vx v8, v12, a0, v0.t | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (non blocking for this review, continuing from previous) This one looks a bit more questionable. It looks like maybe we need a guard in the vwadd combine for the case where the RHS is a legal immediate? It'd be really useful here if we have a vwadd.vi form, but we don't. Ignoring the passthru issue, which form do you think is likely better - vwaddu.vx w/immediate in register or vzext.vf + vadd.vi? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. vzext.vf + vadd.vi seems like more work in the vector ALU. So I think vwaddu.vx w/immediate is better. Hopefully we can find a way to fix this to match the RV32 codegen. |
||
; RV64-NEXT: ret | ||
%zb = zext i16 %b to i32 | ||
%head = insertelement <vscale x 8 x i32> poison, i32 %zb, i32 0 | ||
|
@@ -1570,20 +1575,23 @@ define <vscale x 8 x i32> @vwadd_vx_splat_sext_i1(<vscale x 8 x i1> %va, i16 %b) | |
; RV32: # %bb.0: | ||
; RV32-NEXT: slli a0, a0, 16 | ||
; RV32-NEXT: srai a0, a0, 16 | ||
; RV32-NEXT: vsetvli a1, zero, e32, m4, ta, mu | ||
; RV32-NEXT: vsetvli a1, zero, e32, m4, ta, ma | ||
; RV32-NEXT: vmv.v.x v8, a0 | ||
; RV32-NEXT: li a0, 1 | ||
; RV32-NEXT: vsub.vx v8, v8, a0, v0.t | ||
; RV32-NEXT: addi a0, a0, -1 | ||
; RV32-NEXT: vmerge.vxm v8, v8, a0, v0 | ||
; RV32-NEXT: ret | ||
; | ||
; RV64-LABEL: vwadd_vx_splat_sext_i1: | ||
; RV64: # %bb.0: | ||
; RV64-NEXT: slli a0, a0, 48 | ||
; RV64-NEXT: srai a0, a0, 48 | ||
; RV64-NEXT: vsetvli a1, zero, e32, m4, ta, mu | ||
; RV64-NEXT: vsetvli a1, zero, e16, m2, ta, ma | ||
; RV64-NEXT: vmv.v.x v12, a0 | ||
; RV64-NEXT: vsetvli zero, zero, e32, m4, ta, ma | ||
; RV64-NEXT: vmv.v.x v8, a0 | ||
; RV64-NEXT: li a0, 1 | ||
; RV64-NEXT: vsub.vx v8, v8, a0, v0.t | ||
; RV64-NEXT: vsetvli zero, zero, e16, m2, ta, mu | ||
; RV64-NEXT: vwsub.vx v8, v12, a0, v0.t | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (non blocking for this review, continuing from previous) In this case, the rv32 codegen looks clearly better than the rv64. |
||
; RV64-NEXT: ret | ||
%sb = sext i16 %b to i32 | ||
%head = insertelement <vscale x 8 x i32> poison, i32 %sb, i32 0 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not blocking discussion for this review)
@topperc What's your feeling on this change? This looks like either the before or after are about equal to me, do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed it looks about equal