You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SelectionDAG] Add computeKnownBits support for ISD::STEP_VECTOR (#80452)
This handles two cases where we can work out some known-zero bits for
ISD::STEP_VECTOR.
The first case handles when we know the low bits are zero because the
step
amount is a power of two. This is taken from
https://reviews.llvm.org/D128159,
and even though the original patch didn't end up landing this case due
to it
not having any test difference, I've included it here for completeness's
sake.
The second case handles the case when we have an upper bound on
vscale_range.
We can use this to work out the upper bound on the number of elements,
and thus
what the maximum step will be. From the maximum step we then know which
hi bits
are zero.
On its own, computing the known hi bits results in some small
improvements for
RVV with -mrvv-vector-bits=zvl across the llvm-test-suite. However I'm
hoping
to be able to use this later to reduce the LMUL in index calculations
for
vrgather/indexed accesses.
---------
Co-authored-by: Philip Reames <[email protected]>
Copy file name to clipboardExpand all lines: llvm/test/CodeGen/RISCV/rvv/stepvector.ll
+42Lines changed: 42 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -733,3 +733,45 @@ entry:
733
733
%3 = shl <vscale x 16 x i64> %2, %1
734
734
ret <vscale x 16 x i64> %3
735
735
}
736
+
737
+
; maximum step is 4 * 2 = 8, so maximum step value is 7, so hi 61 bits are known
738
+
; zero
739
+
define <vscale x 2 x i64> @hi_bits_known_zero() vscale_range(2, 4) {
740
+
; CHECK-LABEL: hi_bits_known_zero:
741
+
; CHECK: # %bb.0:
742
+
; CHECK-NEXT: vsetvli a0, zero, e64, m2, ta, ma
743
+
; CHECK-NEXT: vmv.v.i v8, 0
744
+
; CHECK-NEXT: ret
745
+
%step = call <vscale x 2 x i64> @llvm.experimental.stepvector.nxv2i64()
746
+
%and = and <vscale x 2 x i64> %step, shufflevector(<vscale x 2 x i64> insertelement(<vscale x 2 x i64> poison, i64 u0xfffffffffffffff8, i320), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer)
747
+
ret <vscale x 2 x i64> %and
748
+
}
749
+
750
+
; the maximum step here overflows so don't set the known hi bits
751
+
define <vscale x 2 x i64> @hi_bits_known_zero_overflow() vscale_range(2, 4) {
752
+
; CHECK-LABEL: hi_bits_known_zero_overflow:
753
+
; CHECK: # %bb.0:
754
+
; CHECK-NEXT: vsetvli a0, zero, e64, m2, ta, ma
755
+
; CHECK-NEXT: vid.v v8
756
+
; CHECK-NEXT: li a0, -1
757
+
; CHECK-NEXT: vmul.vx v8, v8, a0
758
+
; CHECK-NEXT: vand.vi v8, v8, -8
759
+
; CHECK-NEXT: ret
760
+
%step = call <vscale x 2 x i64> @llvm.experimental.stepvector.nxv2i64()
761
+
%step.mul = mul <vscale x 2 x i64> %step, shufflevector(<vscale x 2 x i64> insertelement(<vscale x 2 x i64> poison, i64 u0xffffffffffffffff, i320), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer)
762
+
%and = and <vscale x 2 x i64> %step.mul, shufflevector(<vscale x 2 x i64> insertelement(<vscale x 2 x i64> poison, i64 u0xfffffffffffffff8, i320), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer)
763
+
ret <vscale x 2 x i64> %and
764
+
}
765
+
766
+
; step values are multiple of 8, so lo 3 bits are known zero
767
+
define <vscale x 2 x i64> @lo_bits_known_zero() {
768
+
; CHECK-LABEL: lo_bits_known_zero:
769
+
; CHECK: # %bb.0:
770
+
; CHECK-NEXT: vsetvli a0, zero, e64, m2, ta, ma
771
+
; CHECK-NEXT: vmv.v.i v8, 0
772
+
; CHECK-NEXT: ret
773
+
%step = call <vscale x 2 x i64> @llvm.experimental.stepvector.nxv2i64()
774
+
%step.mul = mul <vscale x 2 x i64> %step, shufflevector(<vscale x 2 x i64> insertelement(<vscale x 2 x i64> poison, i648, i320), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer)
775
+
%and = and <vscale x 2 x i64> %step.mul, shufflevector(<vscale x 2 x i64> insertelement(<vscale x 2 x i64> poison, i647, i320), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer)
0 commit comments