You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[IA] Add support for [de]interleave{3,5,7} (#139373)
This adds support for lowering deinterleave and interleave intrinsics
for factors 3 5 and 7 into target specific memory intrinsics.
Notably this doesn't add support for handling higher factors constructed
from interleaving interleave intrinsics, e.g. factor 6 from interleave3
+ interleave2.
I initially tried this but it became very complex very quickly. For
example, because there's now multiple factors involved
interleaveLeafValues is no longer symmetric between interleaving and
deinterleaving. There's then also two ways of representing a factor 6
deinterleave: It can both be done as either 1 deinterleave3 and 3
deinterleave2s OR 1 deinterleave2 and 3 deinterleave3s.
I'm not sure the complexity of supporting arbitrary factors is warranted
given how we only need to support a small number of factors currently:
SVE only needs factors 2,3,4 whilst RVV only needs 2,3,4,5,6,7,8.
My preference would be to just add a interleave6 and deinterleave6
intrinsic to avoid all this ambiguity, but I'll defer this discussion to
a later patch.
Copy file name to clipboardExpand all lines: llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleave-store.ll
+33Lines changed: 33 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -181,6 +181,17 @@ define void @vector_interleave_store_v4f64_v2f64(<2 x double> %a, <2 x double> %
181
181
retvoid
182
182
}
183
183
184
+
definevoid@vector_interleave_store_factor3(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, ptr%p) {
185
+
; CHECK-LABEL: vector_interleave_store_factor3:
186
+
; CHECK: # %bb.0:
187
+
; CHECK-NEXT: vsetivli zero, 4, e32, m1, ta, ma
188
+
; CHECK-NEXT: vsseg3e32.v v8, (a0)
189
+
; CHECK-NEXT: ret
190
+
%v = call <12 x i32> @llvm.vector.interleave3(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c)
191
+
store <12 x i32> %v, ptr%p
192
+
retvoid
193
+
}
194
+
184
195
definevoid@vector_interleave_store_factor4(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i32> %d, ptr%p) {
185
196
; CHECK-LABEL: vector_interleave_store_factor4:
186
197
; CHECK: # %bb.0:
@@ -194,6 +205,28 @@ define void @vector_interleave_store_factor4(<4 x i32> %a, <4 x i32> %b, <4 x i3
194
205
retvoid
195
206
}
196
207
208
+
definevoid@vector_interleave_store_factor5(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i32> %d, <4 x i32> %e, ptr%p) {
209
+
; CHECK-LABEL: vector_interleave_store_factor5:
210
+
; CHECK: # %bb.0:
211
+
; CHECK-NEXT: vsetivli zero, 4, e32, m1, ta, ma
212
+
; CHECK-NEXT: vsseg5e32.v v8, (a0)
213
+
; CHECK-NEXT: ret
214
+
%v = call <20 x i32> @llvm.vector.interleave5(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i32> %d, <4 x i32> %e)
215
+
store <20 x i32> %v, ptr%p
216
+
retvoid
217
+
}
218
+
219
+
definevoid@vector_interleave_store_factor7(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i32> %d, <4 x i32> %e, <4 x i32> %f, <4 x i32> %g, ptr%p) {
220
+
; CHECK-LABEL: vector_interleave_store_factor7:
221
+
; CHECK: # %bb.0:
222
+
; CHECK-NEXT: vsetivli zero, 4, e32, m1, ta, ma
223
+
; CHECK-NEXT: vsseg7e32.v v8, (a0)
224
+
; CHECK-NEXT: ret
225
+
%v = call <28 x i32> @llvm.vector.interleave7(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i32> %d, <4 x i32> %e, <4 x i32> %f, <4 x i32> %g)
226
+
store <28 x i32> %v, ptr%p
227
+
retvoid
228
+
}
229
+
197
230
definevoid@vector_interleave_store_factor8(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i32> %d, <4 x i32> %e, <4 x i32> %f, <4 x i32> %g, <4 x i32> %h, ptr%p) {
0 commit comments