Skip to content

Commit b1297dd

Browse files
committed
[AArch64][NFC] Add test as a representative of scalarizing a vector integer division
The last resort to vectorize a bundle of integer divisions is considered scalarizing it. Currently, the cost estimates for scalarizing a vector division can be considerably overestimated as is the scenario with this motivating test case i.e. vector cost should not deviate much from the scalar cost. Future patch will try to improve the scalarization cost.
1 parent 639a7ac commit b1297dd

File tree

1 file changed

+23
-0
lines changed

1 file changed

+23
-0
lines changed
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
; RUN: opt -mtriple=aarch64 -passes=slp-vectorizer -debug-only=SLP -S -disable-output < %s 2>&1 | FileCheck %s
2+
3+
define <4 x i8> @v4i8(<4 x i8> %a, <4 x i8> %b)
4+
{
5+
; CHECK: SLP: Found cost = 18 for VF=4
6+
%a0 = extractelement <4 x i8> %a, i64 0
7+
%a1 = extractelement <4 x i8> %a, i64 1
8+
%a2 = extractelement <4 x i8> %a, i64 2
9+
%a3 = extractelement <4 x i8> %a, i64 3
10+
%b0 = extractelement <4 x i8> %b, i64 0
11+
%b1 = extractelement <4 x i8> %b, i64 1
12+
%b2 = extractelement <4 x i8> %b, i64 2
13+
%b3 = extractelement <4 x i8> %b, i64 3
14+
%1 = sdiv i8 %a0, undef
15+
%2 = sdiv i8 %a1, 1
16+
%3 = sdiv i8 %a2, 2
17+
%4 = sdiv i8 %a3, 4
18+
%r0 = insertelement <4 x i8> poison, i8 %1, i32 0
19+
%r1 = insertelement <4 x i8> %r0, i8 %2, i32 1
20+
%r2 = insertelement <4 x i8> %r1, i8 %3, i32 2
21+
%r3 = insertelement <4 x i8> %r2, i8 %4, i32 3
22+
ret <4 x i8> %r3
23+
}

0 commit comments

Comments
 (0)