-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AArch64][NFC] Add test as a representative of scalarizing a vector i… #114107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Sushant Gokhale (sushgokh) Changes…nteger division The last resort to vectorize a bundle of integer divisions is considered scalarizing it. Currently, the cost estimates for scalarizing a vector division can be considerably overestimated as is the scenario with this motivating test case i.e. vector cost should not deviate much from the scalar cost. Future patch will try to improve the scalarization cost. Full diff: https://github.com/llvm/llvm-project/pull/114107.diff 1 Files Affected:
diff --git a/llvm/test/Transforms/SLPVectorizer/AArch64/scalarizing-vector-cost.ll b/llvm/test/Transforms/SLPVectorizer/AArch64/scalarizing-vector-cost.ll
new file mode 100644
index 00000000000000..12bbf6e043ca91
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/AArch64/scalarizing-vector-cost.ll
@@ -0,0 +1,23 @@
+; RUN: opt -mtriple=aarch64 -passes=slp-vectorizer -debug-only=SLP -S -disable-output < %s 2>&1 | FileCheck %s
+
+define <4 x i8> @v4i8(<4 x i8> %a, <4 x i8> %b)
+{
+; CHECK: SLP: Found cost = 18 for VF=4
+ %a0 = extractelement <4 x i8> %a, i64 0
+ %a1 = extractelement <4 x i8> %a, i64 1
+ %a2 = extractelement <4 x i8> %a, i64 2
+ %a3 = extractelement <4 x i8> %a, i64 3
+ %b0 = extractelement <4 x i8> %b, i64 0
+ %b1 = extractelement <4 x i8> %b, i64 1
+ %b2 = extractelement <4 x i8> %b, i64 2
+ %b3 = extractelement <4 x i8> %b, i64 3
+ %1 = sdiv i8 %a0, undef
+ %2 = sdiv i8 %a1, 1
+ %3 = sdiv i8 %a2, 2
+ %4 = sdiv i8 %a3, 4
+ %r0 = insertelement <4 x i8> poison, i8 %1, i32 0
+ %r1 = insertelement <4 x i8> %r0, i8 %2, i32 1
+ %r2 = insertelement <4 x i8> %r1, i8 %3, i32 2
+ %r3 = insertelement <4 x i8> %r2, i8 %4, i32 3
+ ret <4 x i8> %r3
+}
|
|
||
define <4 x i8> @v4i8(<4 x i8> %a, <4 x i8> %b) | ||
{ | ||
; CHECK: SLP: Found cost = 18 for VF=4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to avoid checks for debug outputs, use -pass-remarks-output=
instead
…nteger division The last resort to vectorize a bundle of integer divisions is considered scalarizing it. Currently, the cost estimates for scalarizing a vector division can be considerably overestimated as is the scenario with this motivating test case i.e. vector cost should not deviate much from the scalar cost. Future patch will try to improve the scalarization cost.
b1297dd
to
d38f624
Compare
ping |
1 similar comment
ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks OK. LGTM
Feel free to put up costmodel changes too if you have them already, even if they include the tests again. It can be good to separate the tests out but don't let the code review block other patches. If nothing else there might be other tests that are useful to add or change that becomes clearer with the cost model is changed.
Thanks @davemgreen for the approval and your opinion. |
…nteger division
The last resort to vectorize a bundle of integer divisions is considered scalarizing it. Currently, the cost estimates for scalarizing a vector division can be considerably overestimated as is the scenario with this motivating test case i.e. vector cost should not deviate much from the scalar cost.
Future patch will try to improve the scalarization cost.