-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AArch64][GlobalISel] Improve codegen for G_VECREDUCE_{SMIN,SMAX,UMIN,UMAX} for odd-sized vectors #82740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64][GlobalISel] Improve codegen for G_VECREDUCE_{SMIN,SMAX,UMIN,UMAX} for odd-sized vectors #82740
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1074,6 +1074,13 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST) | |
{s16, v8s16}, | ||
{s32, v2s32}, | ||
{s32, v4s32}}) | ||
.moreElementsIf( | ||
[=](const LegalityQuery &Query) { | ||
return Query.Types[1].isVector() && | ||
Query.Types[1].getElementType() != s8 && | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The i8 issues look like they might be coming from the calling convention of v3i8 types. They might look better of the inputs were vectors. They might not be quite correct at the moment if they were enabled though. I wouldn't be against just using moreElementsToNextPow2 in the long run, if it doesn't need a more precise "cost model" than that. |
||
Query.Types[1].getNumElements() & 1; | ||
}, | ||
LegalizeMutations::moreElementsToNextPow2(1)) | ||
.clampMaxNumElements(1, s64, 2) | ||
.clampMaxNumElements(1, s32, 4) | ||
.clampMaxNumElements(1, s16, 8) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This builds a vector with undef, then inserts identity elements into it? Could it instead just build directly, with the new identity element?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using G_BUILD_VECTOR seems to cause quite a few regressions: https://gist.github.com/dc03-work/66c179f4d380cfac3ed7b36525a583ef.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like there is a combine to clear up the undef version, but not the direct one.