Skip to content

Pick non-power-of-2 load/store cost improvements #7993

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 18, 2024

Conversation

fhahn
Copy link

@fhahn fhahn commented Jan 18, 2024

No description provided.

fhahn added 2 commits January 18, 2024 11:32
Extend cost-model test coverage for vectors with non-power-of-2
elements.

(cherry-picked from 47c6815)
Improve cost computaton for odd vector mem ops by breaking them down
into smaller power-of-2 parts and sum up the cost of those parts.

This fixes the current cost estimates, which for most parts
underestimated the cos, due to using getTypeLegalizationCost, which
widens to the next power-of-2 in a single step in most cases. This
doesn't reflect the actual cost.

See https://llvm.godbolt.org/z/vMsnxMf1v for codegen for the tests.

Note that there is a special case for v3i8, for which current codegen is
pretty bad, due to automatic widening to v4i8, which in turn requires
the conversion to go through memory ops in the stack. I am planning on
fixing that as a follow-up, but I am not yet sure where to best fix
this.

At the moment, there are almost no cases in which such vector operations
will be generated automatically. The motivating case is non-power-of-2
SLP vectorization: llvm#77790

PR: llvm#78181

(cherry-picked from e473daa)
@fhahn
Copy link
Author

fhahn commented Jan 18, 2024

@swift-ci please test

@fhahn
Copy link
Author

fhahn commented Jan 18, 2024

@swift-ci please test llvm

@fhahn fhahn merged commit da1f620 into swiftlang:stable/20230725 Jan 18, 2024
@fhahn fhahn deleted the aarch64-vec3-load-store-cost branch January 18, 2024 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant