[SLP] Limit GEP lists based on width of index computation. #1403

fhahn · 2020-07-01T17:48:06Z

D68667 introduced a tighter limit to the number of GEPs to simplify
together. The limit was based on the vector element size of the pointer,
but the pointers themselves are not actually put in vectors.

IIUC we try to vectorize the index computations here, so we should base
the limit on the vector element size of the computation of the index.

This restores the test regression on AArch64 and also restores the
vectorization for a important pattern in SPEC2006/464.h264ref on
AArch64 (@test_i16_extend). We get a large benefit from doing a single
load up front and then processing the index computations in vectors.

Note that we could probably even further improve the AArch64 codegen, if
we would do zexts to i32 instead of i64 for the sub operands and then do
a single vector sext on the result of the subtractions. AArch64 provides
dedicated vector instructions to do so. Sketch of proof in Alive:
https://alive2.llvm.org/ce/z/A4xYAB

Reviewers: craig.topper, RKSimon, xbolva00, ABataev, spatel

Reviewed By: ABataev, spatel

Differential Revision: https://reviews.llvm.org/D82418

D68667 introduced a tighter limit to the number of GEPs to simplify together. The limit was based on the vector element size of the pointer, but the pointers themselves are not actually put in vectors. IIUC we try to vectorize the index computations here, so we should base the limit on the vector element size of the computation of the index. This restores the test regression on AArch64 and also restores the vectorization for a important pattern in SPEC2006/464.h264ref on AArch64 (@test_i16_extend). We get a large benefit from doing a single load up front and then processing the index computations in vectors. Note that we could probably even further improve the AArch64 codegen, if we would do zexts to i32 instead of i64 for the sub operands and then do a single vector sext on the result of the subtractions. AArch64 provides dedicated vector instructions to do so. Sketch of proof in Alive: https://alive2.llvm.org/ce/z/A4xYAB Reviewers: craig.topper, RKSimon, xbolva00, ABataev, spatel Reviewed By: ABataev, spatel Differential Revision: https://reviews.llvm.org/D82418

fhahn · 2020-07-01T17:48:17Z

@swift-ci please test

fhahn merged commit 705d120 into swiftlang:apple/stable/20200108 Jul 2, 2020

fhahn deleted the slp-cost-fix branch July 2, 2020 22:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SLP] Limit GEP lists based on width of index computation. #1403

[SLP] Limit GEP lists based on width of index computation. #1403

Uh oh!

fhahn commented Jul 1, 2020

Uh oh!

fhahn commented Jul 1, 2020

Uh oh!

Uh oh!

[SLP] Limit GEP lists based on width of index computation. #1403

[SLP] Limit GEP lists based on width of index computation. #1403

Uh oh!

Conversation

fhahn commented Jul 1, 2020

Uh oh!

fhahn commented Jul 1, 2020

Uh oh!

Uh oh!