Skip to content

Commit ed66d16

Browse files
[LoopVectorize] Make needsExtract notice scalarized instructions
LoopVectorizationCostModel::needsExtract should recognise instructions that have been widened by scalarizing as scalar instructions, and thus not needing an extract when used by later scalarized instructions. This fixes an incorrect cost calculation in computePredInstDiscount, where we are adding a scalarization overhead cost when we shouldn't, though I haven't come up with a test case where it makes a difference. It will make a difference when the cost model switches to using the cost kind TCK_CodeSize for optsize, as not doing this causes the test LoopVectorize/X86/small-size.ll to get worse.
1 parent 38cadab commit ed66d16

File tree

3 files changed

+164
-163
lines changed

3 files changed

+164
-163
lines changed

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1741,7 +1741,8 @@ class LoopVectorizationCostModel {
17411741
bool needsExtract(Value *V, ElementCount VF) const {
17421742
Instruction *I = dyn_cast<Instruction>(V);
17431743
if (VF.isScalar() || !I || !TheLoop->contains(I) ||
1744-
TheLoop->isLoopInvariant(I))
1744+
TheLoop->isLoopInvariant(I) ||
1745+
getWideningDecision(I, VF) == CM_Scalarize)
17451746
return false;
17461747

17471748
// Assume we can vectorize V (and hence we need extraction) if the

llvm/test/Transforms/LoopVectorize/AArch64/interleaved_cost.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -170,8 +170,8 @@ entry:
170170
; VF_2-LABEL: Checking a loop in 'i64_factor_8'
171171
; VF_2: Found an estimated cost of 8 for VF 2 For instruction: %tmp2 = load i64, ptr %tmp0, align 8
172172
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: %tmp3 = load i64, ptr %tmp1, align 8
173-
; VF_2-NEXT: Found an estimated cost of 12 for VF 2 For instruction: store i64 %tmp2, ptr %tmp0, align 8
174-
; VF_2-NEXT: Found an estimated cost of 12 for VF 2 For instruction: store i64 %tmp3, ptr %tmp1, align 8
173+
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: store i64 %tmp2, ptr %tmp0, align 8
174+
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: store i64 %tmp3, ptr %tmp1, align 8
175175
for.body:
176176
%i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
177177
%tmp0 = getelementptr inbounds %i64.8, ptr %data, i64 %i, i32 2

0 commit comments

Comments
 (0)