You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[LoopVectorize] Ensure fairness when selecting epilogue VFs
Whilst rebasing PR llvm#116247 I discovered an issue where
PR llvm#108190 seems to have unintentionally introduced an
unfairness in selecting epilogue VFs by making potentially
better choices for fixed-width VFs compared to scalable VFs.
When considering whether epilogue vectorisation is profitable
or not the latest algorithm appears to be:
bool IsProfitable = false;
if (VF.isFixed())
IsProfitable = (IC * VF.getFixedValue())
>= EpilogueVectorizationMinVF;
else
IsProfitable = (getVScaleForTuning() * VF.getKnownMinValue())
>= EpilogueVectorizationMinVF;
Instead, the estimate for the number of scalar iterations
processed in the main vector loop should be
(IC * estimatedRuntimeVF)
; DEFAULT-NEXT: [[TMP26:%.*]] = mul i64 [[TMP25]], 2
365
+
; DEFAULT-NEXT: [[TMP27:%.*]] = insertelement <vscale x 2 x i16> zeroinitializer, i16 [[BC_MERGE_RDX]], i32 0
366
+
; DEFAULT-NEXT: [[BROADCAST_SPLATINSERT9:%.*]] = insertelement <vscale x 2 x i16> poison, i16 [[X]], i64 0
367
+
; DEFAULT-NEXT: [[BROADCAST_SPLAT10:%.*]] = shufflevector <vscale x 2 x i16> [[BROADCAST_SPLATINSERT9]], <vscale x 2 x i16> poison, <vscale x 2 x i32> zeroinitializer
; CHECK-NEXT: [[VEC_IND:%.*]] = phi <vscale x 2 x i64> [ [[INDUCTION]], [[SCALAR_PH]] ], [ [[VEC_IND_NEXT:%.*]], [[FOR_BODY]] ]
357
+
; CHECK-NEXT: [[TMP31:%.*]] = shl <vscale x 2 x i64> [[VEC_IND]], shufflevector (<vscale x 2 x i64> insertelement (<vscale x 2 x i64> poison, i64 1, i64 0), <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer)
358
+
; CHECK-NEXT: [[TMP32:%.*]] = getelementptr inbounds float, ptr [[B]], <vscale x 2 x i64> [[TMP31]]
359
+
; CHECK-NEXT: [[WIDE_MASKED_GATHER1:%.*]] = call <vscale x 2 x float> @llvm.masked.gather.nxv2f32.nxv2p0(<vscale x 2 x ptr> [[TMP32]], i32 4, <vscale x 2 x i1> shufflevector (<vscale x 2 x i1> insertelement (<vscale x 2 x i1> poison, i1 true, i64 0), <vscale x 2 x i1> poison, <vscale x 2 x i32> zeroinitializer), <vscale x 2 x float> poison)
0 commit comments