Skip to content

[LV][VPlan] Prevent calculate cost for skiped instructions in precomputeCosts(). #127966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 25, 2025

Conversation

ElvisWang123
Copy link
Contributor

@ElvisWang123 ElvisWang123 commented Feb 20, 2025

Skip calculating instruction costs for exit conditions in precomputeCosts() when it should be skipped.

Reported from: #115744 (comment)
Godbolt for reduced test cases: https://godbolt.org/z/fr4YMeqcv

…uteCosts()

Skip calaulating instruction costs for exit conditions in
precomputeCosts() when it should be skiped.
@llvmbot
Copy link
Member

llvmbot commented Feb 20, 2025

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Elvis Wang (ElvisWang123)

Changes

Skip calculating instruction costs for exit conditions in precomputeCosts() when it should be skipped.

Reported from: #115744 (comment)
Godbolt for reduced test cases: https://godbolt.org/z/d85rfr9hn


Full diff: https://github.com/llvm/llvm-project/pull/127966.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+1)
  • (added) llvm/test/Transforms/LoopVectorize/X86/vplan-based-cost-assertion.ll (+86)
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 584cda34f902e..73478becf75e5 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7291,6 +7291,7 @@ LoopVectorizationPlanner::precomputeCosts(VPlan &Plan, ElementCount VF,
   for (unsigned I = 0; I != ExitInstrs.size(); ++I) {
     Instruction *CondI = ExitInstrs[I];
     if (!OrigLoop->contains(CondI) ||
+        CostCtx.skipCostComputation(CondI, VF.isVector()) ||
         !CostCtx.SkipCostComputation.insert(CondI).second)
       continue;
     InstructionCost CondICost = CostCtx.getLegacyCost(CondI, VF);
diff --git a/llvm/test/Transforms/LoopVectorize/X86/vplan-based-cost-assertion.ll b/llvm/test/Transforms/LoopVectorize/X86/vplan-based-cost-assertion.ll
new file mode 100644
index 0000000000000..5c515af35d991
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/X86/vplan-based-cost-assertion.ll
@@ -0,0 +1,86 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -passes=loop-vectorize -mtriple=x86_64-unknown-linux-gnu -S | FileCheck %s
+
+; Check if the vplan-based cost model select same VF to the legacy cost model.
+; Reduced from: https://github.com/llvm/llvm-project/issues/115744#issuecomment-2670479463
+
+; int *e;
+; int f;
+; void g() {
+;     for (; e[f]; f++)
+;           e[b(f + 9)];
+; }
+
+define void @g(i64 %n) {
+; CHECK-LABEL: @g(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = trunc i64 [[N:%.*]] to i32
+; CHECK-NEXT:    [[TMP1:%.*]] = add i32 [[TMP0]], 1
+; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[TMP1]], 8
+; CHECK-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_SCEVCHECK:%.*]]
+; CHECK:       vector.scevcheck:
+; CHECK-NEXT:    [[TMP2:%.*]] = icmp ugt i64 [[N]], 4294967295
+; CHECK-NEXT:    br i1 [[TMP2]], label [[SCALAR_PH]], label [[ENTRY:%.*]]
+; CHECK:       vector.ph:
+; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i32 [[TMP1]], 8
+; CHECK-NEXT:    [[N_VEC:%.*]] = sub i32 [[TMP1]], [[N_MOD_VF]]
+; CHECK-NEXT:    [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i64> poison, i64 [[N]], i64 0
+; CHECK-NEXT:    [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i64> [[BROADCAST_SPLATINSERT]], <4 x i64> poison, <4 x i32> zeroinitializer
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       vector.body:
+; CHECK-NEXT:    [[SELECT:%.*]] = phi i32 [ 0, [[ENTRY]] ], [ [[SELECT_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <4 x i32> [ <i32 0, i32 1, i32 2, i32 3>, [[ENTRY]] ], [ [[VEC_IND_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[VEC_PHI:%.*]] = phi <4 x i32> [ zeroinitializer, [[ENTRY]] ], [ [[TMP9:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[VEC_PHI1:%.*]] = phi <4 x i32> [ zeroinitializer, [[ENTRY]] ], [ [[TMP10:%.*]], [[LOOP]] ]
+; CHECK-NEXT:    [[STEP_ADD:%.*]] = add <4 x i32> [[VEC_IND]], splat (i32 4)
+; CHECK-NEXT:    [[TMP3:%.*]] = zext <4 x i32> [[VEC_IND]] to <4 x i64>
+; CHECK-NEXT:    [[TMP4:%.*]] = zext <4 x i32> [[STEP_ADD]] to <4 x i64>
+; CHECK-NEXT:    [[TMP5:%.*]] = icmp eq <4 x i64> [[BROADCAST_SPLAT]], [[TMP3]]
+; CHECK-NEXT:    [[TMP6:%.*]] = icmp eq <4 x i64> [[BROADCAST_SPLAT]], [[TMP4]]
+; CHECK-NEXT:    [[TMP7:%.*]] = select <4 x i1> [[TMP5]], <4 x i32> zeroinitializer, <4 x i32> zeroinitializer
+; CHECK-NEXT:    [[TMP8:%.*]] = select <4 x i1> [[TMP6]], <4 x i32> zeroinitializer, <4 x i32> zeroinitializer
+; CHECK-NEXT:    [[TMP9]] = or <4 x i32> [[TMP7]], [[VEC_PHI]]
+; CHECK-NEXT:    [[TMP10]] = or <4 x i32> [[TMP8]], [[VEC_PHI1]]
+; CHECK-NEXT:    [[SELECT_NEXT]] = add nuw i32 [[SELECT]], 8
+; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <4 x i32> [[STEP_ADD]], splat (i32 4)
+; CHECK-NEXT:    [[TMP11:%.*]] = icmp eq i32 [[SELECT_NEXT]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[TMP11]], label [[MIDDLE_BLOCK:%.*]], label [[LOOP]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK:       middle.block:
+; CHECK-NEXT:    [[BIN_RDX:%.*]] = or <4 x i32> [[TMP10]], [[TMP9]]
+; CHECK-NEXT:    [[TMP12:%.*]] = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> [[BIN_RDX]])
+; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i32 [[TMP1]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
+; CHECK:       scalar.ph:
+; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i32 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY1:%.*]] ], [ 0, [[VECTOR_SCEVCHECK]] ]
+; CHECK-NEXT:    [[BC_MERGE_RDX:%.*]] = phi i32 [ [[TMP12]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY1]] ], [ 0, [[VECTOR_SCEVCHECK]] ]
+; CHECK-NEXT:    br label [[LOOP1:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], [[LOOP1]] ]
+; CHECK-NEXT:    [[SELECT1:%.*]] = phi i32 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[SELECT_NEXT1:%.*]], [[LOOP1]] ]
+; CHECK-NEXT:    [[IV_WIDEN:%.*]] = zext i32 [[IV]] to i64
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp eq i64 [[N]], [[IV_WIDEN]]
+; CHECK-NEXT:    [[SELECT_I:%.*]] = select i1 [[EXITCOND]], i32 0, i32 0
+; CHECK-NEXT:    [[SELECT_NEXT1]] = or i32 [[SELECT_I]], [[SELECT1]]
+; CHECK-NEXT:    [[IV_NEXT]] = add i32 [[IV]], 1
+; CHECK-NEXT:    br i1 [[EXITCOND]], label [[EXIT]], label [[LOOP1]], !llvm.loop [[LOOP3:![0-9]+]]
+; CHECK:       exit:
+; CHECK-NEXT:    [[SPEC_SELECT_I_LCSSA:%.*]] = phi i32 [ [[SELECT_NEXT1]], [[LOOP1]] ], [ [[TMP12]], [[MIDDLE_BLOCK]] ]
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i32 [ 0, %entry ], [ %iv.next, %loop ]
+  %select = phi i32 [ 0, %entry ], [ %select.next, %loop ]
+  %iv.widen = zext i32 %iv to i64
+  %exitcond = icmp eq i64 %n, %iv.widen
+  %select.i = select i1 %exitcond, i32 0, i32 0
+  %select.next = or i32 %select.i, %select
+  %iv.next = add i32 %iv, 1
+  br i1 %exitcond, label %exit, label %loop
+
+exit:
+  %spec.select.i.lcssa = phi i32 [ %select.next, %loop ]
+  ret void
+}

@@ -7291,6 +7291,7 @@ LoopVectorizationPlanner::precomputeCosts(VPlan &Plan, ElementCount VF,
for (unsigned I = 0; I != ExitInstrs.size(); ++I) {
Instruction *CondI = ExitInstrs[I];
if (!OrigLoop->contains(CondI) ||
CostCtx.skipCostComputation(CondI, VF.isVector()) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we skip adding the instructions directly in the loop above?

Comment on lines 84 to 86
exit:
store i32 %select.next, ptr %f, align 4
ret void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly simpler?

Suggested change
exit:
store i32 %select.next, ptr %f, align 4
ret void
exit:
ret i32 %select.next,

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -passes=loop-vectorize -mtriple=x86_64-unknown-linux-gnu -S | FileCheck %s

; Check if the vplan-based cost model select same VF to the legacy cost model.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to group with other tests for various crashes, llvm/test/Transforms/LoopVectorize/X86/cost-model.

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@ElvisWang123 ElvisWang123 merged commit 8009c1f into llvm:main Feb 25, 2025
11 checks passed
@ElvisWang123 ElvisWang123 deleted the fix-skip-insts-in-exit-condtion branch February 25, 2025 03:09
@fhahn
Copy link
Contributor

fhahn commented Feb 25, 2025

I think it would be good to pick this on the 20.x release branch. I requested a back port: #128694

fhahn pushed a commit to fhahn/llvm-project that referenced this pull request Feb 26, 2025
…uteCosts(). (llvm#127966)

Skip calculating instruction costs for exit conditions in
precomputeCosts() when it should be skipped.

Reported from:
llvm#115744 (comment)
Godbolt for reduced test cases: https://godbolt.org/z/fr4YMeqcv

(cherry picked from commit 8009c1f)
swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request Mar 11, 2025
…uteCosts(). (llvm#127966)

Skip calculating instruction costs for exit conditions in
precomputeCosts() when it should be skipped.

Reported from:
llvm#115744 (comment)
Godbolt for reduced test cases: https://godbolt.org/z/fr4YMeqcv

(cherry picked from commit 8009c1f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants