[LoopPeel] Avoid unprofitable last iteration peeling with equalities #143588

preames · 2025-06-10T19:18:46Z

The peeling profitability logic considers two types of conditions: monotonic relational and equalities. Monotonic conditions can only toggle from true to false (or vice versa) once, but equalities (even with no wrap) can toggle twice on successive iterations.

The forward peeling code has a special case for this where we will advance the peel count by one to ensure we get both transitions, but we didn't consider this in the recently added reverse peeling code. This isn't unsound, but it does result in us peeling a loop without being able to actully eliminate the condition in the loop.

This patch guards against identifying the second transition. An alternate approach would be to extend the reverse peeling to allow more than one iteration, and peel off two iterations.

The peeling profitability logic considers two types of conditions: monotonic relational and equalities. Monotonic conditions can only toggle from true to false (or vice versa) once, but equalities (even with no wrap) can toggle twice on successive iterations. The forward peeling code has a special case for this where we will advance the peel count by one to ensure we get both transitions, but we didn't consider this in the recently added reverse peeling code. This isn't unsound, but it does result in us peeling a loop without being able to actully eliminate the condition in the loop. This patch guards against identifying the second transition. An alternate approach would be to extend the reverse peeling to allow more than one iteration, and peel off two iterations.

llvmbot · 2025-06-10T19:19:34Z

@llvm/pr-subscribers-llvm-transforms

Author: Philip Reames (preames)

Changes

The peeling profitability logic considers two types of conditions: monotonic relational and equalities. Monotonic conditions can only toggle from true to false (or vice versa) once, but equalities (even with no wrap) can toggle twice on successive iterations.

The forward peeling code has a special case for this where we will advance the peel count by one to ensure we get both transitions, but we didn't consider this in the recently added reverse peeling code. This isn't unsound, but it does result in us peeling a loop without being able to actully eliminate the condition in the loop.

This patch guards against identifying the second transition. An alternate approach would be to extend the reverse peeling to allow more than one iteration, and peel off two iterations.

Full diff: https://github.com/llvm/llvm-project/pull/143588.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Utils/LoopPeel.cpp (+17-3)
(modified) llvm/test/Transforms/LoopUnroll/peel-last-iteration-with-variable-trip-count.ll (+7-29)

diff --git a/llvm/lib/Transforms/Utils/LoopPeel.cpp b/llvm/lib/Transforms/Utils/LoopPeel.cpp
index f343962548259..7e56c09dcc87d 100644
--- a/llvm/lib/Transforms/Utils/LoopPeel.cpp
+++ b/llvm/lib/Transforms/Utils/LoopPeel.cpp
@@ -381,9 +381,23 @@ static bool shouldPeelLastIteration(Loop &L, CmpPredicate Pred,
   const SCEV *ValAtSecondToLastIter = LeftAR->evaluateAtIteration(
       SE.getMinusSCEV(BTC, SE.getOne(BTC->getType())), SE);
 
-  return SE.isKnownPredicate(ICmpInst::getInversePredicate(Pred), ValAtLastIter,
-                             RightSCEV) &&
-         SE.isKnownPredicate(Pred, ValAtSecondToLastIter, RightSCEV);
+  CmpPredicate InvPred = ICmpInst::getInversePredicate(Pred);
+  if (!SE.isKnownPredicate(InvPred, ValAtLastIter, RightSCEV) ||
+      !SE.isKnownPredicate(Pred, ValAtSecondToLastIter, RightSCEV))
+    return false;
+
+  // For a monotonic predicate, we've found the single transition point and
+  // don't need to check further.
+  if (!ICmpInst::isEquality(Pred))
+    return true;
+
+  // For an equality, the condition can toggle twice.  For the peeling to be
+  // profitable, we need to know that this is the first toggle.  The two
+  // toggles are separate by one iteration.  If the BTC isn't at least two
+  // this will overflow, but that's okay correctness wise.
+  const SCEV *ValAtPriorIter = LeftAR->evaluateAtIteration(
+      SE.getMinusSCEV(BTC, SE.getConstant(BTC->getType(), 2)), SE);
+  return SE.isKnownPredicate(Pred, ValAtPriorIter, RightSCEV);
 }
 
 // Return the number of iterations to peel off from the beginning and end of the
diff --git a/llvm/test/Transforms/LoopUnroll/peel-last-iteration-with-variable-trip-count.ll b/llvm/test/Transforms/LoopUnroll/peel-last-iteration-with-variable-trip-count.ll
index 5704be33f0f0b..34b53c54d9bc6 100644
--- a/llvm/test/Transforms/LoopUnroll/peel-last-iteration-with-variable-trip-count.ll
+++ b/llvm/test/Transforms/LoopUnroll/peel-last-iteration-with-variable-trip-count.ll
@@ -176,39 +176,18 @@ define i32 @peel_last_with_trip_count_check_lcssa_phi_cmp_not_invar2(i32 %n) {
 ; CHECK-SAME: i32 [[N:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
 ; CHECK-NEXT:    [[SUB:%.*]] = add i32 [[N]], -2
-; CHECK-NEXT:    [[TMP0:%.*]] = add i32 [[N]], -1
-; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne i32 [[TMP0]], 0
-; CHECK-NEXT:    br i1 [[TMP1]], label %[[ENTRY_SPLIT:.*]], label %[[EXIT_PEEL_BEGIN:.*]]
-; CHECK:       [[ENTRY_SPLIT]]:
 ; CHECK-NEXT:    br label %[[LOOP:.*]]
 ; CHECK:       [[LOOP]]:
-; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ 0, %[[ENTRY_SPLIT]] ], [ [[IV_NEXT1:%.*]], %[[LOOP]] ]
-; CHECK-NEXT:    [[C:%.*]] = icmp ne i32 [[IV]], [[SUB]]
-; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[C]], i32 1, i32 2
-; CHECK-NEXT:    call void @foo(i32 [[SEL]])
-; CHECK-NEXT:    [[IV_NEXT1]] = add nuw i32 [[IV]], 1
-; CHECK-NEXT:    [[TMP2:%.*]] = sub i32 [[N]], 1
-; CHECK-NEXT:    [[EC1:%.*]] = icmp ne i32 [[IV_NEXT1]], [[TMP2]]
-; CHECK-NEXT:    br i1 [[EC1]], label %[[LOOP]], label %[[EXIT_PEEL_BEGIN_LOOPEXIT:.*]], !llvm.loop [[LOOP2:![0-9]+]]
-; CHECK:       [[EXIT_PEEL_BEGIN_LOOPEXIT]]:
-; CHECK-NEXT:    [[DOTPH:%.*]] = phi i32 [ [[IV_NEXT1]], %[[LOOP]] ]
-; CHECK-NEXT:    br label %[[EXIT_PEEL_BEGIN]]
-; CHECK:       [[EXIT_PEEL_BEGIN]]:
-; CHECK-NEXT:    [[TMP3:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[DOTPH]], %[[EXIT_PEEL_BEGIN_LOOPEXIT]] ]
-; CHECK-NEXT:    br label %[[LOOP_PEEL:.*]]
-; CHECK:       [[LOOP_PEEL]]:
+; CHECK-NEXT:    [[TMP3:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ]
 ; CHECK-NEXT:    [[C_PEEL:%.*]] = icmp ne i32 [[TMP3]], [[SUB]]
 ; CHECK-NEXT:    [[SEL_LCSSA:%.*]] = select i1 [[C_PEEL]], i32 1, i32 2
 ; CHECK-NEXT:    call void @foo(i32 [[SEL_LCSSA]])
-; CHECK-NEXT:    [[IV_NEXT:%.*]] = add i32 [[TMP3]], 1
+; CHECK-NEXT:    [[IV_NEXT]] = add i32 [[TMP3]], 1
 ; CHECK-NEXT:    [[EC:%.*]] = icmp ne i32 [[IV_NEXT]], [[N]]
-; CHECK-NEXT:    br i1 [[EC]], label %[[EXIT_PEEL_NEXT:.*]], label %[[EXIT_PEEL_NEXT]]
-; CHECK:       [[EXIT_PEEL_NEXT]]:
-; CHECK-NEXT:    br label %[[LOOP_PEEL_NEXT:.*]]
-; CHECK:       [[LOOP_PEEL_NEXT]]:
-; CHECK-NEXT:    br label %[[EXIT:.*]]
+; CHECK-NEXT:    br i1 [[EC]], label %[[LOOP]], label %[[EXIT:.*]]
 ; CHECK:       [[EXIT]]:
-; CHECK-NEXT:    ret i32 [[SEL_LCSSA]]
+; CHECK-NEXT:    [[SEL_LCSSA1:%.*]] = phi i32 [ [[SEL_LCSSA]], %[[LOOP]] ]
+; CHECK-NEXT:    ret i32 [[SEL_LCSSA1]]
 ;
 entry:
   %sub = add i32 %n, -2
@@ -245,7 +224,7 @@ define i32 @peel_last_with_trip_count_check_lcssa_phi_cmp_not_invar3(i32 %n) {
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw i32 [[IV]], 1
 ; CHECK-NEXT:    [[TMP1:%.*]] = sub i32 [[N]], 1
 ; CHECK-NEXT:    [[EC:%.*]] = icmp ne i32 [[IV_NEXT]], [[TMP1]]
-; CHECK-NEXT:    br i1 [[EC]], label %[[LOOP]], label %[[EXIT_PEEL_BEGIN_LOOPEXIT:.*]], !llvm.loop [[LOOP3:![0-9]+]]
+; CHECK-NEXT:    br i1 [[EC]], label %[[LOOP]], label %[[EXIT_PEEL_BEGIN_LOOPEXIT:.*]], !llvm.loop [[LOOP2:![0-9]+]]
 ; CHECK:       [[EXIT_PEEL_BEGIN_LOOPEXIT]]:
 ; CHECK-NEXT:    [[DOTPH:%.*]] = phi i32 [ [[IV_NEXT]], %[[LOOP]] ]
 ; CHECK-NEXT:    br label %[[EXIT_PEEL_BEGIN]]
@@ -363,7 +342,7 @@ define void @peel_last_with_trip_count_check_nested_loop(i32 %n) {
 ; CHECK-NEXT:    [[IV_NEXT1]] = add nuw i32 [[IV1]], 1
 ; CHECK-NEXT:    [[TMP2:%.*]] = sub i32 [[N]], 1
 ; CHECK-NEXT:    [[EXITCOND_NOT1:%.*]] = icmp eq i32 [[IV_NEXT1]], [[TMP2]]
-; CHECK-NEXT:    br i1 [[EXITCOND_NOT1]], label %[[OUTER_HEADER_LOOPEXIT_PEEL_BEGIN_LOOPEXIT]], label %[[INNER_HEADER]], !llvm.loop [[LOOP4:![0-9]+]]
+; CHECK-NEXT:    br i1 [[EXITCOND_NOT1]], label %[[OUTER_HEADER_LOOPEXIT_PEEL_BEGIN_LOOPEXIT]], label %[[INNER_HEADER]], !llvm.loop [[LOOP3:![0-9]+]]
 ;
 entry:
   %sub = add i32 %n, -1
@@ -392,5 +371,4 @@ inner.latch:
 ; CHECK: [[META1]] = !{!"llvm.loop.peeled.count", i32 1}
 ; CHECK: [[LOOP2]] = distinct !{[[LOOP2]], [[META1]]}
 ; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META1]]}
-; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META1]]}
 ;.

llvm/lib/Transforms/Utils/LoopPeel.cpp

fhahn · 2025-06-12T15:56:49Z

llvm/lib/Transforms/Utils/LoopPeel.cpp

+
+  // For a monotonic predicate, we've found the single transition point and
+  // don't need to check further.
+  if (!ICmpInst::isEquality(Pred))


Should we use SE.getMonotonicPredicateType additionally here, to handle the case where the AddRec may wrap, which could make something like i < x not motontonic?

I'm confused. In order to get here, we have to have identified a predicate which is either a nssw equality, or a monotonic predicate. (Along the min/max intrinsic, we check the same conditions, but using different names, etc..). What's the point of adding the additional monotonic check here?

preames requested a review from fhahn June 10, 2025 19:18

llvmbot added the llvm:transforms label Jun 10, 2025

preames mentioned this pull request Jun 10, 2025

[LoopPeel] Allow bidirection condition change for last iteration peeling #143562

Open

fhahn reviewed Jun 12, 2025

View reviewed changes

Merge branch 'main' into pr-peel-last-equality-edge-case

318af92

preames changed the title ~~[LoopPeel] Avoid unprofitable last iteration peeling with equalitis~~ [LoopPeel] Avoid unprofitable last iteration peeling with equalities Jun 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LoopPeel] Avoid unprofitable last iteration peeling with equalities #143588

[LoopPeel] Avoid unprofitable last iteration peeling with equalities #143588

Uh oh!

preames commented Jun 10, 2025

Uh oh!

llvmbot commented Jun 10, 2025

Uh oh!

Uh oh!

fhahn Jun 12, 2025

Uh oh!

preames Jun 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

[LoopPeel] Avoid unprofitable last iteration peeling with equalities #143588

Are you sure you want to change the base?

[LoopPeel] Avoid unprofitable last iteration peeling with equalities #143588

Uh oh!

Conversation

preames commented Jun 10, 2025

Uh oh!

llvmbot commented Jun 10, 2025

Uh oh!

Uh oh!

fhahn Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

preames Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

preames Jun 16, 2025 •

edited

Loading