Skip to content

[Reland][SCEV] teach isImpliedViaOperations about samesign #133711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 2, 2025

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Mar 31, 2025

This patch relands #124270.
Closes #126409.

The root cause is that we incorrectly preserve the samesign flag after truncating operands of an icmp:
https://alive2.llvm.org/ce/z/4NE9gS

artagnon and others added 3 commits March 31, 2025 17:36
Use CmpPredicate::getMatching in isImpliedCondBalancedTypes to pass
samesign information to isImpliedViaOperations, and teach it to call
CmpPredicate::getPreferredSignedPredicate, effectively making it
optimize with samesign information.
@dtcxzyw dtcxzyw requested review from artagnon and fhahn March 31, 2025 12:32
@dtcxzyw dtcxzyw requested a review from nikic as a code owner March 31, 2025 12:32
@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Mar 31, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 31, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch relands #124270.
Closes #126409.

The root cause is that we incorrectly preserve the samesign flag after truncating operands of an icmp:
https://alive2.llvm.org/ce/z/4NE9gS


Patch is 21.07 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/133711.diff

5 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+21-18)
  • (modified) llvm/test/Analysis/ScalarEvolution/exit-count-samesign.ll (+2-2)
  • (modified) llvm/test/Analysis/ScalarEvolution/implied-via-division.ll (+131-33)
  • (modified) llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll (+16-8)
  • (added) llvm/test/Transforms/IndVarSimplify/pr126409.ll (+31)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 600a061d4435e..32e669a400c13 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -11804,8 +11804,10 @@ bool ScalarEvolution::isImpliedCond(CmpPredicate Pred, const SCEV *LHS,
                                           MaxValue)) {
         const SCEV *TruncFoundLHS = getTruncateExpr(FoundLHS, NarrowType);
         const SCEV *TruncFoundRHS = getTruncateExpr(FoundRHS, NarrowType);
-        if (isImpliedCondBalancedTypes(Pred, LHS, RHS, FoundPred, TruncFoundLHS,
-                                       TruncFoundRHS, CtxI))
+        // We cannot preserve samesign after truncation.
+        if (isImpliedCondBalancedTypes(
+                Pred, LHS, RHS, static_cast<ICmpInst::Predicate>(FoundPred),
+                TruncFoundLHS, TruncFoundRHS, CtxI))
           return true;
       }
     }
@@ -11862,15 +11864,13 @@ bool ScalarEvolution::isImpliedCondBalancedTypes(
   }
 
   // Check whether the found predicate is the same as the desired predicate.
-  // FIXME: use CmpPredicate::getMatching here.
-  if (FoundPred == static_cast<CmpInst::Predicate>(Pred))
-    return isImpliedCondOperands(Pred, LHS, RHS, FoundLHS, FoundRHS, CtxI);
+  if (auto P = CmpPredicate::getMatching(FoundPred, Pred))
+    return isImpliedCondOperands(*P, LHS, RHS, FoundLHS, FoundRHS, CtxI);
 
   // Check whether swapping the found predicate makes it the same as the
   // desired predicate.
-  // FIXME: use CmpPredicate::getMatching here.
-  if (ICmpInst::getSwappedCmpPredicate(FoundPred) ==
-      static_cast<CmpInst::Predicate>(Pred)) {
+  if (auto P = CmpPredicate::getMatching(
+          ICmpInst::getSwappedCmpPredicate(FoundPred), Pred)) {
     // We can write the implication
     // 0.  LHS Pred      RHS  <-   FoundLHS SwapPred  FoundRHS
     // using one of the following ways:
@@ -11881,22 +11881,23 @@ bool ScalarEvolution::isImpliedCondBalancedTypes(
     // Forms 1. and 2. require swapping the operands of one condition. Don't
     // do this if it would break canonical constant/addrec ordering.
     if (!isa<SCEVConstant>(RHS) && !isa<SCEVAddRecExpr>(LHS))
-      return isImpliedCondOperands(FoundPred, RHS, LHS, FoundLHS, FoundRHS,
-                                   CtxI);
+      return isImpliedCondOperands(ICmpInst::getSwappedCmpPredicate(*P), RHS,
+                                   LHS, FoundLHS, FoundRHS, CtxI);
     if (!isa<SCEVConstant>(FoundRHS) && !isa<SCEVAddRecExpr>(FoundLHS))
-      return isImpliedCondOperands(Pred, LHS, RHS, FoundRHS, FoundLHS, CtxI);
+      return isImpliedCondOperands(*P, LHS, RHS, FoundRHS, FoundLHS, CtxI);
 
     // There's no clear preference between forms 3. and 4., try both.  Avoid
     // forming getNotSCEV of pointer values as the resulting subtract is
     // not legal.
     if (!LHS->getType()->isPointerTy() && !RHS->getType()->isPointerTy() &&
-        isImpliedCondOperands(FoundPred, getNotSCEV(LHS), getNotSCEV(RHS),
-                              FoundLHS, FoundRHS, CtxI))
+        isImpliedCondOperands(ICmpInst::getSwappedCmpPredicate(*P),
+                              getNotSCEV(LHS), getNotSCEV(RHS), FoundLHS,
+                              FoundRHS, CtxI))
       return true;
 
     if (!FoundLHS->getType()->isPointerTy() &&
         !FoundRHS->getType()->isPointerTy() &&
-        isImpliedCondOperands(Pred, LHS, RHS, getNotSCEV(FoundLHS),
+        isImpliedCondOperands(*P, LHS, RHS, getNotSCEV(FoundLHS),
                               getNotSCEV(FoundRHS), CtxI))
       return true;
 
@@ -12572,14 +12573,16 @@ bool ScalarEvolution::isImpliedViaOperations(CmpPredicate Pred, const SCEV *LHS,
     return false;
 
   // We only want to work with GT comparison so far.
-  if (Pred == ICmpInst::ICMP_ULT || Pred == ICmpInst::ICMP_SLT) {
+  if (ICmpInst::isLT(Pred)) {
     Pred = ICmpInst::getSwappedCmpPredicate(Pred);
     std::swap(LHS, RHS);
     std::swap(FoundLHS, FoundRHS);
   }
 
+  CmpInst::Predicate P = Pred.getPreferredSignedPredicate();
+
   // For unsigned, try to reduce it to corresponding signed comparison.
-  if (Pred == ICmpInst::ICMP_UGT)
+  if (P == ICmpInst::ICMP_UGT)
     // We can replace unsigned predicate with its signed counterpart if all
     // involved values are non-negative.
     // TODO: We could have better support for unsigned.
@@ -12592,10 +12595,10 @@ bool ScalarEvolution::isImpliedViaOperations(CmpPredicate Pred, const SCEV *LHS,
                                 FoundRHS) &&
           isImpliedCondOperands(ICmpInst::ICMP_SGT, RHS, MinusOne, FoundLHS,
                                 FoundRHS))
-        Pred = ICmpInst::ICMP_SGT;
+        P = ICmpInst::ICMP_SGT;
     }
 
-  if (Pred != ICmpInst::ICMP_SGT)
+  if (P != ICmpInst::ICMP_SGT)
     return false;
 
   auto GetOpFromSExt = [&](const SCEV *S) {
diff --git a/llvm/test/Analysis/ScalarEvolution/exit-count-samesign.ll b/llvm/test/Analysis/ScalarEvolution/exit-count-samesign.ll
index 93c6bc08af2a0..4d569cc69fa2b 100644
--- a/llvm/test/Analysis/ScalarEvolution/exit-count-samesign.ll
+++ b/llvm/test/Analysis/ScalarEvolution/exit-count-samesign.ll
@@ -5,9 +5,9 @@
 define i32 @exit_count_samesign(i32 %iter.count, ptr %ptr) {
 ; CHECK-LABEL: 'exit_count_samesign'
 ; CHECK-NEXT:  Determining loop execution counts for: @exit_count_samesign
-; CHECK-NEXT:  Loop %inner.loop: backedge-taken count is (-1 + (1 smax {(-1 + %iter.count)<nsw>,+,-1}<nsw><%outer.loop>))<nsw>
+; CHECK-NEXT:  Loop %inner.loop: backedge-taken count is {(-2 + %iter.count),+,-1}<nw><%outer.loop>
 ; CHECK-NEXT:  Loop %inner.loop: constant max backedge-taken count is i32 2147483646
-; CHECK-NEXT:  Loop %inner.loop: symbolic max backedge-taken count is (-1 + (1 smax {(-1 + %iter.count)<nsw>,+,-1}<nsw><%outer.loop>))<nsw>
+; CHECK-NEXT:  Loop %inner.loop: symbolic max backedge-taken count is {(-2 + %iter.count),+,-1}<nw><%outer.loop>
 ; CHECK-NEXT:  Loop %inner.loop: Trip multiple is 1
 ; CHECK-NEXT:  Loop %outer.loop: <multiple exits> Unpredictable backedge-taken count.
 ; CHECK-NEXT:  Loop %outer.loop: Unpredictable constant max backedge-taken count.
diff --git a/llvm/test/Analysis/ScalarEvolution/implied-via-division.ll b/llvm/test/Analysis/ScalarEvolution/implied-via-division.ll
index a1d30406095ec..d83301243ef30 100644
--- a/llvm/test/Analysis/ScalarEvolution/implied-via-division.ll
+++ b/llvm/test/Analysis/ScalarEvolution/implied-via-division.ll
@@ -2,12 +2,10 @@
 ; RUN: opt < %s -disable-output -passes="print<scalar-evolution>" \
 ; RUN:   -scalar-evolution-classify-expressions=0 2>&1 | FileCheck %s
 
-declare void @llvm.experimental.guard(i1, ...)
-
-define void @test_1(i32 %n) nounwind {
-; Prove that (n > 1) ===> (n / 2 > 0).
-; CHECK-LABEL: 'test_1'
-; CHECK-NEXT:  Determining loop execution counts for: @test_1
+define void @implied1(i32 %n) {
+; Prove that (n s> 1) ===> (n / 2 s> 0).
+; CHECK-LABEL: 'implied1'
+; CHECK-NEXT:  Determining loop execution counts for: @implied1
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + %n.div.2)<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + %n.div.2)<nsw>
@@ -29,10 +27,35 @@ exit:
   ret void
 }
 
-define void @test_1neg(i32 %n) nounwind {
-; Prove that (n > 0) =\=> (n / 2 > 0).
-; CHECK-LABEL: 'test_1neg'
-; CHECK-NEXT:  Determining loop execution counts for: @test_1neg
+define void @implied1_samesign(i32 %n) {
+; Prove that (n > 1) ===> (n / 2 s> 0).
+; CHECK-LABEL: 'implied1_samesign'
+; CHECK-NEXT:  Determining loop execution counts for: @implied1_samesign
+; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
+; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: Trip multiple is 1
+;
+entry:
+  %cmp1 = icmp samesign ugt i32 %n, 1
+  %n.div.2 = sdiv i32 %n, 2
+  call void @llvm.assume(i1 %cmp1)
+  br label %header
+
+header:
+  %indvar = phi i32 [ %indvar.next, %header ], [ 0, %entry ]
+  %indvar.next = add i32 %indvar, 1
+  %exitcond = icmp sgt i32 %n.div.2, %indvar.next
+  br i1 %exitcond, label %header, label %exit
+
+exit:
+  ret void
+}
+
+define void @implied1_neg(i32 %n) {
+; Prove that (n s> 0) =\=> (n / 2 s> 0).
+; CHECK-LABEL: 'implied1_neg'
+; CHECK-NEXT:  Determining loop execution counts for: @implied1_neg
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
@@ -54,10 +77,10 @@ exit:
   ret void
 }
 
-define void @test_2(i32 %n) nounwind {
-; Prove that (n >= 2) ===> (n / 2 > 0).
-; CHECK-LABEL: 'test_2'
-; CHECK-NEXT:  Determining loop execution counts for: @test_2
+define void @implied2(i32 %n) {
+; Prove that (n s>= 2) ===> (n / 2 s> 0).
+; CHECK-LABEL: 'implied2'
+; CHECK-NEXT:  Determining loop execution counts for: @implied2
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + %n.div.2)<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + %n.div.2)<nsw>
@@ -79,10 +102,35 @@ exit:
   ret void
 }
 
-define void @test_2neg(i32 %n) nounwind {
-; Prove that (n >= 1) =\=> (n / 2 > 0).
-; CHECK-LABEL: 'test_2neg'
-; CHECK-NEXT:  Determining loop execution counts for: @test_2neg
+define void @implied2_samesign(i32 %n) {
+; Prove that (n >= 2) ===> (n / 2 s> 0).
+; CHECK-LABEL: 'implied2_samesign'
+; CHECK-NEXT:  Determining loop execution counts for: @implied2_samesign
+; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
+; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
+; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
+; CHECK-NEXT:  Loop %header: Trip multiple is 1
+;
+entry:
+  %cmp1 = icmp samesign uge i32 %n, 2
+  %n.div.2 = sdiv i32 %n, 2
+  call void @llvm.assume(i1 %cmp1)
+  br label %header
+
+header:
+  %indvar = phi i32 [ %indvar.next, %header ], [ 0, %entry ]
+  %indvar.next = add i32 %indvar, 1
+  %exitcond = icmp sgt i32 %n.div.2, %indvar.next
+  br i1 %exitcond, label %header, label %exit
+
+exit:
+  ret void
+}
+
+define void @implied2_neg(i32 %n) {
+; Prove that (n s>= 1) =\=> (n / 2 s> 0).
+; CHECK-LABEL: 'implied2_neg'
+; CHECK-NEXT:  Determining loop execution counts for: @implied2_neg
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
@@ -104,10 +152,10 @@ exit:
   ret void
 }
 
-define void @test_3(i32 %n) nounwind {
-; Prove that (n > -2) ===> (n / 2 >= 0).
-; CHECK-LABEL: 'test_3'
-; CHECK-NEXT:  Determining loop execution counts for: @test_3
+define void @implied3(i32 %n) {
+; Prove that (n s> -2) ===> (n / 2 s>= 0).
+; CHECK-LABEL: 'implied3'
+; CHECK-NEXT:  Determining loop execution counts for: @implied3
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (1 + %n.div.2)<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741824
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (1 + %n.div.2)<nsw>
@@ -129,10 +177,35 @@ exit:
   ret void
 }
 
-define void @test_3neg(i32 %n) nounwind {
+define void @implied3_samesign(i32 %n) {
+; Prove that (n > -2) ===> (n / 2 s>= 0).
+; CHECK-LABEL: 'implied3_samesign'
+; CHECK-NEXT:  Determining loop execution counts for: @implied3_samesign
+; CHECK-NEXT:  Loop %header: backedge-taken count is (1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1
+; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: Trip multiple is 1
+;
+entry:
+  %cmp1 = icmp samesign ugt i32 %n, -2
+  %n.div.2 = sdiv i32 %n, 2
+  call void @llvm.assume(i1 %cmp1)
+  br label %header
+
+header:
+  %indvar = phi i32 [ %indvar.next, %header ], [ 0, %entry ]
+  %indvar.next = add i32 %indvar, 1
+  %exitcond = icmp sge i32 %n.div.2, %indvar
+  br i1 %exitcond, label %header, label %exit
+
+exit:
+  ret void
+}
+
+define void @implied3_neg(i32 %n) {
 ; Prove that (n > -3) =\=> (n / 2 >= 0).
-; CHECK-LABEL: 'test_3neg'
-; CHECK-NEXT:  Determining loop execution counts for: @test_3neg
+; CHECK-LABEL: 'implied3_neg'
+; CHECK-NEXT:  Determining loop execution counts for: @implied3_neg
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (0 smax (1 + %n.div.2)<nsw>)
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741824
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (0 smax (1 + %n.div.2)<nsw>)
@@ -154,10 +227,10 @@ exit:
   ret void
 }
 
-define void @test_4(i32 %n) nounwind {
-; Prove that (n >= -1) ===> (n / 2 >= 0).
-; CHECK-LABEL: 'test_4'
-; CHECK-NEXT:  Determining loop execution counts for: @test_4
+define void @implied4(i32 %n) {
+; Prove that (n s>= -1) ===> (n / 2 s>= 0).
+; CHECK-LABEL: 'implied4'
+; CHECK-NEXT:  Determining loop execution counts for: @implied4
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (1 + %n.div.2)<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741824
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (1 + %n.div.2)<nsw>
@@ -179,10 +252,35 @@ exit:
   ret void
 }
 
-define void @test_4neg(i32 %n) nounwind {
-; Prove that (n >= -2) =\=> (n / 2 >= 0).
-; CHECK-LABEL: 'test_4neg'
-; CHECK-NEXT:  Determining loop execution counts for: @test_4neg
+define void @implied4_samesign(i32 %n) {
+; Prove that (n >= -1) ===> (n / 2 s>= 0).
+; CHECK-LABEL: 'implied4_samesign'
+; CHECK-NEXT:  Determining loop execution counts for: @implied4_samesign
+; CHECK-NEXT:  Loop %header: backedge-taken count is (1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1
+; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: Trip multiple is 1
+;
+entry:
+  %cmp1 = icmp samesign uge i32 %n, -1
+  %n.div.2 = sdiv i32 %n, 2
+  call void @llvm.assume(i1 %cmp1)
+  br label %header
+
+header:
+  %indvar = phi i32 [ %indvar.next, %header ], [ 0, %entry ]
+  %indvar.next = add i32 %indvar, 1
+  %exitcond = icmp sge i32 %n.div.2, %indvar
+  br i1 %exitcond, label %header, label %exit
+
+exit:
+  ret void
+}
+
+define void @implied4_neg(i32 %n) {
+; Prove that (n s>= -2) =\=> (n / 2 s>= 0).
+; CHECK-LABEL: 'implied4_neg'
+; CHECK-NEXT:  Determining loop execution counts for: @implied4_neg
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (0 smax (1 + %n.div.2)<nsw>)
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741824
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (0 smax (1 + %n.div.2)<nsw>)
diff --git a/llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll b/llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll
index 1207f47c5e3c9..c4e26c98ed24a 100644
--- a/llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll
+++ b/llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll
@@ -68,28 +68,32 @@ define i32 @iv_zext_zext_gt_slt(i32 %iter.count, ptr %ptr) {
 ; CHECK-LABEL: define i32 @iv_zext_zext_gt_slt(
 ; CHECK-SAME: i32 [[ITER_COUNT:%.*]], ptr [[PTR:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
-; CHECK-NEXT:    [[TMP0:%.*]] = sext i32 [[ITER_COUNT]] to i64
+; CHECK-NEXT:    [[TMP0:%.*]] = add nsw i32 [[ITER_COUNT]], -1
 ; CHECK-NEXT:    br label %[[OUTER_LOOP:.*]]
 ; CHECK:       [[PH_LOOPEXIT:.*]]:
 ; CHECK-NEXT:    br label %[[PH:.*]]
 ; CHECK:       [[PH]]:
+; CHECK-NEXT:    [[INDVARS_IV_NEXT3:%.*]] = add i32 [[INDVARS_IV1:%.*]], -1
 ; CHECK-NEXT:    br label %[[OUTER_LOOP]]
 ; CHECK:       [[OUTER_LOOP]]:
-; CHECK-NEXT:    [[INDVARS_IV1:%.*]] = phi i64 [ [[INDVARS_IV_NEXT2:%.*]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
-; CHECK-NEXT:    [[INDVARS_IV_NEXT2]] = add nsw i64 [[INDVARS_IV1]], -1
+; CHECK-NEXT:    [[INDVARS_IV1]] = phi i32 [ [[INDVARS_IV_NEXT3]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
+; CHECK-NEXT:    [[IV_OUTER:%.*]] = phi i32 [ [[IV_OUTER_1:%.*]], %[[PH]] ], [ [[ITER_COUNT]], %[[ENTRY]] ]
+; CHECK-NEXT:    [[IV_OUTER_1]] = add nsw i32 [[IV_OUTER]], -1
+; CHECK-NEXT:    [[INDVARS_IV_NEXT2:%.*]] = zext nneg i32 [[IV_OUTER_1]] to i64
 ; CHECK-NEXT:    [[GEP_OUTER:%.*]] = getelementptr i8, ptr [[PTR]], i64 [[INDVARS_IV_NEXT2]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP_OUTER]], align 1
-; CHECK-NEXT:    [[EXIT_COND_OUTER:%.*]] = icmp samesign ugt i64 [[INDVARS_IV1]], 1
+; CHECK-NEXT:    [[EXIT_COND_OUTER:%.*]] = icmp samesign ugt i32 [[IV_OUTER]], 1
 ; CHECK-NEXT:    br i1 [[EXIT_COND_OUTER]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[PH]]
 ; CHECK:       [[INNER_LOOP_PREHEADER]]:
+; CHECK-NEXT:    [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[INDVARS_IV1]] to i64
 ; CHECK-NEXT:    br label %[[INNER_LOOP:.*]]
 ; CHECK:       [[INNER_LOOP]]:
 ; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ 0, %[[INNER_LOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[INNER_LOOP]] ]
 ; CHECK-NEXT:    [[GEP_INNER:%.*]] = getelementptr i8, ptr [[PTR]], i64 [[INDVARS_IV]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP_INNER]], align 1
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
-; CHECK-NEXT:    [[EXIT_COND_INNER:%.*]] = icmp slt i64 [[INDVARS_IV_NEXT]], [[INDVARS_IV_NEXT2]]
-; CHECK-NEXT:    br i1 [[EXIT_COND_INNER]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT:    br i1 [[EXITCOND]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
 ; CHECK:       [[EXIT:.*:]]
 ; CHECK-NEXT:    ret i32 0
 ;
@@ -424,28 +428,32 @@ define i32 @iv_sext_sext_gt_slt(i32 %iter.count, ptr %ptr) {
 ; CHECK-LABEL: define i32 @iv_sext_sext_gt_slt(
 ; CHECK-SAME: i32 [[ITER_COUNT:%.*]], ptr [[PTR:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:    [[TMP1:%.*]] = add nsw i32 [[ITER_COUNT]], -1
 ; CHECK-NEXT:    [[TMP0:%.*]] = sext i32 [[ITER_COUNT]] to i64
 ; CHECK-NEXT:    br label %[[OUTER_LOOP:.*]]
 ; CHECK:       [[PH_LOOPEXIT:.*]]:
 ; CHECK-NEXT:    br label %[[PH:.*]]
 ; CHECK:       [[PH]]:
+; CHECK-NEXT:    [[INDVARS_IV_NEXT3:%.*]] = add i32 [[INDVARS_IV2:%.*]], -1
 ; CHECK-NEXT:    br label %[[OUTER_LOOP]]
 ; CHECK:       [[OUTER_LOOP]]:
 ; CHECK-NEXT:    [[INDVARS_IV1:%.*]] = phi i64 [ [[INDVARS_IV_NEXT2:%.*]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
+; CHECK-NEXT:    [[INDVARS_IV2]] = phi i32 [ [[INDVARS_IV_NEXT3]], %[[PH]] ], [ [[TMP1]], %[[ENTRY]] ]
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT2]] = add nsw i64 [[INDVARS_IV1]], -1
 ; CHECK-NEXT:    [[GEP_OUTER:%.*]] = getelementptr i8, ptr [[PTR]], i64 [[INDVARS_IV_NEXT2]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP_OUTER]], align 1
 ; CHECK-NEXT:    [[EXIT_COND_OUTER:%.*]] = icmp samesign ugt i64 [[INDVARS_IV1]], 1
 ; CHECK-NEXT:    br i1 [[EXIT_COND_OUTER]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[PH]]
 ; CHECK:       [[INNER_LOOP_PREHEADER]]:
+; CHECK-NEXT:    [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[INDVARS_IV2]] to i64
 ; CHECK-NEXT:    br label %[[INNER_LOOP:.*]]
 ; CHECK:       [[INNER_LOOP]]:
 ; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ 0, %[[INNER_LOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[INNER_LOOP]] ]
 ; CHECK-NEXT:    [[GEP_INNER:%.*]] = getelementptr i8, ptr [[PTR]], i64 [[INDVARS_IV]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP_INNER]], align 1
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
-; CHECK-NEXT:    [[EXIT_COND_INNER:%.*]] = icmp slt i64 [[INDVARS_IV_NEXT]], [[INDVARS_IV_NEXT2]]
-; CHECK-NEXT:    br i1 [[EXIT_COND_INNER]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT:    br i1 [[EXITCOND]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
 ; CHECK:       [[EXIT:.*:]]
 ; CHECK-NEXT:    ret i32 0
 ;
diff --git a/llvm/test/Transforms/IndVarSimplify/pr126409.ll b/llvm/test/Transforms/IndVarSimplify/pr126409.ll
new file mode 100644
index 0000000000000..a491ce81878de
--- /dev/null
+++ b/llvm/test/Transforms/IndVarSi...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Mar 31, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch relands #124270.
Closes #126409.

The root cause is that we incorrectly preserve the samesign flag after truncating operands of an icmp:
https://alive2.llvm.org/ce/z/4NE9gS


Patch is 21.07 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/133711.diff

5 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+21-18)
  • (modified) llvm/test/Analysis/ScalarEvolution/exit-count-samesign.ll (+2-2)
  • (modified) llvm/test/Analysis/ScalarEvolution/implied-via-division.ll (+131-33)
  • (modified) llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll (+16-8)
  • (added) llvm/test/Transforms/IndVarSimplify/pr126409.ll (+31)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 600a061d4435e..32e669a400c13 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -11804,8 +11804,10 @@ bool ScalarEvolution::isImpliedCond(CmpPredicate Pred, const SCEV *LHS,
                                           MaxValue)) {
         const SCEV *TruncFoundLHS = getTruncateExpr(FoundLHS, NarrowType);
         const SCEV *TruncFoundRHS = getTruncateExpr(FoundRHS, NarrowType);
-        if (isImpliedCondBalancedTypes(Pred, LHS, RHS, FoundPred, TruncFoundLHS,
-                                       TruncFoundRHS, CtxI))
+        // We cannot preserve samesign after truncation.
+        if (isImpliedCondBalancedTypes(
+                Pred, LHS, RHS, static_cast<ICmpInst::Predicate>(FoundPred),
+                TruncFoundLHS, TruncFoundRHS, CtxI))
           return true;
       }
     }
@@ -11862,15 +11864,13 @@ bool ScalarEvolution::isImpliedCondBalancedTypes(
   }
 
   // Check whether the found predicate is the same as the desired predicate.
-  // FIXME: use CmpPredicate::getMatching here.
-  if (FoundPred == static_cast<CmpInst::Predicate>(Pred))
-    return isImpliedCondOperands(Pred, LHS, RHS, FoundLHS, FoundRHS, CtxI);
+  if (auto P = CmpPredicate::getMatching(FoundPred, Pred))
+    return isImpliedCondOperands(*P, LHS, RHS, FoundLHS, FoundRHS, CtxI);
 
   // Check whether swapping the found predicate makes it the same as the
   // desired predicate.
-  // FIXME: use CmpPredicate::getMatching here.
-  if (ICmpInst::getSwappedCmpPredicate(FoundPred) ==
-      static_cast<CmpInst::Predicate>(Pred)) {
+  if (auto P = CmpPredicate::getMatching(
+          ICmpInst::getSwappedCmpPredicate(FoundPred), Pred)) {
     // We can write the implication
     // 0.  LHS Pred      RHS  <-   FoundLHS SwapPred  FoundRHS
     // using one of the following ways:
@@ -11881,22 +11881,23 @@ bool ScalarEvolution::isImpliedCondBalancedTypes(
     // Forms 1. and 2. require swapping the operands of one condition. Don't
     // do this if it would break canonical constant/addrec ordering.
     if (!isa<SCEVConstant>(RHS) && !isa<SCEVAddRecExpr>(LHS))
-      return isImpliedCondOperands(FoundPred, RHS, LHS, FoundLHS, FoundRHS,
-                                   CtxI);
+      return isImpliedCondOperands(ICmpInst::getSwappedCmpPredicate(*P), RHS,
+                                   LHS, FoundLHS, FoundRHS, CtxI);
     if (!isa<SCEVConstant>(FoundRHS) && !isa<SCEVAddRecExpr>(FoundLHS))
-      return isImpliedCondOperands(Pred, LHS, RHS, FoundRHS, FoundLHS, CtxI);
+      return isImpliedCondOperands(*P, LHS, RHS, FoundRHS, FoundLHS, CtxI);
 
     // There's no clear preference between forms 3. and 4., try both.  Avoid
     // forming getNotSCEV of pointer values as the resulting subtract is
     // not legal.
     if (!LHS->getType()->isPointerTy() && !RHS->getType()->isPointerTy() &&
-        isImpliedCondOperands(FoundPred, getNotSCEV(LHS), getNotSCEV(RHS),
-                              FoundLHS, FoundRHS, CtxI))
+        isImpliedCondOperands(ICmpInst::getSwappedCmpPredicate(*P),
+                              getNotSCEV(LHS), getNotSCEV(RHS), FoundLHS,
+                              FoundRHS, CtxI))
       return true;
 
     if (!FoundLHS->getType()->isPointerTy() &&
         !FoundRHS->getType()->isPointerTy() &&
-        isImpliedCondOperands(Pred, LHS, RHS, getNotSCEV(FoundLHS),
+        isImpliedCondOperands(*P, LHS, RHS, getNotSCEV(FoundLHS),
                               getNotSCEV(FoundRHS), CtxI))
       return true;
 
@@ -12572,14 +12573,16 @@ bool ScalarEvolution::isImpliedViaOperations(CmpPredicate Pred, const SCEV *LHS,
     return false;
 
   // We only want to work with GT comparison so far.
-  if (Pred == ICmpInst::ICMP_ULT || Pred == ICmpInst::ICMP_SLT) {
+  if (ICmpInst::isLT(Pred)) {
     Pred = ICmpInst::getSwappedCmpPredicate(Pred);
     std::swap(LHS, RHS);
     std::swap(FoundLHS, FoundRHS);
   }
 
+  CmpInst::Predicate P = Pred.getPreferredSignedPredicate();
+
   // For unsigned, try to reduce it to corresponding signed comparison.
-  if (Pred == ICmpInst::ICMP_UGT)
+  if (P == ICmpInst::ICMP_UGT)
     // We can replace unsigned predicate with its signed counterpart if all
     // involved values are non-negative.
     // TODO: We could have better support for unsigned.
@@ -12592,10 +12595,10 @@ bool ScalarEvolution::isImpliedViaOperations(CmpPredicate Pred, const SCEV *LHS,
                                 FoundRHS) &&
           isImpliedCondOperands(ICmpInst::ICMP_SGT, RHS, MinusOne, FoundLHS,
                                 FoundRHS))
-        Pred = ICmpInst::ICMP_SGT;
+        P = ICmpInst::ICMP_SGT;
     }
 
-  if (Pred != ICmpInst::ICMP_SGT)
+  if (P != ICmpInst::ICMP_SGT)
     return false;
 
   auto GetOpFromSExt = [&](const SCEV *S) {
diff --git a/llvm/test/Analysis/ScalarEvolution/exit-count-samesign.ll b/llvm/test/Analysis/ScalarEvolution/exit-count-samesign.ll
index 93c6bc08af2a0..4d569cc69fa2b 100644
--- a/llvm/test/Analysis/ScalarEvolution/exit-count-samesign.ll
+++ b/llvm/test/Analysis/ScalarEvolution/exit-count-samesign.ll
@@ -5,9 +5,9 @@
 define i32 @exit_count_samesign(i32 %iter.count, ptr %ptr) {
 ; CHECK-LABEL: 'exit_count_samesign'
 ; CHECK-NEXT:  Determining loop execution counts for: @exit_count_samesign
-; CHECK-NEXT:  Loop %inner.loop: backedge-taken count is (-1 + (1 smax {(-1 + %iter.count)<nsw>,+,-1}<nsw><%outer.loop>))<nsw>
+; CHECK-NEXT:  Loop %inner.loop: backedge-taken count is {(-2 + %iter.count),+,-1}<nw><%outer.loop>
 ; CHECK-NEXT:  Loop %inner.loop: constant max backedge-taken count is i32 2147483646
-; CHECK-NEXT:  Loop %inner.loop: symbolic max backedge-taken count is (-1 + (1 smax {(-1 + %iter.count)<nsw>,+,-1}<nsw><%outer.loop>))<nsw>
+; CHECK-NEXT:  Loop %inner.loop: symbolic max backedge-taken count is {(-2 + %iter.count),+,-1}<nw><%outer.loop>
 ; CHECK-NEXT:  Loop %inner.loop: Trip multiple is 1
 ; CHECK-NEXT:  Loop %outer.loop: <multiple exits> Unpredictable backedge-taken count.
 ; CHECK-NEXT:  Loop %outer.loop: Unpredictable constant max backedge-taken count.
diff --git a/llvm/test/Analysis/ScalarEvolution/implied-via-division.ll b/llvm/test/Analysis/ScalarEvolution/implied-via-division.ll
index a1d30406095ec..d83301243ef30 100644
--- a/llvm/test/Analysis/ScalarEvolution/implied-via-division.ll
+++ b/llvm/test/Analysis/ScalarEvolution/implied-via-division.ll
@@ -2,12 +2,10 @@
 ; RUN: opt < %s -disable-output -passes="print<scalar-evolution>" \
 ; RUN:   -scalar-evolution-classify-expressions=0 2>&1 | FileCheck %s
 
-declare void @llvm.experimental.guard(i1, ...)
-
-define void @test_1(i32 %n) nounwind {
-; Prove that (n > 1) ===> (n / 2 > 0).
-; CHECK-LABEL: 'test_1'
-; CHECK-NEXT:  Determining loop execution counts for: @test_1
+define void @implied1(i32 %n) {
+; Prove that (n s> 1) ===> (n / 2 s> 0).
+; CHECK-LABEL: 'implied1'
+; CHECK-NEXT:  Determining loop execution counts for: @implied1
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + %n.div.2)<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + %n.div.2)<nsw>
@@ -29,10 +27,35 @@ exit:
   ret void
 }
 
-define void @test_1neg(i32 %n) nounwind {
-; Prove that (n > 0) =\=> (n / 2 > 0).
-; CHECK-LABEL: 'test_1neg'
-; CHECK-NEXT:  Determining loop execution counts for: @test_1neg
+define void @implied1_samesign(i32 %n) {
+; Prove that (n > 1) ===> (n / 2 s> 0).
+; CHECK-LABEL: 'implied1_samesign'
+; CHECK-NEXT:  Determining loop execution counts for: @implied1_samesign
+; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
+; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: Trip multiple is 1
+;
+entry:
+  %cmp1 = icmp samesign ugt i32 %n, 1
+  %n.div.2 = sdiv i32 %n, 2
+  call void @llvm.assume(i1 %cmp1)
+  br label %header
+
+header:
+  %indvar = phi i32 [ %indvar.next, %header ], [ 0, %entry ]
+  %indvar.next = add i32 %indvar, 1
+  %exitcond = icmp sgt i32 %n.div.2, %indvar.next
+  br i1 %exitcond, label %header, label %exit
+
+exit:
+  ret void
+}
+
+define void @implied1_neg(i32 %n) {
+; Prove that (n s> 0) =\=> (n / 2 s> 0).
+; CHECK-LABEL: 'implied1_neg'
+; CHECK-NEXT:  Determining loop execution counts for: @implied1_neg
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
@@ -54,10 +77,10 @@ exit:
   ret void
 }
 
-define void @test_2(i32 %n) nounwind {
-; Prove that (n >= 2) ===> (n / 2 > 0).
-; CHECK-LABEL: 'test_2'
-; CHECK-NEXT:  Determining loop execution counts for: @test_2
+define void @implied2(i32 %n) {
+; Prove that (n s>= 2) ===> (n / 2 s> 0).
+; CHECK-LABEL: 'implied2'
+; CHECK-NEXT:  Determining loop execution counts for: @implied2
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + %n.div.2)<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + %n.div.2)<nsw>
@@ -79,10 +102,35 @@ exit:
   ret void
 }
 
-define void @test_2neg(i32 %n) nounwind {
-; Prove that (n >= 1) =\=> (n / 2 > 0).
-; CHECK-LABEL: 'test_2neg'
-; CHECK-NEXT:  Determining loop execution counts for: @test_2neg
+define void @implied2_samesign(i32 %n) {
+; Prove that (n >= 2) ===> (n / 2 s> 0).
+; CHECK-LABEL: 'implied2_samesign'
+; CHECK-NEXT:  Determining loop execution counts for: @implied2_samesign
+; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
+; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
+; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
+; CHECK-NEXT:  Loop %header: Trip multiple is 1
+;
+entry:
+  %cmp1 = icmp samesign uge i32 %n, 2
+  %n.div.2 = sdiv i32 %n, 2
+  call void @llvm.assume(i1 %cmp1)
+  br label %header
+
+header:
+  %indvar = phi i32 [ %indvar.next, %header ], [ 0, %entry ]
+  %indvar.next = add i32 %indvar, 1
+  %exitcond = icmp sgt i32 %n.div.2, %indvar.next
+  br i1 %exitcond, label %header, label %exit
+
+exit:
+  ret void
+}
+
+define void @implied2_neg(i32 %n) {
+; Prove that (n s>= 1) =\=> (n / 2 s> 0).
+; CHECK-LABEL: 'implied2_neg'
+; CHECK-NEXT:  Determining loop execution counts for: @implied2_neg
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741822
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (-1 + (1 smax %n.div.2))<nsw>
@@ -104,10 +152,10 @@ exit:
   ret void
 }
 
-define void @test_3(i32 %n) nounwind {
-; Prove that (n > -2) ===> (n / 2 >= 0).
-; CHECK-LABEL: 'test_3'
-; CHECK-NEXT:  Determining loop execution counts for: @test_3
+define void @implied3(i32 %n) {
+; Prove that (n s> -2) ===> (n / 2 s>= 0).
+; CHECK-LABEL: 'implied3'
+; CHECK-NEXT:  Determining loop execution counts for: @implied3
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (1 + %n.div.2)<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741824
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (1 + %n.div.2)<nsw>
@@ -129,10 +177,35 @@ exit:
   ret void
 }
 
-define void @test_3neg(i32 %n) nounwind {
+define void @implied3_samesign(i32 %n) {
+; Prove that (n > -2) ===> (n / 2 s>= 0).
+; CHECK-LABEL: 'implied3_samesign'
+; CHECK-NEXT:  Determining loop execution counts for: @implied3_samesign
+; CHECK-NEXT:  Loop %header: backedge-taken count is (1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1
+; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: Trip multiple is 1
+;
+entry:
+  %cmp1 = icmp samesign ugt i32 %n, -2
+  %n.div.2 = sdiv i32 %n, 2
+  call void @llvm.assume(i1 %cmp1)
+  br label %header
+
+header:
+  %indvar = phi i32 [ %indvar.next, %header ], [ 0, %entry ]
+  %indvar.next = add i32 %indvar, 1
+  %exitcond = icmp sge i32 %n.div.2, %indvar
+  br i1 %exitcond, label %header, label %exit
+
+exit:
+  ret void
+}
+
+define void @implied3_neg(i32 %n) {
 ; Prove that (n > -3) =\=> (n / 2 >= 0).
-; CHECK-LABEL: 'test_3neg'
-; CHECK-NEXT:  Determining loop execution counts for: @test_3neg
+; CHECK-LABEL: 'implied3_neg'
+; CHECK-NEXT:  Determining loop execution counts for: @implied3_neg
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (0 smax (1 + %n.div.2)<nsw>)
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741824
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (0 smax (1 + %n.div.2)<nsw>)
@@ -154,10 +227,10 @@ exit:
   ret void
 }
 
-define void @test_4(i32 %n) nounwind {
-; Prove that (n >= -1) ===> (n / 2 >= 0).
-; CHECK-LABEL: 'test_4'
-; CHECK-NEXT:  Determining loop execution counts for: @test_4
+define void @implied4(i32 %n) {
+; Prove that (n s>= -1) ===> (n / 2 s>= 0).
+; CHECK-LABEL: 'implied4'
+; CHECK-NEXT:  Determining loop execution counts for: @implied4
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (1 + %n.div.2)<nsw>
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741824
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (1 + %n.div.2)<nsw>
@@ -179,10 +252,35 @@ exit:
   ret void
 }
 
-define void @test_4neg(i32 %n) nounwind {
-; Prove that (n >= -2) =\=> (n / 2 >= 0).
-; CHECK-LABEL: 'test_4neg'
-; CHECK-NEXT:  Determining loop execution counts for: @test_4neg
+define void @implied4_samesign(i32 %n) {
+; Prove that (n >= -1) ===> (n / 2 s>= 0).
+; CHECK-LABEL: 'implied4_samesign'
+; CHECK-NEXT:  Determining loop execution counts for: @implied4_samesign
+; CHECK-NEXT:  Loop %header: backedge-taken count is (1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1
+; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (1 + %n.div.2)<nsw>
+; CHECK-NEXT:  Loop %header: Trip multiple is 1
+;
+entry:
+  %cmp1 = icmp samesign uge i32 %n, -1
+  %n.div.2 = sdiv i32 %n, 2
+  call void @llvm.assume(i1 %cmp1)
+  br label %header
+
+header:
+  %indvar = phi i32 [ %indvar.next, %header ], [ 0, %entry ]
+  %indvar.next = add i32 %indvar, 1
+  %exitcond = icmp sge i32 %n.div.2, %indvar
+  br i1 %exitcond, label %header, label %exit
+
+exit:
+  ret void
+}
+
+define void @implied4_neg(i32 %n) {
+; Prove that (n s>= -2) =\=> (n / 2 s>= 0).
+; CHECK-LABEL: 'implied4_neg'
+; CHECK-NEXT:  Determining loop execution counts for: @implied4_neg
 ; CHECK-NEXT:  Loop %header: backedge-taken count is (0 smax (1 + %n.div.2)<nsw>)
 ; CHECK-NEXT:  Loop %header: constant max backedge-taken count is i32 1073741824
 ; CHECK-NEXT:  Loop %header: symbolic max backedge-taken count is (0 smax (1 + %n.div.2)<nsw>)
diff --git a/llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll b/llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll
index 1207f47c5e3c9..c4e26c98ed24a 100644
--- a/llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll
+++ b/llvm/test/Transforms/IndVarSimplify/iv-ext-samesign.ll
@@ -68,28 +68,32 @@ define i32 @iv_zext_zext_gt_slt(i32 %iter.count, ptr %ptr) {
 ; CHECK-LABEL: define i32 @iv_zext_zext_gt_slt(
 ; CHECK-SAME: i32 [[ITER_COUNT:%.*]], ptr [[PTR:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
-; CHECK-NEXT:    [[TMP0:%.*]] = sext i32 [[ITER_COUNT]] to i64
+; CHECK-NEXT:    [[TMP0:%.*]] = add nsw i32 [[ITER_COUNT]], -1
 ; CHECK-NEXT:    br label %[[OUTER_LOOP:.*]]
 ; CHECK:       [[PH_LOOPEXIT:.*]]:
 ; CHECK-NEXT:    br label %[[PH:.*]]
 ; CHECK:       [[PH]]:
+; CHECK-NEXT:    [[INDVARS_IV_NEXT3:%.*]] = add i32 [[INDVARS_IV1:%.*]], -1
 ; CHECK-NEXT:    br label %[[OUTER_LOOP]]
 ; CHECK:       [[OUTER_LOOP]]:
-; CHECK-NEXT:    [[INDVARS_IV1:%.*]] = phi i64 [ [[INDVARS_IV_NEXT2:%.*]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
-; CHECK-NEXT:    [[INDVARS_IV_NEXT2]] = add nsw i64 [[INDVARS_IV1]], -1
+; CHECK-NEXT:    [[INDVARS_IV1]] = phi i32 [ [[INDVARS_IV_NEXT3]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
+; CHECK-NEXT:    [[IV_OUTER:%.*]] = phi i32 [ [[IV_OUTER_1:%.*]], %[[PH]] ], [ [[ITER_COUNT]], %[[ENTRY]] ]
+; CHECK-NEXT:    [[IV_OUTER_1]] = add nsw i32 [[IV_OUTER]], -1
+; CHECK-NEXT:    [[INDVARS_IV_NEXT2:%.*]] = zext nneg i32 [[IV_OUTER_1]] to i64
 ; CHECK-NEXT:    [[GEP_OUTER:%.*]] = getelementptr i8, ptr [[PTR]], i64 [[INDVARS_IV_NEXT2]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP_OUTER]], align 1
-; CHECK-NEXT:    [[EXIT_COND_OUTER:%.*]] = icmp samesign ugt i64 [[INDVARS_IV1]], 1
+; CHECK-NEXT:    [[EXIT_COND_OUTER:%.*]] = icmp samesign ugt i32 [[IV_OUTER]], 1
 ; CHECK-NEXT:    br i1 [[EXIT_COND_OUTER]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[PH]]
 ; CHECK:       [[INNER_LOOP_PREHEADER]]:
+; CHECK-NEXT:    [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[INDVARS_IV1]] to i64
 ; CHECK-NEXT:    br label %[[INNER_LOOP:.*]]
 ; CHECK:       [[INNER_LOOP]]:
 ; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ 0, %[[INNER_LOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[INNER_LOOP]] ]
 ; CHECK-NEXT:    [[GEP_INNER:%.*]] = getelementptr i8, ptr [[PTR]], i64 [[INDVARS_IV]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP_INNER]], align 1
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
-; CHECK-NEXT:    [[EXIT_COND_INNER:%.*]] = icmp slt i64 [[INDVARS_IV_NEXT]], [[INDVARS_IV_NEXT2]]
-; CHECK-NEXT:    br i1 [[EXIT_COND_INNER]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT:    br i1 [[EXITCOND]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
 ; CHECK:       [[EXIT:.*:]]
 ; CHECK-NEXT:    ret i32 0
 ;
@@ -424,28 +428,32 @@ define i32 @iv_sext_sext_gt_slt(i32 %iter.count, ptr %ptr) {
 ; CHECK-LABEL: define i32 @iv_sext_sext_gt_slt(
 ; CHECK-SAME: i32 [[ITER_COUNT:%.*]], ptr [[PTR:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:    [[TMP1:%.*]] = add nsw i32 [[ITER_COUNT]], -1
 ; CHECK-NEXT:    [[TMP0:%.*]] = sext i32 [[ITER_COUNT]] to i64
 ; CHECK-NEXT:    br label %[[OUTER_LOOP:.*]]
 ; CHECK:       [[PH_LOOPEXIT:.*]]:
 ; CHECK-NEXT:    br label %[[PH:.*]]
 ; CHECK:       [[PH]]:
+; CHECK-NEXT:    [[INDVARS_IV_NEXT3:%.*]] = add i32 [[INDVARS_IV2:%.*]], -1
 ; CHECK-NEXT:    br label %[[OUTER_LOOP]]
 ; CHECK:       [[OUTER_LOOP]]:
 ; CHECK-NEXT:    [[INDVARS_IV1:%.*]] = phi i64 [ [[INDVARS_IV_NEXT2:%.*]], %[[PH]] ], [ [[TMP0]], %[[ENTRY]] ]
+; CHECK-NEXT:    [[INDVARS_IV2]] = phi i32 [ [[INDVARS_IV_NEXT3]], %[[PH]] ], [ [[TMP1]], %[[ENTRY]] ]
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT2]] = add nsw i64 [[INDVARS_IV1]], -1
 ; CHECK-NEXT:    [[GEP_OUTER:%.*]] = getelementptr i8, ptr [[PTR]], i64 [[INDVARS_IV_NEXT2]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP_OUTER]], align 1
 ; CHECK-NEXT:    [[EXIT_COND_OUTER:%.*]] = icmp samesign ugt i64 [[INDVARS_IV1]], 1
 ; CHECK-NEXT:    br i1 [[EXIT_COND_OUTER]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[PH]]
 ; CHECK:       [[INNER_LOOP_PREHEADER]]:
+; CHECK-NEXT:    [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[INDVARS_IV2]] to i64
 ; CHECK-NEXT:    br label %[[INNER_LOOP:.*]]
 ; CHECK:       [[INNER_LOOP]]:
 ; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ 0, %[[INNER_LOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[INNER_LOOP]] ]
 ; CHECK-NEXT:    [[GEP_INNER:%.*]] = getelementptr i8, ptr [[PTR]], i64 [[INDVARS_IV]]
 ; CHECK-NEXT:    store i8 0, ptr [[GEP_INNER]], align 1
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
-; CHECK-NEXT:    [[EXIT_COND_INNER:%.*]] = icmp slt i64 [[INDVARS_IV_NEXT]], [[INDVARS_IV_NEXT2]]
-; CHECK-NEXT:    br i1 [[EXIT_COND_INNER]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT:    br i1 [[EXITCOND]], label %[[INNER_LOOP]], label %[[PH_LOOPEXIT]]
 ; CHECK:       [[EXIT:.*:]]
 ; CHECK-NEXT:    ret i32 0
 ;
diff --git a/llvm/test/Transforms/IndVarSimplify/pr126409.ll b/llvm/test/Transforms/IndVarSimplify/pr126409.ll
new file mode 100644
index 0000000000000..a491ce81878de
--- /dev/null
+++ b/llvm/test/Transforms/IndVarSi...
[truncated]

Copy link
Contributor

@artagnon artagnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for following up on this! This is quite a non-trivial fix, and the inter-diff seems obviously correct, although I can't be certain that there are other pending issues: I think you created some kind of fuzz-service for LLVM; if possible, can we fuzz the patch pre-landing?

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Apr 2, 2025

Thanks for following up on this! This is quite a non-trivial fix, and the inter-diff seems obviously correct, although I can't be certain that there are other pending issues: I think you created some kind of fuzz-service for LLVM; if possible, can we fuzz the patch pre-landing?

Csmith generated 2 million tests, and all the tests passed. Is it good enough to merge?

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for tracking it down!

TruncFoundRHS, CtxI))
// We cannot preserve samesign after truncation.
if (isImpliedCondBalancedTypes(
Pred, LHS, RHS, static_cast<ICmpInst::Predicate>(FoundPred),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of wonder whether we should have some more explicit API for this like FoundPred.withoutSameSign(). Doing this by going through a static_cast is a bit subtle...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix it in a follow up patch.

Copy link
Contributor

@artagnon artagnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fuzzing! This change LGTM!

@dtcxzyw dtcxzyw merged commit f066d75 into llvm:main Apr 2, 2025
14 checks passed
@dtcxzyw dtcxzyw deleted the perf/reapply-124270 branch April 2, 2025 10:45
dtcxzyw added a commit that referenced this pull request Apr 2, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Apr 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SCEV: buggy samesign handling in #124270
4 participants