Skip to content

[SCEV] Prove no-self-wrap from negative power of two step #101416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

preames
Copy link
Collaborator

@preames preames commented Jul 31, 2024

We have existing code which reasons about a step evenly dividing the iteration space is a finite loop with a single exit implying no-self-wrap. The sign of the step doesn't effect this.

We have existing code which reasons about a step evenly dividing
the iteration space is a finite loop with a single exit implying
no-self-wrap.  The sign of the step doesn't effect this.

Not really a fan of the conditional negate logic here, but it seemed
like a bit of overkill to have isKnownToBeAPossibleNegativePowerOfTwo.
Any better ideas on how to structure this?
@preames preames requested a review from nikic as a code owner July 31, 2024 21:51
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Jul 31, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 31, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Philip Reames (preames)

Changes

We have existing code which reasons about a step evenly dividing the iteration space is a finite loop with a single exit implying no-self-wrap. The sign of the step doesn't effect this.

Not really a fan of the conditional negate logic here, but it seemed like a bit of overkill to have isKnownToBeAPossibleNegativePowerOfTwo. Any better ideas on how to structure this?


Full diff: https://github.com/llvm/llvm-project/pull/101416.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+11-7)
  • (modified) llvm/test/Analysis/ScalarEvolution/trip-count-scalable-stride.ll (+14-12)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 264ac392b16d1..755fc9a6d35bc 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -9157,13 +9157,17 @@ ScalarEvolution::ExitLimit ScalarEvolution::computeExitLimitFromICmp(
     if (auto *ZExt = dyn_cast<SCEVZeroExtendExpr>(LHS))
       InnerLHS = ZExt->getOperand();
     if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(InnerLHS);
-        AR && !AR->hasNoSelfWrap() && AR->getLoop() == L && AR->isAffine() &&
-        isKnownToBeAPowerOfTwo(AR->getStepRecurrence(*this), /*OrZero=*/true)) {
-      auto Flags = AR->getNoWrapFlags();
-      Flags = setFlags(Flags, SCEV::FlagNW);
-      SmallVector<const SCEV *> Operands{AR->operands()};
-      Flags = StrengthenNoWrapFlags(this, scAddRecExpr, Operands, Flags);
-      setNoWrapFlags(const_cast<SCEVAddRecExpr *>(AR), Flags);
+        AR && !AR->hasNoSelfWrap() && AR->getLoop() == L && AR->isAffine()) {
+      const SCEV *Step = AR->getStepRecurrence(*this);
+      if (isKnownNegative(Step))
+        Step = getNegativeSCEV(Step);
+      if (isKnownToBeAPowerOfTwo(Step, /*OrZero=*/true)) {
+        auto Flags = AR->getNoWrapFlags();
+        Flags = setFlags(Flags, SCEV::FlagNW);
+        SmallVector<const SCEV *> Operands{AR->operands()};
+        Flags = StrengthenNoWrapFlags(this, scAddRecExpr, Operands, Flags);
+        setNoWrapFlags(const_cast<SCEVAddRecExpr *>(AR), Flags);
+      }
     }
   }
 
diff --git a/llvm/test/Analysis/ScalarEvolution/trip-count-scalable-stride.ll b/llvm/test/Analysis/ScalarEvolution/trip-count-scalable-stride.ll
index 7c9498304e939..eda28e24f1b0e 100644
--- a/llvm/test/Analysis/ScalarEvolution/trip-count-scalable-stride.ll
+++ b/llvm/test/Analysis/ScalarEvolution/trip-count-scalable-stride.ll
@@ -455,15 +455,16 @@ define void @vscale_countdown_ne(ptr nocapture %A, i32 %n) mustprogress vscale_r
 ; CHECK-NEXT:    %start = sub i32 %n, %vscale
 ; CHECK-NEXT:    --> ((-1 * vscale)<nsw> + %n) U: full-set S: full-set
 ; CHECK-NEXT:    %iv = phi i32 [ %sub, %for.body ], [ %start, %entry ]
-; CHECK-NEXT:    --> {((-1 * vscale)<nsw> + %n),+,(-1 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {((-1 * vscale)<nsw> + %n),+,(-1 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: ((vscale * (-1 + (-1 * (((-2 * vscale)<nsw> + %n) /u vscale))<nsw>)<nsw>) + %n) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:    %arrayidx = getelementptr inbounds i32, ptr %A, i32 %iv
-; CHECK-NEXT:    --> {((4 * %n) + (-4 * vscale)<nsw> + %A),+,(-4 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {((4 * %n) + (-4 * vscale)<nsw> + %A),+,(-4 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: ((4 * %n) + (vscale * (-4 + (-4 * (((-2 * vscale)<nsw> + %n) /u vscale)))) + %A) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:    %sub = sub i32 %iv, %vscale
-; CHECK-NEXT:    --> {((-2 * vscale)<nsw> + %n),+,(-1 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {((-2 * vscale)<nsw> + %n),+,(-1 * vscale)<nsw>}<nw><%for.body> U: full-set S: full-set Exits: ((vscale * (-2 + (-1 * (((-2 * vscale)<nsw> + %n) /u vscale))<nsw>)) + %n) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @vscale_countdown_ne
-; CHECK-NEXT:  Loop %for.body: Unpredictable backedge-taken count.
-; CHECK-NEXT:  Loop %for.body: Unpredictable constant max backedge-taken count.
-; CHECK-NEXT:  Loop %for.body: Unpredictable symbolic max backedge-taken count.
+; CHECK-NEXT:  Loop %for.body: backedge-taken count is (((-2 * vscale)<nsw> + %n) /u vscale)
+; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 2147483647
+; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (((-2 * vscale)<nsw> + %n) /u vscale)
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 1
 ;
 entry:
   %vscale = call i32 @llvm.vscale.i32()
@@ -495,15 +496,16 @@ define void @vscalex4_countdown_ne(ptr nocapture %A, i32 %n) mustprogress vscale
 ; CHECK-NEXT:    %start = sub i32 %n, %VF
 ; CHECK-NEXT:    --> ((-4 * vscale)<nsw> + %n) U: full-set S: full-set
 ; CHECK-NEXT:    %iv = phi i32 [ %sub, %for.body ], [ %start, %entry ]
-; CHECK-NEXT:    --> {((-4 * vscale)<nsw> + %n),+,(-4 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {((-4 * vscale)<nsw> + %n),+,(-4 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: ((vscale * (-4 + (-4 * (((-8 * vscale)<nsw> + %n) /u (4 * vscale)<nuw><nsw>))<nsw>)<nsw>) + %n) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:    %arrayidx = getelementptr inbounds i32, ptr %A, i32 %iv
-; CHECK-NEXT:    --> {((4 * %n) + (-16 * vscale)<nsw> + %A),+,(-16 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {((4 * %n) + (-16 * vscale)<nsw> + %A),+,(-16 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: ((4 * %n) + (vscale * (-16 + (-16 * (((-8 * vscale)<nsw> + %n) /u (4 * vscale)<nuw><nsw>)))) + %A) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:    %sub = sub i32 %iv, %VF
-; CHECK-NEXT:    --> {((-8 * vscale)<nsw> + %n),+,(-4 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    --> {((-8 * vscale)<nsw> + %n),+,(-4 * vscale)<nsw>}<nw><%for.body> U: full-set S: full-set Exits: ((vscale * (-8 + (-4 * (((-8 * vscale)<nsw> + %n) /u (4 * vscale)<nuw><nsw>))<nsw>)) + %n) LoopDispositions: { %for.body: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @vscalex4_countdown_ne
-; CHECK-NEXT:  Loop %for.body: Unpredictable backedge-taken count.
-; CHECK-NEXT:  Loop %for.body: Unpredictable constant max backedge-taken count.
-; CHECK-NEXT:  Loop %for.body: Unpredictable symbolic max backedge-taken count.
+; CHECK-NEXT:  Loop %for.body: backedge-taken count is (((-8 * vscale)<nsw> + %n) /u (4 * vscale)<nuw><nsw>)
+; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is i32 536870911
+; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (((-8 * vscale)<nsw> + %n) /u (4 * vscale)<nuw><nsw>)
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 1
 ;
 entry:
   %vscale = call i32 @llvm.vscale.i32()

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really a fan of the conditional negate logic here, but it seemed like a bit of overkill to have isKnownToBeAPossibleNegativePowerOfTwo. Any better ideas on how to structure this?

I think it would be okay to add an OrNegative parameter to the function. Computing getNegativeSCEV() may be expensive if it's not just negating a constant.

@preames
Copy link
Collaborator Author

preames commented Aug 1, 2024

Not really a fan of the conditional negate logic here, but it seemed like a bit of overkill to have isKnownToBeAPossibleNegativePowerOfTwo. Any better ideas on how to structure this?

I think it would be okay to add an OrNegative parameter to the function. Computing getNegativeSCEV() may be expensive if it's not just negating a constant.

Will do. Update pending later today.

return all_of(Mul->operands(), NonRecursive) && (OrZero || isKnownNonZero(S));
return all_of(Mul->operands(), NonRecursive) &&
(OrZero || isKnownNonZero(S)) &&
(!OrNegative || llvm::count_if(Mul->operands(), [this](const SCEV *S) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why we need this condition.

If OrNegative=false the previous code is correct. If OrNegative=true, then we're fine with either a positive or negative power of two. So even if we have two negative factors, the result will be a positive power of two (or zero) and still be fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I was originally thinking of this as power-of-two XOR negative power-of-two, and then generalized slightly, but you're right, this condition is a remnant of the XOR model. Will kill shortly.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -455,15 +455,16 @@ define void @vscale_countdown_ne(ptr nocapture %A, i32 %n) mustprogress vscale_r
; CHECK-NEXT: %start = sub i32 %n, %vscale
; CHECK-NEXT: --> ((-1 * vscale)<nsw> + %n) U: full-set S: full-set
; CHECK-NEXT: %iv = phi i32 [ %sub, %for.body ], [ %start, %entry ]
; CHECK-NEXT: --> {((-1 * vscale)<nsw> + %n),+,(-1 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %for.body: Computable }
; CHECK-NEXT: --> {((-1 * vscale)<nsw> + %n),+,(-1 * vscale)<nsw>}<%for.body> U: full-set S: full-set Exits: ((vscale * (-1 + (-1 * (((-2 * vscale)<nsw> + %n) /u vscale))<nsw>)<nsw>) + %n) LoopDispositions: { %for.body: Computable }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the TODO above.

@preames preames merged commit f0944f4 into llvm:main Aug 1, 2024
4 of 6 checks passed
@preames preames deleted the pr-scev-no-self-wrap-from-neg-power-of-two branch August 1, 2024 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants