-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SCEV] Use context sensitive reasoning in howFarToZero #94525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change builds on 0a357ad which supported non-constant strides in howFarToZero, but used only context insensative reasoning. This change does two things: 1) Directly use context sensative queries to prove facts established before the loop. Note that we technically only need facts known at the latch, but using facts known on entry is a conservative approximation which will cover most everything. 2) For the non-zero check, we can usually prove non-zero from the finite assumption implied by mustprogress. This eliminates the need to do the context sensative query in the common case.
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-llvm-analysis Author: Philip Reames (preames) ChangesThis change builds on 0a357ad which supported non-constant strides in howFarToZero, but used only context insensative reasoning. This change does two things:
Full diff: https://github.com/llvm/llvm-project/pull/94525.diff 4 Files Affected:
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 9808308cbfed9..8b947870e6da4 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -10485,7 +10485,7 @@ ScalarEvolution::ExitLimit ScalarEvolution::howFarToZero(const SCEV *V,
const SCEV *Step = getSCEVAtScope(AddRec->getOperand(1), L->getParentLoop());
const SCEVConstant *StepC = dyn_cast<SCEVConstant>(Step);
- if (!isLoopInvariant(Step, L) || !isKnownNonZero(Step))
+ if (!isLoopInvariant(Step, L))
return getCouldNotCompute();
// For positive steps (counting up until unsigned overflow):
@@ -10493,8 +10493,10 @@ ScalarEvolution::ExitLimit ScalarEvolution::howFarToZero(const SCEV *V,
// For negative steps (counting down to zero):
// N = Start/-Step
// First compute the unsigned distance from zero in the direction of Step.
- bool CountDown = isKnownNegative(Step);
- if (!CountDown && !isKnownNonNegative(Step))
+ const SCEV *Zero = getZero(AddRec->getType());
+ bool CountDown = isLoopEntryGuardedByCond(L, ICmpInst::ICMP_SLT, Step, Zero);
+ if (!CountDown &&
+ !isLoopEntryGuardedByCond(L, ICmpInst::ICMP_SGE, Step, Zero))
return getCouldNotCompute();
const SCEV *Distance = CountDown ? Start : getNegativeSCEV(Start);
@@ -10513,7 +10515,6 @@ ScalarEvolution::ExitLimit ScalarEvolution::howFarToZero(const SCEV *V,
// Explicitly handling this here is necessary because getUnsignedRange
// isn't context-sensitive; it doesn't know that we only care about the
// range inside the loop.
- const SCEV *Zero = getZero(Distance->getType());
const SCEV *One = getOne(Distance->getType());
const SCEV *DistancePlusOne = getAddExpr(Distance, One);
if (isLoopEntryGuardedByCond(L, ICmpInst::ICMP_NE, DistancePlusOne, Zero)) {
@@ -10533,6 +10534,14 @@ ScalarEvolution::ExitLimit ScalarEvolution::howFarToZero(const SCEV *V,
// will have undefined behavior due to wrapping.
if (ControlsOnlyExit && AddRec->hasNoSelfWrap() &&
loopHasNoAbnormalExits(AddRec->getLoop())) {
+
+ // If the stride is zero, the loop must be infinite. Most loops are
+ // finite by assumption, in which case the step being zero implies UB
+ // must execute if the loop is entered.
+ if (!loopIsFiniteByAssumption(L) &&
+ !isLoopEntryGuardedByCond(L, ICmpInst::ICMP_NE, Step, Zero))
+ return getCouldNotCompute();
+
const SCEV *Exact =
getUDivExpr(Distance, CountDown ? getNegativeSCEV(Step) : Step);
const SCEV *ConstantMax = getCouldNotCompute();
@@ -10547,7 +10556,7 @@ ScalarEvolution::ExitLimit ScalarEvolution::howFarToZero(const SCEV *V,
}
// Solve the general equation.
- if (!StepC)
+ if (!StepC || StepC->getValue()->isZero())
return getCouldNotCompute();
const SCEV *E = SolveLinEquationWithOverflow(StepC->getAPInt(),
getNegativeSCEV(Start), *this);
diff --git a/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll b/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
index 413bd21554c98..15e672d186c26 100644
--- a/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
+++ b/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
@@ -918,13 +918,13 @@ define void @crash(ptr %ptr) {
; CHECK-NEXT: %incdec.ptr112 = getelementptr inbounds i8, ptr %text.addr.5, i64 -1
; CHECK-NEXT: --> {(-1 + null)<nuw><nsw>,+,-1}<nw><%while.cond111> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %while.cond111: Computable, %while.body: Variant }
; CHECK-NEXT: %lastout.2271 = phi ptr [ %incdec.ptr126, %while.body125 ], [ %ptr, %while.end117 ]
-; CHECK-NEXT: --> {%ptr,+,1}<nuw><%while.body125> U: full-set S: full-set Exits: {(-2 + (-1 * (ptrtoint ptr %ptr to i64)) + %ptr),+,-1}<nw><%while.cond111> LoopDispositions: { %while.body125: Computable }
+; CHECK-NEXT: --> {%ptr,+,1}<nuw><%while.body125> U: full-set S: full-set Exits: {(2 + (ptrtoint ptr %ptr to i64) + %ptr),+,1}<nw><%while.cond111> LoopDispositions: { %while.body125: Computable }
; CHECK-NEXT: %incdec.ptr126 = getelementptr inbounds i8, ptr %lastout.2271, i64 1
-; CHECK-NEXT: --> {(1 + %ptr),+,1}<nuw><%while.body125> U: full-set S: full-set Exits: {(-1 + (-1 * (ptrtoint ptr %ptr to i64)) + %ptr),+,-1}<nw><%while.cond111> LoopDispositions: { %while.body125: Computable }
+; CHECK-NEXT: --> {(1 + %ptr),+,1}<nuw><%while.body125> U: full-set S: full-set Exits: {(3 + (ptrtoint ptr %ptr to i64) + %ptr),+,1}<nw><%while.cond111> LoopDispositions: { %while.body125: Computable }
; CHECK-NEXT: Determining loop execution counts for: @crash
-; CHECK-NEXT: Loop %while.body125: backedge-taken count is {(-2 + (-1 * (ptrtoint ptr %ptr to i64))),+,-1}<nw><%while.cond111>
+; CHECK-NEXT: Loop %while.body125: backedge-taken count is {(2 + (ptrtoint ptr %ptr to i64)),+,1}<nw><%while.cond111>
; CHECK-NEXT: Loop %while.body125: constant max backedge-taken count is i64 -2
-; CHECK-NEXT: Loop %while.body125: symbolic max backedge-taken count is {(-2 + (-1 * (ptrtoint ptr %ptr to i64))),+,-1}<nw><%while.cond111>
+; CHECK-NEXT: Loop %while.body125: symbolic max backedge-taken count is {(2 + (ptrtoint ptr %ptr to i64)),+,1}<nw><%while.cond111>
; CHECK-NEXT: Loop %while.body125: Trip multiple is 1
; CHECK-NEXT: Loop %while.cond111: Unpredictable backedge-taken count.
; CHECK-NEXT: Loop %while.cond111: Unpredictable constant max backedge-taken count.
diff --git a/llvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll b/llvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll
index ecf13320a5e81..2d02cb6194f4c 100644
--- a/llvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll
+++ b/llvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll
@@ -271,9 +271,10 @@ define void @ne_nsw_pos_step(ptr nocapture %A, i32 %n, i32 %s) mustprogress {
;
; CHECK-LABEL: 'ne_nsw_pos_step'
; CHECK-NEXT: Determining loop execution counts for: @ne_nsw_pos_step
-; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable constant max backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable symbolic max backedge-taken count.
+; CHECK-NEXT: Loop %for.body: backedge-taken count is (((-1 * %s) + %n) /u %s)
+; CHECK-NEXT: Loop %for.body: constant max backedge-taken count is i32 -1
+; CHECK-NEXT: Loop %for.body: symbolic max backedge-taken count is (((-1 * %s) + %n) /u %s)
+; CHECK-NEXT: Loop %for.body: Trip multiple is 1
;
entry:
%pos_step = icmp sgt i32 %s, 0
@@ -299,9 +300,10 @@ define void @ne_nsw_neg_step(ptr nocapture %A, i32 %n, i32 %s) mustprogress {
;
; CHECK-LABEL: 'ne_nsw_neg_step'
; CHECK-NEXT: Determining loop execution counts for: @ne_nsw_neg_step
-; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable constant max backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable symbolic max backedge-taken count.
+; CHECK-NEXT: Loop %for.body: backedge-taken count is (((-1 * %n) + %s) /u (-1 * %s))
+; CHECK-NEXT: Loop %for.body: constant max backedge-taken count is i32 -2
+; CHECK-NEXT: Loop %for.body: symbolic max backedge-taken count is (((-1 * %n) + %s) /u (-1 * %s))
+; CHECK-NEXT: Loop %for.body: Trip multiple is 1
;
entry:
%neg_step = icmp slt i32 %s, 0
@@ -327,9 +329,10 @@ define void @ne_nsw_nonneg_step(ptr nocapture %A, i32 %n, i32 %s) mustprogress {
;
; CHECK-LABEL: 'ne_nsw_nonneg_step'
; CHECK-NEXT: Determining loop execution counts for: @ne_nsw_nonneg_step
-; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable constant max backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable symbolic max backedge-taken count.
+; CHECK-NEXT: Loop %for.body: backedge-taken count is (((-1 * %s) + %n) /u %s)
+; CHECK-NEXT: Loop %for.body: constant max backedge-taken count is i32 -1
+; CHECK-NEXT: Loop %for.body: symbolic max backedge-taken count is (((-1 * %s) + %n) /u %s)
+; CHECK-NEXT: Loop %for.body: Trip multiple is 1
;
entry:
%nonneg_step = icmp sge i32 %s, 0
@@ -381,9 +384,10 @@ define void @ne_nuw_pos_step(ptr nocapture %A, i32 %n, i32 %s) mustprogress {
;
; CHECK-LABEL: 'ne_nuw_pos_step'
; CHECK-NEXT: Determining loop execution counts for: @ne_nuw_pos_step
-; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable constant max backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable symbolic max backedge-taken count.
+; CHECK-NEXT: Loop %for.body: backedge-taken count is (((-1 * %s) + %n) /u %s)
+; CHECK-NEXT: Loop %for.body: constant max backedge-taken count is i32 -1
+; CHECK-NEXT: Loop %for.body: symbolic max backedge-taken count is (((-1 * %s) + %n) /u %s)
+; CHECK-NEXT: Loop %for.body: Trip multiple is 1
;
entry:
%pos_step = icmp sgt i32 %s, 0
@@ -409,9 +413,10 @@ define void @ne_nuw_neg_step(ptr nocapture %A, i32 %n, i32 %s) mustprogress {
;
; CHECK-LABEL: 'ne_nuw_neg_step'
; CHECK-NEXT: Determining loop execution counts for: @ne_nuw_neg_step
-; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable constant max backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable symbolic max backedge-taken count.
+; CHECK-NEXT: Loop %for.body: backedge-taken count is (((-1 * %n) + %s) /u (-1 * %s))
+; CHECK-NEXT: Loop %for.body: constant max backedge-taken count is i32 -2
+; CHECK-NEXT: Loop %for.body: symbolic max backedge-taken count is (((-1 * %n) + %s) /u (-1 * %s))
+; CHECK-NEXT: Loop %for.body: Trip multiple is 1
;
entry:
%neg_step = icmp slt i32 %s, 0
@@ -437,9 +442,10 @@ define void @ne_nuw_nonneg_step(ptr nocapture %A, i32 %n, i32 %s) mustprogress {
;
; CHECK-LABEL: 'ne_nuw_nonneg_step'
; CHECK-NEXT: Determining loop execution counts for: @ne_nuw_nonneg_step
-; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable constant max backedge-taken count.
-; CHECK-NEXT: Loop %for.body: Unpredictable symbolic max backedge-taken count.
+; CHECK-NEXT: Loop %for.body: backedge-taken count is (((-1 * %s) + %n) /u %s)
+; CHECK-NEXT: Loop %for.body: constant max backedge-taken count is i32 -1
+; CHECK-NEXT: Loop %for.body: symbolic max backedge-taken count is (((-1 * %s) + %n) /u %s)
+; CHECK-NEXT: Loop %for.body: Trip multiple is 1
;
entry:
%nonneg_step = icmp sge i32 %s, 0
diff --git a/llvm/test/Transforms/LoopRotate/pr56260.ll b/llvm/test/Transforms/LoopRotate/pr56260.ll
index 70b68e7cf6db3..f9772eed4560f 100644
--- a/llvm/test/Transforms/LoopRotate/pr56260.ll
+++ b/llvm/test/Transforms/LoopRotate/pr56260.ll
@@ -17,16 +17,7 @@ define void @main() {
; CHECK-NEXT: [[TOBOOL3_NOT1:%.*]] = icmp eq i32 [[INC]], 0
; CHECK-NEXT: br i1 [[TOBOOL3_NOT1]], label [[L0_PREHEADER_LOOPEXIT]], label [[L1_PREHEADER_LR_PH:%.*]]
; CHECK: L1.preheader.lr.ph:
-; CHECK-NEXT: br label [[L1_PREHEADER:%.*]]
-; CHECK: L1.preheader:
-; CHECK-NEXT: [[SPEC_SELECT3:%.*]] = phi i32 [ [[INC]], [[L1_PREHEADER_LR_PH]] ], [ [[SPEC_SELECT:%.*]], [[L0_LATCH:%.*]] ]
-; CHECK-NEXT: [[K_02:%.*]] = phi i32 [ 0, [[L1_PREHEADER_LR_PH]] ], [ [[SPEC_SELECT3]], [[L0_LATCH]] ]
-; CHECK-NEXT: [[TOBOOL8_NOT:%.*]] = icmp eq i32 [[K_02]], 0
-; CHECK-NEXT: br label [[L0_LATCH]]
-; CHECK: L0.latch:
-; CHECK-NEXT: [[SPEC_SELECT]] = add nsw i32 [[SPEC_SELECT3]], [[INC]]
-; CHECK-NEXT: [[TOBOOL3_NOT:%.*]] = icmp eq i32 [[SPEC_SELECT]], 0
-; CHECK-NEXT: br i1 [[TOBOOL3_NOT]], label [[L0_L0_PREHEADER_LOOPEXIT_CRIT_EDGE:%.*]], label [[L1_PREHEADER]]
+; CHECK-NEXT: br label [[L0_L0_PREHEADER_LOOPEXIT_CRIT_EDGE:%.*]]
;
entry:
br label %L0.preheader
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Looking at this code, I think we can further improve the reasoning here in a couple of ways. Let me sketch out a few ideas here, so reviewers can see where this is going.
I'm probably going to post a patch for the first of those. Not sure I'm going to continue through with the later two, as they're a bit more involved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one does have some overhead: https://llvm-compile-time-tracker.com/compare.php?from=21711f89b9d85028160611f725bd33d7832d1d46&to=d89246df52708a8fd1fb126526f104ac89240fef&stat=instructions:u
It would be better to use applyLoopGuards, which should be cheaper and also provide better results for the queries you are performing here.
@nikic ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately the overhead stays about the same: http://llvm-compile-time-tracker.com/compare.php?from=21711f89b9d85028160611f725bd33d7832d1d46&to=234423fe3d82181c2cc1a7a614f60b46dea70297&stat=instructions%3Au
But it's small enough that I'm okay with landing this if it's useful.
return getCouldNotCompute(); | ||
|
||
// Specialize step for this loop so we get context sensative facts below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Specialize step for this loop so we get context sensative facts below. | |
// Specialize step for this loop so we get context sensitive facts below. |
@@ -10533,6 +10536,13 @@ ScalarEvolution::ExitLimit ScalarEvolution::howFarToZero(const SCEV *V, | |||
// will have undefined behavior due to wrapping. | |||
if (ControlsOnlyExit && AddRec->hasNoSelfWrap() && | |||
loopHasNoAbnormalExits(AddRec->getLoop())) { | |||
|
|||
// If the stride is zero, the loop must be infinite. Most loops are | |||
// finite by assumption, in which case the step being zero implies UB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Most loops are finite by assumption" is a C++-ism...
This change builds on 0a357ad which supported non-constant strides in howFarToZero, but used only context insensitive reasoning. This change does two things: 1) Directly use context sensitive queries to prove facts established before the loop. Note that we technically only need facts known at the latch, but using facts known on entry is a conservative approximation which will cover most everything. 2) For the non-zero check, we can usually prove non-zero from the finite assumption implied by mustprogress. This eliminates the need to do the context sensitive query in the common case.
…tion::howFarToZero` (#131522) llvm/llvm-project#94525 assumes that the loop will be infinite when the stride is zero. However, it doesn't hold when the start value of addrec is also zero. Closes llvm/llvm-project#131465.
…rToZero` (llvm#131522) llvm#94525 assumes that the loop will be infinite when the stride is zero. However, it doesn't hold when the start value of addrec is also zero. Closes llvm#131465. (cherry picked from commit c5a491e)
…tion::howFarToZero` (#131522) llvm/llvm-project#94525 assumes that the loop will be infinite when the stride is zero. However, it doesn't hold when the start value of addrec is also zero. Closes llvm/llvm-project#131465. (cherry picked from commit c5a491e)
This change builds on 0a357ad which supported non-constant strides in howFarToZero, but used only context insensitive reasoning.
This change does two things:
before the loop. Note that we technically only need facts known
at the latch, but using facts known on entry is a conservative
approximation which will cover most everything.
finite assumption implied by mustprogress. This eliminates the
need to do the context sensitive query in the common case.