Skip to content

Commit b812e57

Browse files
preamesnikic
andauthored
[SCEV] Consolidate code for proving wrap flags of controlling finite IVs (llvm#101404)
The canAssumeNoSelfWrap routine in howManyLessThans was doing two subtly inter-related things. First, it was proving no-self-wrap. This exactly duplicates the existing logic in the caller. Second, it was establishing the precondition for the nw->nsw/nuw inference. Specifically, we need to know that *this* exit must be taken for the inference to be sound. Otherwise, another (possible abnormal) exit could be taken in the iteration where this IV would become poison. This change moves all of that logic into the caller, and caches the resulting nuw/nsw flags in the AddRec. This centralizes the logic in one place, and makes it clear that it all depends on controlling the sole exit. We do loose a couple cases with SCEV predication. Specifically, if SCEV predication was able to convert e.g. zext(addrec) into an addrec(zext) using predication, but didn't record the nuw fact on the new addrec, then the consuming code can no longer fix this up. I don't think this case particularly matters. --------- Co-authored-by: Nikita Popov <[email protected]>
1 parent d4f6fcf commit b812e57

File tree

3 files changed

+29
-55
lines changed

3 files changed

+29
-55
lines changed

llvm/lib/Analysis/ScalarEvolution.cpp

Lines changed: 27 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -9136,11 +9136,12 @@ ScalarEvolution::ExitLimit ScalarEvolution::computeExitLimitFromICmp(
91369136
}
91379137

91389138
// If this loop must exit based on this condition (or execute undefined
9139-
// behaviour), and we can prove the test sequence produced must repeat
9140-
// the same values on self-wrap of the IV, then we can infer that IV
9141-
// doesn't self wrap because if it did, we'd have an infinite (undefined)
9142-
// loop. Note that a stride of 0 is trivially no-self-wrap by definition.
9139+
// behaviour), see if we can improve wrap flags. This is essentially
9140+
// a must execute style proof.
91439141
if (ControllingFiniteLoop && isLoopInvariant(RHS, L)) {
9142+
// If we can prove the test sequence produced must repeat the same values
9143+
// on self-wrap of the IV, then we can infer that IV doesn't self wrap
9144+
// because if it did, we'd have an infinite (undefined) loop.
91449145
// TODO: We can peel off any functions which are invertible *in L*. Loop
91459146
// invariant terms are effectively constants for our purposes here.
91469147
auto *InnerLHS = LHS;
@@ -9156,6 +9157,25 @@ ScalarEvolution::ExitLimit ScalarEvolution::computeExitLimitFromICmp(
91569157
Flags = StrengthenNoWrapFlags(this, scAddRecExpr, Operands, Flags);
91579158
setNoWrapFlags(const_cast<SCEVAddRecExpr *>(AR), Flags);
91589159
}
9160+
9161+
// For a slt/ult condition with a positive step, can we prove nsw/nuw?
9162+
// From no-self-wrap, this follows trivially from the fact that every
9163+
// (un)signed-wrapped, but not self-wrapped value must be LT than the
9164+
// last value before (un)signed wrap. Since we know that last value
9165+
// didn't exit, nor will any smaller one.
9166+
if (Pred == ICmpInst::ICMP_SLT || Pred == ICmpInst::ICMP_ULT) {
9167+
auto WrapType = Pred == ICmpInst::ICMP_SLT ? SCEV::FlagNSW : SCEV::FlagNUW;
9168+
if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(LHS);
9169+
AR && AR->getLoop() == L && AR->isAffine() &&
9170+
!AR->getNoWrapFlags(WrapType) && AR->hasNoSelfWrap() &&
9171+
isKnownPositive(AR->getStepRecurrence(*this))) {
9172+
auto Flags = AR->getNoWrapFlags();
9173+
Flags = setFlags(Flags, WrapType);
9174+
SmallVector<const SCEV*> Operands{AR->operands()};
9175+
Flags = StrengthenNoWrapFlags(this, scAddRecExpr, Operands, Flags);
9176+
setNoWrapFlags(const_cast<SCEVAddRecExpr *>(AR), Flags);
9177+
}
9178+
}
91599179
}
91609180

91619181
switch (Pred) {
@@ -12769,35 +12789,6 @@ ScalarEvolution::howManyLessThans(const SCEV *LHS, const SCEV *RHS,
1276912789

1277012790
const SCEVAddRecExpr *IV = dyn_cast<SCEVAddRecExpr>(LHS);
1277112791
bool PredicatedIV = false;
12772-
12773-
auto canAssumeNoSelfWrap = [&](const SCEVAddRecExpr *AR) {
12774-
// Can we prove this loop *must* be UB if overflow of IV occurs?
12775-
// Reasoning goes as follows:
12776-
// * Suppose the IV did self wrap.
12777-
// * If Stride evenly divides the iteration space, then once wrap
12778-
// occurs, the loop must revisit the same values.
12779-
// * We know that RHS is invariant, and that none of those values
12780-
// caused this exit to be taken previously. Thus, this exit is
12781-
// dynamically dead.
12782-
// * If this is the sole exit, then a dead exit implies the loop
12783-
// must be infinite if there are no abnormal exits.
12784-
// * If the loop were infinite, then it must either not be mustprogress
12785-
// or have side effects. Otherwise, it must be UB.
12786-
// * It can't (by assumption), be UB so we have contradicted our
12787-
// premise and can conclude the IV did not in fact self-wrap.
12788-
if (!isLoopInvariant(RHS, L))
12789-
return false;
12790-
12791-
if (!isKnownToBeAPowerOfTwo(AR->getStepRecurrence(*this), /*OrZero=*/true,
12792-
/*OrNegative*/ true))
12793-
return false;
12794-
12795-
if (!ControlsOnlyExit || !loopHasNoAbnormalExits(L))
12796-
return false;
12797-
12798-
return loopIsFiniteByAssumption(L);
12799-
};
12800-
1280112792
if (!IV) {
1280212793
if (auto *ZExt = dyn_cast<SCEVZeroExtendExpr>(LHS)) {
1280312794
const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(ZExt->getOperand());
@@ -12948,21 +12939,10 @@ ScalarEvolution::howManyLessThans(const SCEV *LHS, const SCEV *RHS,
1294812939
Stride = getUMaxExpr(Stride, getOne(Stride->getType()));
1294912940
}
1295012941
}
12951-
} else if (!Stride->isOne() && !NoWrap) {
12952-
auto isUBOnWrap = [&]() {
12953-
// From no-self-wrap, we need to then prove no-(un)signed-wrap. This
12954-
// follows trivially from the fact that every (un)signed-wrapped, but
12955-
// not self-wrapped value must be LT than the last value before
12956-
// (un)signed wrap. Since we know that last value didn't exit, nor
12957-
// will any smaller one.
12958-
return canAssumeNoSelfWrap(IV);
12959-
};
12960-
12942+
} else if (!NoWrap) {
1296112943
// Avoid proven overflow cases: this will ensure that the backedge taken
12962-
// count will not generate any unsigned overflow. Relaxed no-overflow
12963-
// conditions exploit NoWrapFlags, allowing to optimize in presence of
12964-
// undefined behaviors like the case of C language.
12965-
if (canIVOverflowOnLT(RHS, Stride, IsSigned) && !isUBOnWrap())
12944+
// count will not generate any unsigned overflow.
12945+
if (canIVOverflowOnLT(RHS, Stride, IsSigned))
1296612946
return getCouldNotCompute();
1296712947
}
1296812948

llvm/test/Analysis/ScalarEvolution/trip-count-implied-addrec.ll

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -239,12 +239,6 @@ define void @neg_rhs_wrong_range(i16 %n.raw) mustprogress {
239239
; CHECK-NEXT: Loop %for.body: Unpredictable backedge-taken count.
240240
; CHECK-NEXT: Loop %for.body: Unpredictable constant max backedge-taken count.
241241
; CHECK-NEXT: Loop %for.body: Unpredictable symbolic max backedge-taken count.
242-
; CHECK-NEXT: Loop %for.body: Predicated backedge-taken count is ((-1 + (2 umax (-1 + (zext i8 (trunc i16 %n.raw to i8) to i16))<nsw>)) /u 2)
243-
; CHECK-NEXT: Predicates:
244-
; CHECK-NEXT: {2,+,2}<nw><%for.body> Added Flags: <nusw>
245-
; CHECK-NEXT: Loop %for.body: Predicated symbolic max backedge-taken count is ((-1 + (2 umax (-1 + (zext i8 (trunc i16 %n.raw to i8) to i16))<nsw>)) /u 2)
246-
; CHECK-NEXT: Predicates:
247-
; CHECK-NEXT: {2,+,2}<nw><%for.body> Added Flags: <nusw>
248242
;
249243
entry:
250244
%n.and = and i16 %n.raw, 255

llvm/test/Analysis/ScalarEvolution/trip-count-scalable-stride.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -377,7 +377,7 @@ define void @vscale_slt_noflags(ptr nocapture %A, i32 %n) mustprogress vscale_ra
377377
; CHECK-NEXT: %arrayidx = getelementptr inbounds i32, ptr %A, i32 %i.05
378378
; CHECK-NEXT: --> {%A,+,(4 * vscale)<nuw><nsw>}<%for.body> U: full-set S: full-set Exits: ((4 * vscale * ((-1 + %n) /u vscale)) + %A) LoopDispositions: { %for.body: Computable }
379379
; CHECK-NEXT: %add = add i32 %i.05, %vscale
380-
; CHECK-NEXT: --> {vscale,+,vscale}<nw><%for.body> U: full-set S: full-set Exits: (vscale * (1 + ((-1 + %n) /u vscale))<nuw>) LoopDispositions: { %for.body: Computable }
380+
; CHECK-NEXT: --> {vscale,+,vscale}<nuw><nsw><%for.body> U: [2,-2147483648) S: [2,-2147483648) Exits: (vscale * (1 + ((-1 + %n) /u vscale))<nuw>) LoopDispositions: { %for.body: Computable }
381381
; CHECK-NEXT: Determining loop execution counts for: @vscale_slt_noflags
382382
; CHECK-NEXT: Loop %for.body: backedge-taken count is ((-1 + %n) /u vscale)
383383
; CHECK-NEXT: Loop %for.body: constant max backedge-taken count is i32 1073741822
@@ -415,7 +415,7 @@ define void @vscalex4_ult_noflags(ptr nocapture %A, i32 %n) mustprogress vscale_
415415
; CHECK-NEXT: %arrayidx = getelementptr inbounds i32, ptr %A, i32 %i.05
416416
; CHECK-NEXT: --> {%A,+,(16 * vscale)<nuw><nsw>}<%for.body> U: full-set S: full-set Exits: ((16 * vscale * ((-1 + %n) /u (4 * vscale)<nuw><nsw>)) + %A) LoopDispositions: { %for.body: Computable }
417417
; CHECK-NEXT: %add = add i32 %i.05, %VF
418-
; CHECK-NEXT: --> {(4 * vscale)<nuw><nsw>,+,(4 * vscale)<nuw><nsw>}<nw><%for.body> U: [0,-3) S: [-2147483648,2147483645) Exits: (vscale * (4 + (4 * ((-1 + %n) /u (4 * vscale)<nuw><nsw>))<nuw><nsw>)<nuw>) LoopDispositions: { %for.body: Computable }
418+
; CHECK-NEXT: --> {(4 * vscale)<nuw><nsw>,+,(4 * vscale)<nuw><nsw>}<nuw><%for.body> U: [8,-3) S: [-2147483648,2147483645) Exits: (vscale * (4 + (4 * ((-1 + %n) /u (4 * vscale)<nuw><nsw>))<nuw><nsw>)<nuw>) LoopDispositions: { %for.body: Computable }
419419
; CHECK-NEXT: Determining loop execution counts for: @vscalex4_ult_noflags
420420
; CHECK-NEXT: Loop %for.body: backedge-taken count is ((-1 + %n) /u (4 * vscale)<nuw><nsw>)
421421
; CHECK-NEXT: Loop %for.body: constant max backedge-taken count is i32 536870910

0 commit comments

Comments
 (0)