Skip to content

Commit 29fa37e

Browse files
committed
[SCEV] If max BTC is zero, then so is the exact BTC [2 of 2]
This extends D108921 into a generic rule applied to constructing ExitLimits along all paths. The remaining paths (primarily howFarToZero) don't have the same reasoning about UB sensitivity as the howManyLessThan ones did. Instead, the remain cause for max counts being more precise than exact counts is that we apply context sensitive loop guards on the max path, and not on the exact path. That choice is mildly suspect, but out of scope of this patch. The MVETailPredication.cpp change deserves a bit of explanation. We were previously figuring out that two SCEVs happened to be equal because the happened to be identical. When we optimized one with context sensitive information, but not the other, we lost the ability to prove them equal. So, cover this case by subtracting and then applying loop guards again. Without this, we see changes in test/CodeGen/Thumb2/mve-blockplacement.ll Differential Revision: https://reviews.llvm.org/D109015
1 parent 3d157cf commit 29fa37e

File tree

3 files changed

+19
-13
lines changed

3 files changed

+19
-13
lines changed

llvm/lib/Analysis/ScalarEvolution.cpp

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7635,6 +7635,12 @@ ScalarEvolution::ExitLimit::ExitLimit(
76357635
const SCEV *E, const SCEV *M, bool MaxOrZero,
76367636
ArrayRef<const SmallPtrSetImpl<const SCEVPredicate *> *> PredSetList)
76377637
: ExactNotTaken(E), MaxNotTaken(M), MaxOrZero(MaxOrZero) {
7638+
// If we prove the max count is zero, so is the symbolic bound. This happens
7639+
// in practice due to differences in a) how context sensitive we've chosen
7640+
// to be and b) how we reason about bounds impied by UB.
7641+
if (MaxNotTaken->isZero())
7642+
ExactNotTaken = MaxNotTaken;
7643+
76387644
assert((isa<SCEVCouldNotCompute>(ExactNotTaken) ||
76397645
!isa<SCEVCouldNotCompute>(MaxNotTaken)) &&
76407646
"Exact is not allowed to be less precise than Max");
@@ -11939,10 +11945,6 @@ ScalarEvolution::howManyLessThans(const SCEV *LHS, const SCEV *RHS,
1193911945
} else {
1194011946
MaxBECount = computeMaxBECountForLT(
1194111947
Start, Stride, RHS, getTypeSizeInBits(LHS->getType()), IsSigned);
11942-
// If we prove the max count is zero, so is the symbolic bound. This can
11943-
// happen due to differences in how we reason about bounds impied by UB.
11944-
if (MaxBECount->isZero())
11945-
BECount = MaxBECount;
1194611948
}
1194711949

1194811950
if (isa<SCEVCouldNotCompute>(MaxBECount) &&

llvm/lib/Target/ARM/MVETailPredication.cpp

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -293,14 +293,18 @@ bool MVETailPredication::IsSafeActiveMask(IntrinsicInst *ActiveLaneMask,
293293
// Check for equality of TC and Ceil by calculating SCEV expression
294294
// TC - Ceil and test it for zero.
295295
//
296-
bool Zero = SE->getMinusSCEV(
297-
SE->getBackedgeTakenCount(L),
298-
SE->getUDivExpr(SE->getAddExpr(SE->getMulExpr(Ceil, VW),
299-
SE->getNegativeSCEV(VW)),
300-
VW))
301-
->isZero();
302-
303-
if (!Zero) {
296+
const SCEV *Sub =
297+
SE->getMinusSCEV(SE->getBackedgeTakenCount(L),
298+
SE->getUDivExpr(SE->getAddExpr(SE->getMulExpr(Ceil, VW),
299+
SE->getNegativeSCEV(VW)),
300+
VW));
301+
302+
// Use context sensitive facts about the path to the loop to refine. This
303+
// comes up as the backedge taken count can incorporate context sensitive
304+
// reasoning, and our RHS just above doesn't.
305+
Sub = SE->applyLoopGuards(Sub, L);
306+
307+
if (!Sub->isZero()) {
304308
LLVM_DEBUG(dbgs() << "ARM TP: possible overflow in sub expression.\n");
305309
return false;
306310
}

llvm/test/Analysis/ScalarEvolution/max-trip-count.ll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -523,7 +523,7 @@ exit:
523523
; of context sensativity.
524524
define void @ne_zero_max_btc(i32 %a) {
525525
; CHECK-LABEL: Determining loop execution counts for: @ne_zero_max_btc
526-
; CHECK: Loop %for.body: backedge-taken count is (-1 + (zext i32 (1 umax (1 smin %a)) to i64))<nsw>
526+
; CHECK: Loop %for.body: backedge-taken count is 0
527527
; CHECK: Loop %for.body: max backedge-taken count is 0
528528
entry:
529529
%cmp = icmp slt i32 %a, 1

0 commit comments

Comments
 (0)